Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.0.0 Release Candidate #9497

Closed
5 tasks done
trivialfis opened this issue Aug 17, 2023 · 33 comments
Closed
5 tasks done

2.0.0 Release Candidate #9497

trivialfis opened this issue Aug 17, 2023 · 33 comments

Comments

@trivialfis
Copy link
Member

trivialfis commented Aug 17, 2023

We are about to release version 2.0.0 of XGBoost. We invite everyone to try out the release candidate (RC).

Roadmap: https://github.com/dmlc/xgboost/projects/2
Release note: #9484

Feedback period: until the end of August 28, 2023. No new feature will be added to the release; only critical bug fixes will be backported.

@dmlc/xgboost-committer

Available packages:

  • Python packages:
pip install xgboost==2.0.0rc1
  • R packages:

R binary packages with CUDA enabled for testing:

  • xgboost_r_gpu_linux_2.0.0.tar.gz: Download
  • xgboost_r_gpu_win64_2.0.0.tar.gz: Download

sha256sum:

671a66478c910546b5715d9a42f6111beb149e8b470ca217c252129469174647  xgboost_r_gpu_win64_2.0.0-rc1.tar.gz
e25ae92db365a25010de0c2d8e1eaa2f567c4980f6a6f944e690b6480117be93  xgboost_r_gpu_linux_2.0.0-rc1.tar.gz

Install:

R CMD INSTALL ./xgboost_r_gpu_linux_2.0.0-rc1.tar.gz
  • JVM packages
Show instructions (Maven/SBT)

Maven

<dependencies>
  ...
  <dependency>
      <groupId>ml.dmlc</groupId>
      <artifactId>xgboost4j_2.12</artifactId>
      <version>2.0.0-RC1</version>
  </dependency>
  <dependency>
      <groupId>ml.dmlc</groupId>
      <artifactId>xgboost4j-spark_2.12</artifactId>
      <version>2.0.0-RC1</version>
  </dependency>
</dependencies>

<repositories>
  <repository>
    <id>XGBoost4J Release Repo</id>
    <name>XGBoost4J Release Repo</name>
    <url>https://s3-us-west-2.amazonaws.com/xgboost-maven-repo/release/</url>
  </repository>
</repositories>

SBT

libraryDependencies ++= Seq(
  "ml.dmlc" %% "xgboost4j" % "2.0.0-RC1",
  "ml.dmlc" %% "xgboost4j-spark" % "2.0.0-RC1"
)
resolvers += ("XGBoost4J Release Repo"
              at "https://s3-us-west-2.amazonaws.com/xgboost-maven-repo/release/")

Starting from 1.2.0, XGBoost4J-Spark supports training with NVIDIA GPUs. To enable this capability, download artifacts suffixed with -gpu, as follows:

Show instructions (Maven/SBT)

Maven

<dependencies>
  ...
  <dependency>
      <groupId>ml.dmlc</groupId>
      <artifactId>xgboost4j-gpu_2.12</artifactId>
      <version>2.0.0-RC1</version>
  </dependency>
  <dependency>
      <groupId>ml.dmlc</groupId>
      <artifactId>xgboost4j-spark-gpu_2.12</artifactId>
      <version>2.0.0-RC1</version>
  </dependency>
</dependencies>

<repositories>
  <repository>
    <id>XGBoost4J Release Repo</id>
    <name>XGBoost4J Release Repo</name>
    <url>https://s3-us-west-2.amazonaws.com/xgboost-maven-repo/release/</url>
  </repository>
</repositories>

SBT

libraryDependencies ++= Seq(
  "ml.dmlc" %% "xgboost4j-gpu" % "2.0.0-RC1",
  "ml.dmlc" %% "xgboost4j-spark-gpu" % "2.0.0-RC1"
)
resolvers += ("XGBoost4J Release Repo"
              at "https://s3-us-west-2.amazonaws.com/xgboost-maven-repo/release/")

backports

@terrytangyuan
Copy link
Member

This is exciting! Thanks for driving it.

Quick question: why are you bumping the version to v2.1 though? #9498

@hcho3
Copy link
Collaborator

hcho3 commented Aug 17, 2023

@terrytangyuan #9498 is updating the development branch. We have a separate branch for the 2.0 release.

@terrytangyuan
Copy link
Member

It's bumping from 2.0.0 to 2.1.0 directly. Are there new features in 2.1.0 that are worth a minor release?

@trivialfis
Copy link
Member Author

@terrytangyuan The 2.1 bump is for the master branch as the future release version.

@terrytangyuan
Copy link
Member

Maybe I am not following the current way of releasing in this project but usually we only bump versions right before we release. This way we can determine the release version based on the additional commits on top of the last release. Another concern is that if users install the R package directly from GitHub, they would not be able to upgrade later to the official 2.1 release since it already exists. One convention (at least in the R community) is to use x.x.x.9999 to indicate that it's not officially released yet.

@trivialfis
Copy link
Member Author

trivialfis commented Aug 17, 2023

Yeah, it would be confusing for R users. For Python users, the version is actually 2.1.0-dev, and for JVM users, it's2.1.0-SNAPSHOT according to respective conventions. The suffix indicates a development branch. It would also be confusing if we don't bump the version and have 2.0.0-dev while the official 2.0.0 is out. Installing from master branch would be a downgrade.

@pangshengwei
Copy link

thanks for this, will the release date also be 28 Aug or later?

@trivialfis
Copy link
Member Author

trivialfis commented Aug 22, 2023

It will be a bit later even if everything goes well. We still need to go through all the CI building and package submissions.

@trivialfis
Copy link
Member Author

Delaying the release due to blocking spark issues: #9510 .

@hcho3 hcho3 unpinned this issue Sep 11, 2023
@trivialfis
Copy link
Member Author

@hcho3 We are hitting the total size limit on pypi. I will be removing some prior rc releases (like 1.0.0rc1) from PyPI, what do you think?

WARNING  Error during upload. Retry with the --verbose option for more details.
ERROR    HTTPError: 400 Bad Request from https://upload.pypi.org/legacy/
         Project size too large. Limit for project 'xgboost' total size is 10 GB. See
         https://pypi.org/help/#project-size-limit                                       

@hcho3
Copy link
Collaborator

hcho3 commented Sep 12, 2023

Sure, sounds good to me

@terrytangyuan
Copy link
Member

I remember we can request for exceptions by filing an issue to PyPI.

@trivialfis
Copy link
Member Author

Yes, as documented in https://pypi.org/help/#project-size-limit . I will do that after the release. At the moment, I think pulling down prior RC releases seems reasonable.

@trivialfis
Copy link
Member Author

trivialfis commented Sep 12, 2023

I'm pulling down binary files and keeping the release tags. The source package will continue to be available.

@trivialfis
Copy link
Member Author

trivialfis commented Sep 13, 2023

Some other R failures on CRAN:

Examples with CPU (user + system) or elapsed time > 5s
                   user system elapsed
xgb.plot.deepness 9.057  0.275   1.081
xgb.plot.shap     7.498  0.164   1.041
Examples with CPU time > 2.5 times elapsed time
                      user system elapsed  ratio
xgb.config           1.330  0.064   0.043 32.419
xgb.model.dt.tree    2.210  0.059   0.071 31.958
xgb.DMatrix          1.167  0.036   0.038 31.658
xgb.train            0.985  0.057   0.095 10.968
xgb.plot.multi.trees 3.383  0.063   0.378  9.116
xgb.plot.deepness    9.057  0.275   1.081  8.633
xgb.plot.shap        7.498  0.164   1.041  7.360
xgb.plot.importance  1.297  0.049   0.352  3.824

I don't know how xgb.config can violate the cpu time constraint, it doesn't launch any thread, it's just a simple setter/getter.

I'm running the tests with a 12 core 24 thread machine, couldn't reproduce any of the time violation.

@trivialfis
Copy link
Member Author

trivialfis commented Sep 14, 2023

I run the test against * using R version 4.3.1 (2023-06-16), but still haven't been able to reproduce it. Really curious what kind of machines the CRAN test farm is using.

@jameslamb
Copy link
Contributor

I think those example names correspond to the names of the .Rd files in man/, which aren't guaranteed to be the same as function names.

So for example, man/xgb.config.Rd contains an example that's actually training a model.

\examples{
data(agaricus.train, package='xgboost')
train <- agaricus.train
bst <- xgboost(data = train$data, label = train$label, max_depth = 2,
eta = 1, nthread = 2, nrounds = 2, objective = "binary:logistic")
config <- xgb.config(bst)
}

I can see there that nthread = 2 is being passed, but maybe there are some operations where that isn't being respected and all a higher degree of parallelism is being used?

@jameslamb
Copy link
Contributor

For the two plotting ones where CRAN is also complaining about the absolute time

Examples with CPU (user + system) or elapsed time > 5s
                   user system elapsed
xgb.plot.deepness 9.057  0.275   1.081
xgb.plot.shap     7.498  0.164   1.041

It might help to:

  • train shallower models

e.g. are 50 iterations really necessary to show what this function does?

#' # Change max_depth to a higher number to get a more significant result
#' bst <- xgboost(data = agaricus.train$data, label = agaricus.train$label, max_depth = 6,
#' eta = 0.1, nthread = 2, nrounds = 50, objective = "binary:logistic",
#' subsample = 0.5, min_child_weight = 2)

  • add a call to data.table::setDTthreads(1), to ensure that the use of {data.table} inside those functions isn't using more than 2 threads

still haven't been able to reproduce it

I also have not been able to reproduce it.

There are some details here about the CRAN check farm, but a lot that's missing (like values of OMP_* environment variables): https://cran.r-project.org/web/checks/check_flavors.html

@trivialfis
Copy link
Member Author

@jameslamb Thank you for sharing!

add a call to data.table::setDTthreads(1), to ensure that the use of {data.table} inside those functions isn't using more than 2 threads

This could be it! Let me try to limit the datatable thread usage and re-submit.

@trivialfis
Copy link
Member Author

trivialfis commented Sep 18, 2023

Second attempt after #9591:

                   user system elapsed  ratio
xgb.config        1.348  0.047   0.043 32.442
xgb.model.dt.tree 2.171  0.099   0.071 31.972
xgb.DMatrix       1.150  0.062   0.038 31.895
xgb.save.raw      1.107  0.064   0.047 24.915
xgb.serialize     1.041  0.061   0.045 24.489
xgb.train         1.181  0.035   0.095 12.800

On my machine:

xgb.train           1.245  0.026   0.058 21.914
cb.gblinear.history 2.420  0.060   0.232 10.690

@trivialfis
Copy link
Member Author

trivialfis commented Sep 18, 2023

I have manually verified that some of these failing examples conform to having less or equal to 2 threads using gdb, by looking at gdb hints like [New Thread 0x7ffff33ff640 (LWP 94358)] [Thread 0x7ffff33ff640 (LWP 94358) exited] . Will look at how's the discussion in the mailing list going. A ratio of 32.442 with the example in xgb.config is just not accurate.

After #9591 , I don't think there's anything we can do from the xgboost's side other than removing those examples.

Interestingly, I was only able to get some warnings by using clang instead of gcc to compile xgboost. #9591 listed some notes on how to do it. @jameslamb

@jameslamb
Copy link
Contributor

A ratio of 32.442 with the example in xgb.config is just not accurate.

I agree, that just can't be right.

To try to help, I looked into a mirror of the R source code, to try to understand the code in R CMD check that produces these timings.

It's here: https://github.com/wch/r-source/blame/b1fc9701b2d323f838c0f358d314f87ea827666f/src/library/tools/R/check.R#L4128-L4167.

I tried to build the source distribution of {xgboost} that'd be uploaded to CRAN, but didn't succeed.

details (click me)

I ran the following:

python ./tests/ci_build/test_r_package.py --task build

and that yielded this error

usage: test_r_package.py [-h] [--compiler {mingw,msvc}] [--build-tool {cmake,autotools}]
test_r_package.py: error: unrecognized arguments: --task build

which I didn't understand, given that the script appears to define task as an argument:

parser.add_argument(
"--task",
type=str,
choices=["pack", "build", "check", "doc"],
default="check",
required=False,
)

So I decided to do this with LightGBM, which I knew how to build a CRAN-style source distribution for.

sh build-cran-package.sh --no-build-vignettes

MAKEFLAGS=-j3 \
_R_CHECK_EXAMPLE_TIMING_THRESHOLD_=0 \
_R_CHECK_EXAMPLE_TIMING_CPU_TO_ELAPSED_THRESHOLD_=0.1 \
R --vanilla CMD check \
    --no-codoc \
    --no-manual \
    --no-tests \
    --no-vignettes \
    --run-dontrun \
    --run-donttest \
    --timings \
    ./lightgbm_4.1.0.99.tar.gz

These have the following meanings:

  • MAKEFLAGS=-j3 = compile 3 objects at a time, to make the build faster
  • _R_CHECK_EXAMPLE_TIMING_THRESHOLD_ = make R CMD check raise a NOTE if any examples take longer than this many seconds
    • (appears to be set to 5 on CRAN, based on your notes above)
  • _R_CHECK_EXAMPLE_TIMING_CPU_TO_ELAPSED_THRESHOLD_ = make R CMD check raise a NOTE if CPU time for examples is this many times greater than elapsed time
    • (appears to be set to 2.5 on CRAN, based on your notes above)
  • --no-codoc, --no-manual, --no-tests, --no-vignettes = skip other skippable checks not related to the examples
  • --run-dontrun, --run-donttest = run all the examples
    • (as CRAN does with R CMD check --as-cran)
  • --timings = produce a file {pkgname}.Rcheck/{pkgname}-Ex.timings, the source data for those NOTEs on timing

I hope that information will be helpful to you in debugging this. If you tell me how to build the tar.gz source distribution that gets uploaded to CRAN, I can try a similar testing on the branch of {xgboost} from #9591.

@trivialfis
Copy link
Member Author

@jameslamb Thank you for sharing the detailed information!

which I didn't understand, given that the script appears to define task as an argument:

python ./tests/ci_build/test_r_package.py --task=build # <- the `=` operator

After which, there will be a tarball in the working directory.

With the same script, you can run

python ./tests/ci_build/test_r_package.py --task=check

The check uses --as-cran along with _R_CHECK_EXAMPLE_TIMING_CPU_TO_ELAPSED_THRESHOLD_=2.5. It uses the maximum number of logical cores available on your machine to run make (as specified by MAKEFLAGS) The _TIMING_THRESHOLD_ was not used in the test since we don't know the CI machine specification.

--timings = produce a file {pkgname}.Rcheck/{pkgname}-Ex.timings, the source data for those NOTEs on timing

Thank you! This is useful, let me try to produce one and share it here.

@trivialfis
Copy link
Member Author

trivialfis commented Sep 20, 2023

I can reproduce the error now based on the --timings output (but not the environment variable):

|    | name                                  |   user |   system |   elapsed |   user/elapsed |
|---:|:--------------------------------------|-------:|---------:|----------:|---------------:|
|  0 | a-compatibility-note-for-saveRDS-save |  0.312 |    0.025 |     0.049 |        6.36735 |
|  1 | cb.gblinear.history                   |  2.618 |    0.111 |     0.222 |       11.7928  |
| 12 | xgb.DMatrix                           |  0.446 |    0.006 |     0.019 |       23.4737  |
| 13 | xgb.DMatrix.save                      |  0.313 |    0.009 |     0.017 |       18.4118  |
| 14 | xgb.attr                              |  0.16  |    0.009 |     0.026 |        6.15385 |
| 15 | xgb.config                            |  0.518 |    0.023 |     0.023 |       22.5217  |
| 20 | xgb.load                              |  0.2   |    0.007 |     0.03  |        6.66667 |
| 21 | xgb.model.dt.tree                     |  0.834 |    0.016 |     0.036 |       23.1667  |
| 30 | xgb.save.raw                          |  0.605 |    0.005 |     0.026 |       23.2692  |
| 32 | xgb.train                             |  1.359 |    0.05  |     0.059 |       23.0339  |

A simple script for the ease of sharing results:

import pandas as pd
from io import StringIO

path = "./xgboost.Rcheck/xgboost-Ex.timings"
with open(path, "r") as fd:
    content = fd.readlines()
    newlines = []
    for line in content:
        line = line.strip()
        newlines.append(line)
    con_content = '\n'.join(newlines)

df = pd.read_csv(StringIO(con_content), delimiter="\t")
ratio_n = "user/elapsed"
df[ratio_n] = df["user"] / df["elapsed"]
df.to_markdown("timings.md")

offending = df[df[ratio_n] > 2.5]
offending.to_markdown("offending.md")

@trivialfis
Copy link
Member Author

trivialfis commented Sep 20, 2023

Extracting the example out as an independent script doesn't reproduce the error. For instance, I took the example from xgb.config out:

library(xgboost)

data(agaricus.train, package = "xgboost")
## Keep the number of threads to 1 for examples
nthread <- 1
data.table::setDTthreads(nthread)
train <- agaricus.train

bst <- xgboost(
  data = train$data, label = train$label, max_depth = 2,
  eta = 1, nthread = nthread, nrounds = 2, objective = "binary:logistic"
)
config <- xgb.config(bst)

And run:

time Rscript test-config.R

This is reported by time:

real    0m0.535s
user    0m0.487s
sys     0m0.048s

Similar observation on cb.gblinear.history

update
I have also verified this by using proc.time() in R.

@trivialfis
Copy link
Member Author

@jameslamb I think I have found (one of) the causes for XGBoost. XGBoost uses multiple threads to load model, this is used in one of the examples xgb.Booster.complete. Somehow, the time spilled into other examples.

@trivialfis
Copy link
Member Author

@hcho3 We will delay the R release to 2.1. Anything that concerns configuration is not trivial.

@trivialfis
Copy link
Member Author

Alternatively, we just restrict the number of threads during model load for R build.

@trivialfis
Copy link
Member Author

trivialfis commented Sep 21, 2023

@jameslamb

I wrote a helper script for running examples individually:

library(pkgload)
library(xgboost)

files <- list.files("./man")


run_example_timeit <- function(f) {
  path <- paste("./man/", f, sep = "")
  print(paste("Test", f))
  flush.console()
  t0 <- proc.time()
  run_example(path)
  t1 <- proc.time()
  list(file = f, time = t1 - t0)
}

timings <- lapply(files, run_example_timeit)

for (t in timings) {
  ratio <- t$time[1] / t$time[3]
  if (!is.na(ratio) && !is.infinite(ratio) && ratio >= 2.5) {
    print(paste("Offending example:", t$file, ratio))
  }
}

@jameslamb
Copy link
Contributor

!!! this is very helpful, thank you!

@trivialfis
Copy link
Member Author

Failed CRAN check with reverse dependencies. Likely caused by the change of the default tree method and the addition of initial estimation.

Debian log

Package check result: OK

Changes to worse in reverse depends:

Package: CausalGPS
Check: tests
New result: ERROR
Running ‘testthat.R’ [139s/146s]
Running the tests in ‘tests/testthat.R’ failed.
Complete output:
> library(testthat)
> library(CausalGPS)
>
> Sys.setenv("R_TESTS" = "")
> library(testthat)
> test_check("CausalGPS")
2023-09-22 16:05:32.877062 anduin2 3241335 CausalGPS log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:05:38.770944 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Starting compiling pseudo population (original data size: 196) ...
2023-09-22 16:05:39.061797 anduin2 3241335 CausalGPS FUN WARN: There is no data to match with 19.9149974416582 in 0.5 radius.
2023-09-22 16:05:39.246097 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Finished compiling pseudo population (Pseudo population data size: 196)
2023-09-22 16:05:41.715596 anduin2 3241335 CausalGPS log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:05:49.850126 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Starting compiling pseudo population (original data size: 490) ...
2023-09-22 16:05:49.858623 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Finished compiling pseudo population (Pseudo population data size: 490)
2023-09-22 16:05:49.932548 anduin2 3241335 CausalGPS log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:05:51.946565 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Starting compiling pseudo population (original data size: 196) ...
2023-09-22 16:05:52.057325 anduin2 3241335 CausalGPS FUN WARN: There is no data to match with 21.2300763979248 in 0.5 radius.
2023-09-22 16:05:52.248203 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Finished compiling pseudo population (Pseudo population data size: 196)
2023-09-22 16:05:53.189214 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Starting compiling pseudo population (original data size: 100) ...
2023-09-22 16:05:53.196196 anduin2 3241335 CausalGPS FUN WARN: There is no data to match with -8.74760785492387 in 0.5 radius.
2023-09-22 16:05:53.1984 anduin2 3241335 CausalGPS FUN WARN: There is no data to match with -7.74760785492387 in 0.5 radius.
2023-09-22 16:05:53.200478 anduin2 3241335 CausalGPS FUN WARN: There is no data to match with -6.74760785492387 in 0.5 radius.
2023-09-22 16:05:53.202507 anduin2 3241335 CausalGPS FUN WARN: There is no data to match with -5.74760785492387 in 0.5 radius.
2023-09-22 16:05:53.204529 anduin2 3241335 CausalGPS FUN WARN: There is no data to match with -4.74760785492387 in 0.5 radius.
2023-09-22 16:05:53.206583 anduin2 3241335 CausalGPS FUN WARN: There is no data to match with -3.74760785492387 in 0.5 radius.
2023-09-22 16:05:53.273383 anduin2 3241335 CausalGPS FUN WARN: There is no data to match with 21.2523921450761 in 0.5 radius.
2023-09-22 16:05:53.277081 anduin2 3241335 CausalGPS FUN WARN: There is no data to match with 22.2523921450761 in 0.5 radius.
2023-09-22 16:05:53.280568 anduin2 3241335 CausalGPS FUN WARN: There is no data to match with 23.2523921450761 in 0.5 radius.
2023-09-22 16:05:53.479519 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Finished compiling pseudo population (Pseudo population data size: 100)
2023-09-22 16:05:53.497842 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Starting compiling pseudo population (original data size: 1000) ...
2023-09-22 16:05:53.503459 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Finished compiling pseudo population (Pseudo population data size: 1000)
2023-09-22 16:05:53.726737 anduin2 3241335 CausalGPS log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:06:02.65557 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Starting compiling pseudo population (original data size: 98) ...
2023-09-22 16:06:02.662154 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Finished compiling pseudo population (Pseudo population data size: 98)
2023-09-22 16:06:02.78074 anduin2 3241335 CausalGPS log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:06:15.572037 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Starting compiling pseudo population (original data size: 98) ...
2023-09-22 16:06:15.57836 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Finished compiling pseudo population (Pseudo population data size: 98)
2023-09-22 16:06:22.683724 anduin2 3241335 CausalGPS log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:06:38.532326 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Starting compiling pseudo population (original data size: 392) ...
2023-09-22 16:06:38.931338 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Finished compiling pseudo population (Pseudo population data size: 392)
2023-09-22 16:06:42.349596 anduin2 3241335 CausalGPS estimate_npmetric_erf INFO: The band width with the minimum risk value: 2.
2023-09-22 16:06:42.416235 anduin2 3241335 CausalGPS log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:06:59.34287 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Starting compiling pseudo population (original data size: 392) ...
2023-09-22 16:06:59.790529 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Finished compiling pseudo population (Pseudo population data size: 392)
2023-09-22 16:07:03.52904 anduin2 3241335 CausalGPS estimate_npmetric_erf INFO: The band width with the minimum risk value: 2.
2023-09-22 16:07:03.654735 anduin2 3241335 CausalGPS log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:07:06.636457 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Starting compiling pseudo population (original data size: 490) ...
2023-09-22 16:07:07.055452 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Finished compiling pseudo population (Pseudo population data size: 490)
2023-09-22 16:07:07.159511 anduin2 3241335 CausalGPS log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:07:09.316784 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Starting compiling pseudo population (original data size: 460) ...
2023-09-22 16:07:09.595503 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Finished compiling pseudo population (Pseudo population data size: 460)
2023-09-22 16:07:09.651717 anduin2 3241335 CausalGPS log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:07:09.672039 anduin2 3241335 CausalGPS log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:07:09.691187 anduin2 3241335 CausalGPS log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:07:09.724338 anduin2 3241335 CausalGPS log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:07:09.755583 anduin2 3241335 CausalGPS log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:07:09.793293 anduin2 3241335 CausalGPS log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:07:09.809022 anduin2 3241335 CausalGPS log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:07:09.837447 anduin2 3241335 CausalGPS log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:07:09.856138 anduin2 3241335 CausalGPS log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:07:11.809242 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Starting compiling pseudo population (original data size: 460) ...
2023-09-22 16:07:11.816538 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Finished compiling pseudo population (Pseudo population data size: 460)
2023-09-22 16:07:11.874854 anduin2 3241335 CausalGPS log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:07:13.886692 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Starting compiling pseudo population (original data size: 460) ...
2023-09-22 16:07:14.127239 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Finished compiling pseudo population (Pseudo population data size: 460)
2023-09-22 16:07:15.770124 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Starting compiling pseudo population (original data size: 460) ...
2023-09-22 16:07:16.023317 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Finished compiling pseudo population (Pseudo population data size: 460)
2023-09-22 16:07:17.761662 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Starting compiling pseudo population (original data size: 460) ...
2023-09-22 16:07:17.928777 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Finished compiling pseudo population (Pseudo population data size: 460)
2023-09-22 16:07:19.715724 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Starting compiling pseudo population (original data size: 460) ...
2023-09-22 16:07:19.931238 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Finished compiling pseudo population (Pseudo population data size: 460)
2023-09-22 16:07:19.983181 anduin2 3241335 CausalGPS log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:07:21.203905 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Starting compiling pseudo population (original data size: 460) ...
2023-09-22 16:07:21.476891 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Finished compiling pseudo population (Pseudo population data size: 460)
2023-09-22 16:07:22.723018 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Starting compiling pseudo population (original data size: 460) ...
2023-09-22 16:07:22.995949 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Finished compiling pseudo population (Pseudo population data size: 460)
2023-09-22 16:07:24.260637 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Starting compiling pseudo population (original data size: 460) ...
2023-09-22 16:07:24.50107 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Finished compiling pseudo population (Pseudo population data size: 460)
2023-09-22 16:07:25.822035 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Starting compiling pseudo population (original data size: 460) ...
2023-09-22 16:07:26.106343 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Finished compiling pseudo population (Pseudo population data size: 460)
2023-09-22 16:07:27.312351 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Starting compiling pseudo population (original data size: 460) ...
2023-09-22 16:07:27.485747 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Finished compiling pseudo population (Pseudo population data size: 460)
2023-09-22 16:07:28.715146 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Starting compiling pseudo population (original data size: 460) ...
2023-09-22 16:07:28.959134 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Finished compiling pseudo population (Pseudo population data size: 460)
2023-09-22 16:07:29.889963 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Starting compiling pseudo population (original data size: 460) ...
2023-09-22 16:07:30.105026 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Finished compiling pseudo population (Pseudo population data size: 460)
2023-09-22 16:07:30.118417 anduin2 3241335 CausalGPS generate_pseudo_pop INFO: All possible combination of transformers has been tried. Retrying ... .
2023-09-22 16:07:30.142533 anduin2 3241335 CausalGPS log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:07:33.218621 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Starting compiling pseudo population (original data size: 490) ...
2023-09-22 16:07:33.641911 anduin2 3241335 CausalGPS compile_pseudo_pop INFO: Finished compiling pseudo population (Pseudo population data size: 490)
2023-09-22 16:07:33.695104 anduin2 3241335 CausalGPS log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:07:33.715707 anduin2 3241335 CausalGPS log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:07:33.740538 anduin2 3241335 CausalGPS log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
[ FAIL 13 | WARN 91 | SKIP 0 | PASS 205 ]

══ Failed tests ════════════════════════════════════════════════════════════════
── Failure ('test-check_covar_balance.R:43:1'): Covariate balance check works as expected ──
val1$pass is not TRUE

`actual`:   FALSE
`expected`: TRUE 
── Failure ('test-check_kolmogorov_smirnov.R:39:3'): check_kolmogorov_smirnov works as expected. ──
output$ks_stat[["w"]] not equal to 0.1098639.
1/1 mismatches
[1] 0.105 - 0.11 == -0.0051
── Failure ('test-check_kolmogorov_smirnov.R:40:3'): check_kolmogorov_smirnov works as expected. ──
output$ks_stat[["cf3"]] not equal to 0.1319728.
1/1 mismatches
[1] 0.0563 - 0.132 == -0.0757
── Failure ('test-check_kolmogorov_smirnov.R:41:3'): check_kolmogorov_smirnov works as expected. ──
output$stat_vals[["maximal_val"]] not equal to 0.1931973.
1/1 mismatches
[1] 0.116 - 0.193 == -0.0774
── Failure ('test-create_weighting.R:27:3'): create_weighting works as expected. ──
pseudo_pop$passed_covar_test is not FALSE

`actual`:   TRUE 
`expected`: FALSE
── Failure ('test-estimate_gps.R:14:3'): estimate_gps works as expected. ───────
data_with_gps_1$dataset$gps[2] not equal to 20.991916.
1/1 mismatches
[1] 443 - 21 == 422
── Failure ('test-estimate_gps.R:28:3'): estimate_gps works as expected. ───────
data_with_gps_2$dataset$e_gps_pred[58, ] not equal to 19.07269287.
1/1 mismatches
[1] 19.1 - 19.1 == 0.000925
── Failure ('test-generate_pseudo_pop.R:38:3'): generate_pseudo_pop works as expected. ──
ps_pop1$adjusted_corr_results$mean_absolute_corr not equal to 0.2580481.
1/1 mismatches
[1] 0.291 - 0.258 == 0.0327
── Failure ('test-generate_pseudo_pop.R:74:3'): generate_pseudo_pop works as expected. ──
ps_pop2$adjusted_corr_results$mean_absolute_corr not equal to 0.2241794.
1/1 mismatches
[1] 0.205 - 0.224 == -0.0195
── Failure ('test-generate_pseudo_pop.R:255:3'): generate_pseudo_pop works as expected. ──
ps_pop3$adjusted_corr_results$mean_absolute_corr not equal to 0.3750209.
1/1 mismatches
[1] 0.572 - 0.375 == 0.197
── Failure ('test-generate_pseudo_pop.R:281:3'): generate_pseudo_pop works as expected. ──
ps_pop4$adjusted_corr_results$mean_absolute_corr not equal to 0.2209775.
1/1 mismatches
[1] 0.133 - 0.221 == -0.0881
── Failure ('test-generate_pseudo_pop.R:308:3'): generate_pseudo_pop works as expected. ──
ps_pop5$adjusted_corr_results$mean_absolute_corr not equal to 0.1076907.
1/1 mismatches
[1] 0.0914 - 0.108 == -0.0163
── Failure ('test-matching_fn.R:36:4'): matching_l1 functions as expected. ─────
nrow(val) not equal to 6.
1/1 mismatches
[1] 5 - 6 == -1

[ FAIL 13 | WARN 91 | SKIP 0 | PASS 205 ]
Error: Test failures
Execution halted

Package: CRE
Check: tests
New result: ERROR
Running ‘testthat.R’ [295s/152s]
Running the tests in ‘tests/testthat.R’ failed.
Complete output:
> library(testthat)
> library(CRE)
>
> test_check("CRE")
2023-09-22 16:04:42.970418 anduin2 3230382 CRE cre INFO: Starting rules discovery...
2023-09-22 16:04:43.005929 anduin2 3230382 CRE cre INFO: Starting rules discovery...
fitting treatment model via method 'bart'
fitting response model via method 'bart'
2023-09-22 16:05:01.848323 anduin2 3230382 CRE cre INFO: Done with rules discovery. (WC: 18.839 seconds.)
2023-09-22 16:05:01.851805 anduin2 3230382 CRE cre INFO: Starting inference...
2023-09-22 16:05:02.259985 anduin2 3230382 CRE cre INFO: Starting rules discovery...
2023-09-22 16:05:19.558476 anduin2 3230382 CRE cre INFO: Done with rules discovery. (WC: 17.294 seconds.)
2023-09-22 16:05:19.561759 anduin2 3230382 CRE cre INFO: Starting inference...
fitting treatment model via method 'bart'
fitting response model via method 'bart'
2023-09-22 16:05:24.042061 anduin2 3230382 CRE cre INFO: Done with inference. (WC: 4.477 seconds .)
2023-09-22 16:05:24.044301 anduin2 3230382 CRE cre INFO: Done with running CRE function!(WC: 21.788 seconds.)
2023-09-22 16:05:24.046918 anduin2 3230382 CRE cre INFO: Done!
2023-09-22 16:05:24.056295 anduin2 3230382 CRE cre INFO: Starting rules discovery...
2023-09-22 16:05:26.983551 anduin2 3230382 CRE cre INFO: Done with rules discovery. (WC: 2.924 seconds.)
2023-09-22 16:05:26.986682 anduin2 3230382 CRE cre INFO: Starting inference...
fitting treatment model via method 'bart'
fitting response model via method 'bart'
2023-09-22 16:05:30.296713 anduin2 3230382 CRE cre INFO: Done with inference. (WC: 3.307 seconds .)
2023-09-22 16:05:30.298401 anduin2 3230382 CRE cre INFO: Done with running CRE function!(WC: 6.244 seconds.)
2023-09-22 16:05:30.299849 anduin2 3230382 CRE cre INFO: Done!
2023-09-22 16:05:30.306063 anduin2 3230382 CRE cre INFO: Starting rules discovery...
2023-09-22 16:05:32.802845 anduin2 3230382 CRE cre INFO: Done with rules discovery. (WC: 2.495 seconds.)
2023-09-22 16:05:32.805091 anduin2 3230382 CRE cre INFO: Starting inference...
2023-09-22 16:05:34.994778 anduin2 3230382 CRE cre INFO: Done with inference. (WC: 2.187 seconds .)
2023-09-22 16:05:34.997608 anduin2 3230382 CRE cre INFO: Done with running CRE function!(WC: 4.692 seconds.)
2023-09-22 16:05:35.000015 anduin2 3230382 CRE cre INFO: Done!
2023-09-22 16:05:35.010265 anduin2 3230382 CRE cre INFO: Starting rules discovery...
2023-09-22 16:05:35.012938 anduin2 3230382 CRE cre INFO: Using the provided ITE estimations...
2023-09-22 16:05:35.398855 anduin2 3230382 CRE cre INFO: Done with rules discovery. (WC: 0.386 seconds.)
2023-09-22 16:05:35.40121 anduin2 3230382 CRE cre INFO: Starting inference...
2023-09-22 16:05:35.403371 anduin2 3230382 CRE cre INFO: Skipped generating ITE.The provided ITE will be used.
2023-09-22 16:05:35.420071 anduin2 3230382 CRE cre INFO: Done with inference. (WC: 0.016 seconds .)
2023-09-22 16:05:35.422587 anduin2 3230382 CRE cre INFO: Done with running CRE function!(WC: 0.414 seconds.)
2023-09-22 16:05:35.424796 anduin2 3230382 CRE cre INFO: Done!
fitting treatment model via method 'bart'
fitting response model via method 'bart'
fitting treatment model via method 'bart'
fitting response model via method 'bart'
Error in setinfo.xgb.DMatrix(dmat, names(p), p[[1]]) :
The length of labels must equal to the number of rows in the input data
Error in setinfo.xgb.DMatrix(dmat, names(p), p[[1]]) :
The length of labels must equal to the number of rows in the input data
Error in setinfo.xgb.DMatrix(dmat, names(p), p[[1]]) :
The length of labels must equal to the number of rows in the input data
Error in setinfo.xgb.DMatrix(dmat, names(p), p[[1]]) :
The length of labels must equal to the number of rows in the input data
fitting treatment model via method 'bart'
fitting response model via method 'bart'
fitting treatment model via method 'bart'
fitting response model via method 'bart'
fitting treatment model via method 'bart'
fitting response model via method 'bart'
Error in contrasts<-(*tmp*, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels
Error in contrasts<-(*tmp*, value = contr.funs[1 + isOF[nn]]) :
contrasts can be applied only to factors with 2 or more levels
fitting treatment model via method 'bart'
fitting response model via method 'bart'
fitting treatment model via method 'bart'
fitting response model via method 'bart'
fitting treatment model via method 'bart'
fitting response model via method 'bart'
fitting treatment model via method 'bart'
fitting response model via method 'bart'
fitting treatment model via method 'bart'
fitting response model via method 'bart'
fitting treatment model via method 'bart'
fitting response model via method 'bart'
[ FAIL 28 | WARN 0 | SKIP 0 | PASS 210 ]

══ Failed tests ════════════════════════════════════════════════════════════════
── Failure ('test-estimate_ite_aipw.R:24:3'): AIPW ITE Estimated Correctly ─────
ite_result[1] not equal to 0.5578771.
1/1 mismatches
[1] 0.62 - 0.558 == 0.0621
── Failure ('test-estimate_ite_aipw.R:25:3'): AIPW ITE Estimated Correctly ─────
ite_result[11] not equal to -1.218074.
1/1 mismatches
[1] -1.11 - -1.22 == 0.104
── Failure ('test-estimate_ite_aipw.R:26:3'): AIPW ITE Estimated Correctly ─────
ite_result[91] not equal to 1.429128.
1/1 mismatches
[1] 1.48 - 1.43 == 0.0513
── Failure ('test-estimate_ite_slearner.R:24:3'): S-Learner ITE Estimated Correctly ──
ite[1] not equal to 0.4066854.
1/1 mismatches
[1] 0.561 - 0.407 == 0.155
── Failure ('test-estimate_ite_slearner.R:25:3'): S-Learner ITE Estimated Correctly ──
ite[11] not equal to -0.8926377.
1/1 mismatches
[1] -0.744 - -0.893 == 0.149
── Failure ('test-estimate_ite_slearner.R:26:3'): S-Learner ITE Estimated Correctly ──
ite[91] not equal to 1.136385.
1/1 mismatches
[1] 1.41 - 1.14 == 0.269
── Failure ('test-estimate_ite_tlearner.R:23:3'): T-Learner ITE Estimated Correctly ──
ite[1] not equal to 0.8611812.
1/1 mismatches
[1] 0.814 - 0.861 == -0.0471
── Failure ('test-estimate_ite_tlearner.R:24:3'): T-Learner ITE Estimated Correctly ──
ite[11] not equal to -1.713489.
1/1 mismatches
[1] -1.73 - -1.71 == -0.0188
── Failure ('test-estimate_ite_tlearner.R:25:3'): T-Learner ITE Estimated Correctly ──
ite[91] not equal to -0.4767654.
1/1 mismatches
[1] -0.484 - -0.477 == -0.00674
── Failure ('test-estimate_ite_xlearner.R:23:3'): X-Learner ITE Estimated Correctly ──
ite[1] not equal to 0.9212638.
1/1 mismatches
[1] 0.881 - 0.921 == -0.0403
── Failure ('test-estimate_ite_xlearner.R:24:3'): X-Learner ITE Estimated Correctly ──
ite[11] not equal to -1.103585.
1/1 mismatches
[1] -1.13 - -1.1 == -0.0242
── Failure ('test-estimate_ite_xlearner.R:25:3'): X-Learner ITE Estimated Correctly ──
ite[91] not equal to -0.226993.
1/1 mismatches
[1] -0.224 - -0.227 == 0.00314
── Failure ('test-estimate_ps.R:22:3'): Propensity Score Estimated Correctly ───
est_ps[2] not equal to 0.5685197.
1/1 mismatches
[1] 0.556 - 0.569 == -0.0125
── Failure ('test-estimate_ps.R:23:3'): Propensity Score Estimated Correctly ───
est_ps[37] not equal to 0.124416.
1/1 mismatches
[1] 0.128 - 0.124 == 0.00377
── Failure ('test-extract_rules.R:33:3'): Rules Extracted Correctly ────────────
ite[10] not equal to 0.6874263.
1/1 mismatches
[1] 0.614 - 0.687 == -0.0738
── Failure ('test-extract_rules.R:34:3'): Rules Extracted Correctly ────────────
ite[25] not equal to -0.2175163.
1/1 mismatches
[1] -0.416 - -0.218 == -0.199
── Failure ('test-extract_rules.R:35:3'): Rules Extracted Correctly ────────────
ite[70] not equal to 1.656867.
1/1 mismatches
[1] 2.05 - 1.66 == 0.395
── Failure ('test-extract_rules.R:67:3'): Rules Extracted Correctly ────────────
treelist[2]$list[[1]][2, 6] not equal to 0.4320062.
1/1 mismatches
[1] -0.428 - 0.432 == -0.86
── Failure ('test-extract_rules.R:68:3'): Rules Extracted Correctly ────────────
treelist[2]$list[[2]][3, 6] not equal to -0.5863133.
1/1 mismatches
[1] 0.738 - -0.586 == 1.32
── Failure ('test-extract_rules.R:69:3'): Rules Extracted Correctly ────────────
treelist[2]$list[[10]][3, 6] not equal to 0.06850307.
1/1 mismatches
[1] -0.00645 - 0.0685 == -0.075
── Failure ('test-extract_rules.R:83:3'): Rules Extracted Correctly ────────────
length(rules_RF) not equal to 428.
1/1 mismatches
[1] 417 - 428 == -11
── Failure ('test-extract_rules.R:84:3'): Rules Extracted Correctly ────────────
rules_RF[3] not equal to "X[,1]<=0.5 & X[,5]>0.5".
1/1 mismatches
x[1]: "X[,2]>0.5 & X[,5]<=0.5 & X[,9]>0.5"
y[1]: "X[,1]<=0.5 & X[,5]>0.5"
── Failure ('test-filter_irrelevant_rules.R:30:3'): Filter ireelevant rules run correctly ──
ite[10] not equal to 0.6874263.
1/1 mismatches
[1] 0.614 - 0.687 == -0.0738
── Failure ('test-filter_irrelevant_rules.R:31:3'): Filter ireelevant rules run correctly ──
ite[25] not equal to -0.2175163.
1/1 mismatches
[1] -0.416 - -0.218 == -0.199
── Failure ('test-filter_irrelevant_rules.R:32:3'): Filter ireelevant rules run correctly ──
ite[70] not equal to 1.656867.
1/1 mismatches
[1] 2.05 - 1.66 == 0.395
── Failure ('test-filter_irrelevant_rules.R:66:3'): Filter ireelevant rules run correctly ──
treelist[2]$list[[1]][2, 6] not equal to 0.4320062.
1/1 mismatches
[1] -0.428 - 0.432 == -0.86
── Failure ('test-filter_irrelevant_rules.R:67:3'): Filter ireelevant rules run correctly ──
treelist[2]$list[[2]][3, 6] not equal to -0.5863133.
1/1 mismatches
[1] 0.738 - -0.586 == 1.32
── Failure ('test-filter_irrelevant_rules.R:68:3'): Filter ireelevant rules run correctly ──
treelist[2]$list[[10]][3, 6] not equal to 0.06850307.
1/1 mismatches
[1] -0.00645 - 0.0685 == -0.075

[ FAIL 28 | WARN 0 | SKIP 0 | PASS 210 ]
Error: Test failures
Execution halted

Package: GPCERF
Check: tests
New result: ERROR
Running ‘testthat.R’ [125s/155s]
Running the tests in ‘tests/testthat.R’ failed.
Complete output:
> library(testthat)
> library(GPCERF)
>
> test_check("GPCERF")
2023-09-22 16:05:08.754071 anduin2 3236593 GPCERF estimate_gps INFO: Started estimating GPS values ...
Loading required package: nnls
2023-09-22 16:05:15.138245 anduin2 3236593 GPCERF estimate_gps INFO: Started estimating GPS values ...
2023-09-22 16:05:21.617365 anduin2 3236593 GPCERF estimate_gps INFO: Started estimating GPS values ...
2023-09-22 16:05:30.245974 anduin2 3236593 GPCERF estimate_gps INFO: Started estimating GPS values ...
2023-09-22 16:05:35.765607 anduin2 3236593 GPCERF estimate_gps INFO: Started estimating GPS values ...
2023-09-22 16:05:41.752051 anduin2 3236593 GPCERF estimate_noise_nn INFO: Working on estimating residual error with nngp ...
starting worker pid=3243465 on localhost:11475 at 16:05:41.995
2023-09-22 16:05:44.228993 anduin2 3236593 GPCERF estimate_noise_nn INFO: Done with estimating residual error with nngpWall clock time: 2.479 s.
2023-09-22 16:05:44.274859 anduin2 3236593 GPCERF estimate_gps INFO: Started estimating GPS values ...
2023-09-22 16:05:50.164587 anduin2 3236593 GPCERF estimate_gps INFO: Started estimating GPS values ...
2023-09-22 16:05:56.538252 anduin2 3236593 GPCERF estimate_gps INFO: Started estimating GPS values ...
[1] train-rmse:5.313567
[2] train-rmse:4.667407
[3] train-rmse:3.922342
[4] train-rmse:3.496441
[5] train-rmse:3.189745
[6] train-rmse:2.909992
[7] train-rmse:2.579096
[8] train-rmse:2.417789
[9] train-rmse:2.250876
[10] train-rmse:2.132709
[11] train-rmse:1.927925
[12] train-rmse:1.740749
[13] train-rmse:1.582843
[14] train-rmse:1.456660
[15] train-rmse:1.370258
[16] train-rmse:1.253989
[17] train-rmse:1.123036
[18] train-rmse:1.086729
[19] train-rmse:1.017817
[20] train-rmse:0.881426
[21] train-rmse:0.792504
[22] train-rmse:0.723483
[23] train-rmse:0.668225
[24] train-rmse:0.601481
[25] train-rmse:0.568259
[26] train-rmse:0.530765
[27] train-rmse:0.500917
[28] train-rmse:0.484038
[29] train-rmse:0.455969
[30] train-rmse:0.426960
[31] train-rmse:0.380551
[32] train-rmse:0.358708
[33] train-rmse:0.336700
[34] train-rmse:0.300794
[35] train-rmse:0.290543
[36] train-rmse:0.266671
[37] train-rmse:0.261051
[38] train-rmse:0.235661
[39] train-rmse:0.225523
[40] train-rmse:0.214115
[41] train-rmse:0.201783
[42] train-rmse:0.186876
[43] train-rmse:0.180376
[44] train-rmse:0.168879
[45] train-rmse:0.160807
[46] train-rmse:0.153128
[47] train-rmse:0.150586
[48] train-rmse:0.134723
[49] train-rmse:0.127087
[50] train-rmse:0.115730
2023-09-22 16:06:02.897922 anduin2 3236593 GPCERF estimate_gps INFO: Started estimating GPS values ...
2023-09-22 16:06:08.777707 anduin2 3236593 GPCERF log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:06:11.537879 anduin2 3236593 GPCERF estimate_gps INFO: Started estimating GPS values ...
2023-09-22 16:06:16.619067 anduin2 3236593 GPCERF log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:06:16.635795 anduin2 3236593 GPCERF estimate_gps INFO: Started estimating GPS values ...
2023-09-22 16:06:22.667979 anduin2 3236593 GPCERF log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:06:22.685522 anduin2 3236593 GPCERF log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:06:22.702528 anduin2 3236593 GPCERF log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:06:22.720567 anduin2 3236593 GPCERF log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:06:22.738173 anduin2 3236593 GPCERF log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:06:22.755677 anduin2 3236593 GPCERF log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:06:22.773586 anduin2 3236593 GPCERF log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:06:22.790068 anduin2 3236593 GPCERF estimate_gps INFO: Started estimating GPS values ...
2023-09-22 16:06:28.664223 anduin2 3236593 GPCERF log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
starting worker pid=3265346 on localhost:11475 at 16:06:28.920
starting worker pid=3265347 on localhost:11475 at 16:06:28.921
2023-09-22 16:06:35.659107 anduin2 3236593 GPCERF estimate_gps INFO: Started estimating GPS values ...
2023-09-22 16:06:42.25386 anduin2 3236593 GPCERF log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:06:42.256446 anduin2 3236593 GPCERF estimate_cerf_nngp INFO: Working on estimating cerf using nngp approach ...
2023-09-22 16:06:42.26128 anduin2 3236593 GPCERF find_optimal_nn INFO: Started finding optimal values ...
starting worker pid=3289088 on localhost:11475 at 16:06:42.507
2023-09-22 16:06:45.854515 anduin2 3236593 GPCERF find_optimal_nn INFO: Done with finding optimal value.(Wall clock time: 3.59100000000001 s.} ...
2023-09-22 16:06:45.857175 anduin2 3236593 GPCERF estimate_noise_nn INFO: Working on estimating residual error with nngp ...
starting worker pid=3291610 on localhost:11475 at 16:06:46.133
2023-09-22 16:06:49.038773 anduin2 3236593 GPCERF estimate_noise_nn INFO: Done with estimating residual error with nngpWall clock time: 3.181 s.
2023-09-22 16:06:49.041783 anduin2 3236593 GPCERF estimate_mean_sd_nn INFO: Working on estimating mean and sd using nngp approach ...
starting worker pid=3294374 on localhost:11475 at 16:06:49.293
2023-09-22 16:06:53.169014 anduin2 3236593 GPCERF estimate_mean_sd_nn INFO: Done with estimating mean and sd using nngp approach Wall clock time: 4.127 s.
2023-09-22 16:06:53.171528 anduin2 3236593 GPCERF estimate_cerf_nngp INFO: Done with estimating cerf using nngp approach Wall clock time: 10.915 s.
2023-09-22 16:06:53.184003 anduin2 3236593 GPCERF log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:06:53.186982 anduin2 3236593 GPCERF estimate_cerf_nngp INFO: Working on estimating cerf using nngp approach ...
2023-09-22 16:06:53.209631 anduin2 3236593 GPCERF log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:06:53.212335 anduin2 3236593 GPCERF estimate_cerf_nngp INFO: Working on estimating cerf using nngp approach ...
2023-09-22 16:06:53.243408 anduin2 3236593 GPCERF log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:06:53.245793 anduin2 3236593 GPCERF estimate_cerf_nngp INFO: Working on estimating cerf using nngp approach ...
2023-09-22 16:06:53.258139 anduin2 3236593 GPCERF estimate_gps INFO: Started estimating GPS values ...
2023-09-22 16:06:58.177807 anduin2 3236593 GPCERF log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:06:58.180978 anduin2 3236593 GPCERF estimate_cerf_nngp INFO: Working on estimating cerf using nngp approach ...
2023-09-22 16:06:58.200074 anduin2 3236593 GPCERF log_system_info INFO: System name: Linux, OS type: unix, machine architecture: x86_64, user: hornik, R Under development (unstable) (2023-09-21 r85196), detected cores: 32
2023-09-22 16:06:58.202828 anduin2 3236593 GPCERF estimate_cerf_nngp INFO: Working on estimating cerf using nngp approach ...
2023-09-22 16:06:58.231735 anduin2 3236593 GPCERF estimate_gps INFO: Started estimating GPS values ...
2023-09-22 16:07:04.690614 anduin2 3236593 GPCERF estimate_gps INFO: Started estimating GPS values ...
2023-09-22 16:07:10.859255 anduin2 3236593 GPCERF estimate_noise_nn INFO: Working on estimating residual error with nngp ...
starting worker pid=3304299 on localhost:11475 at 16:07:11.056
2023-09-22 16:07:13.691146 anduin2 3236593 GPCERF estimate_noise_nn INFO: Done with estimating residual error with nngpWall clock time: 2.83199999999999 s.
2023-09-22 16:07:13.694097 anduin2 3236593 GPCERF estimate_mean_sd_nn INFO: Working on estimating mean and sd using nngp approach ...
starting worker pid=3305307 on localhost:11475 at 16:07:13.893
2023-09-22 16:07:16.863155 anduin2 3236593 GPCERF estimate_mean_sd_nn INFO: Done with estimating mean and sd using nngp approach Wall clock time: 3.16899999999998 s.
2023-09-22 16:07:16.901533 anduin2 3236593 GPCERF estimate_gps INFO: Started estimating GPS values ...
2023-09-22 16:07:21.502832 anduin2 3236593 GPCERF estimate_gps INFO: Started estimating GPS values ...
2023-09-22 16:07:27.406164 anduin2 3236593 GPCERF estimate_noise_nn INFO: Working on estimating residual error with nngp ...
starting worker pid=3309560 on localhost:11475 at 16:07:27.636
2023-09-22 16:07:30.141098 anduin2 3236593 GPCERF estimate_noise_nn INFO: Done with estimating residual error with nngpWall clock time: 2.73500000000001 s.
2023-09-22 16:07:30.169791 anduin2 3236593 GPCERF estimate_gps INFO: Started estimating GPS values ...
2023-09-22 16:07:36.367018 anduin2 3236593 GPCERF find_optimal_nn INFO: Started finding optimal values ...
starting worker pid=3311342 on localhost:11475 at 16:07:36.561
2023-09-22 16:07:40.392681 anduin2 3236593 GPCERF find_optimal_nn INFO: Done with finding optimal value.(Wall clock time: 4.02300000000002 s.} ...
[ FAIL 6 | WARN 0 | SKIP 0 | PASS 101 ]

══ Failed tests ════════════════════════════════════════════════════════════════
── Failure ('test-compute_deriv_nn.R:21:3'): compute_deriv_nn works as expected! ──
deriv_val[1, 1] (`actual`) not equal to 1.558466 (`expected`).

  `actual`: -1
`expected`:  2
── Failure ('test-compute_deriv_weights_gp.R:17:3'): compute_deriv_weights_gp works as expected! ──
weights[37] (`actual`) not equal to 3.307249e-05 (`expected`).

  `actual`: -0.0001
`expected`:  0.0000
── Failure ('test-compute_m_sigma.R:25:4'): compute_m_sigma works as expected! ──
gp_cerf[10] (`actual`) not equal to 2.105425 (`expected`).

  `actual`: -6
`expected`:  2
── Failure ('test-compute_weight_gp.R:49:3'): multiplication works ─────────────
weight$weight[28] (`actual`) not equal to 0.0002182767 (`expected`).

  `actual`: 0.0000
`expected`: 0.0002
── Failure ('test-estimate_gps.R:15:3'): estimate_gps works as expected. ───────
gps_m$gps[4, 1] (`actual`) not equal to 8.74310154 (`expected`).

  `actual`: 7
`expected`: 9
── Failure ('test-estimate_mean_sd_nn.R:37:3'): estimate_mean_sd_nn works as expected! ──
val[10] (`actual`) not equal to 5.031225 (`expected`).

  `actual`: 5.07
`expected`: 5.03

[ FAIL 6 | WARN 0 | SKIP 0 | PASS 101 ]
Error: Test failures
Execution halted

Package: lime
Check: tests
New result: ERROR
Running ‘testthat.R’ [36s/37s]
Running the tests in ‘tests/testthat.R’ failed.
Complete output:
> library(testthat)
> library(lime)
>
> test_check("lime")
[ FAIL 1 | WARN 0 | SKIP 1 | PASS 21 ]

══ Skipped tests (1) ═══════════════════════════════════════════════════════════
• On CRAN (1): 'test-h2o.R:3:1'

══ Failed tests ════════════════════════════════════════════════════════════════
── Failure ('test-text.R:40:3'): multiple sentences, multiple explanations ─────
"our" %in% explanation$feature is not TRUE

`actual`:   FALSE
`expected`: TRUE 

[ FAIL 1 | WARN 0 | SKIP 1 | PASS 21 ]
Error: Test failures
Execution halted

Package: pdp
Check: tests
New result: ERROR
Running ‘tinytest.R’ [115s/99s]
Running the tests in ‘tests/tinytest.R’ failed.
Complete output:
> # Run tests in local environment
> if (requireNamespace("tinytest", quietly = TRUE)) {
+ home <- length(unclass(packageVersion("pdp"))[[1]]) == 4
+ tinytest::test_package("pdp", at_home = home)
+ }

test_cats_argument.R..........    0 tests    
test_cats_argument.R..........    0 tests    
test_cats_argument.R..........    0 tests    
test_cats_argument.R..........    0 tests    
test_cats_argument.R..........    0 tests    
test_cats_argument.R..........    0 tests    
test_cats_argument.R..........    0 tests    
test_cats_argument.R..........    0 tests    
test_cats_argument.R..........    4 tests �[0;32mOK�[0m randomForest 4.7-1.1
Type rfNews() to see new features/changes/bug fixes.

test_cats_argument.R..........    5 tests �[0;32mOK�[0m �[0;34m0.3s�[0m

test_exemplar.R...............    0 tests    
test_exemplar.R...............    0 tests    
test_exemplar.R...............    0 tests    
test_exemplar.R...............    1 tests �[0;32mOK�[0m �[0;36m19ms�[0m

test_get_training_data.R......    0 tests    
Attaching package: 'ggplot2'

The following object is masked from 'package:randomForest':

    margin


test_get_training_data.R......    2 tests �[0;32mOK�[0m 
test_get_training_data.R......    5 tests �[0;32mOK�[0m 
test_get_training_data.R......    8 tests �[0;32mOK�[0m 
Attaching package: 'ranger'

The following object is masked from 'package:randomForest':

    importance


test_get_training_data.R......   10 tests �[0;32mOK�[0m �[0;34m4.2s�[0m

test_pkg_C50.R................    0 tests    
test_pkg_C50.R................    4 tests �[0;32mOK�[0m �[0;34m0.1s�[0m

test_pkg_MASS.R...............    0 tests    
test_pkg_MASS.R...............    0 tests    
test_pkg_MASS.R...............    8 tests �[0;32mOK�[0m �[0;34m2.3s�[0m

test_pkg_e1071.R..............    0 tests    
test_pkg_e1071.R..............    0 tests    
test_pkg_e1071.R..............    0 tests    
test_pkg_e1071.R..............    0 tests    
test_pkg_e1071.R..............    8 tests �[0;32mOK�[0m �[0;34m1.3s�[0m

Attaching package: 'zoo'

The following objects are masked from 'package:base':

    as.Date, as.Date.numeric


test_pkg_party.R..............   12 tests �[0;32mOK�[0m �[0;34m1.1s�[0m

test_pkg_ranger.R.............    0 tests    
test_pkg_ranger.R.............    0 tests    
test_pkg_ranger.R.............    0 tests    �[0;36m3ms�[0m

test_pkg_stats.R..............   10 tests �[0;32mOK�[0m �[0;34m1.7s�[0m

test_pkg_xgboost.R............    0 tests    
test_pkg_xgboost.R............    0 tests    
test_pkg_xgboost.R............    0 tests    
test_pkg_xgboost.R............    0 tests    
test_pkg_xgboost.R............    0 tests    
test_pkg_xgboost.R............    0 tests    
test_pkg_xgboost.R............   12 tests �[0;32mOK�[0m �[0;34m7.1s�[0m

test_pred_grid.R..............    0 tests    
test_pred_grid.R..............    0 tests    
test_pred_grid.R..............    0 tests    
test_pred_grid.R..............    0 tests    
test_pred_grid.R..............    0 tests    
test_pred_grid.R..............    1 tests �[0;32mOK�[0m 
test_pred_grid.R..............    2 tests �[0;32mOK�[0m 
test_pred_grid.R..............    3 tests �[0;32mOK�[0m 
test_pred_grid.R..............    4 tests �[0;32mOK�[0m 
test_pred_grid.R..............    5 tests �[0;32mOK�[0m 
test_pred_grid.R..............    6 tests �[0;32mOK�[0m 
test_pred_grid.R..............    6 tests �[0;32mOK�[0m 
test_pred_grid.R..............    6 tests �[0;32mOK�[0m 
test_pred_grid.R..............    7 tests �[0;32mOK�[0m �[0;36m10ms�[0m

test_xgboost_pima.R...........   12 tests �[0;31m2 fails�[0m �[0;34m12.2s�[0m
----- FAILED[data]: test_xgboost_pima.R<1--96>
 call| expect_identical(pds[[1]], target = pds[[3]])
 diff| Component "yhat": Mean relative difference: 0.008709692
----- FAILED[data]: test_xgboost_pima.R<1--96>
 call| expect_identical(pds[[1]], target = pds[[9]])
 diff| Component "yhat": Mean relative difference: 0.008709692
Error: 2 out of 89 tests failed
In addition: Warning messages:
1: In partial.default(fit4, pred.var = "x.4", prob = TRUE, ice = TRUE,  :
  Centering may result in probabilities outside of [0, 1].
2: glm.fit: algorithm did not converge 
3: glm.fit: fitted probabilities numerically 0 or 1 occurred 
4: In partial.default(fit2_glm, pred.var = "x.3", prob = TRUE, ice = TRUE,  :
  Centering may result in probabilities outside of [0, 1].
Execution halted

Package: personalized
Check: re-building of vignette outputs
New result: ERROR
Error(s) in re-building vignettes:
...
--- re-building ‘efficiency_augmentation_personalized.Rmd’ using rmarkdown
--- finished re-building ‘efficiency_augmentation_personalized.Rmd’

--- re-building ‘fitting_itrs_with_xgboost.Rmd’ using rmarkdown

Quitting from lines 125-141 [unnamed-chunk-4] (fitting_itrs_with_xgboost.Rmd)
Error: processing vignette 'fitting_itrs_with_xgboost.Rmd' failed with diagnostics:
argument "modelfile" is missing, with no default
--- failed re-building ‘fitting_itrs_with_xgboost.Rmd’

--- re-building ‘multicategory_treatments_with_personalized.Rmd’ using rmarkdown
--- finished re-building ‘multicategory_treatments_with_personalized.Rmd’

--- re-building ‘usage_of_the_personalized_package.Rmd’ using rmarkdown
--- finished re-building ‘usage_of_the_personalized_package.Rmd’

SUMMARY: processing the following file failed:
‘fitting_itrs_with_xgboost.Rmd’

Error: Vignette re-building failed.
Execution halted

Package: personalized
Check: tests
New result: ERROR
Running ‘testthat.R’ [144s/144s]
Running the tests in ‘tests/testthat.R’ failed.
Complete output:
> Sys.setenv("R_TESTS" = "")
> library(testthat)
> library(personalized)
Loading required package: glmnet
Loading required package: Matrix
Loaded glmnet 4.1-8
Loading required package: mgcv
Loading required package: nlme
This is mgcv 1.9-0. For overview type 'help("mgcv-package")'.
Loading required package: ggplot2
Loading required package: plotly

Attaching package: 'plotly'

The following object is masked from 'package:ggplot2':

    last_plot

The following object is masked from 'package:stats':

    filter

The following object is masked from 'package:graphics':

    layout

> 
> test_check("personalized")
family:    gaussian 
loss:      sq_loss_lasso 
method:    weighting 
cutpoint:  0 
propensity 
function:  propensity.func 

benefit score: f(x), 
Trt recom = 1*I(f(x)>c)+0*I(f(x)<=c) where c is 'cutpoint'

Average Outcomes:
             Recommended 0    Recommended 1
Received 0 8.9342 (n = 24) -9.2993 (n = 17)
Received 1 -5.888 (n = 29)  7.2975 (n = 30)

Treatment effects conditional on subgroups:
Est of E[Y|T=0,Recom=0]-E[Y|T=/=0,Recom=0] 
                          14.8221 (n = 53) 
Est of E[Y|T=1,Recom=1]-E[Y|T=/=1,Recom=1] 
                          16.5969 (n = 47) 

NOTE: The above average outcomes are biased estimates of
      the expected outcomes conditional on subgroups. 
      Use 'validate.subgroup()' to obtain unbiased estimates.

---------------------------------------------------

Benefit score quantiles (f(X) for 1 vs 0): 
      0%      25%      50%      75%     100% 
-21.9806  -6.3018  -0.6354   6.1213  25.5801 

---------------------------------------------------

Summary of individual treatment effects: 
E[Y|T=1, X] - E[Y|T=0, X]

    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
-43.9612 -12.6036  -1.2708   0.1229  12.2425  51.1602 
family:    gaussian 
loss:      sq_loss_lasso 
method:    weighting 
cutpoint:  0 
propensity 
function:  propensity.func 

benefit score: f(x), 
Trt recom = 1*I(f(x)<c)+0*I(f(x)>=c) where c is 'cutpoint'

Average Outcomes:
              Recommended 0    Recommended 1
Received 0 -9.1702 (n = 17)  8.2552 (n = 24)
Received 1  7.1945 (n = 31) -5.9147 (n = 28)

Treatment effects conditional on subgroups:
Est of E[Y|T=0,Recom=0]-E[Y|T=/=0,Recom=0] 
                         -16.3647 (n = 48) 
Est of E[Y|T=1,Recom=1]-E[Y|T=/=1,Recom=1] 
                         -14.1699 (n = 52) 

NOTE: The above average outcomes are biased estimates of
      the expected outcomes conditional on subgroups. 
      Use 'validate.subgroup()' to obtain unbiased estimates.

---------------------------------------------------

Benefit score quantiles (f(X) for 1 vs 0): 
      0%      25%      50%      75%     100% 
-21.1598  -6.0406  -0.5452   6.0647  25.1260 

---------------------------------------------------

Summary of individual treatment effects: 
E[Y|T=1, X] - E[Y|T=0, X]

    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
-42.3197 -12.0812  -1.0904   0.3885  12.1294  50.2519 
Summary of individual treatment effects: 
E[Y|T=1, X] - E[Y|T=0, X]

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
-43.913 -10.825  -1.574  -1.009   8.205  38.722 
Summary of individual treatment effects: 
E[Y|T=1, X] - E[Y|T=0, X]

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
-34.390  -5.795   1.886   2.259  10.456  37.506 
Summary of individual treatment effects: 
E[Y|T=1, X] - E[Y|T=0, X]

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
-34.390  -5.795   1.886   2.259  10.456  37.506 
Summary of individual treatment effects: 
E[Y|T=1, X] / E[Y|T=0, X]

Note: for survival outcomes, the above ratio is 
E[g(Y)|T=1, X] / E[g(Y)|T=0, X], 
where g() is a monotone increasing function of Y, 
the survival time

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.2452  0.8272  1.1267  1.2324  1.5091  3.9228 
family:  binomial 
loss:    logistic_loss_lasso 
method:  weighting 

validation method:  training_test_replication 
cutpoint:           0 
replications:       3 

benefit score: f(x), 
Trt recom = 1*I(f(x)>c)+0*I(f(x)<=c) where c is 'cutpoint'

Average Test Set Outcomes:
                              Recommended 0                    Recommended 1
Received 0           1 (SE = 0, n = 3.6667) 0.0424 (SE = 0.0735, n = 5.6667)
Received 1 0.0331 (SE = 0.0573, n = 7.3333) 0.9339 (SE = 0.1145, n = 8.3333)

Treatment effects conditional on subgroups:
Est of E[Y|T=0,Recom=0]-E[Y|T=/=0,Recom=0] 
              0.9669 (SE = 0.0573, n = 11) 
Est of E[Y|T=1,Recom=1]-E[Y|T=/=1,Recom=1] 
               0.8915 (SE = 0.188, n = 14) 

Est of 
E[Y|Trt received = Trt recom] - E[Y|Trt received =/= Trt recom]:                     
0.9067 (SE = 0.1213) 
family:  gaussian 
loss:    sq_loss_lasso 
method:  weighting 

validation method:  training_test_replication 
cutpoint:           0 
replications:       3 

benefit score: f(x), 
Trt recom = 1*I(f(x)<c)+0*I(f(x)>=c) where c is 'cutpoint'

Average Test Set Outcomes:
                               Recommended 0                     Recommended 1
Received 0 -27.101 (SE = 5.5353, n = 2.6667) 21.9228 (SE = 4.1631, n = 7.6667)
Received 1 12.1774 (SE = 1.8509, n = 8.6667)     -12.9406 (SE = 8.3275, n = 6)

Treatment effects conditional on subgroups:
Est of E[Y|T=0,Recom=0]-E[Y|T=/=0,Recom=0] 
       -39.2784 (SE = 5.2129, n = 11.3333) 
Est of E[Y|T=1,Recom=1]-E[Y|T=/=1,Recom=1] 
      -34.8635 (SE = 12.2911, n = 13.6667) 

Est of 
E[Y|Trt received = Trt recom] - E[Y|Trt received =/= Trt recom]:                       
-38.4093 (SE = 4.2242) 
family:  binomial 
loss:    logistic_loss_lasso 
method:  weighting 

validation method:  training_test_replication 
cutpoint:           Quant_67 
replications:       3 

benefit score: f(x), 
Trt recom = 1*I(f(x)>c)+0*I(f(x)<=c) where c is 'cutpoint'

Average Test Set Outcomes:
                               Recommended 0          Recommended 1
Received 0  0.8827 (SE = 0.1035, n = 4.6667) 0 (SE = 0, n = 4.6667)
Received 1 0.3673 (SE = 0.0878, n = 12.3333) 1 (SE = 0, n = 3.3333)

Treatment effects conditional on subgroups:
Est of E[Y|T=0,Recom=0]-E[Y|T=/=0,Recom=0] 
              0.5154 (SE = 0.1634, n = 17) 
Est of E[Y|T=1,Recom=1]-E[Y|T=/=1,Recom=1] 
                         1 (SE = 0, n = 8) 

Est of E[Y|Trt received = Trt recom] - E[Y|Trt received =/= Trt recom]:                    
0.704 (SE = 0.1303) 

<===============================================>

family:  binomial 
loss:    logistic_loss_lasso 
method:  weighting 

validation method:  training_test_replication 
cutpoint:           Quant_83 
replications:       3 

benefit score: f(x), 
Trt recom = 1*I(f(x)>c)+0*I(f(x)<=c) where c is 'cutpoint'

Average Test Set Outcomes:
                              Recommended 0     Recommended 1
Received 0 0.5684 (SE = 0.1514, n = 6.3333) 0 (SE = 0, n = 3)
Received 1 0.4308 (SE = 0.065, n = 13.6667) 1 (SE = 0, n = 2)

Treatment effects conditional on subgroups:
Est of E[Y|T=0,Recom=0]-E[Y|T=/=0,Recom=0] 
              0.1377 (SE = 0.1768, n = 20) 
Est of E[Y|T=1,Recom=1]-E[Y|T=/=1,Recom=1] 
                         1 (SE = 0, n = 5) 

Est of E[Y|Trt received = Trt recom] - E[Y|Trt received =/= Trt recom]:                     
0.3255 (SE = 0.1659) 
family:  binomial 
loss:    logistic_loss_lasso 
method:  weighting 

validation method:  training_test_replication 
cutpoint:           Quant_67 
replications:       3 

benefit score: f(x), 
Trt recom = 1*I(f(x)>c)+0*I(f(x)<=c) where c is 'cutpoint'

Average Test Set Outcomes:
                            Recommended 0        Recommended 1
Received 0 0.8827 (SE = 0.1035, 18.6667%) 0 (SE = 0, 18.6667%)
Received 1 0.3673 (SE = 0.0878, 49.3333%) 1 (SE = 0, 13.3333%)

Treatment effects conditional on subgroups:
Est of E[Y|T=0,Recom=0]-E[Y|T=/=0,Recom=0] 
                 0.5154 (SE = 0.1634, 68%) 
Est of E[Y|T=1,Recom=1]-E[Y|T=/=1,Recom=1] 
                           1 (SE = 0, 32%) 

Est of E[Y|Trt received = Trt recom] - E[Y|Trt received =/= Trt recom]:                    
0.704 (SE = 0.1303) 

<===============================================>

family:  binomial 
loss:    logistic_loss_lasso 
method:  weighting 

validation method:  training_test_replication 
cutpoint:           Quant_83 
replications:       3 

benefit score: f(x), 
Trt recom = 1*I(f(x)>c)+0*I(f(x)<=c) where c is 'cutpoint'

Average Test Set Outcomes:
                            Recommended 0   Recommended 1
Received 0 0.5684 (SE = 0.1514, 25.3333%) 0 (SE = 0, 12%)
Received 1  0.4308 (SE = 0.065, 54.6667%)  1 (SE = 0, 8%)

Treatment effects conditional on subgroups:
Est of E[Y|T=0,Recom=0]-E[Y|T=/=0,Recom=0] 
                 0.1377 (SE = 0.1768, 80%) 
Est of E[Y|T=1,Recom=1]-E[Y|T=/=1,Recom=1] 
                           1 (SE = 0, 20%) 

Est of E[Y|Trt received = Trt recom] - E[Y|Trt received =/= Trt recom]:                     
0.3255 (SE = 0.1659) 
family:  cox 
loss:    cox_loss_lasso 
method:  weighting 

validation method:  training_test_replication 
cutpoint:           0 
replications:       3 

benefit score: f(x), 
Trt recom = 1*I(f(x)>c)+0*I(f(x)<=c) where c is 'cutpoint'

Average Test Set Outcomes:
                           Recommended 0                    Recommended 1
Received 0 24.4288 (SE = 30.2323, n = 4)           0 (SE = 0, n = 4.6667)
Received 1        0 (SE = 0, n = 7.6667) 1.1995 (SE = 0.7975, n = 8.6667)

Treatment effects conditional on subgroups:
Est of E[Y|T=0,Recom=0]-E[Y|T=/=0,Recom=0] 
       24.4288 (SE = 30.2323, n = 11.6667) 
Est of E[Y|T=1,Recom=1]-E[Y|T=/=1,Recom=1] 
         1.1995 (SE = 0.7975, n = 13.3333) 

Est of 
E[Y|Trt received = Trt recom] - E[Y|Trt received =/= Trt recom]:                       
14.7452 (SE = 12.9127) 
family:  cox 
loss:    cox_loss_lasso 
method:  weighting 

validation method:  boot_bias_correction 
cutpoint:           0 
replications:       3 

benefit score: f(x), 
Trt recom = 1*I(f(x)>c)+0*I(f(x)<=c) where c is 'cutpoint'

Average Bootstrap Bias-Corrected Outcomes:
                                 Recommended 0                    Recommended 1
Received 0 31.5529 (SE = 16.8759, n = 17.3333)          0 (SE = 0, n = 25.6667)
Received 1             0 (SE = 0, n = 31.6667) 1.8959 (SE = 0.891, n = 25.3333)

Treatment effects conditional on subgroups:
Est of E[Y|T=0,Recom=0]-E[Y|T=/=0,Recom=0] 
            31.5529 (SE = 16.8759, n = 49) 
Est of E[Y|T=1,Recom=1]-E[Y|T=/=1,Recom=1] 
               1.8959 (SE = 0.891, n = 51) 

Est of 
E[Y|Trt received = Trt recom] - E[Y|Trt received =/= Trt recom]:                       
34.3274 (SE = 11.4573) 
family:    gaussian 
loss:      sq_loss_lasso 
method:    weighting 
cutpoint:  0 
propensity 
function:  propensity.func 

benefit score: f_2(x): 2 vs 1,  f_3(x): 3 vs 1 
               f_1(x): 0 
maxval = max(f_2(x), f_3(x)) 
which.max(maxval) = The trt level which maximizes maxval
Trt recom = which.max(maxval)*I(maxval > c) + 1*I(maxval <= c) where c is 'cutpoint'

Average Outcomes:
              Recommended 1    Recommended 2     Recommended 3
Received 1  19.7513 (n = 4) 15.9236 (n = 28)   23.9965 (n = 1)
Received 2 -13.9114 (n = 2)  31.9898 (n = 6) -15.5207 (n = 34)
Received 3 -28.2337 (n = 5) -41.1735 (n = 6)  29.1472 (n = 14)

Treatment effects conditional on subgroups:
Est of E[Y|T=1,Recom=1]-E[Y|T=/=1,Recom=1] 
                          41.5168 (n = 11) 
Est of E[Y|T=2,Recom=2]-E[Y|T=/=2,Recom=2] 
                          30.0126 (n = 40) 
Est of E[Y|T=3,Recom=3]-E[Y|T=/=3,Recom=3] 
                          41.6508 (n = 49) 

NOTE: The above average outcomes are biased estimates of
      the expected outcomes conditional on subgroups. 
      Use 'validate.subgroup()' to obtain unbiased estimates.

---------------------------------------------------

Benefit score 1 quantiles (f(X) for 2 vs 1): 
     0%     25%     50%     75%    100% 
-52.419 -18.669  -1.927  13.652  61.772 

Benefit score 2 quantiles (f(X) for 3 vs 1): 
      0%      25%      50%      75%     100% 
-103.787  -30.817    3.594   34.699  105.366 

---------------------------------------------------

Summary of individual treatment effects: 
E[Y|T=trt, X] - E[Y|T=1, X]
where 'trt' is 2 and 3

     2-vs-1             3-vs-1        
 Min.   :-104.839   Min.   :-207.573  
 1st Qu.: -37.338   1st Qu.: -61.633  
 Median :  -3.855   Median :   7.188  
 Mean   :  -1.162   Mean   :   2.327  
 3rd Qu.:  27.303   3rd Qu.:  69.399  
 Max.   : 123.544   Max.   : 210.733  
family:  gaussian 
loss:    sq_loss_lasso 
method:  weighting 

validation method:  training_test_replication 
cutpoint:           0 
replications:       2 

benefit score: f_2(x): 2 vs 1,  f_3(x): 3 vs 1 
               f_1(x): 0 
maxval = max(f_2(x), f_3(x)) 
which.max(maxval) = The trt level which maximizes maxval
Trt recom = which.max(maxval)*I(maxval > c) + 1*I(maxval <= c) where c is 'cutpoint'

Average Test Set Outcomes:
                            Recommended 1                   Recommended 2
Received 1  10.9365 (SE = 18.2443, n = 2) 15.0693 (SE = 11.5189, n = 6.5)
Received 2           NaN (SE = NA, n = 0)        18.6765 (SE = NA, n = 1)
Received 3 -25.872 (SE = 4.9554, n = 1.5)      17.1442 (SE = NA, n = 0.5)
                            Recommended 3
Received 1     23.9965 (SE = NA, n = 0.5)
Received 2 -18.3166 (SE = 3.866, n = 9.5)
Received 3 45.3389 (SE = 5.7727, n = 3.5)

Treatment effects conditional on subgroups:
Est of E[Y|T=1,Recom=1]-E[Y|T=/=1,Recom=1] 
           36.8085 (SE = 23.1997, n = 3.5) 
Est of E[Y|T=2,Recom=2]-E[Y|T=/=2,Recom=2] 
                  11.7522 (SE = NA, n = 8) 
Est of E[Y|T=3,Recom=3]-E[Y|T=/=3,Recom=3] 
           58.4252 (SE = 9.3034, n = 13.5) 

Est of 
E[Y|Trt received = Trt recom] - E[Y|Trt received =/= Trt recom]:                      
39.3959 (SE = 13.392) 
family:  gaussian 
loss:    sq_loss_lasso 
method:  weighting 

validation method:  training_test_replication 
cutpoint:           Quant_33 
replications:       2 

benefit score: f_2(x): 2 vs 1,  f_3(x): 3 vs 1 
               f_1(x): 0 
maxval = max(f_2(x), f_3(x)) 
which.max(maxval) = The trt level which maximizes maxval
Trt recom = which.max(maxval)*I(maxval > c) + 1*I(maxval <= c) where c is 'cutpoint'

Average Test Set Outcomes:
                         Recommended 1                  Recommended 2
Received 1        NaN (SE = NA, n = 0) 16.7838 (SE = 1.8647, n = 8.5)
Received 2        NaN (SE = NA, n = 0)       18.6765 (SE = NA, n = 1)
Received 3 -44.1777 (SE = NA, n = 0.5)       10.5345 (SE = NA, n = 1)
                            Recommended 3
Received 1     23.9965 (SE = NA, n = 0.5)
Received 2 -18.3166 (SE = 3.866, n = 9.5)
Received 3   43.4349 (SE = 3.0802, n = 4)

Treatment effects conditional on subgroups:
Est of E[Y|T=1,Recom=1]-E[Y|T=/=1,Recom=1] 
                    NaN (SE = NA, n = 0.5) 
Est of E[Y|T=2,Recom=2]-E[Y|T=/=2,Recom=2] 
                3.2112 (SE = NA, n = 10.5) 
Est of E[Y|T=3,Recom=3]-E[Y|T=/=3,Recom=3] 
             56.5212 (SE = 6.6108, n = 14) 

Est of E[Y|Trt received = Trt recom] - E[Y|Trt received =/= Trt recom]:                      
39.5889 (SE = 4.6787) 

<===============================================>

family:  gaussian 
loss:    sq_loss_lasso 
method:  weighting 

validation method:  training_test_replication 
cutpoint:           Quant_67 
replications:       2 

benefit score: f_2(x): 2 vs 1,  f_3(x): 3 vs 1 
               f_1(x): 0 
maxval = max(f_2(x), f_3(x)) 
which.max(maxval) = The trt level which maximizes maxval
Trt recom = which.max(maxval)*I(maxval > c) + 1*I(maxval <= c) where c is 'cutpoint'

Average Test Set Outcomes:
                              Recommended 1                Recommended 2
Received 1     17.8705 (SE = 2.6598, n = 4) 17.1043 (SE = 11.472, n = 5)
Received 2         18.6765 (SE = NA, n = 1)         NaN (SE = NA, n = 0)
Received 3 -12.0197 (SE = 24.5456, n = 3.5)         NaN (SE = NA, n = 0)
                            Recommended 3
Received 1           NaN (SE = NA, n = 0)
Received 2 -18.3166 (SE = 3.866, n = 9.5)
Received 3   52.8108 (SE = 4.7941, n = 2)

Treatment effects conditional on subgroups:
Est of E[Y|T=1,Recom=1]-E[Y|T=/=1,Recom=1] 
           19.2468 (SE = 12.1534, n = 8.5) 
Est of E[Y|T=2,Recom=2]-E[Y|T=/=2,Recom=2] 
                      NaN (SE = NA, n = 5) 
Est of E[Y|T=3,Recom=3]-E[Y|T=/=3,Recom=3] 
           71.1273 (SE = 8.6602, n = 11.5) 

Est of E[Y|Trt received = Trt recom] - E[Y|Trt received =/= Trt recom]:                      
42.2847 (SE = 5.1894) 
family:    gaussian 
loss:      sq_loss_lasso 
method:    weighting 
cutpoint:  0 
propensity 
function:  propensity.func 

benefit score: f_2(x): 2 vs 1,  f_3(x): 3 vs 1 
               f_1(x): 0 
minval = min(f_2(x), f_3(x)) 
which.min(minval) = The trt level which mininizes minval
Trt recom = which.min(minval)*I(minval < c) + 1*I(minval >= c) where c is 'cutpoint'

Average Outcomes:
              Recommended 1     Recommended 2    Recommended 3
Received 1 -12.4319 (n = 3)   23.9965 (n = 1) 20.0737 (n = 29)
Received 2  16.5515 (n = 8) -23.5188 (n = 28)  24.3617 (n = 6)
Received 3  41.8225 (n = 2)  18.1545 (n = 14) -39.4779 (n = 9)

Treatment effects conditional on subgroups:
Est of E[Y|T=1,Recom=1]-E[Y|T=/=1,Recom=1] 
                         -44.6999 (n = 13) 
Est of E[Y|T=2,Recom=2]-E[Y|T=/=2,Recom=2] 
                         -42.1553 (n = 43) 
Est of E[Y|T=3,Recom=3]-E[Y|T=/=3,Recom=3] 
                         -61.4123 (n = 44) 

NOTE: The above average outcomes are biased estimates of
      the expected outcomes conditional on subgroups. 
      Use 'validate.subgroup()' to obtain unbiased estimates.

---------------------------------------------------

Benefit score 1 quantiles (f(X) for 2 vs 1): 
     0%     25%     50%     75%    100% 
-52.058 -18.664  -2.026  13.390  61.143 

Benefit score 2 quantiles (f(X) for 3 vs 1): 
      0%      25%      50%      75%     100% 
-103.661  -30.787    3.412   34.372  104.861 

---------------------------------------------------

Summary of individual treatment effects: 
E[Y|T=trt, X] - E[Y|T=1, X]
where 'trt' is 2 and 3

     2-vs-1             3-vs-1        
 Min.   :-104.117   Min.   :-207.321  
 1st Qu.: -37.327   1st Qu.: -61.574  
 Median :  -4.051   Median :   6.825  
 Mean   :  -1.278   Mean   :   2.121  
 3rd Qu.:  26.780   3rd Qu.:  68.744  
 Max.   : 122.285   Max.   : 209.722  
family:  gaussian 
loss:    sq_loss_lasso 
method:  weighting 

validation method:  training_test_replication 
cutpoint:           0 
replications:       2 

benefit score: f_2(x): 2 vs 1,  f_3(x): 3 vs 1 
               f_1(x): 0 
minval = min(f_2(x), f_3(x)) 
which.min(minval) = The trt level which minimizes minval
Trt recom = which.min(minval)*I(minval < c) + 1*I(minval >= c) where c is 'cutpoint'

Average Test Set Outcomes:
                            Recommended 1                   Recommended 2
Received 1    -12.4319 (SE = NA, n = 1.5)            NaN (SE = NA, n = 0)
Received 2 21.007 (SE = 17.8902, n = 1.5) -24.8481 (SE = 3.3736, n = 7.5)
Received 3        17.1442 (SE = 0, n = 1) 35.5173 (SE = 22.6333, n = 2.5)
                           Recommended 3
Received 1  24.4409 (SE = 4.4403, n = 7)
Received 2      42.5672 (SE = NA, n = 1)
Received 3 -43.2845 (SE = 7.6407, n = 3)

Treatment effects conditional on subgroups:
Est of E[Y|T=1,Recom=1]-E[Y|T=/=1,Recom=1] 
                 -31.3354 (SE = NA, n = 4) 
Est of E[Y|T=2,Recom=2]-E[Y|T=/=2,Recom=2] 
           -60.3653 (SE = 26.0069, n = 10) 
Est of E[Y|T=3,Recom=3]-E[Y|T=/=3,Recom=3] 
           -72.1813 (SE = 18.3825, n = 11) 

Est of 
E[Y|Trt received = Trt recom] - E[Y|Trt received =/= Trt recom]:                       
-60.6921 (SE = 3.7457) 
family:  gaussian 
loss:    sq_loss_lasso 
method:  weighting 

validation method:  training_test_replication 
cutpoint:           Quant_33 
replications:       2 

benefit score: f_2(x): 2 vs 1,  f_3(x): 3 vs 1 
               f_1(x): 0 
minval = min(f_2(x), f_3(x)) 
which.min(minval) = The trt level which minimizes minval
Trt recom = which.min(minval)*I(minval < c) + 1*I(minval >= c) where c is 'cutpoint'

Average Test Set Outcomes:
                            Recommended 1                   Recommended 2
Received 1 0.1205 (SE = 17.7517, n = 2.5)            NaN (SE = NA, n = 0)
Received 2 -2.8543 (SE = 0.4226, n = 3.5) -28.3557 (SE = 3.2631, n = 5.5)
Received 3  31.1302 (SE = 15.1212, n = 2)   3.6383 (SE = 8.0984, n = 1.5)
                           Recommended 3
Received 1  27.5455 (SE = 8.8308, n = 6)
Received 2      42.5672 (SE = NA, n = 1)
Received 3 -43.2845 (SE = 7.6407, n = 3)

Treatment effects conditional on subgroups:
Est of E[Y|T=1,Recom=1]-E[Y|T=/=1,Recom=1] 
            -23.4634 (SE = 33.2635, n = 8) 
Est of E[Y|T=2,Recom=2]-E[Y|T=/=2,Recom=2] 
              -31.994 (SE = 4.8353, n = 7) 
Est of E[Y|T=3,Recom=3]-E[Y|T=/=3,Recom=3] 
           -73.7928 (SE = 20.6616, n = 10) 

Est of E[Y|Trt received = Trt recom] - E[Y|Trt received =/= Trt recom]:                       
-54.5619 (SE = 1.9297) 

<===============================================>

family:  gaussian 
loss:    sq_loss_lasso 
method:  weighting 

validation method:  training_test_replication 
cutpoint:           Quant_67 
replications:       2 

benefit score: f_2(x): 2 vs 1,  f_3(x): 3 vs 1 
               f_1(x): 0 
minval = min(f_2(x), f_3(x)) 
which.min(minval) = The trt level which minimizes minval
Trt recom = which.min(minval)*I(minval < c) + 1*I(minval >= c) where c is 'cutpoint'

Average Test Set Outcomes:
                  Recommended 1                   Recommended 2
Received 1 NaN (SE = NA, n = 0)            NaN (SE = NA, n = 0)
Received 2 NaN (SE = NA, n = 0)  -18.496 (SE = 5.7219, n = 8.5)
Received 3 NaN (SE = NA, n = 0) 35.5173 (SE = 22.6333, n = 2.5)
                             Recommended 3
Received 1 19.3597 (SE = 11.6262, n = 8.5)
Received 2      40.7872 (SE = NA, n = 1.5)
Received 3    -6.2096 (SE = 7.7322, n = 4)

Treatment effects conditional on subgroups:
Est of E[Y|T=1,Recom=1]-E[Y|T=/=1,Recom=1] 
                      NaN (SE = NA, n = 0) 
Est of E[Y|T=2,Recom=2]-E[Y|T=/=2,Recom=2] 
           -54.0133 (SE = 28.3552, n = 11) 
Est of E[Y|T=3,Recom=3]-E[Y|T=/=3,Recom=3] 
           -29.5908 (SE = 25.0456, n = 14) 

Est of E[Y|Trt received = Trt recom] - E[Y|Trt received =/= Trt recom]:                       
-42.9246 (SE = 2.8594) 

Hyperbolic Tangent kernel function. 
 Hyperparameters : scale =  1  offset =  1 

                                   
C                    1.0000 10.0000
CV weighted accuracy 0.3839  0.3521
[ FAIL 1 | WARN 2 | SKIP 0 | PASS 237 ]

══ Failed tests ════════════════════════════════════════════════════════════════
── Error ('test-fitsubgroup.R:1083:9'): test fit.subgroup with augment.func for continuous outcomes and various losses ──
Error in `xgb.Booster.handle(params, list(dtrain, dtest))`: argument "modelfile" is missing, with no default
Backtrace:
     ▆
  1. ├─testthat::expect_warning(...) at test-fitsubgroup.R:1083:8
  2. │ └─testthat:::quasi_capture(...)
  3. │   ├─testthat (local) .capture(...)
  4. │   │ └─base::withCallingHandlers(...)
  5. │   └─rlang::eval_bare(quo_get_expr(.quo), quo_get_env(.quo))
  6. └─personalized::fit.subgroup(...)
  7.   ├─base::do.call(...)
  8.   └─personalized:::fit_sq_loss_xgboost(...)
  9.     ├─base::do.call(...)
 10.     └─personalized (local) `<fn>`(...)
 11.       └─base::lapply(...)
 12.         └─personalized (local) FUN(X[[i]], ...)
 13.           └─xgboost:::xgb.Booster.handle(params, list(dtrain, dtest))

[ FAIL 1 | WARN 2 | SKIP 0 | PASS 237 ]
Error: Test failures
Execution halted

Package: pmml
Check: tests
New result: ERROR
Running ‘testthat.R’ [30s/30s]
Running the tests in ‘tests/testthat.R’ failed.
Complete output:
> library(testthat)
> library(pmml, quietly = T)
>
> test_check("pmml")
[ FAIL 9 | WARN 1 | SKIP 51 | PASS 368 ]

══ Skipped tests (51) ══════════════════════════════════════════════════════════
• On CRAN (48): 'test_pmml.iForest.R:6:3',
  'test_pmml_integration_ARIMA.R:109:3', 'test_pmml_integration_ARIMA.R:184:3',
  'test_pmml_integration_ARIMA.R:269:3',
  'test_pmml_integration_e1071_svm.R:27:3',
  'test_pmml_integration_e1071_svm.R:275:3', 'test_pmml_integration_lm.R:13:3',
  'test_pmml_integration_lm.R:123:3', 'test_pmml_integration_lm.R:175:3',
  'test_pmml_integration_other.R:120:3', 'test_pmml_integration_other.R:167:3',
  'test_pmml_integration_other.R:265:3', 'test_pmml_integration_other.R:439:3',
  'test_pmml_integration_other.R:607:3', 'test_pmml_integration_other.R:692:3',
  'test_pmml_integration_other.R:849:3',
  'test_pmml_integration_other.R:1062:3',
  'test_pmml_integration_other.R:1322:3',
  'test_pmml_integration_other.R:1442:3',
  'test_pmml_integration_other.R:1547:3',
  'test_pmml_integration_other.R:1633:3',
  'test_pmml_integration_other.R:1822:3',
  'test_pmml_integration_transformations.R:19:3',
  'test_pmml_integration_transformations.R:319:3',
  'test_pmml_integration_transformations.R:354:3',
  'test_pmml_integration_transformations.R:377:3',
  'test_pmml_integration_transformations.R:407:3',
  'test_pmml_integration_transformations.R:469:3',
  'test_pmml_integration_xgboost.R:21:3', 'test_schema_validation.R:135:3',
  'test_schema_validation.R:183:3', 'test_schema_validation.R:204:3',
  'test_schema_validation.R:248:3', 'test_schema_validation.R:343:3',
  'test_schema_validation.R:426:3', 'test_schema_validation.R:458:3',
  'test_schema_validation.R:500:3', 'test_schema_validation.R:603:3',
  'test_schema_validation.R:795:3', 'test_schema_validation.R:933:3',
  'test_schema_validation.R:1008:3', 'test_schema_validation.R:1045:3',
  'test_schema_validation.R:1077:3', 'test_schema_validation.R:1146:3',
  'test_schema_validation.R:1193:3', 'test_schema_validation.R:1429:3',
  'test_schema_validation.R:1510:3', 'test_schema_validation.R:1540:3'
• skip (2): 'test_pmml_integration_lm.R:147:3',
  'test_pmml_integration_transformations.R:439:3'
• skip until export issue is resolved (1): 'test_pmml.nnet.R:66:3'

══ Failed tests ════════════════════════════════════════════════════════════════
── Failure ('test_pmml.miningschema.R:44:3'): invalidValueTreatment attribute is exported correctly for xgboost models ──
`ms2` not equal to c(...).
2/12 mismatches
x[4]: "spore-print-color"
y[4]: "stalk-root"

x[7]: "stalk-root"
y[7]: "spore-print-color"
── Failure ('test_pmml.miningschema.R:50:3'): invalidValueTreatment attribute is exported correctly for xgboost models ──
`ms3` not equal to c(...).
2/12 mismatches
x[4]: "spore-print-color"
y[4]: "stalk-root"

x[7]: "stalk-root"
y[7]: "spore-print-color"
── Failure ('test_pmml.miningschema.R:56:3'): invalidValueTreatment attribute is exported correctly for xgboost models ──
`ms4` not equal to c(...).
2/12 mismatches
x[4]: "spore-print-color"
y[4]: "stalk-root"

x[7]: "stalk-root"
y[7]: "spore-print-color"
── Failure ('test_pmml.miningschema.R:78:3'): invalidValueTreatment attribute is exported correctly for xgboost models ──
unlist(ms22) not equal to c(...).
2/12 mismatches
x[4]: "spore-print-color"
y[4]: "stalk-root"

x[7]: "stalk-root"
y[7]: "spore-print-color"
── Failure ('test_pmml.miningschema.R:84:3'): invalidValueTreatment attribute is exported correctly for xgboost models ──
`ms23` not equal to c(...).
2/12 mismatches
x[4]: "spore-print-color"
y[4]: "stalk-root"

x[7]: "stalk-root"
y[7]: "spore-print-color"
── Failure ('test_pmml.miningschema.R:90:3'): invalidValueTreatment attribute is exported correctly for xgboost models ──
`ms24` not equal to c(...).
2/12 mismatches
x[4]: "spore-print-color"
y[4]: "stalk-root"

x[7]: "stalk-root"
y[7]: "spore-print-color"
── Failure ('test_pmml.miningschema.R:111:3'): invalidValueTreatment attribute is exported correctly for xgboost models ──
unlist(ms32) not equal to c(...).
2/12 mismatches
x[4]: "spore-print-color"
y[4]: "stalk-root"

x[7]: "stalk-root"
y[7]: "spore-print-color"
── Failure ('test_pmml.miningschema.R:118:3'): invalidValueTreatment attribute is exported correctly for xgboost models ──
`ms33` not equal to c(...).
2/12 mismatches
x[4]: "spore-print-color"
y[4]: "stalk-root"

x[7]: "stalk-root"
y[7]: "spore-print-color"
── Failure ('test_pmml.miningschema.R:124:3'): invalidValueTreatment attribute is exported correctly for xgboost models ──
`ms34` not equal to c(...).
2/12 mismatches
x[4]: "spore-print-color"
y[4]: "stalk-root"

x[7]: "stalk-root"
y[7]: "spore-print-color"

[ FAIL 9 | WARN 1 | SKIP 51 | PASS 368 ]
Error: Test failures
Execution halted

@trivialfis
Copy link
Member Author

We will delay the release for R to 2.1. In the meantime, I will prepare for a patch release for https://github.com/dmlc/xgboost/projects/12. Thank you to everyone who has participated!

@jameslamb
Copy link
Contributor

Sounds good, @ me any time for help. Thanks so much for your scripts above, they'll be really helpful for us over in LightGBM as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants