propensityml

Overview

This is an R package to help the SKKU modern statistical methods project. It is basically based on the paper

Lee, B. K., Lessler, J., & Stuart, E. A. (2010). Improving propensity score weighting using machine learning. Statistics in Medicine, 29(3), 337–346. doi:10.1002/sim.3782

Installation

# install.packages("remotes")
remotes::install_github("ygeunkim/propensityml")

Usage

propensityml package aims at estimating propensity score with machine learning methods as in the paper mentioned above.

library(propensityml)

The package provides simulation function that generates the dataset in the paper:

Setoguchi, S., Schneeweiss, S., Brookhart, M. A., Glynn, R. J., & Cook, E. F. (2008). Evaluating uses of data mining techniques in propensity score estimation: a simulation study. Pharmacoepidemiology and Drug Safety, 17(6), 546–555 https://doi.org/10.1002/pds.1555

and additional toy datasets. Consider simulation.

The most simplest scenario, i.e. additivity and linearity model:

(x <- sim_outcome(1000, covmat = build_covariate()))
#>       w1      w2 w3     w4 w5 w6      w7 w8 w9    w10 exposure       y
#>    1:  0 -0.0234  1  1.397  0  0 -0.2294  0  1 -0.865        0 -101.50
#>    2:  0 -0.8632  0 -0.144  1  0  1.1669  1  0  2.907        1    1.41
#>    3:  1 -2.4124  1 -0.224  0  0 -1.7911  1  0  1.695        0  174.52
#>    4:  1 -0.7639  0 -0.838  1  0 -1.1463  0  0 -1.809        0  -65.72
#>    5:  1  1.1810  0  0.352  0  1  0.8446  0  1 -0.344        0   -3.22
#>   ---                                                                 
#>  996:  0  0.9541  0  1.736  1  1  2.3953  0  1 -0.510        1  -73.39
#>  997:  1 -1.0332  0  0.703  0  0 -0.1423  0  1 -0.143        1    6.69
#>  998:  1 -2.3697  1  0.068  1  0  0.4174  0  0  0.291        0   -2.68
#>  999:  1  0.8074  1 -0.834  0  1 -0.7798  0  0 -0.430        0   -3.21
#> 1000:  1 -0.3249  0  1.176  0  1 -0.0487  0  1 -0.394        1  -88.73
#>       exposure_prob
#>    1:         0.500
#>    2:         0.849
#>    3:         0.588
#>    4:         0.538
#>    5:         0.104
#>   ---              
#>  996:         0.992
#>  997:         0.352
#>  998:         0.155
#>  999:         0.168
#> 1000:         0.935

(fit_rf <- 
  x %>% 
  ps_rf(exposure ~ . - y - exposure_prob, data = .))
#> 
#> Call:
#>  randomForest(formula = formula, data = .) 
#>                Type of random forest: classification
#>                      Number of trees: 500
#> No. of variables tried at each split: 3
#> 
#>         OOB estimate of  error rate: 47.2%
#> Confusion matrix:
#>     0   1 class.error
#> 0 266 232       0.466
#> 1 240 262       0.478

We have defined the class named propmod for some usage.

class(fit_rf)
#> [1] "propmod"

Estimating propensity score:

estimate_ps(fit_rf) %>% head()
#>     1     2     3     4     5     6 
#> 0.433 0.584 0.500 0.548 0.446 0.364

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
R		R
data-raw		data-raw
data		data
man		man
tests		tests
.Rbuildignore		.Rbuildignore
.Rhistory		.Rhistory
.gitignore		.gitignore
.travis.yml		.travis.yml
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md
codecov.yml		codecov.yml
propensityml.Rproj		propensityml.Rproj

License

Licenses found

ygeunkim/propensityml

Folders and files

Latest commit

History

Repository files navigation

propensityml

Overview

Installation

Usage

About

Topics

Resources

License

Licenses found

Stars

Watchers

Forks

Languages