Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPW for Categorical Exposure with 4 levels #32

Closed
Soudi00 opened this issue Dec 3, 2017 · 1 comment
Closed

IPW for Categorical Exposure with 4 levels #32

Soudi00 opened this issue Dec 3, 2017 · 1 comment

Comments

@Soudi00
Copy link

Soudi00 commented Dec 3, 2017

IPW_for_Categorical_Exposure_with_4_Levels.docx

fitPropensity has error when the exposure is categorical with more than 2 levels (‘max’ not meaningful for factors). Please see the example bellow

stremr version 0.8.99 and Installing stremr package from GitHub and loading the packages


# ----------------------------------------------------------------------
# Instal stremr Version 0.8.99 Data
# ----------------------------------------------------------------------
#knitr::opts_chunk$set(echo = TRUE)

library(devtools)
#install_github("osofr/stremr", ref = "experimental_master")

# ----------------------------------------------------------------------
# Get Libraries
# ----------------------------------------------------------------------
library(stremr)
library(data.table)
library(magrittr)
library(h2o)
options(stremr.verbose=TRUE)
sessionInfo()

R version 3.4.2 (2017-09-28)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] rmarkdown_1.8 repmis_0.5 sl3_1.0.0 h2o_3.16.0.2
[5] magrittr_1.5 data.table_1.10.5 devtools_1.13.4 haven_1.1.0
[9] stremr_0.8.99

Get Source Data from another Github repository

Read sampleAD.RData from Soudi00 GitHub repository

library(repmis)

source_data("https://github.com/Soudi00/Multi-Treatment-Causal-Modeling/blob/master/sampleAD.RData?raw=True")

AD = as.data.table(AD, key= c(ID,SEQ))

use importData to prepare the data to get porpensity scores, also define censoring and exposrure regressions

# ----------------------------------------------------------------------
# Import Data
# ----------------------------------------------------------------------
OData.2  <-  importData(AD, ID = "ID", t_name = "SEQ", 
                        covars = c("CAT_VAR1","CAT_VAR2","CONT_VAR1"),           
                        CENS = c("CNS","ADM_CNS"), 
                        TRT = "TRTN",
                        MONITOR = NULL, OUTCOME = "STATUS",
                        weights = NULL, remove_extra_rows = TRUE,
                        verbose = getOption("stremr.verbose"))

# ----------------------------------------------------------------------
# Look at the input data object
# ----------------------------------------------------------------------
print(OData.2)

# ----------------------------------------------------------------------
# Access the input data
# ----------------------------------------------------------------------
get_data(OData.2)

# ----------------------------------------------------------------------
# Regression formula for Right Censoring and Administrative
# Censoring and  Exposure
# ----------------------------------------------------------------------
gform_CENS <- "CNS + ADM_CNS ~ CAT_VAR1 + CONT_VAR1"
gform_TRT = "TRTN ~ CAT_VAR1 + CAT_VAR2 + CONT_VAR1"

#Error in fitPropensity (not meaningful for factors)

I tried different options in fitPropensity but none of them works for categorical with more than 2 levels.


# ----------------------------------------------------------------------
# Estimate Propensity Scores
# fitPRopensity score with all defult option has an error
# ----------------------------------------------------------------------

OData.2 <- fitPropensity(OData.2, gform_CENS = gform_CENS,ngform_TRT = gform_TRT )

Using the default regression formula: TRTN ~ CAT_VAR1_2+CAT_VAR1_3+CAT_VAR1_4+CAT_VAR1_6+CAT_VAR1_7+CAT_VAR2_2+CAT_VAR2_3+CAT_VAR2_4+CAT_VAR2_5+CONT_VAR1
[1] "New 'ModelBinomial' regression defined:"
[1] "P(CNS|CAT_VAR1_2, CAT_VAR1_3, CAT_VAR1_4, CAT_VAR1_6, CAT_VAR1_7, CONT_VAR1);\ outvar.class: binomial;\ Stratify: ;\ N: NA"
[1] "New 'ModelBinomial' regression defined:"
[1] "P(ADM_CNS|CNS, CAT_VAR1_2, CAT_VAR1_3, CAT_VAR1_4, CAT_VAR1_6, CAT_VAR1_7, CONT_VAR1);\ outvar.class: binomial;\ Stratify: CNS == 0;\ N: NA"
[1] "New 'ModelCategorical' regression defined:"
[1] "P(TRTN|CAT_VAR1_2, CAT_VAR1_3, CAT_VAR1_4, CAT_VAR1_6, CAT_VAR1_7, CAT_VAR2_2, CAT_VAR2_3, CAT_VAR2_4, CAT_VAR2_5, CONT_VAR1);\ outvar.class: categorical;\ Stratify: ;\ N: NA"
[1] "fitting the model: P(CNS|CAT_VAR1_2, CAT_VAR1_3, CAT_VAR1_4, CAT_VAR1_6, CAT_VAR1_7, CONT_VAR1);\ outvar.class: binomial;\ Stratify: ;\ N: NA"
[1] "fitting the model: P(ADM_CNS|CNS, CAT_VAR1_2, CAT_VAR1_3, CAT_VAR1_4, CAT_VAR1_6, CAT_VAR1_7, CONT_VAR1);\ outvar.class: binomial;\ Stratify: CNS == 0;\ N: NA"
[1] "fitting the model: P(TRTN|CAT_VAR1_2, CAT_VAR1_3, CAT_VAR1_4, CAT_VAR1_6, CAT_VAR1_7, CAT_VAR2_2, CAT_VAR2_3, CAT_VAR2_4, CAT_VAR2_5, CONT_VAR1);\ outvar.class: categorical;\ Stratify: ;\ N: NA"
Failed on Lrnr_condensier_c("equal.mass", "equal.len", "dhist")_5_20_FALSE_NA_FALSE_NULL
Error in Summary.factor(structure(c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, :
‘max’ not meaningful for factors

sl3 error debugging info:
[1] "Error in Summary.factor(structure(c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, : \n ‘max’ not meaningful for factors\n"
attr(,"class")
[1] "try-error"
attr(,"condition")
<simpleError in Summary.factor(structure(c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 3L, 3L, 3L, 1L, 1L, 3L, 3L, 3L, 1L, 3L, 3L, 3L, 1L, 3L, 1L, 1L, 3L, 1L, 3L, 1L, 3L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 1L, 1L, 3L, 3L, 3L, 1L, 1L, 3L, 1L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 2L, 2L, 3L, 1L, 1L, 1L, 2L, 2L, 3L, 4L, 2L, 2L, 2L, 2L, 3L, 1L, 2L, 3L, 3L, 1L, 1L, 1L, 2L, 2L, 3L, 4L, 3L, 2L, 3L, 1L, 2L, 3L, 2L, 3L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 1L, 2L, 2L, 3L, 2L, 2L, 2L, 3L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 3L, 1L, 1L, 2L, 1L, 3L, 3L, 3L, 1L, 1L, 3L, 3L, 3L, 1L, 3L, 3L, 3L, 3L, 1L, 3L, 1L, 1L, 3L, 1L, 3L, 3L, 3L, 1L, 2L, 1L, 1L, 2L, 1L, 3L, 4L, 2L, 2L, 2L, 2L, 2L, 4L, 2L, 2L, 4L, 4L, 4L, 2L, 2L, 2L, 4L, 2L, 4L, 4L, 4L, 2L, 2L, 2L, 4L, 4L, 4L, 2L, 4L, 2L, 2L, 4L, 2L, 4L, 4L, 4L, 2L, 2L, 2L, 4L, 4L, 2L, 2L, 2L), .Label = c("1", "2", "3", "4"), class = "factor"), na.rm = FALSE): 'max' not meaningful for factors>
...trying to run Lrnr_glm_fast as a backup...
Warning in Ops.factor(y, mu): '-' not meaningful for factors
Warning in Ops.factor(eta, offset): '-' not meaningful for factors
Warning in Ops.factor(y, mu): '-' not meaningful for factors
Warning in Ops.factor(y, mu): '-' not meaningful for factors
Warning in Ops.factor(weights, y): '*' not meaningful for factors
Warning in Ops.factor(y, mu): '-' not meaningful for factors
Warning in Ops.factor(y, mu): '-' not meaningful for factors
Error in private$PsAsW.models[[k_i]]$predictAeqa(newdata = newdata, n = n, : some of the modeling predictions resulted in NAs, which indicates an error in a prediction routine

tried modeling treatment with Gradient Boosting machines same error


# ----------------------------------------------------------------------
# Fitting treatment model with Gradient Boosting machines:
# ----------------------------------------------------------------------
require("h2o")
h2o::h2o.init(nthreads = -1)
gform_CENS <- "CNS + ADM_CNS ~ CAT_VAR1 + CONT_VAR1"
models_TRT <- sl3::Lrnr_h2o_grid$new(algorithm = "gbm")
OData.2 <- fitPropensity(OData.2, gform_CENS = gform_CENS,
                        gform_TRT = gform_TRT,
                        models_TRT = models_TRT)

# Use `H2O-3` distributed implementation of GLM for treatment model estimator:
models_TRT <- sl3::Lrnr_h2o_glm$new(family = "multinomial")
OData.2 <- fitPropensity(OData.2, gform_CENS = gform_CENS,
                        gform_TRT = gform_TRT,
                        models_TRT = models_TRT)

@osofr
Copy link
Collaborator

osofr commented Dec 10, 2017

The previous issue address this I believe. Therefore closing this.

@osofr osofr closed this as completed Dec 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants