Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using weights #5

Open
francobeltran opened this issue Apr 22, 2019 · 7 comments
Open

Using weights #5

francobeltran opened this issue Apr 22, 2019 · 7 comments

Comments

@francobeltran
Copy link

Hi, I have been trying to using the weights option with no success. I get the following error

Error in wfe(y ~ tr + x1+x 2 + :
'C.it' must be a numeric vector with length equal to number of observations

Basically what I did was to rename my weight variable as C.it in the main dataset. Also I tried defining another dataset named C.it that includes this variable only. My weights are integers: basically the number of observations that correspond to the averages I am using at the unit by time level. I also tried defining these weights as proportions (the ration of this number and the total number of observations). Do you have an example of how to work with weights? Or could you please indicate me how can I incorporate them?

Another thing I realized is that even without weights I need to convert all columns into integers (except for time and year which I set up as factors) for the code to work, else I get the following error:

Error in $<-.data.frame(*tmp*, "W.it", value = numeric(0)) :
replacement has 0 rows, data has 17122

Thank you very much,

@insongkim
Copy link
Owner

Did you set C.it = "varname" where varname is a character string corresponding to the variable name in the data frame? Could you use one of the quantities of interest, e.g., qoi = ate rather than using your own weights? Note that different weights correspond to a different quantity of interest, and so your quantity of interest might not be clear with arbitrary weights. Thank you very much.

@francobeltran
Copy link
Author

francobeltran commented Apr 22, 2019 via email

@insongkim
Copy link
Owner

treat variable should be binary. The outcome variable y can be a numeric variable. Control variables can be numeric. The unit and time index should be factor as you noted. I recommend that you start with a simple model with treatment and control, and then include control variables one by one gradually to identify any potential reasons why you get an error. We also include a few examples: please try > example(wfe). Thank you very much.

@HaoShiming
Copy link

Dear professor Kim, l'd like to know whether the package "wfe" is also suitable for other data types, such as cross-section data or data that doesn't belong to panel, time series or cross-section.

Thanks so much !

@insongkim
Copy link
Owner

@HaoShiming Thanks for using the package. You may use it on a cross-section data in which you have a distinct group structure (using one-way wfe), although the discussion that we have about dynamics in the following paper may not apply in that case: http://web.mit.edu/insong/www/pdf/FEmatch.pdf

@HaoShiming
Copy link

@insongkim Thanks so much for your reply! but,
(i) I'm still wondering if wfe is sufficient enough for handling endogenous problems (omitted variables, measurement error, etc.) when the data type is cross-section and no other covariates are included;
(ii) the reason for not including covariates is that they make the estimated ate unreasonable and unexplainable;
(iii) neither empirical studies nor methodology studies has paid enough attention on the use of covariates in casual analysis. I have seen some papers suggest including covariates is not necessary, such as in Synthetic Control Methods (SCM, Abadie,Dianmond & Hainmueller, 2010; HCW, Hsiao, Ching & Wan, 2012); while others suggest that we must include covariates. In my experience, I find that sometimes not including covariates can get better ate estimates in Monte Carlo simulations.
So I'm wondering when should we include covariates, and what is the criterion of choosing covariates?

Thanks agian and sorry for interrupting.

@insongkim
Copy link
Owner

An important identification assumption for causal inference is conditional ignorability. You want to adjust for the pre-treatment covariates (confounders) such that the potential outcome is independent of the treatment conditional on pre-treatment covariates. I don't think that including fixed effects is sufficient for solving any endogeneity problem if there are other confounders.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants