Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

conceptual replication of temporal cheating analysis #29

Open
allanjust opened this issue Dec 17, 2022 · 1 comment
Open

conceptual replication of temporal cheating analysis #29

allanjust opened this issue Dec 17, 2022 · 1 comment

Comments

@allanjust
Copy link
Member

allanjust commented Dec 17, 2022

Building on the primary (simplest) temporal cheating analysis in this publication by Eric Zou 2021:

  1. Define the dataset:
    The first step is identifying the subset of AQS sites where 1-in-6 days is happening and pulling out the collocated dataset (MCD19A2 raw and corrected) for relevant years. We could either pull the same set Zou used from his replication dataset or do it ourselves. He discusses his strategy on page 8. I think we could examine the modal sampling interval of 88101 measures for each monitor at a unique site in each year (to allow that monitoring approaches change over time) and then classify that site-year as having 1-in-6, 1-in-3, 1-in-1 monitoring, or a mixture or frequencies.

  2. Regression strategy (using raw versus corrected MAIAC AOD)
    from Page 9 of the publication:

Aerosolst = β ⋅ 1(Off-dayst) + Timet + αs + Xstγ + εst (1)
where Aerosolst is the logged satellite aerosol concentration at monitoring site s at time t. 1(Off-dayst) is a dummy variable that indicates days when monitoring is scheduled off (five out of six days of the monitoring cycle). The key coefficient of interest is β which represents the gap in pollution levels between an average off-day and an average on-day. The strictly 1-in-6-day cyclicality in the 1(Off-dayst) variable implies that very few confounders may bias the OLS estimate β̂ from identifying the causal effect of the monitoring schedule. To confirm this point, I report results from two types of specifications. In the first, I report regressions conditional on no covariates, so that β̂ is simply the raw difference between off-days and on-days.

@Kodiologist
Copy link
Member

I've left behind some aborted work on this in a branch cheating.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants