Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additions to experimental module #37

Open
tsrobinson opened this issue Jul 13, 2022 · 1 comment
Open

Additions to experimental module #37

tsrobinson opened this issue Jul 13, 2022 · 1 comment
Assignees
Labels
new feature New feature or request refine Improvements to code short of a bug fix

Comments

@tsrobinson
Copy link
Owner

tsrobinson commented Jul 13, 2022

Thanks to @antndlcrx we now have the basic treatment shock function. I will work up a demo for this at some point.

Looking forward, there are two further types of treatment shock we should model.

Interactive effects

In the first instance, what if we assume beyond the main effect there is an interaction effect with another variable in the data? Suppose:

  • $\tilde{X}$ -- a sample from the ESS-trained SyGNet model with $n$ observations
  • $\tilde{y} \in \tilde{X}$ -- some outcome of interest from the synthetic data
  • $\tilde{z} \in \tilde{X}$ -- some variable already present in the synthetic data
  • $\mu, \sigma$ -- hypothesized treatment effect and noise parameters for the main effect
  • $\mu_\text{Int.}, \sigma_\text{Int.}$ -- hypothesized treatment effect and noise parameters for an interaction effect

We can then simulate a scenario where:

  • $d \sim \text{Binom.}(n, 1, 0.5)$
  • $y' \sim \mathcal{N}\bigg(d \times \mu + \tilde{z} + d \times \tilde{z} \times \mathcal{N}(\mu_\text{Int.},\sigma_\text{Int.}), \ \sigma\bigg)$

Heterogeneous Treatment Effects (HTE)

Unlike in the interaction case, we might want to preserve the ATE by simulating a HTE where the main effect is a function of some third variable, centred on $\mu$ . So:

  • $\tilde{X}$ -- a sample from the ESS-trained SyGNet model with $n$ observations
  • $\tilde{y} \in \tilde{X}$ -- some outcome of interest from the synthetic data
  • $\tilde{z} \in \tilde{X}$ -- some variable already present in the synthetic data
  • $\mu, \sigma$ -- hypothesized treatment effect and noise parameters for the main effect

We can simulate a scenario where:

  • $d \sim \text{Binom.}(n, 1, 0.5)$
  • $\tilde{z}_\text{Z-score} = \frac{\tilde{z} - \text{Mean}(\tilde{z})}{\text{StdDev}(\tilde{z})}$
  • $y' = \mathcal{N}\bigg(d \times \mu \times (1 + \tilde{z}_\text{Z-score}),\sigma\bigg) $

One further complication we could add is a parameter $\psi$ to control the amount of heterogeneity:
$\tilde{z}_\text{Z-score} = \psi \times \frac{\tilde{z} - \text{Mean}(\tilde{z})}{\text{StdDev}(\tilde{z})}$,
and then we keep the same outcome equation.

@tsrobinson tsrobinson added new feature New feature or request refine Improvements to code short of a bug fix labels Jul 13, 2022
@antndlcrx
Copy link
Collaborator

added a trial version of the requested functions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature New feature or request refine Improvements to code short of a bug fix
Projects
None yet
Development

No branches or pull requests

2 participants