# Tutorial

This tutorial covers the configuration and API usage of `grmpy`. For the economic theory behind the generalized Roy model, see the [course documentation](https://eisenhauerio.github.io/courses-business-decisions/).

## Quickstart

### Setup

In [1]:
import grmpy

### Simulate Data

Simulate a dataset from the generalized Roy model using a YAML configuration file.

In [2]:
config = grmpy.process_config("tutorial.grmpy.yml")
result = grmpy.simulate(config)
result.data.head()

Unnamed: 0,X0,X1,Z0,Z1,U1,U0,V,Y1,Y0,D,Y
0,1.0,-0.860385,-0.413606,1.887688,0.496714,-0.138264,0.647689,1.066522,0.10362,1.0,1.066522
1,1.0,-1.335482,0.486036,-1.547304,1.52303,-0.234153,-0.234137,1.855289,-0.134798,0.0,-0.134798
2,1.0,-0.471125,-0.093636,1.325797,1.579213,0.767435,-0.469474,2.34365,1.126097,1.0,2.34365
3,1.0,-1.397118,-0.583599,1.038379,0.54256,-0.463418,-0.46573,0.844001,-0.382553,1.0,0.844001
4,1.0,-2.832156,-0.451159,0.551741,0.241962,-1.91328,-1.724918,-0.174116,-2.262927,1.0,-0.174116


The simulated data contains:
- `Y1`, `Y0`: Potential outcomes
- `D`: Treatment indicator
- `Y`: Observed outcome
- Covariates and unobservables

### Estimation

Estimate the model parameters.

In [3]:
est_result = grmpy.estimate(config, result.data)
print(est_result)

EstimationResult(mte=array([0.54554859, 0.54554859, 0.54554859, 0.54554859, 0.54554859,
       0.54554859, 0.54554859, 0.54554859, 0.54554859, 0.54554859,
       0.54554859, 0.54554859, 0.54554859, 0.54554859, 0.54554859,
       0.54554859, 0.54554859, 0.54554859, 0.54554859, 0.54554859,
       0.54554859, 0.54554859, 0.54554859, 0.54554859, 0.54554859,
       0.54554859, 0.54554859, 0.54554859, 0.54554859, 0.54554859,
       0.54554859, 0.54554859, 0.54554859, 0.54554859, 0.54554859,
       0.54554859, 0.54554859, 0.54554859, 0.54554859, 0.54554859,
       0.54554859, 0.54554859, 0.54554859, 0.54554859, 0.54554859,
       0.54554859, 0.54554859, 0.54554859, 0.54554859, 0.54554859,
       0.54554859, 0.54554859, 0.54554859, 0.54554859, 0.54554859,
       0.54554859, 0.54554859, 0.54554859, 0.54554859, 0.54554859,
       0.54554859, 0.54554859, 0.54554859, 0.54554859, 0.54554859,
       0.54554859, 0.54554859, 0.54554859, 0.54554859, 0.54554859,
       0.54554859, 0.54554859, 0.54554859

  likl = -np.mean(np.log(np.append(treated, untreated)))
  -(norm.pdf(lambda1) / norm.cdf(lambda1)) * (rho1v / (np.sqrt(1 - rho1v**2) * sd1)) - nu1 / sd1,
  norm.pdf(lambda0) / (1 - norm.cdf(lambda0)) * (rho0v / (np.sqrt(1 - rho0v**2) * sd0)) - nu0 / sd0,
  - (norm.pdf(lambda1) / norm.cdf(lambda1)) * (rho1v * nu1 / (np.sqrt(1 - rho1v**2) * sd1))
  + (norm.pdf(lambda0) / (1 - norm.cdf(lambda0))) * (rho0v * nu0 / (np.sqrt(1 - rho0v**2) * sd0))
  -(norm.pdf(lambda1) / norm.cdf(lambda1)) * ((np.dot(gamma, Z1.T) * rho1v) - nu1) / (1 - rho1v**2) ** 0.5,
  (norm.pdf(lambda0) / (1 - norm.cdf(lambda0))) * ((np.dot(gamma, Z0.T) * rho0v) - nu0) / (1 - rho0v**2) ** 0.5,
  (norm.pdf(lambda1) / norm.cdf(lambda1)) * 1 / np.sqrt(1 - rho1v**2),
  (norm.pdf(lambda0) / (1 - norm.cdf(lambda0))) * (1 / np.sqrt(1 - rho0v**2)),
  likl = -np.mean(np.log(np.append(treated, untreated)))
  -(norm.pdf(lambda1) / norm.cdf(lambda1)) * (rho1v / (np.sqrt(1 - rho1v**2) * sd1)) - nu1 / sd1,
  norm.pdf(lambda0) / (1 - n

## Configuration Reference

Configuration files use YAML format with `FUNCTION` and `PARAMS` blocks.

### SIMULATION

| Parameter | Type | Description |
|-----------|------|-------------|
| agents | int | Number of individuals to simulate |
| seed | int | Random seed for reproducibility |
| coefficients_treated | list | Coefficients for treated outcome equation |
| coefficients_untreated | list | Coefficients for untreated outcome equation |
| coefficients_choice | list | Coefficients for treatment choice equation |
| covariance | matrix | Covariance matrix of unobservables |

### ESTIMATION

| Parameter | Type | Description |
|-----------|------|-------------|
| function | str | Estimation method: 'parametric' or 'semiparametric' |
| file | str | Path to data file |
| dependent | str | Name of outcome variable |
| treatment | str | Name of treatment indicator |