In [1]:
from IPython.core.display import HTML, Image
css_file = 'custom.css' 
HTML(open(css_file, 'r').read())

*grmpy* package
========

The *grmpy* can be used to estimate the benefits, cost, and surplus using your own data and using simulated data. 

    Eisenhauer, Philipp and James J. Heckman, Edward Vytlacil (2014): The Generalized Roy Model and the Cost-Benefit Analysis of Social Programs, Journal of Political Economy, 123(2): 413-443.


There are two functions: (1) estimate and (2) simulate.

The grmpy package includes a sample dataset, 'test.dat', and a sample init file 'test.ini' that are included to provide an example of how the two functions work.

This tutorial reviews using both function with those sample files.

The *grmpy* package is maintained by [Philipp Eisenhauer](https://github.com/peisenha) and [Chase Corbin](https://github.com/cocorbin). We want to thank [Jake Torcasse](http://jaketorcasso.com/) and [Luke Schmerold](https://www.linkedin.com/pub/luke-schmerold/5a/308/a65/) for their help in developing earlier versions of the package. 

Model Specification
----------------------

If not estimating using a simulated data set, the source data must be specified in the init file

The example init file is reproduced below for convenience:

```
DATA
	source		../data/test.dat
	agents 		10000

	outcome		0
	treatment	1

BENE
        
	coeff  2  0.00	0.00	false
	coeff  3  0.00	0.00	true

	coeff  4  0.00	0.00	true
	
	int       0.00	0.00
	sd        1.00	1.00

COST

	coeff  4  0.00
	coeff  5  0.00
	
	int       0.00
	sd        !1.00

RHO

	 treated   0.0
	 untreated   0.0

ESTIMATION
	
	algorithm 	bfgs
	maxiter    	15
	start		manual
	gtol       	1e-05

	epsilon    	1.4901161193847656e-08
	differences	one-sided

	asymptotics true
	hessian    	numdiff

	draws    	1000
	alpha		0.05

	version     fast

SIMULATION

	agents		1747
	seed 	  	123
	target  	simulation.dat

```



Getting Started
-----------------

You can incorporate the grmpy package's functions in any python project after first importing it just as you would any
other python library

In [2]:
# Import grmpy package
import grmpy

# Let us run some basic tests
grmpy.test()

We briefly reproduce the interface to the *estimate* function.

In [3]:
def estimate(init='init.ini', resume=False, use_simulation=False):
    """
            Parameters
            ----------
            
                init: str, optional
                    Path to the initialization file.
                    
                resume: bool, optional
                    Restart estimation, requires info.grmpy.out.
                    
                use_simulation: bool, optional
                    Use information from SIMULATION section of 
                    the initialiation file.
                
            Results
            -------
            
                info.grmpy.out: file
                    Text file with results from estimation run.
                    
                rslt.grmpy.pkl: serialized Python object    
                    Results from estimation run.
    """
    
    pass


For any initialization file, you can simulate a dataset for the specified data generating process. This is  useful to inspect properties of the simulated population but also testing the reliability of your estimator. Also, if information from a previous estimation run is available the estimated structural parameters can be used for the simulation.

Here we reproduce the interface to the *simulate()* function.

In [4]:
def simulate(init='init.ini', update=False):
    """
            Parameters
            ----------
            
                init: str, optional
                    Path to the initialization file.
                    
                update: bool, optional
                    Update structural parameters from info.grmpy.out.
                                    
            Results
            -------
            
                simulation.infos.grmpy.out
                    Text file with basic information on 
                    simulated economy.
                    
                *.dat
                    Text file with simulated dataset. The file
                    name is determined by the target flag in 
                    the SIMULATION section of the initalization 
                    file.
                
    """
    
    pass


Basic Workflow
----------------

Let us simulate a dataset and have a look at some basic information about the economy,

In [8]:
# Simulate dataset
grmpy.simulate('example.grmpy.ini')

# Inspect the results
%cat simulation.infos.grmpy.out

Now we run an estimation.

In [7]:
# Estimate model
grmpy.estimate('example.grmpy.ini', use_simulation=True)

# Inspect the results
%cat info.grmpy.out

In [1]:
%timeit range(1000)

The slowest run took 5.15 times longer than the fastest. This could mean that an intermediate result is being cached 
100000 loops, best of 3: 7.35 µs per loop
