Using the grmpy package
===================

The grmToolbox can be used to estimate the benefits, cost, and surplus using your own data and using simulated data. 
grmToolbox includes two function, the estimate() function and the simulate function.

The grmpy package includes a sample dataset, 'test.dat', and a sample init file 'test.ini' that are included to provide
an example of how the two functions work.

This tutorial reviews using both function with those sample files.

You can incorporate the grmpy package's functions in any python project after first importing it just as you would any
other python library


In [1]:
import grmpy
from grmpy import *
import cPickle

grmpy.estimate(init, resume, useSimulation)
--------------
The grmpy estimate command only requires that you specify an init file. 

The estimation run should take approximately 15 seconds.

This command will fork a process on your computer for the 
optimization algorithm. 

While it is running, you may investigate
the value of your parameters and the function values 
by opening and `info.grmpy.out`, respectively. 

The example directory should now contain 2 additional files:

`info.grmpy.out`, which updates continuously during
optimization with the current function value; 


`rslt.grmpy.pkl`, a python pickle file which stores
all of the results. 

Please note that the program only uses
the complete cases of the data.

InitFile (='test.ini')
=============

If not estimating using a simulated data set, the source data must be specified in the init file

The example init file is reproduced below for convenience:

```
DATA
	source		../data/test.dat
	agents 		10000

	outcome		0
	treatment	1

BENE
        
	coeff  2  0.00	0.00	false
	coeff  3  0.00	0.00	true

	coeff  4  0.00	0.00	true
	
	int       0.00	0.00
	sd        1.00	1.00

COST

	coeff  4  0.00
	coeff  5  0.00
	
	int       0.00
	sd        !1.00

RHO

	 treated   0.0
	 untreated   0.0

ESTIMATION
	
	algorithm 	bfgs
	maxiter    	15
	start		manual
	gtol       	1e-05

	epsilon    	1.4901161193847656e-08
	differences	one-sided

	asymptotics true
	hessian    	numdiff

	draws    	1000
	alpha		0.05

	version     fast

SIMULATION

	agents		1747
	seed 	  	123
	target  	simulation.dat

```



In [2]:
grmpy.estimate('test.ini')

  cov, num_draws)


<grmpy.clsRslt.RsltCls at 0x10384a990>

`grmpy.simulate(init, resume)`
--------------
After estimation you can simulate a dataset of agents drawn from the distribution
of your estimated data-generating process (i.e. implied by your
parameter estimates).

The filename of the simulated dataset is specified in the INITFILE. 
Specifying the `update=True` option ensures your data is simulated
from the current parameter values.

Otherwise, it will take the parameter
values from your INITFILE. 

You should see that the simulation produced 3 files:

`simulation.dat`, which is the simulated data;

`simulation.infos.grm.out`, which summarizes the data;

and `simulation.paras.grm.out`, which shows the true
parameter values underlying the simulated data.

In [3]:
grmpy.simulate('test.ini')

`grmpy.simulate()` Results:
----------

The resultant dataset is written to `simulation.dat`

Information on the simulation is written out to `simulation.infos.grmpy.out`, reproduced below:

```
 Simulated Economy

   Number of Observations: 1747
   Function Value:         2.08302019005

   Choices:  

     Treated            849
     Untreated          898


   Outcomes:  

     Treated           0.04408
     Untreated         0.08214

```

`grmpy.estimate(init, resume, useSimulation=True)`
============
So far we have estimated our model and created a simulated dataset
from our estimate. 

We can use the `simulation=True` option when issuing
the `grmpy.estimate()` command to see if we recover the true parameter estimates. 

This command isn't anything new. 

It merely utilizes the `simulation` 
option of the `grmpy.estimate` command. 

The estimation will 
run using the dataset and observation count as specified in the `SIMULATION` 
section of the INITFILE. 

Using the instructions contained in the INITFILE, 
it generates a new set of estimates for the simulated data. 

The user can then compare the estimated parameter values in `stepParas.grm.out` with the true parameter values of the simulated
data. 


In [4]:
grmpy.estimate('test.ini', False, True)

<grmpy.clsRslt.RsltCls at 0x1043d0050>

`grmpy.estimate(useSimulation=True)` Results:
-------
Full Results are written out to the new file `info.grm.out`. All output is required in the pickle object `rslt.grmpy.pkl`. 

A sample of the output found in `info.grm.out` is reproduced below:

```
 OPTIMIZATION REPORT 

      Function:   2.07807655743
      Gradient:   0.000131011009216

      Success:    False
      Message:    Maximum number of function evaluations.
      
```