In [2]:
from __future__ import print_function
import numpy as np
from scipy import stats
import statsmodels.api as sm
from statsmodels.base.model import GenericLikelihoodModel

The data listed in Appendix Table F21.1 were taken from a study by Spector and Mazzeo
(1980), which examined whether a new method of teaching economics, the Personalized
System of Instruction (PSI), significantly influenced performance in later economics courses.
The "dependent variable" used in our application is GRADE, which indicates the whether
a student's grade in an intermediate macroeconomics course was higher than that in the
principles course. The other variables are GPA, their grade point average; TUCE, the score
on a pretest that indicates entering knowledge of the material; and PSI, the binary variable
indicator of whether the student was exposed to the new teaching method. (Spector and
Mazzeo's specific equation was somewhat different from the one estimated here.)

### DATASET DESCRIPTION

In [3]:
data = sm.datasets.spector.load_pandas()

In [4]:
print(sm.datasets.spector.NOTE)

::

    Number of Observations - 32

    Number of Variables - 4

    Variable name definitions::

        Grade - binary variable indicating whether or not a student's grade
                improved.  1 indicates an improvement.
        TUCE  - Test score on economics test
        PSI   - participation in program
        GPA   - Student's grade point average



#### INDEPENDENT VARIABLES
In Our Case Independent Variables would be :
- Number of adopter neighbors 
- Time ?? Since the neighbor because adopted ?

I believe currenly our function is a long product , we need to apply log , simplify it and then we can use it here.


In [5]:
exog = data.exog
print(data.exog.head())

    GPA  TUCE  PSI
0  2.66  20.0  0.0
1  2.89  22.0  0.0
2  3.28  24.0  0.0
3  2.92  12.0  0.0
4  4.00  21.0  0.0


#### DEPENDENT VARIABLES OR REGRESSORS

Here its the binary outcome variable GRADES( which is 0 or 1) indicating improvement or no improvement in grades
In Our case it should be adopted or non adopted

In [6]:
endog = data.endog
print(data.endog.head())

0    0.0
1    0.0
2    0.0
3    0.0
4    1.0
Name: GRADE, dtype: float64


STEP 1: Add a constant

In [7]:
exog = sm.add_constant(exog, prepend=True)
print(exog[:5])

   const   GPA  TUCE  PSI
0      1  2.66  20.0  0.0
1      1  2.89  22.0  0.0
2      1  3.28  24.0  0.0
3      1  2.92  12.0  0.0
4      1  4.00  21.0  0.0


STEP 2: Define the Likelihood Function

This is the main step that we need to do. Here we need to put the likelihood function.
exog and endog are known 

In [8]:
class MyProbit(GenericLikelihoodModel):
    def loglike(self, params):
        exog = self.exog
        endog = self.endog
        q = 2 * endog - 1
        return stats.norm.logcdf(q*np.dot(exog, params)).sum()


STEP 3 : Fit the data to the likelihood

In [9]:
sm_probit_manual = MyProbit(endog, exog).fit()

Optimization terminated successfully.
         Current function value: 0.400588
         Iterations: 292
         Function evaluations: 494


STEP 4: Results

In [10]:
print(sm_probit_manual.summary())

                               MyProbit Results                               
Dep. Variable:                  GRADE   Log-Likelihood:                -12.819
Model:                       MyProbit   AIC:                             33.64
Method:            Maximum Likelihood   BIC:                             39.50
Date:                Fri, 02 Dec 2016                                         
Time:                        12:25:26                                         
No. Observations:                  32                                         
Df Residuals:                      28                                         
Df Model:                           3                                         
                 coef    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
const         -7.4523      2.542     -2.931      0.003       -12.435    -2.469
GPA            1.6258      0.694      2.343      0.0