NegativeBinomial strange results with bfgs #1539

Closed
josef-pkt opened this Issue Apr 2, 2014 · 24 comments

Projects

None yet

2 participants

@josef-pkt
Member

bug or weird

I'm getting a strange convergence in 'nb1' with bfgs or lbfgs
has nans in bse, params that are not close to true values
and score is not close to zero

('nb2' looks fine on the same dataset)

>>> print res_nb1_lb.summary()
                     NegativeBinomial Regression Results                      
==============================================================================
Dep. Variable:                      y   No. Observations:                  100
Model:               NegativeBinomial   Df Residuals:                       95
Method:                           MLE   Df Model:                            4
Date:                Wed, 02 Apr 2014   Pseudo R-squ.:                  0.3760
Time:                        11:52:12   Log-Likelihood:                -757.27
converged:                       True   LL-Null:                       -1213.5
                                        LLR p-value:                3.345e-196
==============================================================================
                 coef    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
const         57.4164        nan        nan        nan           nan       nan
x1             0.0082      0.061      0.134      0.893        -0.112     0.128
x2             0.0068      0.055      0.124      0.902        -0.102     0.115
x3             0.0070      0.057      0.122      0.903        -0.105     0.119
x4             0.0081      0.059      0.137      0.891        -0.108     0.124
alpha       5.258e+26        nan        nan        nan           nan       nan
==============================================================================

>>> res_nb1_lb.model.score(res_nb1_lb.params)
array([ -9.16821547e-04,  -4.94179169e-04,   2.07220644e-03,
        -2.37378048e-03,  -4.15836361e-03,  -3.24589003e-27])

method='nm' looks good with large maxiter
method='newton' works at 'nm' starting values (doesn't change them

artificial dataset (I didn't set a seed)

import numpy as np

import statsmodels.api as sm

nobs, k_vars = 100, 5
sig_e = 0.5
beta = np.array([1,1,1,1])
need = k_vars - len(beta)
beta = np.concatenate((beta, np.ones(need)))
beta *= 2. / beta.sum()   # normalize to something to choose
#x = 3 * np.random.rand(nobs, k_vars)  
# with `3 *` values become too large, poisson parameter estimates don't look good
x = 6 * (np.random.rand(nobs, k_vars) - 0.25)
x[:, 0] = 1
y_true = np.exp(x.dot(beta))
sig_e = y_true / y_true.mean() * sig_e
sig_e = np.clip(sig_e, 0, 2)

# multiplicative
error = np.exp(sig_e * np.random.randn(nobs))
y = y_true * error
mod_ols = sm.OLS(np.log(y), x)
res_ols = mod_ols.fit()
mod_nb1 = sm.NegativeBinomial(y,x, loglike_method='nb1')
res_nb1_b = mod_nb1.fit(start_params=np.concatenate((res_ols.params, [50.])), method='bfgs')
print res_nb1.summary()
@jseabold
Member
jseabold commented Apr 2, 2014

Can you save the dataset? Not much we can do without being able to replicate.

@josef-pkt
Member

I assume the problem will also occur with other seeds.

(I'm looking at Poisson and wanted to run NegativeBinomial just for comparison.)

>>> x
array([[  1.00000000e+00,   2.58609859e+00,  -4.32963903e-01,
          3.73517419e+00,   3.34323375e+00],
       [  1.00000000e+00,   1.27766877e+00,  -2.93624210e-01,
         -6.69391010e-02,  -2.18096356e-01],
       [  1.00000000e+00,   1.37265862e+00,  -9.87237762e-01,
          2.01472662e+00,   6.44555063e-01],
       [  1.00000000e+00,   2.82057686e+00,   2.59561451e+00,
          6.24709518e-01,   1.79485252e+00],
       [  1.00000000e+00,   2.43967139e+00,   2.22752742e+00,
         -3.38591658e-01,   9.29854960e-01],
       [  1.00000000e+00,  -7.91500615e-01,   2.35252348e+00,
          8.00432925e-01,   4.71966239e-01],
       [  1.00000000e+00,   7.08122448e-01,   1.46843547e+00,
          2.93431658e+00,   6.65607115e-01],
       [  1.00000000e+00,  -1.18748755e+00,  -1.46663186e+00,
          4.19439302e+00,  -1.24909171e+00],
       [  1.00000000e+00,  -1.09890181e+00,  -1.67944423e-01,
         -1.30336252e+00,  -1.31468263e+00],
       [  1.00000000e+00,  -1.21589368e-02,   1.91003477e+00,
         -1.00323455e+00,   2.09684047e+00],
       [  1.00000000e+00,   2.83972974e+00,   1.05638129e+00,
          9.35074401e-01,   2.09935371e+00],
       [  1.00000000e+00,   2.94386908e+00,  -8.92562476e-01,
          4.60557933e-01,   9.55387261e-02],
       [  1.00000000e+00,   1.33604539e+00,  -8.72336190e-01,
          3.94484784e+00,  -1.09526791e+00],
       [  1.00000000e+00,   3.24718573e+00,   3.54400086e+00,
          3.00777010e+00,   1.61301164e-01],
       [  1.00000000e+00,   2.06873407e+00,   4.13535365e+00,
          3.20991544e+00,   1.63188425e+00],
       [  1.00000000e+00,   3.99816192e+00,  -1.35841972e+00,
         -8.14614957e-01,   4.28205654e+00],
       [  1.00000000e+00,   4.45188448e+00,  -2.75617550e-01,
          3.63366543e+00,  -1.43524757e+00],
       [  1.00000000e+00,   7.93463051e-01,  -4.53933919e-01,
          3.60815862e+00,   3.63950371e+00],
       [  1.00000000e+00,   2.97599748e+00,   2.06454445e-01,
         -9.19440556e-01,   8.10111578e-01],
       [  1.00000000e+00,   6.80524302e-01,   3.62411745e+00,
         -6.81688935e-01,   4.51832541e-01],
       [  1.00000000e+00,   1.97748773e+00,   2.31273641e-01,
          1.98268481e+00,   1.52463772e+00],
       [  1.00000000e+00,   3.48265417e+00,  -6.34366634e-01,
          1.78516411e+00,   4.18247559e+00],
       [  1.00000000e+00,   3.70447869e+00,   4.40999129e+00,
          2.00539042e-01,   2.77836325e+00],
       [  1.00000000e+00,   2.81369458e+00,  -1.08286112e+00,
          1.33316683e+00,   3.64850628e+00],
       [  1.00000000e+00,   2.53663198e-03,  -7.84740396e-01,
          2.70698026e+00,   6.15780588e-01],
       [  1.00000000e+00,   4.32982892e+00,   9.09253411e-01,
          1.64113149e+00,  -1.27026796e+00],
       [  1.00000000e+00,  -6.22776121e-02,   8.07496513e-01,
          1.52732107e+00,   7.63314283e-01],
       [  1.00000000e+00,  -2.64363540e-01,   2.50731487e+00,
          2.79096931e+00,  -8.82904378e-01],
       [  1.00000000e+00,   4.88326178e-01,   4.46441003e+00,
          3.53147774e+00,   1.21109288e+00],
       [  1.00000000e+00,   1.33805197e+00,   4.46331931e+00,
         -1.22508707e+00,   1.00474282e+00],
       [  1.00000000e+00,   1.53762698e+00,  -5.10076184e-01,
          1.54219952e+00,  -7.36999263e-01],
       [  1.00000000e+00,  -1.44966401e+00,   4.17247097e+00,
         -1.59540205e-02,  -1.05458964e+00],
       [  1.00000000e+00,   4.29800276e+00,   1.94173211e+00,
          4.76258635e-01,   1.72348004e+00],
       [  1.00000000e+00,   1.54237169e+00,   1.51574704e+00,
          3.60374301e-01,  -3.87217307e-01],
       [  1.00000000e+00,   1.09000631e+00,   1.74705055e+00,
          1.34945934e+00,  -1.30592895e+00],
       [  1.00000000e+00,   4.22604063e+00,   2.82231001e+00,
          1.10529556e+00,   1.41824120e+00],
       [  1.00000000e+00,  -8.84976207e-01,  -1.19040752e+00,
         -5.84639371e-01,   7.43905915e-01],
       [  1.00000000e+00,   9.30371071e-01,   3.61209478e+00,
          3.64480163e-02,   4.03428979e+00],
       [  1.00000000e+00,   4.80855657e-01,   1.70684312e+00,
          3.21358705e+00,   4.42482555e+00],
       [  1.00000000e+00,   3.37583328e+00,  -1.39039371e+00,
         -1.39398257e+00,   3.97787572e+00],
       [  1.00000000e+00,   2.18415483e+00,  -8.48671640e-02,
         -8.77400680e-01,  -1.14279772e+00],
       [  1.00000000e+00,  -2.41188949e-01,   4.56352806e-01,
         -1.11768658e-01,   1.22738121e+00],
       [  1.00000000e+00,   4.43169996e+00,  -1.45416174e+00,
          4.13584000e-01,  -9.92962156e-01],
       [  1.00000000e+00,   3.75715974e+00,   1.50447670e+00,
          2.15065385e+00,   4.42713138e+00],
       [  1.00000000e+00,   1.34560995e+00,  -1.30291101e+00,
         -2.59799990e-01,  -1.42022280e+00],
       [  1.00000000e+00,   3.92935477e+00,   3.03048235e-01,
          1.33386440e+00,   3.54763956e-02],
       [  1.00000000e+00,   3.09946482e+00,   1.97031460e+00,
          4.44911701e+00,   2.96403637e+00],
       [  1.00000000e+00,   5.78618051e-01,   2.84682398e+00,
          3.49928061e+00,  -2.54024398e-01],
       [  1.00000000e+00,   3.52954250e+00,  -5.62283858e-01,
          4.21297957e-01,   1.86772280e-01],
       [  1.00000000e+00,   9.36647327e-01,   3.44012687e-01,
         -1.37183947e+00,   1.22836307e-01],
       [  1.00000000e+00,   2.01375122e+00,   3.12490220e+00,
         -1.29852743e+00,  -1.08303273e+00],
       [  1.00000000e+00,  -4.96298756e-02,   3.02346088e+00,
          2.29319846e+00,   2.27542559e+00],
       [  1.00000000e+00,   9.93204597e-01,   2.50757229e-01,
         -5.33658162e-01,   4.01512424e-02],
       [  1.00000000e+00,   4.26709442e+00,  -1.29739911e-01,
          3.60678357e+00,  -7.58656724e-01],
       [  1.00000000e+00,  -3.38044025e-01,  -7.70453685e-01,
         -1.06005498e+00,  -1.79778540e-01],
       [  1.00000000e+00,  -1.24286335e+00,   7.46816822e-01,
         -1.42827692e+00,  -1.41428949e+00],
       [  1.00000000e+00,   2.70698595e+00,   1.26638415e+00,
          3.97320315e-01,  -1.27826337e+00],
       [  1.00000000e+00,   3.77415899e+00,   1.04322709e+00,
          3.90277358e+00,  -1.18569175e-01],
       [  1.00000000e+00,   1.27133490e+00,   5.48706067e-01,
          3.08441259e+00,   1.94860777e+00],
       [  1.00000000e+00,  -4.91990609e-01,  -1.07904540e+00,
         -5.23664161e-01,   4.43974736e+00],
       [  1.00000000e+00,   1.47563564e+00,  -1.19734578e+00,
          2.44114612e+00,   3.24319789e+00],
       [  1.00000000e+00,   2.91270502e+00,   4.11800823e+00,
          2.97221313e+00,   2.76355254e+00],
       [  1.00000000e+00,  -1.26609640e+00,   1.36297170e-01,
          3.46392172e+00,   2.37297630e+00],
       [  1.00000000e+00,   1.02730586e+00,  -4.47959241e-01,
          4.06458963e+00,   1.75090603e-01],
       [  1.00000000e+00,   1.82659964e+00,   2.33109860e+00,
          2.06641754e+00,   2.61855995e+00],
       [  1.00000000e+00,   1.39542567e+00,   2.54380182e+00,
          1.18366508e+00,   4.02993659e+00],
       [  1.00000000e+00,  -1.08046542e+00,  -8.28945968e-01,
          1.82659013e+00,   1.05539160e+00],
       [  1.00000000e+00,   7.35918827e-01,  -1.31234954e+00,
          2.09058276e+00,   1.41301544e+00],
       [  1.00000000e+00,   1.77406728e-01,  -6.93757413e-01,
         -6.77761413e-01,   3.39597770e+00],
       [  1.00000000e+00,   2.97248008e+00,  -4.16982378e-01,
          1.83311888e+00,   2.58036168e+00],
       [  1.00000000e+00,   2.77167417e+00,  -8.90043026e-02,
          4.45715775e+00,   2.80999668e-01],
       [  1.00000000e+00,   3.91610421e+00,  -9.82559128e-01,
          2.71479034e+00,  -1.12180956e+00],
       [  1.00000000e+00,   2.49036655e+00,   2.04609755e+00,
         -1.37033610e-01,   4.67891352e-02],
       [  1.00000000e+00,   9.17497982e-01,   3.19968346e+00,
          3.43256366e-01,   4.65712303e-01],
       [  1.00000000e+00,   4.44428259e+00,  -1.43377836e+00,
         -1.20299162e+00,  -2.43532203e-01],
       [  1.00000000e+00,   1.29525533e+00,   4.09495129e+00,
          2.25954796e+00,   3.08073394e+00],
       [  1.00000000e+00,   2.97901813e-01,  -1.07641200e+00,
          2.83585974e+00,  -1.23019377e+00],
       [  1.00000000e+00,   2.55361685e+00,   2.57698481e-01,
         -1.40354230e+00,   1.86674273e+00],
       [  1.00000000e+00,   1.39222991e+00,   8.94948409e-01,
          1.34873286e+00,   3.03824072e+00],
       [  1.00000000e+00,   4.47264834e+00,  -1.30423933e+00,
         -1.13962940e+00,   4.28447421e+00],
       [  1.00000000e+00,   2.30159189e+00,  -1.08531940e-01,
          3.19686781e+00,   2.78229798e+00],
       [  1.00000000e+00,   4.39690817e+00,  -3.43200103e-01,
          4.41271317e+00,   5.01853815e-02],
       [  1.00000000e+00,   4.48160523e+00,   2.75058349e+00,
         -9.38416486e-01,   1.60332118e+00],
       [  1.00000000e+00,   2.46759690e+00,   4.38730807e+00,
          4.27294372e+00,  -7.83307077e-03],
       [  1.00000000e+00,   1.98400518e+00,  -3.56225636e-01,
          3.68926599e+00,   2.45601550e+00],
       [  1.00000000e+00,   2.80221602e+00,  -8.33450115e-01,
          6.74241509e-01,  -5.66943610e-01],
       [  1.00000000e+00,   4.18146299e+00,   5.39663327e-01,
          6.91160520e-02,   1.00635334e+00],
       [  1.00000000e+00,   2.31029874e+00,  -1.09082002e+00,
          6.98964950e-01,   1.81775015e+00],
       [  1.00000000e+00,  -4.22962602e-01,   2.59384287e+00,
          3.42575500e+00,   1.52515355e+00],
       [  1.00000000e+00,   8.50373767e-01,   1.93919424e+00,
          1.20707531e+00,   8.62924749e-01],
       [  1.00000000e+00,   3.80083776e+00,   2.05469921e+00,
          1.55208680e-01,   5.00331981e-01],
       [  1.00000000e+00,   1.98081300e-01,   2.47304846e+00,
          1.77270520e+00,  -2.93249768e-01],
       [  1.00000000e+00,   3.66640681e+00,  -1.40101487e+00,
          3.57091183e+00,   4.20751490e+00],
       [  1.00000000e+00,   2.59124763e+00,   3.70406401e+00,
          2.25495780e+00,   2.39880609e+00],
       [  1.00000000e+00,   1.15051915e+00,   2.55498114e+00,
         -9.42907464e-04,   1.46616012e+00],
       [  1.00000000e+00,   1.70172294e+00,   3.69451299e+00,
          1.34596620e+00,   1.06212318e+00],
       [  1.00000000e+00,   1.15574459e+00,  -8.11013989e-01,
          2.82603663e+00,   9.36208181e-01],
       [  1.00000000e+00,   3.02879815e+00,   3.16959962e+00,
         -3.78029811e-01,   1.64606733e+00],
       [  1.00000000e+00,  -7.64405319e-01,  -1.28785631e+00,
          3.40967322e+00,   2.63215005e+00],
       [  1.00000000e+00,   1.67041377e+00,   9.41527546e-01,
          9.22255620e-01,   3.39695715e+00]])
>>> y
array([  1.23955572e+02,   1.94661183e+00,   4.66863369e+00,
         1.12763195e+01,   1.25785598e+01,   4.19157828e+00,
         1.39590910e+01,   1.66522836e+00,   3.16119583e-01,
         5.05901696e+00,   2.35376422e+01,   4.21434229e+00,
         5.63114293e+00,   3.56719157e+02,   1.01101843e+01,
         3.09961777e+01,   1.47373216e+01,   2.61237052e+01,
         5.08489372e+00,   8.58922310e+00,   2.01905602e+01,
         2.16551316e+01,   1.17504782e+03,   1.03346158e+01,
         4.14794392e+00,   2.64596904e+01,   5.65196785e+00,
         7.10601829e+00,   4.87905504e+01,   1.24505982e+01,
         2.95542728e+00,   3.18667874e+00,   6.02972807e+01,
         5.34694759e+00,   4.24248778e+00,   1.48948436e+02,
         6.84661436e-01,   2.97668568e+02,   1.39310456e+02,
         9.24596651e+00,   1.58996761e+00,   2.52015341e+00,
         3.65466524e+00,   4.21746503e+02,   7.85189237e-01,
         1.24960352e+01,   2.46275025e+02,   1.83603920e+01,
         6.46034586e+00,   1.47183890e+00,   5.04866383e+00,
         3.93870705e+01,   2.07596119e+00,   4.99082465e+01,
         5.76805849e-01,   3.91344024e-01,   5.64374038e+00,
         6.60000572e+01,   3.34612325e+01,   4.08528024e+00,
         1.27321481e+01,   8.41280065e+01,   9.14161740e+00,
         1.10577769e+01,   5.43620917e+01,   4.81300777e+01,
         2.18788678e+00,   5.24257088e+00,   3.59849705e+00,
         1.81239472e+01,   2.65743145e+01,   9.90714320e+00,
         7.52432600e+00,   1.35229641e+01,   2.75371660e+00,
         7.82330198e+01,   2.17518577e+00,   5.83546976e+00,
         1.96904138e+01,   1.76560609e+01,   1.21211519e+01,
         3.22415093e+01,   5.85593069e+01,   1.68771249e+01,
         3.16586309e+01,   3.15921112e+00,   1.63664317e+01,
         6.28861463e+00,   3.66768096e+01,   1.47433165e+01,
         8.01034754e+00,   8.36119355e+00,   7.03867280e+02,
         6.28662855e+00,   1.06547999e+01,   3.75774094e+01,
         8.27040581e+00,   3.80330341e+01,   6.22474574e+00,
         1.53619366e+01])
@josef-pkt
Member

and here's my reduced (without GLM and Poisson) script to replicate, I added a seed and get the same problem

# -*- coding: utf-8 -*-
"""estimating `y = exp(x b)`

(1) : loglinear OLS
(2) : GLM gaussian, log-link   (is non-linear least squares ?)
(3) : GLM Poisson, log-link
(4) : Poisson, uses log-link


data generating process
-----------------------

with and without heteroscedasticity
additive or multiplicative error


Created on Tue Mar 25 12:17:28 2014

Author: Josef Perktold
"""

import numpy as np

import statsmodels.api as sm

np.random.seed(963457)

nobs, k_vars = 100, 5
sig_e = 0.5
beta = np.array([1,1,1,1])
need = k_vars - len(beta)
beta = np.concatenate((beta, np.ones(need)))
beta *= 2. / beta.sum()   # normalize to something to choose
#x = 3 * np.random.rand(nobs, k_vars)  
# with `3 *` values become too large, poisson parameter estimates don't look good
x = 6 * (np.random.rand(nobs, k_vars) - 0.25)
x[:, 0] = 1
y_true = np.exp(x.dot(beta))
sig_e = y_true / y_true.mean() * sig_e
sig_e = np.clip(sig_e, 0, 2)

# multiplicative
error = np.exp(sig_e * np.random.randn(nobs))
y = y_true * error
# additive error need to clip below at zero

mod_ols = sm.OLS(np.log(y), x)
res_ols = mod_ols.fit()
#print(res_ols.summary())

# compare with Negative binomial
mod_nb2 = sm.NegativeBinomial(y,x)
res_nb2 = mod_nb2.fit(start_params=np.concatenate((res_ols.params, [0.5])))
#Optimization terminated successfully.
print res_nb2.summary()

mod_nb1 = sm.NegativeBinomial(y,x, loglike_method='nb1')
res_nb1 = mod_nb1.fit(start_params=np.concatenate((res_ols.params, [0.5])))
#Warning: Desired error not necessarily achieved due to precision loss.
#print res_nb1.summary()  # no Hessian with current random seed
# Note: results don't look good

res_nb1_b = mod_nb1.fit(start_params=np.concatenate((res_ols.params, [50])))
print res_nb1_b.summary()
@jseabold
Member
jseabold commented Apr 2, 2014

I would add some logging to the score and see what's going on. My guess is it gets into a weird region. NM doesn't use it. I ran into similar with ARMA recently. It returned all NaNs but converged from the solver was True.

@josef-pkt
Member

I just realized that the script version uses newton, in the interpreter I used bfgs and lbfgs

call to get the score:

>>> res_nb1_b.model.score(res_nb1_b.params)
array([ -7.63876770e-05,  -1.48348029e-04,  -1.15866271e-04,
        -6.27801985e-05,  -1.07283293e-04,  -4.12306866e-22])
@josef-pkt
Member

There were also some unclear issues with the analytical derivatives and parameter transformations when we merged NegativeBinomial. I didn't look into it yet.

aside: skew and kurtosis are very large in this DGP.
I don't know yet whether to do anything about it, but there can be numerical problems if we are not close to the optimal params.

@jseabold
Member
jseabold commented Apr 2, 2014

IIRC, the only unclear issue was whether to present the post-fit statistics for alpha in terms of lnalpha or alpha. Everything should be correct (and tested) though. My guess is that the special functions may be returning NaNs or something in some region of the search space, which the solvers don't handle well.

@jseabold
Member
jseabold commented Apr 2, 2014

It looks like there's two code paths for newton/ncg vs the rest. One optimizes in alpha space. One uses transparams to go into lnalpha. I don't remember why exactly there are the two code paths. May be discussed in the comments, tests, or PR though.

@jseabold
Member
jseabold commented Apr 2, 2014

Luckily we discussed (at length) in the PR #795. Don't get bogged down in my earlier comments. Starts to converge on actually making sense after I refactor the code for the analytic Hessians. I'm going to take some convincing that the code is not right and it's not a numerical issue. Easy to add logging to score and see.

@jseabold
Member
jseabold commented Apr 2, 2014

If the code weren't right, it wouldn't work at all for optimization instead of their being corner case failures.

@jseabold
Member
jseabold commented Apr 2, 2014

Looking at my last comment on that PR I don't know anymore... Sorry but I don't have a huge chunk of time to dig into this again.

@josef-pkt
Member

related: To test the score in another model, I added the numerical derivative as an extra private method to check the analytical score against.
Similar, it would be possible to call numerical derivatives in a super call, if the superclass defines those. (not yet in LikelihoodModel, but available in GenericLikelihoodModel.

@josef-pkt
Member

untransformed score looks good

>>> nd.approx_fprime_cs(res_nb1_n.params, res_nb1_nm.model.loglike)
array([ -1.44497769e-04,  -8.51417965e-05,  -7.36037004e-04,
        -3.06818971e-04,  -3.04473978e-04,  -1.23255466e-08])
>>> res_nb1_nm.model.score(res_nb1_n.params)
array([ -1.44497769e-04,  -8.51417965e-05,  -7.36037004e-04,
        -3.06818971e-04,  -3.04473978e-04,  -1.23255466e-08])
>>> nd.approx_fprime_cs(res_nb1_nm.params, res_nb1_nm.model.loglike)
array([ -1.46689471e-03,   1.41427569e-03,   5.56680455e-03,
        -5.55133301e-03,  -3.38088669e-03,   5.99298099e-07])
>>> res_nb1_nm.model.score(res_nb1_nm.params)
array([ -1.46689471e-03,   1.41427569e-03,   5.56680455e-03,
        -5.55133301e-03,  -3.38088669e-03,   5.99298099e-07])

with transformed params

>>> res_nb1_nm.model._transparams
False
>>> res_nb1_nm.model._transparams = True
>>> res_nb1_nm.model.score(res_nb1_nm.params)
array([ nan,  nan,  nan,  nan,  nan,  nan])
>>> nd.approx_fprime_cs(res_nb1_nm.params, res_nb1_nm.model.loglike)
array([ nan,  nan,  nan,  nan,  nan,  nan])
>>> res_nb1_nm.model.score(res_nb1_n.params)
array([ nan,  nan,  nan,  nan,  nan,  nan])
>>> nd.approx_fprime_cs(res_nb1_n.params, res_nb1_nm.model.loglike)
array([ nan,  nan,  nan,  nan,  nan,  nan])

>>> start_params=np.concatenate((res_ols.params, [20]))
>>> nd.approx_fprime_cs(start_params, res_nb1_nm.model.loglike)
array([  99.99984084,  142.30063289,  112.97414007,  147.60326816,
        173.07226423,  -99.9997753 ])
>>> res_nb1_nm.model.score(start_params)
array([  9.99998408e+01,   1.42300633e+02,   1.12974140e+02,
         1.47603268e+02,   1.73072264e+02,  -2.06114899e-07])

last number (alpha) looks wrong

@josef-pkt
Member

My guess was that there was a missing transformation, but it might only have a visible effect if alpha is large or small enough

reverse engineering:

with transform = False

>>> start_params=np.concatenate((res_ols.params, [np.log(50)]))
>>> res_nb1_nm.model.score(start_params)
array([ 121.81145261,  308.66268649,  173.2680627 ,  369.69518077,
        319.28101175,   10.03401603])
>>> nd.approx_fprime_cs(start_params, res_nb1_nm.model.loglike)
array([ 121.81145261,  308.66268649,  173.2680627 ,  369.69518077,
        319.28101175,  501.70080142])
>>> 501.70080142 / 50
10.0340160284

I can look into it

@jseabold
Member
jseabold commented Apr 2, 2014

Hmm, yeah. It looks like the dalpha calculation there isn't right.

@josef-pkt
Member

I haven't looked at the code yet.

If you see where we need to make the change, then make a PR with your guess and I go over it, and add a test case.

@jseabold
Member
jseabold commented Apr 2, 2014

It looks like you may have spotted it.

@jseabold
Member
jseabold commented Apr 2, 2014
[~/statsmodels/statsmodels-skipper/statsmodels/]
[38]: res_nb1_nm.model._transparams
[38]: False

[~/statsmodels/statsmodels-skipper/statsmodels/]
[39]: nd.approx_fprime_cs(start_params, res_nb1_nm.model.loglike)
[39]: 
array([ 121.81145261,  308.66268649,  173.2680627 ,  369.69518077,
        319.28101175,   10.03401603])

[~/statsmodels/statsmodels-skipper/statsmodels/]
[40]: res_nb1_nm.model.score(start_params)
[40]: 
array([ 121.81145261,  308.66268649,  173.2680627 ,  369.69518077,
        319.28101175,   10.03401603])
@jseabold
Member
jseabold commented Apr 2, 2014
[~/statsmodels/statsmodels-skipper/statsmodels/]
[41]: res_nb1_nm.model._transparams = True

[~/statsmodels/statsmodels-skipper/statsmodels/]
[42]: nd.approx_fprime_cs(start_params, res_nb1_nm.model.loglike)
[42]: 
array([ 100.        ,  142.30107798,  112.97458042,  147.60373115,
        173.07275067, -100.        ])

[~/statsmodels/statsmodels-skipper/statsmodels/]
[43]: res_nb1_nm.model.score(start_params)
[43]: 
array([ 100.        ,  142.30107798,  112.97458042,  147.60373115,
        173.07275067,   -0.        ])

Does look like it's the transparams somewhere after all.

@jseabold
Member
jseabold commented Apr 2, 2014

I should probably actually think through the code, but I'm kinda hoping it's just obvious to you what's going on

[~/statsmodels/statsmodels-skipper/statsmodels/]
[49]: score = res_nb1_nm.model.score(start_params)

[~/statsmodels/statsmodels-skipper/statsmodels/]
[50]: score[-1] * np.exp(start_params[-1])
[50]: -99.999999999999787

[~/statsmodels/statsmodels-skipper/statsmodels/]
[51]: res_nb1_nm.model._transparams
[51]: True
@jseabold
Member
jseabold commented Apr 2, 2014

I think the score needs the transformation for the derivative on the way back out

    if self._transparams:
        return np.r_[dparams.sum(0), dalpha*alpha]
    else:
        return np.r_[dparams.sum(0), dalpha]
@josef-pkt
Member

Yes, I can find it. There is just a alpha or 1./alpha missing in the chain rule if _transparams=True

I just have to look where to add it.

general aside: When I didn't want to hardcode the derivative w.r.t. one extra parameter, I mixed analytical derivatives for the mean parameters, with numerical derivative for the extra parameter. It use analytical for the easy part, and generic for the difficult part, or the part that was still changing.

@jseabold
Member
jseabold commented Apr 2, 2014

The example is sensitive to start values but looks better with the fix in #1540. You need to select an lnalpha >= ~0. We might consider adding this to the docs?

[~/statsmodels/statsmodels-skipper/statsmodels/discrete/tests/]
[34]: mod_nb1.fit(start_params=np.r_[res_ols.params, .99] , method='nm', maxiter=100000).summary()
Optimization terminated successfully.
        Current function value: 5.326293
        Iterations: 622
        Function evaluations: 951
[34]: 
<class 'statsmodels.iolib.summary.Summary'>
"""
                    NegativeBinomial Regression Results                      
==============================================================================
Dep. Variable:                      y   No. Observations:                  100
Model:               NegativeBinomial   Df Residuals:                       95
Method:                           MLE   Df Model:                            4
Date:                Wed, 02 Apr 2014   Pseudo R-squ.:                 0.01533
Time:                        14:02:56   Log-Likelihood:                -532.63
converged:                       True   LL-Null:                       -540.92
                                        LLR p-value:                  0.002325
==============================================================================
                coef    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
const          5.0918      0.301     16.929      0.000         4.502     5.681
x1             0.1401      0.059      2.366      0.018         0.024     0.256
x2             0.0911      0.054      1.672      0.094        -0.016     0.198
x3             0.1533      0.061      2.498      0.012         0.033     0.274
x4             0.1118      0.054      2.072      0.038         0.006     0.218
alpha       1348.9428    308.216      4.377      0.000       744.851  1953.035
==============================================================================
"""

[~/statsmodels/statsmodels-skipper/statsmodels/discrete/tests/]
[35]: mod_nb1.fit(start_params=np.r_[res_ols.params, .99], maxiter=1000).summary()
Optimization terminated successfully.
        Current function value: 5.326293
        Iterations: 55
        Function evaluations: 64
        Gradient evaluations: 64
[35]: 
<class 'statsmodels.iolib.summary.Summary'>
"""
                    NegativeBinomial Regression Results                      
==============================================================================
Dep. Variable:                      y   No. Observations:                  100
Model:               NegativeBinomial   Df Residuals:                       95
Method:                           MLE   Df Model:                            4
Date:                Wed, 02 Apr 2014   Pseudo R-squ.:                 0.01533
Time:                        14:03:05   Log-Likelihood:                -532.63
converged:                       True   LL-Null:                       -540.92
                                        LLR p-value:                  0.002325
==============================================================================
                coef    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
const          5.0918      0.301     16.928      0.000         4.502     5.681
x1             0.1401      0.059      2.366      0.018         0.024     0.256
x2             0.0911      0.054      1.673      0.094        -0.016     0.198
x3             0.1533      0.061      2.498      0.012         0.033     0.274
x4             0.1118      0.054      2.071      0.038         0.006     0.218
alpha       1348.9425    308.223      4.377      0.000       744.837  1953.048
==============================================================================
"""

[~/statsmodels/statsmodels-skipper/statsmodels/discrete/tests/]
[36]: mod_nb1.fit(start_params=np.r_[res_ols.params, .99], maxiter=1000, method='bfgs').summary()
Optimization terminated successfully.
        Current function value: 5.326293
        Iterations: 55
        Function evaluations: 64
        Gradient evaluations: 64
[36]: 
<class 'statsmodels.iolib.summary.Summary'>
"""
                    NegativeBinomial Regression Results                      
==============================================================================
Dep. Variable:                      y   No. Observations:                  100
Model:               NegativeBinomial   Df Residuals:                       95
Method:                           MLE   Df Model:                            4
Date:                Wed, 02 Apr 2014   Pseudo R-squ.:                 0.01533
Time:                        14:03:20   Log-Likelihood:                -532.63
converged:                       True   LL-Null:                       -540.92
                                        LLR p-value:                  0.002325
==============================================================================
                coef    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
const          5.0918      0.301     16.928      0.000         4.502     5.681
x1             0.1401      0.059      2.366      0.018         0.024     0.256
x2             0.0911      0.054      1.673      0.094        -0.016     0.198
x3             0.1533      0.061      2.498      0.012         0.033     0.274
x4             0.1118      0.054      2.071      0.038         0.006     0.218
alpha       1348.9425    308.223      4.377      0.000       744.837  1953.048
==============================================================================
"""
@josef-pkt
Member

I'd like to add better default starting values to Poisson and similar.

The numbers look a bit strange, but everything looks fine now. I think the seed is not very good in this case.

>>> y.max()
28131.193260885393

My guess is that this dataset would make a good case for robust poisson or negbin regression. Because of the fat tails we can get results that are not close to the DGP.
My previous example without seed looked better.

nb2 parameter estimates are closer to the true, DGP parameters.

None of the models is correctly specified for the DGP.

@jseabold jseabold closed this in #1540 Apr 2, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment