# Zero-inflated Negative Binomial Model Test
This shows that stan and statsmodels have very close results on a simulation dataset provided that statsmodels converges. The first example also cross validates both stan and statsmodels implemetations. However, with slight changes of simulation data parameters, statsmodels fails to return results.

In [1]:
import sys
sys.path.append("..")
from models.zinb import ZINB
sys.path.remove("..")

import numpy as np

In [2]:
from scipy.stats import uniform, binom, nbinom,bernoulli
import statsmodels.api as sm

np.random.seed(1)                 # set seed to replicate example
nobs= 25000                          # number of obs in model 

x1 = binom.rvs(1, 0.6, size=nobs)   # categorical explanatory variable
x2 = uniform.rvs(size=nobs)         # real explanatory variable

theta = 0.5
X = sm.add_constant(np.column_stack((x1, x2)))
beta = [1.0, 0.8, -0.5]
xb = np.dot(X, beta)          # linear predictor

exb = np.exp(xb)

xc = 1.6
exc = 1.0 / (1.0 + np.exp(-xc))

p = bernoulli.rvs(exc, size=(nobs,1))

nby = nbinom.rvs(exb, theta).reshape((-1,1))*p
X_infl=np.ones((nobs,1))

In [3]:
mod = ZINB(nby,X,exog_infl=X_infl,model_path='../models')

In [4]:
res0=mod.fit(method='stan')[0]
res1=mod.fit(method='statsmodels')[0]

true value of parameters

In [5]:
[-xc]+beta+[theta]

[-1.6, 1.0, 0.8, -0.5, 0.5]

In [6]:
res0

{'params': array([-1.5140275 ,  1.012603  ,  0.77146666, -0.45672494,  1.38959896]),
 'llf': -54171.74256677648,
 'df': 5,
 'aic': 108353.48513355295,
 'cpu_time': 0.30619287490844727,
 'model': 'zinb',
 'method': 'stan'}

In [7]:
res1

{'params': array([-1.51405322,  1.01260657,  0.77150181, -0.45679195,  0.24917075]),
 'llf': -54171.74256465311,
 'df': 5,
 'aic': 108353.48512930622,
 'cpu_time': 1.131302833557129,
 'model': 'zinb',
 'method': 'statsmodels'}

In [8]:
np.exp(res0['params'][-1])*res1['params'][-1]

0.9999821084306637

In [9]:
res0['llf']-res1['llf']

-2.1233645384199917e-06

stan and statsmodels return almost identical results on the simulation data provided that statsmodels converges.

In [10]:
res2=mod.fit(method='tensorflow')[0]

Metal device set to: Apple M2


In [11]:
res2

{'params': array([-1.514226  ,  1.0128202 ,  0.77172554, -0.45652267,  1.3894166 ],
       dtype=float32),
 'llf': -54171.73121811506,
 'aic': 108353.46243623013,
 'df': 5,
 'cpu_time': 2.5477070808410645,
 'model': 'zinb',
 'method': 'tensorflow'}

In [12]:
res0['llf']-res2['llf']

-0.011348661413649097

TensorZINB returns the same results as stan

In [13]:
from scipy.stats import uniform, binom, nbinom,bernoulli
import statsmodels.api as sm

np.random.seed(1)                 # set seed to replicate example
nobs= 25000                          # number of obs in model 

x1 = binom.rvs(1, 0.6, size=nobs)   # categorical explanatory variable
x2 = uniform.rvs(size=nobs)         # real explanatory variable

theta = 0.5
X = sm.add_constant(np.column_stack((x1, x2)))
beta = [1.0, 2.0, -1.5]
xb = np.dot(X, beta)          # linear predictor

exb = np.exp(xb)

xc = 1.2
exc = 1.0 / (1.0 + np.exp(-xc))

p = bernoulli.rvs(exc, size=(nobs,1))

nby = nbinom.rvs(exb, theta).reshape((-1,1))*p
X_infl=np.ones((nobs,1))

In [14]:
mod = ZINB(nby,X,exog_infl=X_infl,model_path='../models')

In [15]:
res0=mod.fit(method='stan')[0]
res1=mod.fit(method='statsmodels')[0]

In [16]:
[-xc]+beta+[theta]

[-1.2, 1.0, 2.0, -1.5, 0.5]

In [17]:
res0

{'params': array([-0.99293726,  1.11499893,  1.88851324, -1.4900054 , 23.40440002]),
 'llf': -56680.87932371076,
 'df': 5,
 'aic': 113371.75864742152,
 'cpu_time': 0.24606108665466309,
 'model': 'zinb',
 'method': 'stan'}

In [18]:
res1

{'params': array([-415.77309263,  -83.61421954, -185.98976246,  -56.76172747,
        -402.59494368]),
 'llf': nan,
 'df': 5,
 'aic': nan,
 'cpu_time': 3.7377138137817383,
 'model': 'zinb',
 'method': 'statsmodels'}

We just slightly change the coefficient. statsmodels does not converge and stan still return parameters.

In [19]:
res2=mod.fit(method='tensorflow')[0]

In [20]:
res2

{'params': array([-1.0538806,  1.0795665,  1.9260335, -1.4970527,  2.2396865],
       dtype=float32),
 'llf': -54906.80794988855,
 'aic': 109823.6158997771,
 'df': 5,
 'cpu_time': 0.8851919174194336,
 'model': 'zinb',
 'method': 'tensorflow'}

In [21]:
res2['llf']-res0['llf']

1774.0713738222112

<!-- TensorZINB returns the same results as stan -->
TensorZINB llf is higher than stan indicating stan converges to a local optimum.

In [22]:
res3=mod.fit(method='stan',start_params=res2['params'])[0]

start_params = res2['params']
start_params[-1]=np.exp(-start_params[-1])
res4=mod.fit(method='statsmodels',start_params=start_params)[0]

In [23]:
res3

{'params': array([-1.05385225,  1.07903573,  1.92554664, -1.49730359,  2.24009856]),
 'llf': -54906.77331677281,
 'df': 5,
 'aic': 109823.54663354561,
 'cpu_time': 0.0932009220123291,
 'model': 'zinb',
 'method': 'stan'}

When stan is initialized with TensorZINB results, stan can converge to a higher likelihood which again indicates stan converges to a local optimum.

In [24]:
start_params

array([-1.0538806 ,  1.0795665 ,  1.9260335 , -1.4970527 ,  0.10649189],
      dtype=float32)

In [25]:
res4

{'params': array([-1.05345518,  1.07908155,  1.92561293, -1.49752726,  0.10644205]),
 'llf': -54906.77287275191,
 'df': 5,
 'aic': 109823.54574550383,
 'cpu_time': 0.6428148746490479,
 'model': 'zinb',
 'method': 'statsmodels'}

When statsmodels is initialized with TensorZINB results, statsmodels can return results which again indicates statsmodels has numerical instability issues in general.

## Test Tensorflow init

In [26]:
mod = ZINB(np.concatenate((nby,nby),axis=1),X,exog_infl=X_infl,model_path='../models')

In [27]:
res6=mod.fit(method='tensorflow',start_params=res2['params'])
res6

[{'params': array([-1.0536436,  1.0793037,  1.9258136, -1.4972854,  2.239911 ],
        dtype=float32),
  'llf': -54907.19857488814,
  'aic': 109824.39714977628,
  'df': 5,
  'cpu_time': 0.7104119062423706,
  'model': 'zinb',
  'method': 'tensorflow'},
 {'params': array([-1.0536436,  1.0793037,  1.9258136, -1.4972854,  2.239911 ],
        dtype=float32),
  'llf': -54907.19857488814,
  'aic': 109824.39714977628,
  'df': 5,
  'cpu_time': 0.7104119062423706,
  'model': 'zinb',
  'method': 'tensorflow'}]

In [28]:
res7=mod.fit(method='tensorflow',start_params=[r['params'] for r in res6])
res7

[{'params': array([-1.0537659,  1.0794762,  1.9259092, -1.4971656,  2.2398   ],
        dtype=float32),
  'llf': -54907.41732488814,
  'aic': 109824.83464977628,
  'df': 5,
  'cpu_time': 0.20449388027191162,
  'model': 'zinb',
  'method': 'tensorflow'},
 {'params': array([-1.0537659,  1.0794762,  1.9259092, -1.4971656,  2.2398   ],
        dtype=float32),
  'llf': -54907.41732488814,
  'aic': 109824.83464977628,
  'df': 5,
  'cpu_time': 0.20449388027191162,
  'model': 'zinb',
  'method': 'tensorflow'}]