# Module biogeme.results

## Examples of use of each function

This webpage is for programmers who need examples of use of the functions of the class. The examples are designed to illustrate the syntax. They do not correspond to any meaningful model. For examples of models, visit  [biogeme.epfl.ch](http://biogeme.epfl.ch).

In [1]:
import datetime
print(datetime.datetime.now())

2023-08-04 18:40:49.146423


In [2]:
import biogeme.version as ver
print(ver.getText())

biogeme 3.2.12 [2023-08-04]
Home page: http://biogeme.epfl.ch
Submit questions to https://groups.google.com/d/forum/biogeme
Michel Bierlaire, Transport and Mobility Laboratory, Ecole Polytechnique Fédérale de Lausanne (EPFL)



In [3]:
import numpy as np
import pandas as pd

In [4]:
import biogeme.biogeme as bio
import biogeme.database as db
import biogeme.results as res
from biogeme.expressions import Beta, Variable, exp

##  Definition of a database

In [5]:
df = pd.DataFrame({'Person': [1, 1, 1, 2, 2],
                   'Exclude': [0, 0, 1, 0, 1],
                   'Variable1': [1, 2, 3, 4, 5],
                   'Variable2': [10, 20, 30,40, 50],
                   'Choice': [1, 2, 3, 1, 2],
                   'Av1': [0, 1, 1, 1, 1],
                   'Av2': [1, 1, 1, 1, 1],
                   'Av3': [0, 1, 1, 1, 1]})
myData = db.Database('test', df)

## Definition of various expressions

In [6]:
Variable1 = Variable('Variable1')
Variable2 = Variable('Variable2')
beta1 = Beta('beta1', -1.0, -3, 3, 0)
beta2 = Beta('beta2', 2.0, -3, 10, 0)
likelihood = -beta1**2 * Variable1 - exp(beta2 * beta1) * \
    Variable2 - beta2**4
simul = beta1 / Variable1 + beta2 / Variable2
dictOfExpressions = {'loglike':likelihood,
                     'beta1':beta1,
                     'simul':simul}

## Creation of the BIOGEME object

In [7]:
myBiogeme = bio.BIOGEME(myData, dictOfExpressions)
myBiogeme.modelName = 'simpleExample'
myBiogeme.bootstrap_samples = 10
results = myBiogeme.estimate(run_bootstrap=True)
print(results)

  0%|                                                                                                                                                                                                                                                                                                                                                | 0/10 [00:00<?, ?it/s]

100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 600.81it/s]


Results for model simpleExample
Output file (HTML):			simpleExample.html
Nbr of parameters:		2
Sample size:			5
Excluded data:			0
Init log likelihood:		-115.3003
Final log likelihood:		-67.06549
Likelihood ratio test (init):		96.4696
Rho square (init):			0.418
Rho bar square (init):			0.401
Akaike Information Criterion:	138.131
Bayesian Information Criterion:	137.3499
Final gradient norm:		1.152097e-07
beta1          : -1.27[0.115 -11.1 0][0.0137 -92.8 0][0.0181 -70.3 0]
beta2          : 1.25[0.0848 14.7 0][0.0591 21.1 0][0.0809 15.4 0]
('beta2', 'beta1'):	0.00167	0.171	19.3	0	0.000811	1	55.6	0






Dump results on a file

In [8]:
the_pickle_file = results.writePickle()
print(the_pickle_file)

simpleExample~00.pickle


Results can be imported from a file previously generated

In [9]:
readResults = res.bioResults(pickleFile=the_pickle_file)
print(readResults)


Results for model simpleExample
Output file (HTML):			simpleExample.html
Nbr of parameters:		2
Sample size:			5
Excluded data:			0
Init log likelihood:		-115.3003
Final log likelihood:		-67.06549
Likelihood ratio test (init):		96.4696
Rho square (init):			0.418
Rho bar square (init):			0.401
Akaike Information Criterion:	138.131
Bayesian Information Criterion:	137.3499
Final gradient norm:		1.152097e-07
beta1          : -1.27[0.115 -11.1 0][0.0137 -92.8 0][0.0181 -70.3 0]
beta2          : 1.25[0.0848 14.7 0][0.0591 21.1 0][0.0809 15.4 0]
('beta2', 'beta1'):	0.00167	0.171	19.3	0	0.000811	1	55.6	0



Results can be formatted in LaTeX

In [10]:
print(readResults.getLaTeX())

%% This file is designed to be included into a LaTeX document
%% See http://www.latex-project.org for information about LaTeX
%% simpleExample - Report from biogeme 3.2.12 [2023-08-04]
%% biogeme 3.2.12 [2023-08-04]
%% Version entirely written in Python
%% Home page: http://biogeme.epfl.ch
%% Submit questions to https://groups.google.com/d/forum/biogeme
%% Michel Bierlaire, Transport and Mobility Laboratory, Ecole Polytechnique Fédérale de Lausanne (EPFL)

%% This file has automatically been generated on 2023-08-04 18:40:49.872066</p>

%%Database name: test

%% General statistics
\section{General statistics}
\begin{tabular}{ll}
Number of estimated parameters & 2 \\
Sample size & 5 \\
Excluded observations & 0 \\
Init log likelihood & -115.3003 \\
Final log likelihood & -67.06549 \\
Likelihood ratio test for the init. model & 96.4696 \\
Rho-square for the init. model & 0.418 \\
Rho-square-bar for the init. model & 0.401 \\
Akaike Information Criterion & 138.131 \\
Bayesian Information C

Results can be formatted in HTML

In [11]:
print(readResults.getHtml())

<html>
<head>
<script src="http://transp-or.epfl.ch/biogeme/sorttable.js"></script>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>simpleExample - Report from biogeme 3.2.12 [2023-08-04]</title>
<meta name="keywords" content="biogeme, discrete choice, random utility">
<meta name="description" content="Report from biogeme 3.2.12 [2023-08-04]">
<meta name="author" content="{bv.author}">
<style type=text/css>
.biostyle
	{font-size:10.0pt;
	font-weight:400;
	font-style:normal;
	font-family:Courier;}
.boundstyle
	{font-size:10.0pt;
	font-weight:400;
	font-style:normal;
	font-family:Courier;
        color:red}
</style>
</head>
<body bgcolor="#ffffff">
<p>biogeme 3.2.12 [2023-08-04]</p>
<p><a href="https://www.python.org/" target="_blank">Python</a> package</p>
<p>Home page: <a href="http://biogeme.epfl.ch" target="_blank">http://biogeme.epfl.ch</a></p>
<p>Submit questions to <a href="https://groups.google.com/d/forum/biogeme" target="_blank">https://groups.googl

General statistics, including a suggested format.

In [12]:
statistics = readResults.getGeneralStatistics()
statistics

{'Number of estimated parameters': GeneralStatistic(value=2, format=''),
 'Sample size': GeneralStatistic(value=5, format=''),
 'Excluded observations': GeneralStatistic(value=0, format=''),
 'Init log likelihood': GeneralStatistic(value=-115.30029248549191, format='.7g'),
 'Final log likelihood': GeneralStatistic(value=-67.06549047946355, format='.7g'),
 'Likelihood ratio test for the init. model': GeneralStatistic(value=96.46960401205672, format='.7g'),
 'Rho-square for the init. model': GeneralStatistic(value=0.4183406734384276, format='.3g'),
 'Rho-square-bar for the init. model': GeneralStatistic(value=0.40099466366788294, format='.3g'),
 'Akaike Information Criterion': GeneralStatistic(value=138.1309809589271, format='.7g'),
 'Bayesian Information Criterion': GeneralStatistic(value=137.3498567837953, format='.7g'),
 'Final gradient norm': GeneralStatistic(value=1.1520974391720636e-07, format='.4E'),
 'Bootstrapping time': GeneralStatistic(value=datetime.timedelta(microseconds=794

The suggested format can be used as follows

for k, (v, p) in statistics.items():
    print(f'{k}:\t{v:{p}}')

This result can be generated directly with the following function

In [13]:
print(results.printGeneralStatistics())

Number of estimated parameters:	2
Sample size:	5
Excluded observations:	0
Init log likelihood:	-115.3003
Final log likelihood:	-67.06549
Likelihood ratio test for the init. model:	96.4696
Rho-square for the init. model:	0.418
Rho-square-bar for the init. model:	0.401
Akaike Information Criterion:	138.131
Bayesian Information Criterion:	137.3499
Final gradient norm:	1.1521E-07
Bootstrapping time:	0:00:00.079456
Nbr of threads:	12



Estimated parameters as pandas dataframe

In [14]:
readResults.getEstimatedParameters()

Unnamed: 0,Value,Rob. Std err,Rob. t-test,Rob. p-value
beta1,-1.273264,0.013724,-92.776664,0.0
beta2,1.248769,0.059086,21.134794,0.0


Correlation results

In [15]:
readResults.getCorrelationResults()

Unnamed: 0,Covariance,Correlation,t-test,p-value,Rob. cov.,Rob. corr.,Rob. t-test,Rob. p-value,Boot. cov.,Boot. corr.,Boot. t-test,Boot. p-value
beta2-beta1,0.001671,0.171121,19.280039,0.0,0.000811,1.0,55.597975,0.0,0.001464,0.999779,40.189571,0.0


Obtain the values of the parameters

In [16]:
readResults.getBetaValues()

{'beta1': -1.273263987213694, 'beta2': 1.2487688099301162}

In [17]:
readResults.getBetaValues(myBetas=['beta2'])

{'beta2': 1.2487688099301162}

Variance-covariance matrix (Rao-Cramer)

In [18]:
readResults.getVarCovar()

Unnamed: 0,beta1,beta2
beta1,0.013258,0.001671
beta2,0.001671,0.007196


Variance-covariance matrix (robust)

In [19]:
readResults.getRobustVarCovar()

Unnamed: 0,beta1,beta2
beta1,0.000188,0.000811
beta2,0.000811,0.003491


Variance-covaraince matrix (bootstrap)

In [20]:
readResults.getBootstrapVarCovar()

Unnamed: 0,beta1,beta2
beta1,0.000328,0.001464
beta2,0.001464,0.006539


Draws for sensitivity analysis are generated using bootstrapping. Any indicator can be generated by the model for each draw, and its empirical distribution can be investigate . 

In [21]:
readResults.getBetasForSensitivityAnalysis(['beta1', 'beta2'],
                                           size=10)

[{'beta1': -1.2925578214686664, 'beta2': 1.164322217510277},
 {'beta1': -1.282393736283639, 'beta2': 1.209195603860136},
 {'beta1': -1.2611085813010012, 'beta2': 1.3007517002098443},
 {'beta1': -1.2649797742013058, 'beta2': 1.2842631765105266},
 {'beta1': -1.2649797742013058, 'beta2': 1.2842631765105266},
 {'beta1': -1.273263987213694, 'beta2': 1.2487688099301162},
 {'beta1': -1.317066045664775, 'beta2': 1.0497316460596449},
 {'beta1': -1.2777126116342756, 'beta2': 1.2295568140731905},
 {'beta1': -1.3040075416731258, 'beta2': 1.1122455742287705},
 {'beta1': -1.2777126116342756, 'beta2': 1.2295568140731905}]

Results can be produced in the ALOGIT F12 format

In [22]:
readResults.getF12()

'                                                                  simpleExample\nFrom biogeme 3.2.12                                     2023-08-04 18:40:49  \nEND\n   0      beta1 F  -1.273263987214e+00 +1.372396817719e-02\n   0      beta2 F  +1.248768809930e+00 +5.908592246840e-02\n  -1\n       5                  0                   0 -6.706549047946e+01\n   0   0  2023-08-04 18:40:49\n  99999\n'

# Miscellaneous functions

## Likelihood ratio test

Let's first estimate a constrained model

In [23]:
beta2_constrained = Beta('beta2_constrained', 2.0, -3, 10, 1)
likelihood_constrained = -beta1**2 * Variable1 - exp(beta2_constrained * beta1) * \
    Variable2 - beta2_constrained**4
myBiogemeConstrained = bio.BIOGEME(myData, likelihood_constrained)
myBiogemeConstrained.modelName = 'simpleExampleConstrained'
results_constrained = myBiogemeConstrained.estimate()
print(results_constrained.shortSummary())

The syntax "shortSummary" is deprecated and is replaced by the syntax "short_summary".


Results for model simpleExampleConstrained
Nbr of parameters:		1
Sample size:			5
Excluded data:			0
Final log likelihood:		-114.7702
Akaike Information Criterion:	231.5403
Bayesian Information Criterion:	231.1498



We can now perform a likelihood ratio test.

In [24]:
results.likelihood_ratio_test(results_constrained, 0.95)

LRTuple(message='H0 can be rejected at level 95.0%', statistic=95.40936413216019, threshold=0.003932140000019531)

## Calculation of the $p$-value

In [25]:
res.calcPValue(1.96)

0.04999579029644097

# Compilation of results

In [26]:
dict_of_results = {'Model A': readResults, 'Model B': the_pickle_file}

In [27]:
df = res.compileEstimationResults(dict_of_results)

The syntax "compileEstimationResults" is deprecated and is replaced by the syntax "compile_estimation_results".


In [28]:
df

(                                       Model A         Model B
 Number of estimated parameters               2               2
 Sample size                                  5               5
 Final log likelihood                 -67.06549       -67.06549
 Akaike Information Criterion        138.130981      138.130981
 Bayesian Information Criterion      137.349857      137.349857
 beta1 (t-test)                  -1.27  (-92.8)  -1.27  (-92.8)
 beta2 (t-test)                    1.25  (21.1)    1.25  (21.1),
 {'Model A': 'Model A', 'Model B': 'Model B'})

# Covariance and correlation between two alternatives of a cross-nested logit model


First, we document the special case of a logit model.

In [29]:
mu_nest_1 = 1.0
alphas_1 =  {'i': 1, 'j': 1}
nest_1 = mu_nest_1, alphas_1
mu_nest_2 = 1.0
alphas_2 = {'j': 0.0, 'k': 1, 'm': 1}
nest_2 = mu_nest_2, alphas_2
nests = nest_1, nest_2

In [30]:
res.correlation_cross_nested(nests)

Unnamed: 0,i,j,k,m
i,1.0,8.237105e-12,8.237105e-12,8.237105e-12
j,8.237105e-12,1.0,8.237105e-12,8.237105e-12
k,8.237105e-12,8.237105e-12,1.0,8.237105e-12
m,8.237105e-12,8.237105e-12,8.237105e-12,1.0


Entries of the covariance matrix can also be obtained. Here, we report the variance for alternative `i`.

In [31]:
res.covariance_cross_nested('i', 'i', nests)

1.6449340668482264

It is $\pi^2/6$.

In [32]:
np.pi**2 / 6

1.6449340668482264

Second, a nested logit model

In [33]:
mu_nest_1 = 1.5
alphas_1 =  {'i': 1, 'j': 1}
nest_1 = mu_nest_1, alphas_1
mu_nest_2 = 2.0
alphas_2 = {'j': 0.0, 'k': 1, 'm': 1}
nest_2 = mu_nest_2, alphas_2
nests = nest_1, nest_2

Theoretical value for the correlation

In [34]:
correl_nest_1 = 1 - (1 / mu_nest_1**2)
correl_nest_1

0.5555555555555556

In [35]:
correl_nest_2 = 1 - (1 / mu_nest_2**2)
correl_nest_2

0.75

In [36]:
res.correlation_cross_nested(nests)

  quad_r = quad(f, low, high, args=args, full_output=self.full_output,


  in the extrapolation table.  It is assumed that the requested tolerance
  cannot be achieved, and that the returned result (if full_output = 1) is 
  the best which can be obtained.
  quad_r = quad(f, low, high, args=args, full_output=self.full_output,


Unnamed: 0,i,j,k,m
i,1.0,0.5555556,8.248444e-12,8.248444e-12
j,0.5555556,1.0,8.248444e-12,8.248444e-12
k,8.248444e-12,8.248444e-12,1.0,0.75
m,8.248444e-12,8.248444e-12,0.75,1.0


Finally, a cross-nested logit model

In [37]:
mu_nest_1 = 1.5
alphas_1 =  {'i': 1, 'j': 0.5}
nest_1 = mu_nest_1, alphas_1
mu_nest_2 = 2.0
alphas_2 = {'j': 0.5, 'k': 1, 'm': 1}
nest_2 = mu_nest_2, alphas_2
nests = nest_1, nest_2

In [38]:
res.correlation_cross_nested(nests)

  quad_r = quad(f, low, high, args=args, full_output=self.full_output,


  in the extrapolation table.  It is assumed that the requested tolerance
  cannot be achieved, and that the returned result (if full_output = 1) is 
  the best which can be obtained.
  quad_r = quad(f, low, high, args=args, full_output=self.full_output,


Unnamed: 0,i,j,k,m
i,1.0,0.37618,8.248444e-12,8.248444e-12
j,0.3761799,1.0,0.5,0.5
k,8.248444e-12,0.5,1.0,0.75
m,8.248444e-12,0.5,0.75,1.0
