File 01-contingency_table_solution.py


Michel Bierlaire

Wed Aug 7 18:03:43 2024



In [None]:

import biogeme.database as db
import biogeme.biogeme as bio
from IPython.core.display_functions import display
from biogeme.expressions import Beta, log, Variable
import pandas as pd


We consider the data set where we have collected data per income category, coded as
follows: 1: low, 2: medium, 3: high.

In [None]:
income_data = pd.DataFrame(
    {
        'Income': [1, 1, 2, 2, 3, 3],
        'Electric': [1, 0, 1, 0, 1, 0],
        'Total': [15, 200, 50, 450, 135, 150],
    }
)


1. Estimate the parameters of the  model predicting the choice of
electrical vehicle as a function of income.

2. Consider a scenario where  the
income distribution is as follows: 7.5% of the population
with low income, 40% of the population with medium income
and 52.5% of the population with high income. Use the
estimated model to forecast the market share of electric
vehicles under this scenario.

We proceed in the same way. We first import the data into the Biogeme database.

In [None]:
database_income = db.Database('contingency_income', income_data)


We define the variables.

In [None]:
Income = Variable('Income')
Electric = Variable('Electric')
Total = Variable('Total')


We define the parameters to be estimated.

In [None]:
pi1 = Beta('pi1', 0.5, 0, 1, 0)
pi2 = Beta('pi2', 0.5, 0, 1, 0)
pi3 = Beta('pi3', 0.5, 0, 1, 0)
pi = (Income == 1) * pi1 + (Income == 2) * pi2 + (Income == 3) * pi3


We define the contribution of each row to the log likelihood function.

In [None]:
loglike = Total * log(pi) * (Electric == 1) + Total * log(1 - pi) * (
    Electric == 0
)


We provide the database and the expression to the Biogeme object.

In [None]:
biogeme = bio.BIOGEME(database_income, loglike)
biogeme.modelName = 'contingency_income'


We run the estimation.

In [None]:
results = biogeme.estimate()


We retrieve the estimated parameters.

In [None]:
pandas_results = results.get_estimated_parameters()
display(pandas_results)


And, finally, we use the estimated model to predict.

In [None]:
pi1_estimate = pandas_results['Value']['pi1']
pi2_estimate = pandas_results['Value']['pi2']
pi3_estimate = pandas_results['Value']['pi3']

market_share = (
    pi1_estimate * 0.075 + pi2_estimate * 0.40 + pi3_estimate * 0.525
)
print(f'Market share future scenario: {100*market_share:.3g}%')