# class Expressions: examples of use of each function

This webpage is for programmers who need examples of use of the functions of the class. The examples are designed to illustrate the syntax. They do not correspond to any meaningful model. For examples of models, visit  [biogeme.epfl.ch](http://biogeme.epfl.ch).

In [1]:
import biogeme.version as ver
print(ver.getText())

biogeme 3.1.3beta [March 14, 2019]
Version entirely written in Python
Home page: http://biogeme.epfl.ch
Submit questions to https://groups.google.com/d/forum/biogeme
Michel Bierlaire, Transport and Mobility Laboratory, Ecole Polytechnique Fédérale de Lausanne (EPFL)



In [2]:
import pandas as pd
import biogeme.expressions as ex
import biogeme.database as db

We first create a small database

In [3]:
df = pd.DataFrame({'Person':[1,1,1,2,2],
                   'Exclude':[0,0,1,0,1],
                   'Variable1':[10,20,30,40,50],
                   'Variable2':[100,200,300,400,500],
                   'Choice':[1,2,3,1,2],
                   'Av1':[0,1,1,1,1],
                   'Av2':[1,1,1,1,1],
                   'Av3':[0,1,1,1,1]})
myData = db.Database('test',df)

The following type of expression is a literal called Variable that corresponds to an entry in the database.

In [4]:
Person=ex.Variable('Person')
Variable1=ex.Variable('Variable1')
Variable2=ex.Variable('Variable2')
Choice=ex.Variable('Choice')
Av1=ex.Variable('Av1')
Av2=ex.Variable('Av2')
Av3=ex.Variable('Av3')

It is possible to add a new column to thre database, that creates a new variable that can be used in expressions.

In [5]:
newvar = ex.DefineVariable('newvar',Variable1+Variable2,myData)
print(myData)

biogeme database test:
   Person  Exclude  Variable1  Variable2  Choice  Av1  Av2  Av3  newvar
0       1        0         10        100       1    0    1    0     110
1       1        0         20        200       2    1    1    1     220
2       1        1         30        300       3    1    1    1     330
3       2        0         40        400       1    1    1    1     440
4       2        1         50        500       2    1    1    1     550


The following type of expression is another literal, corresponding to an unknown parameter. 

In [6]:
beta1 = ex.Beta('beta1',1,None,None,0)
beta2 = ex.Beta('beta2',2,None,None,0)
beta3 = ex.Beta('beta3',3,None,None,1)
beta4 = ex.Beta('beta4',2,None,None,1)

Arithmetic operators are overloaded to allow standard manipulations of expressions. The first expression is $$e_1 = 2  \beta_1 - \frac{\exp(-\beta_2)}{\beta_3 (\beta_2 \geq \beta_1)},$$
where $(\beta_2 \geq \beta_1)$ equals 1 if $\beta_2 \geq \beta_1$ and 0 otherwise.

In [7]:
expr1 = 2 * beta1 - ex.exp(-beta2) / (beta3 * (beta2 >= beta1))
print(expr1)

((`2` * beta1(1)) - (exp((-beta2(2))) / (beta3(3) * (beta2(2) >= beta1(1)))))


The evaluation of expressions can be done in two ways. For simple expressions, the fonction getValue(), implemented in Python, returns the value of the expression.  

In [8]:
expr1.getValue()

1.954888238921129

The function getValue_c() is implemented in C++, and works for any expression. It requires a database as input, and evaluates the expression for each entry in the database.
In the following example, as no variable of the database is involved in the expression, the output of the expression is the same for each entry.

In [9]:
expr1.getValue_c(myData)

Signature: [b'<Numeric>{4622513992},2', b'<Beta>{4622514104}"beta1"[0],0,0', b'<Times>{4622513656}(2),4622513992,4622514104', b'<Beta>{4622514608}"beta2"[0],1,1', b'<UnaryMinus>{4622515728}(1),4622514608', b'<exp>{4622514552}(1),4622515728', b'<Beta>{4622515056}"beta3"[1],2,0', b'<Beta>{4622514608}"beta2"[0],1,1', b'<Beta>{4622514104}"beta1"[0],0,0', b'<GreaterOrEqual>{4622513936}(2),4622514608,4622514104', b'<Times>{4622513880}(2),4622515056,4622513936', b'<Divide>{4622514720}(2),4622514552,4622513880', b'<Minus>{4622514664}(2),4622513656,4622514720']


[1.954888238921129,
 1.954888238921129,
 1.954888238921129,
 1.954888238921129,
 1.954888238921129]

The following function scans the expression and extracts a dict with all free parameters.

In [10]:
expr1.setOfBetas()

{'beta1', 'beta2'}

Options can be set to extract free parameters, fixed parameters, or both. 

In [11]:
expr1.setOfBetas(free=False,fixed=True)

{'beta3'}

In [12]:
expr1.setOfBetas(free=True,fixed=True)

{'beta1', 'beta2', 'beta3'}

In [13]:
expr1.getElementaryExpression('beta2')

Let's consider an expression involving two variables $V_1$ and $V_2$: $$e_2 =2 \beta_1  V_1 - \frac{\exp(-\beta_2 V_2) }{ \beta_3  (\beta_2 \geq \beta_1)}.$$ Note that, in our example, the second term is numerically negligible with respect to the first one.

In [14]:
expr2 = 2 * beta1 * Variable1 - ex.exp(-beta2*Variable2) / (beta3 * (beta2 >= beta1))
print(expr2)

(((`2` * beta1(1)) * Variable1) - (exp(((-beta2(2)) * Variable2)) / (beta3(3) * (beta2(2) >= beta1(1)))))


It is not a simple expression anymore, and only the function getValue_c can be invoked.

In [15]:
expr2.getValue_c(myData)

Signature: [b'<Numeric>{4622747688},2', b'<Beta>{4622514104}"beta1"[0],0,0', b'<Times>{4622747632}(2),4622747688,4622514104', b'<Variable>{4620195992}"Variable1",5,2', b'<Times>{4622747744}(2),4622747632,4620195992', b'<Beta>{4622514608}"beta2"[0],1,1', b'<UnaryMinus>{4622747800}(1),4622514608', b'<Variable>{4620196104}"Variable2",6,3', b'<Times>{4622747856}(2),4622747800,4620196104', b'<exp>{4622747912}(1),4622747856', b'<Beta>{4622515056}"beta3"[1],2,0', b'<Beta>{4622514608}"beta2"[0],1,1', b'<Beta>{4622514104}"beta1"[0],0,0', b'<GreaterOrEqual>{4622747968}(2),4622514608,4622514104', b'<Times>{4622748024}(2),4622515056,4622747968', b'<Divide>{4622748080}(2),4622747912,4622748024', b'<Minus>{4622748136}(2),4622747744,4622748080']


[20.0, 40.0, 60.0, 80.0, 100.0]

The following function extracts the names of the parameters apprearing in the expression

In [18]:
expr2.setOfBetas()

{'beta1', 'beta2'}

The list of parameters can be obtained in the form of a dictionary.

In [19]:
expr2.dictOfBetas(free=True,fixed=True)

{'beta1': beta1(1), 'beta2': beta2(2), 'beta3': beta3(3)}

Expressions are defined recursively, using a tree representation. The following function describes the type of the upper most node of the tree.

In [None]:
expr2.getClassName()

The signature is a formal representation of the expression, assigning identifiers to each node of the tree, and representing them starting from the leaves. It is easy to parse, and is passed to the C++ implementation. 

In [None]:
expr2.getSignature()

Monte Carlo integration is based on draws. 

In [None]:
myDraws = ex.bioDraws('myDraws','UNIFORM')
expr3 = ex.MonteCarlo(myDraws*myDraws)

In [None]:
print(expr3)

Note that draws are not literals. 

In [None]:
expr3.setOfLiterals()

...and are not random variables, used for numerical integration.

In [None]:
expr3.dictOfRandomVariables()

The following function reports the draws involved in an expression.

In [None]:
expr3.getDraws()

The expression is a Monte-Carlo integration.

In [None]:
expr3.getClassName()

Here is its signature.

In [None]:
expr3.getSignature()

... and its value. It is an approximation of $\int_0^1 x^2 dx=\frac{1}{3}$.

In [None]:
expr3.getValue_c(myData,numberOfDraws=100000)

The same integral can be calculated using numerical integration, declaring a random variable. 

In [None]:
omega = ex.RandomVariable('omega')

Numerical integration calculates integrals between $-\infty$ and $+\infty$. Here, the interval being $[0,1]$, a change of variables is required.

In [None]:
a = 0
b = 1
x = a + (b-a) / ( 1 + ex.exp(-omega))
dx = (b-a) * ex.exp(-omega) * (1+ex.exp(-omega))**(-2) 
integrand = x * x
expr4 = ex.Integrate(integrand * dx /(b-a),'omega')

In this case, omega is a literal.

In [None]:
expr4.setOfLiterals()

In [None]:
expr4.dictOfRandomVariables()

Calculating its value requires the C++ implementation.

In [None]:
expr4.getValue_c(myData)

We illustrate now the Elem function. It takes two arguments: a dictionary, and a formula for the key. For each entry in the database, the formula is evaluated, and its result identifies which formula in the dictionary should be evaluated.
Here is 'Person' is 1, the expression is $$e_1=2  \beta_1 - \frac{\exp(-\beta_2)}{\beta_3 (\beta_2 \geq \beta_1)},$$ and if 'Person' is 2, the expression is $$e_2=2 \beta_1  V_1 - \frac{\exp(-\beta_2 V_2) }{ \beta_3  (\beta_2 \geq \beta_1)}.$$ As it is a regular expression, it can be included in any formula. Here, we illustrate it by dividing the result by 10.

In [None]:
expr5 = ex.Elem({1:expr1,2:expr2},Person) / 10
print(expr5)

In [None]:
expr5.dictOfVariables()

In [None]:
expr5.getValue_c(myData)

The bext expression is simply the sum of multiples expressions. The argument is a list of expressions. 

In [None]:
expr6 = ex.bioMultSum([expr1,expr2,expr4])

In [None]:
print(expr6)

In [None]:
expr6.getValue_c(myData,100000)

We now illustrate how to calculate a logit model, that is $$ \frac{y_1 e^{V_1}}{y_0 e^{V_0}+y_1 e^{V_1}+y_2 e^{V_2}}$$ where $V_0=-\beta_1$, $V_1=-\beta_2$ and $V_2=-\beta_1$, and $y_i = 1$, $i=1,2,3$.

In [None]:
V = {0:-beta1,1:-beta2,2:-beta1}
av = {0:1,1:1,2:1}
expr7 = ex.LogLogit(V,av,1)

In [None]:
expr7.getValue()

It is actually better to use the C++ implementation using the following syntax.

In [None]:
expr8 = ex.bioLogLogit(V,av,1)

In [None]:
expr8.getValue_c(myData)

It is possible to calculate the derivative of a formula with respect to a literal: $$e_9=\frac{\partial e_8}{\partial \beta_2}.$$

In [None]:
expr9 = ex.Derive(expr8,'beta2')

In [None]:
expr9.getValue_c(myData)

Biogeme also provides an approximation of the CDF of the normal distribution: $$e_{10}= \frac{1}{{\sigma \sqrt {2\pi } }}\int_{-\infty}^t e^{{{ - \left( {x - \mu } \right)^2 } \mathord{\left/ {\vphantom {{ - \left( {x - \mu } \right)^2 } {2\sigma ^2 }}} \right. } {2\sigma ^2 }}}dx$$

In [None]:
expr10 = ex.bioNormalCdf(Variable1/10-1)

In [None]:
expr10.getValue_c(myData)

Min and max operators are also available. To avoid any ambiguity with the Python operator, they are called bioMin and bioMax. 

In [None]:
expr11 = ex.bioMin(expr5,expr10)
expr11.getValue_c(myData)

In [None]:
expr12 = ex.bioMax(expr5,expr10)
expr12.getValue_c(myData)