Skip to content

Commit

Permalink
Merge branch 'master' of https://github.com/jpn--/larch
Browse files Browse the repository at this point in the history
  • Loading branch information
jpn-- committed Mar 3, 2017
2 parents e6494f8 + 17c08a3 commit a5b767e
Show file tree
Hide file tree
Showing 155 changed files with 535 additions and 160 deletions.
2 changes: 1 addition & 1 deletion build_configuration.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/usr/bin/python
#
# Copyright 2007-2016 Jeffrey Newman
# Copyright 2007-2017 Jeffrey Newman
#
# This file is part of Larch.
#
Expand Down
Binary file added doc/agg-choice-variance.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ def __getattr__(cls, name):

# General information about the project.
project = u'Larch'
copyright = u'2010-2016, Jeffrey Newman'
copyright = u'2010-2017, Jeffrey Newman'

# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
Expand Down Expand Up @@ -377,7 +377,7 @@ def setup(app):
epub_title = u'larch'
epub_author = u'Jeffrey Newman'
epub_publisher = u'Jeffrey Newman'
epub_copyright = u'2016, Jeffrey Newman'
epub_copyright = u'2017, Jeffrey Newman'

# The language of the text. It defaults to the language option
# or en if the language is not set.
Expand Down
144 changes: 140 additions & 4 deletions doc/math.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,148 @@
Mathematics of Logit Choice Modeling
====================================

This documentation will eventually provide some instruction on the underlying
mathematics of logit models. For example:
This documentation will eventually provide instruction on some of the more interesting topics on the underlying
mathematics of logit models.



~~~~~~~~~~~~~~~~~~~~~~~
Aggregate Choice Models
~~~~~~~~~~~~~~~~~~~~~~~

Sometimes, a discrete choice is made from a very large pool of possible choices. In these
circumstances, it may be useful to aggregate choices together, and represent a set of choices
as a single meta-choice. This is particularly common in destination choice models, where the
individual possible destinations are aggregated together as traffic analysis zones.

The aggregate choice in many ways represents a nested logit model, with the aggregations corresponding to the nests.

We can make some assumptions:

1. The individual elemental alternatives within each zone or aggregate are homogeneous.
That is, each such alternative has the same systematic utility, :math:`V_{i} = \beta X_{i}`
2. The particular locations of the zonal or aggregation boundaries are arbitrary, and have
no systematic meaning themselves.

Using these assumptions, we can derive an aggregate/zonal choice model.

The usual form of the nested logit model calculates the probability of an alternative as :math:`P_{nest}P_{alt|nest}`.
In the case of aggregate choices, we do not observe the choice, but only the nest, so we only care about :math:`P_{nest}`.
The nested formula for that term is

.. math::
P_{nest}=\frac{\exp(V_{nest})}{\sum_{j\in nests}\exp(V_{j})}
with

.. math::
V_{nest}=\mu_{nest}\log\left(\sum_{i\in nest}\exp\left(\frac{V_{i}}{\mu_{nest}}\right)\right)
Using assumption 2, we know that :math:`\mu_{nest}` must be 1, as we want the aggregation nesting structure to
collapse to a multinomial logit model. Further, our first assumption is that all the :math:`V_{i}` are equal,
so the terms inside the summation can collapse together, leaving

.. math::
V_{nest}=\log\left(N_{nest}\exp\left(V_{i}\right)\right)=V_{i}+\log\left(N_{nest}\right)
with :math:`N_{nest}` as the number of discrete elemental alternatives inside the nest. This can be estimated
by creating a variable for each aggregate alternative that has a value of :math:`\log\left(N_{nest}\right)`,
and including it in a MNL model, with a beta coefficient constrained to be equal to 1.

One thing to be careful of in these models: the log likelihood at “zeros” model should include the parameter
on :math:`\log\left(N_{nest}\right)` equal to 1, not 0. This is because this is not a parameter we are
estimating in the model, it is a direct function of the structure of aggregation, which we have imposed externally.

Relax Arbitrary Boundaries Assumption
-------------------------------------

Relaxing the assumption of arbitrary boundaries puts :math:`\mu_{nest}` back into the equation for :math:`V_{nest}`:

.. math::
P(i) = \frac{ \exp(V_i) }{ \sum_j \exp(V_j) }
V_{nest}=\mu_{nest}\log\left(\sum_{i\in nest}\exp\left(\frac{V_{i}}{\mu_{nest}}\right)\right)=V_{i}+\mu_{nest}\log\left(N_{nest}\right)
The logsum parameter thus appears as a coefficient on :math:`\log\left(N_{nest}\right)`. This may or may not be a good
idea for transportation models. In an intra-urban model, if the boundaries of zones are at the TAZ level, which are
small sectors drawn only for modelling purposes, relaxing this assumption probably doesn't make sense. If the boundaries
are aligned with political boundaries (counties, towns) that have differing taxing, administration, or other policies,
it might be OK to relax this assumption. In a log distance travel model, if the boundaries are aligned with metropolitan
areas, then it is certainly reasonable to relax the arbitrary bounds assumption.


Relaxing Homogeneity
--------------------

The other assumption we made was that the individual alternatives within a zone are homogeneous... but it is highly likely
they are not. Variance in the systematic utilities, and in particular heteroskedastic variance, can change the calculations.
Consider the one dimensional destination choice depicted here:

.. image:: agg-choice-variance.png

The choice has been subdivided into three aggregation zones. The average utility of Zone A is lower than that of Zone B
or Zone C, but the variance of utility in Zone A is much larger.

Recall that utility maximization theory posits that a decision maker will choose the one discrete alternative with maximum
utility. The aggregation of those discrete alternatives into zones or aggregate choices does not change the underlying
choice; a decision maker does not choose a zone, but she chooses a single discrete alternative in a zone.

While the average utility in Zone A is smaller, you can see that there are some points in Zone A with much higher utility,
and which are more likely to be chosen. In general, all other things being equal, aggregate alternatives get a positive
bump in their probability of selection with an increase in variance of the systematic utility.

[McFadden1978]_ showed that, when the utilities in an aggregate are distributed normally, if we define :math:`\omega_{nest}^{2}`
as the variance of :math:`V_{i}` in a nest, and :math:`\bar{V}_{i}` as the average systematic utility of alternatives in
the nest, then

.. math::
V_{nest}=\bar{V}_{i}+\mu_{nest}\log\left(N_{nest}\right)+\frac{1}{2}\frac{\omega_{nest}^{2}}{\mu_{nest}}
Estimating N
------------
Sometimes, it is not obvious what :math:`N` should be. Land area? Employment? Population? It might be different
for different types of trips, even if the types of trips are not differentiated in the data.

It is possible to build :math:`N` as a linear combination of several component parts, so that you might have

.. math::
N_{nest}=\gamma_{remp}RetailEmployment+\gamma_{nemp}NonretailEmployment+\gamma_{pop}Population
The :math:`\gamma`'s then become new parameters to the model, in addition to the :math:`\beta` and :math:`\mu` parameters.

The size value :math:`N_{nest}` still needs to be strictly positive, as it represents the number of discrete
alternatives in the zone or aggregation. Therefore, all the data values and all the parameters inside :math:`N` also
need to be positive (or, more precisely, they must all be non-negative and at least one pairing must both be strictly positive).
Enforcing positive data is easy, by only choosing variables that reflect size attributes
(like employment, population, area). Enforcing positive coefficients requires constraints on the :math:`\gamma` parameters,
or, more simply, a rewrite of the formulation of :math:`N`:

.. math::
N_{nest}=\exp(\dot{\gamma}_{remp})RetailEmployment+\exp(\dot{\gamma}_{nemp})NonretailEmployment+\exp(\dot{\gamma}_{pop})Population
Then :math:`\dot{\gamma}` can be unconstrained. (This form also has advantages in the calculation of derivatives, the
details of which are not important for users to understand.)

One of the issues with estimating :math:`N` in this fashion is that the scale of :math:`N`, like the scale of :math:`V`,
is not defined. Doubling the :math:`N` size of all alternatives, by adding :math:`\log(2)` to all :math:`\dot{\gamma}`,
will not affect the probabilities. Therefore, one :math:`\dot{\gamma}` needs to be arbitrarily fixed at zero.
(In the non-estimated :math:`N` case, this normalization occurs implicitly; there is no parameter inside the log term
on :math:`N`.)




~~~~~~~~~~


with :math:`V_i = \beta X_i`.
.. [McFadden1978] McFadden, D. (1978) Modelling the choice of residential location.
Spatial Interaction Theory and Residential Location (Karlquist A. Ed., pp. 75-96).
North Holland, Amsterdam.
4 changes: 2 additions & 2 deletions py/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
#
# Larch is free, open source software to estimate discrete choice models.
#
# Copyright 2007-2016 Jeffrey Newman
# Copyright 2007-2017 Jeffrey Newman
#
# Larch is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
Expand Down Expand Up @@ -34,7 +34,7 @@


info = """Larch is free, open source software to estimate discrete choice models.
Copyright 2007-2016 Jeffrey Newman
Copyright 2007-2017 Jeffrey Newman
This program is licensed under GPLv3 and comes with ABSOLUTELY NO WARRANTY."""

status = ""
Expand Down
2 changes: 1 addition & 1 deletion py/examples/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
######################################################### encoding: utf-8 ######
#
# Copyright 2007-2016 Jeffrey Newman.
# Copyright 2007-2017 Jeffrey Newman.
#
# This file is part of Larch.
#
Expand Down
2 changes: 1 addition & 1 deletion py/examples/itin80.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
################################################################################
#
# Copyright 2007-2016 Jeffrey Newman.
# Copyright 2007-2017 Jeffrey Newman.
#
# This file is part of Larch.
#
Expand Down
2 changes: 1 addition & 1 deletion py/examples/mtc01e.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
################################################################################
#
# Copyright 2007-2016 Jeffrey Newman.
# Copyright 2007-2017 Jeffrey Newman.
#
# This file is part of Larch.
#
Expand Down
2 changes: 1 addition & 1 deletion py/examples/mtc17.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
################################################################################
#
# Copyright 2007-2016 Jeffrey Newman.
# Copyright 2007-2017 Jeffrey Newman.
#
# This file is part of Larch.
#
Expand Down
2 changes: 1 addition & 1 deletion py/examples/mtc22.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
################################################################################
#
# Copyright 2007-2016 Jeffrey Newman.
# Copyright 2007-2017 Jeffrey Newman.
#
# This file is part of Larch.
#
Expand Down
2 changes: 1 addition & 1 deletion py/examples/swissmetro00data.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
######################################################### encoding: utf-8 ######
#
# Copyright 2007-2016 Jeffrey Newman.
# Copyright 2007-2017 Jeffrey Newman.
#
# This file is part of Larch.
#
Expand Down
2 changes: 1 addition & 1 deletion py/examples/swissmetro01logit.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
######################################################### encoding: utf-8 ######
#
# Copyright 2007-2016 Jeffrey Newman.
# Copyright 2007-2017 Jeffrey Newman.
#
# This file is part of Larch.
#
Expand Down
2 changes: 1 addition & 1 deletion py/examples/swissmetro02weighted.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
######################################################### encoding: utf-8 ######
#
# Copyright 2007-2016 Jeffrey Newman.
# Copyright 2007-2017 Jeffrey Newman.
#
# This file is part of Larch.
#
Expand Down
2 changes: 1 addition & 1 deletion py/examples/swissmetro04transforms.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
######################################################### encoding: utf-8 ######
#
# Copyright 2007-2016 Jeffrey Newman.
# Copyright 2007-2017 Jeffrey Newman.
#
# This file is part of Larch.
#
Expand Down
2 changes: 1 addition & 1 deletion py/examples/swissmetro09nested.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
######################################################### encoding: utf-8 ######
#
# Copyright 2007-2016 Jeffrey Newman.
# Copyright 2007-2017 Jeffrey Newman.
#
# This file is part of Larch.
#
Expand Down
2 changes: 1 addition & 1 deletion py/examples/swissmetro11cnl.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
######################################################### encoding: utf-8 ######
#
# Copyright 2007-2016 Jeffrey Newman.
# Copyright 2007-2017 Jeffrey Newman.
#
# This file is part of Larch.
#
Expand Down
2 changes: 1 addition & 1 deletion py/examples/swissmetro14selectionBias.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
######################################################### encoding: utf-8 ######
#
# Copyright 2007-2016 Jeffrey Newman.
# Copyright 2007-2017 Jeffrey Newman.
#
# This file is part of Larch.
#
Expand Down
2 changes: 1 addition & 1 deletion py/linalg.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#
# Copyright 2007-2016 Jeffrey Newman
# Copyright 2007-2017 Jeffrey Newman
#
# This file is part of Larch.
#
Expand Down
2 changes: 1 addition & 1 deletion py/logging.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#
# Copyright 2007-2016 Jeffrey Newman
# Copyright 2007-2017 Jeffrey Newman
#
# This file is part of Larch.
#
Expand Down
21 changes: 20 additions & 1 deletion py/model_reporter/docx.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
try:
import docx
from docx.enum.style import WD_STYLE_TYPE
from docx.enum.text import WD_ALIGN_PARAGRAPH
except ImportError:

class DocxModelReporter():
Expand Down Expand Up @@ -30,6 +31,15 @@ def _append_to_document(self, other_doc):
def document_larchstyle():
document = docx.Document()

# normal = document.styles['Normal']
# normal.font.name = 'Arial'
# normal.font.size = docx.shared.Pt(11)
# normal.paragraph_format.alignment = WD_ALIGN_PARAGRAPH.JUSTIFY
# normal.paragraph_format.line_spacing = 1.0
# normal.paragraph_format.widow_control = True
#
body_text = document.styles['Body Text']

monospaced_small = document.styles.add_style('Monospaced Small',WD_STYLE_TYPE.TABLE)
monospaced_small.base_style = document.styles['Normal']
monospaced_small.font.name = 'Courier New'
Expand All @@ -38,6 +48,15 @@ def document_larchstyle():
monospaced_small.paragraph_format.space_after = docx.shared.Pt(0)
monospaced_small.paragraph_format.line_spacing = 1.0

table_body_text = document.styles.add_style('Table Body Text',WD_STYLE_TYPE.TABLE)
table_body_text.base_style = document.styles['Body Text']
table_body_text.font.name = 'Arial Narrow'
table_body_text.font.size = docx.shared.Pt(9)
table_body_text.paragraph_format.space_before = docx.shared.Pt(1)
table_body_text.paragraph_format.space_after = docx.shared.Pt(1)
table_body_text.paragraph_format.line_spacing = 1.0


return document


Expand Down Expand Up @@ -204,7 +223,7 @@ def docx_params(self, groups=None, display_inital=False, **format):
if groups is None and hasattr(self, 'parameter_groups'):
groups = self.parameter_groups

table = docx_table(rows=1, cols=number_of_columns, style='Monospaced Small',
table = docx_table(rows=1, cols=number_of_columns, style='Table Body Text',
header_text="Model Parameter Estimates", header_level=2)

def append_simple_row(name, initial_value, value, std_err, tstat, nullvalue, holdfast):
Expand Down
2 changes: 1 addition & 1 deletion py/test/__init__.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#
# Copyright 2007-2016 Jeffrey Newman
# Copyright 2007-2017 Jeffrey Newman
#
# This file is part of Larch.
#
Expand Down
2 changes: 1 addition & 1 deletion py/test/test_data.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#
# Copyright 2007-2016 Jeffrey Newman
# Copyright 2007-2017 Jeffrey Newman
#
# This file is part of Larch.
#
Expand Down
2 changes: 1 addition & 1 deletion py/test/test_examples.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#
# Copyright 2007-2016 Jeffrey Newman
# Copyright 2007-2017 Jeffrey Newman
#
# This file is part of Larch.
#
Expand Down
2 changes: 1 addition & 1 deletion py/test/test_mixed.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#
# Copyright 2007-2016 Jeffrey Newman
# Copyright 2007-2017 Jeffrey Newman
#
# This file is part of Larch.
#
Expand Down
2 changes: 1 addition & 1 deletion py/test/test_mnl.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#
# Copyright 2007-2016 Jeffrey Newman
# Copyright 2007-2017 Jeffrey Newman
#
# This file is part of Larch.
#
Expand Down
2 changes: 1 addition & 1 deletion py/test/test_nl.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#
# Copyright 2007-2016 Jeffrey Newman
# Copyright 2007-2017 Jeffrey Newman
#
# This file is part of Larch.
#
Expand Down

0 comments on commit a5b767e

Please sign in to comment.