From d1ee62c6e1b0e4333810b860d6f5e350ba78f128 Mon Sep 17 00:00:00 2001 From: Eddie Janowicz Date: Thu, 14 Aug 2014 12:33:47 -0700 Subject: [PATCH] a set of very minor doc edits --- docs/examples.rst | 16 ++++++++-------- docs/gettingstarted.rst | 8 ++++---- urbansim/developer/sqftproforma.py | 2 +- urbansim/models/lcm.py | 10 +++++----- 4 files changed, 18 insertions(+), 18 deletions(-) diff --git a/docs/examples.rst b/docs/examples.rst index d1ad575a..91501bf9 100644 --- a/docs/examples.rst +++ b/docs/examples.rst @@ -8,7 +8,7 @@ A fairly complete case study of using UrbanSim can be shown entirely within a si As the canonical example of using UrbanSim, take the case of a residential sales hedonic model used to perform an ordinary least squares regression on a table of building price data. The best practice would be to store the building data in a Pandas HDFStore, and the buildings table can include millions of rows (all of the buildings in a region) and attributes like square footage, lot size, number of bedrooms and bathrooms and the like. Importantly, the dependent variable should also be included which in this case might be the assessed or observed price of each unit. The example repository includes sample data so that this Notebook can be executed. -This Notebook performs the exact same residential price hedonic as in the complete example below, but all entirely within the same IPython Notebook (and without explicitly using the ``sim.model`` decorator). The simplest use case of the UrbanSim methodology is to create a single model to study an emperical behavior or interest to the modeler, and a good place to start in building such a model is this example. +This Notebook performs the exact same residential price hedonic as in the complete example below, but all entirely within the same IPython Notebook (and without explicitly using the ``sim.model`` decorator). The simplest use case of the UrbanSim methodology is to create a single model to study an emperical behavior of interest to the modeler, and a good place to start in building such a model is this example. Note that the flow of the notebook is one often followed in statistical modeling: @@ -107,7 +107,7 @@ The ``buildings`` object that gets passed in is a `Table Wrapper `_ but this returns *all* computed columns on the table and so has performance implications. In general it's better to use the Series objects directly where possible. -As a concrete example, the above code is recommended: :: +As a concrete example, the following code is recommended: :: return buildings.residential_units.groupby(buildings.zone_id).sum() @@ -130,13 +130,13 @@ Finally, if all the attributes being used are primary, the user can call ``local Models ~~~~~~ -The main objective of the `models.py `_ file is to define the "entry points" into the model system. Although UrbanSim provides the direct API for a `Regression Model `_ a `Location Choice Model `_, etc, it is the models.py file which defines the specific *steps* that outline a simulation or even a more general data processing workflow. +The main objective of the `models.py `_ file is to define the "entry points" into the model system. Although UrbanSim provides the direct API for a `Regression Model `_, a `Location Choice Model `_, etc, it is the models.py file which defines the specific *steps* that outline a simulation or even a more general data processing workflow. In the San Francisco example, there are two price/rent `hedonic models `_ which both use the RegressionModel, one which is the residential sales hedonic which is estimated with the entry point `rsh_estimate `_ and then run in simulation mode with the entry point rsh_simulate. The non-residential rent hedonic has similar entry points `nrh_estimate `_ and nrh_simulate. Note that both functions call `hedonic_estimate `_ and hedonic_simulate in `utils.py `_. In this case ``utils.py`` actually uses the UrbanSim API by calling the `fit_from_cfg `_ method on the Regressionmodel. There are two things that warrant further explanation at this point. -* ``utils.py`` is a set of helper functions that assist with merging data and running models from configuration files. Note that the code in this file is generally sharable across UrbanSim implementations (in fact, this exact code is in use in multiple live simulations). It defines a certain style of UrbanSim and handles a number of boundary cases in a transparent way. In the long run, this kind of functionality might be unit tested and moved to UrbanSim, but for now we think it helps with transparency, flexibility, and debugging to keep this file with the specific client implementations. +* ``utils.py`` is a set of helper functions that assist with merging data and running models from configuration files. Note that the code in this file is generally shareable across UrbanSim implementations (in fact, this exact code is in use in multiple live simulations). It defines a certain style of UrbanSim and handles a number of boundary cases in a transparent way. In the long run, this kind of functionality might be unit tested and moved to UrbanSim, but for now we think it helps with transparency, flexibility, and debugging to keep this file with the specific client implementations. * Many of the models use configuration files to define the actual model configuration. In fact, most models in this file are very short *stub* functions which pass a Pandas DataFrame into the estimation and configure the model using a configuration file in the `YAML file format `_. For instance, the ``rsh_estimate`` function knows to read the configuration file, estimate the model defined in the configuration on the dataframe passed in, and write the estimated coefficients back to the same configuration file, and the complete method is pasted below:: @@ -215,7 +215,7 @@ This notebook estimates all of the models in the example that need estimation (b Simulation Workflow ~~~~~~~~~~~~~~~~~~~ -A sample simulation workflow (a complete UrbanSim simulation is available `in this Notebook `__. +A sample simulation workflow (a complete UrbanSim simulation) is available `in this Notebook `__. This notebook is possibly even simpler than the estimation workflow as it has only one substantive cell which runs all of the available models in the appropriate sequence. Passing a range of years will run the simulation for multiple years (the example simply runs the simulation for a single year). Other parameters are available to the `sim.run `_ method which write the output to an HDF5 file. @@ -230,7 +230,7 @@ This is another simple and powerful notebook which can be used to quickly map va See :ref:`dframe-explorer` for detailed information on how to call the ``start`` method and what queries the website is performing. -Once the ``start`` method has been called, the IPython Notebook is running a web service which will respond to queries from a web browser. Try is out - open your web browser and navigate to http://localhost:8765/ or follow the same link embedded in your notebook. Note the link won't work on the web example - you need to have the example running on your local machine - all queries are run interactively between your web browser and the IPython Notebook. Your web browser should show a page like the following: +Once the ``start`` method has been called, the IPython Notebook is running a web service which will respond to queries from a web browser. Try it out - open your web browser and navigate to http://localhost:8765/ or follow the same link embedded in your notebook. Note the link won't work on the web example - you need to have the example running on your local machine - all queries are run interactively between your web browser and the IPython Notebook. Your web browser should show a page like the following: .. image:: screenshots/dframe_explorer_screenshot.png @@ -323,11 +323,11 @@ to set the name appropriately): :: } }) -The keys in this object are table names, the values are also dictionary +The keys in this object are table names, the values are also a dictionary where the keys are column names and the values are a tuple. The first value of the tuple is what to call the Pandas ``fillna`` function with, and can be a choice of "zero," "median," or "mode" and should be set appropriately by the user for the specific column. The second argument is -the data type to conver to. The user can then call +the data type to convert to. The user can then call ``utils.fill_na_from_config`` as in the `example `_ with a DataFrame and table name and all NaNs will be filled. This functionality will eventually be moved into UrbanSim. \ No newline at end of file diff --git a/docs/gettingstarted.rst b/docs/gettingstarted.rst index 87623c63..87ab11d6 100644 --- a/docs/gettingstarted.rst +++ b/docs/gettingstarted.rst @@ -48,7 +48,7 @@ One of the main motivations for the current implementation of UrbanSim is to ref A Note on Pandas Indexing ~~~~~~~~~~~~~~~~~~~~~~~~~ -One very import note about Pandas - the real genius of the abstraction is that all records in a table are viewed as key-value pairs. Every table has an `index `_ or a `multi-index `_ which is used to `align `_ the table on the key for that table. +One very important note about Pandas - the real genius of the abstraction is that all records in a table are viewed as key-value pairs. Every table has an `index `_ or a `multi-index `_ which is used to `align `_ the table on the key for that table. This is similar to having a `primary key `_ in a database except that now you can do mathematical operations with columns. For instance, you can now take a column from one table and a column from another table and add or multiply them and the operation will automatically align on the key (i.e. it will add elements with the same index value). @@ -72,7 +72,7 @@ IPython One of the most useful features of IPython is the `IPython notebook `_, which is perfect for interactively executing small cells of Python code. We use notebooks a LOT, and they are a wonderful way to avoid the command line in a cross-platform way. The notebook is a fantastic tool to develop snippets of code a few lines at a time, and to capture and communicate higher-level workflows. -This also makes the notebook a fantastic pedagogical tool - in other words it's great for demos and communicating both the input and output of cells of Python code (e.g. `nbviewer `_. Many of the full-size examples of UrbanSim on this site are presented in notebooks. +This also makes the notebook a fantastic pedagogical tool - in other words it's great for demos and communicating both the input and output of cells of Python code (e.g. `nbviewer `_). Many of the full-size examples of UrbanSim on this site are presented in notebooks. In many cases, you can write entire UrbanSim models in the notebook, but this is not generally considered the best practice. It's entirely up to you though, and we are happy to share with you our insights from many hours of developing and using this set of tools. @@ -90,7 +90,7 @@ UrbanSim has been an active research project since the late 1990's, and has unde for model in models: model.simulate(model_configuration_parameters) -The set of models varies among the many UrbanSim applications to different regions, due to the data availability and cleanliness, the time and resources that can be devoted to the project, and specific research questions that motivated the projects. The set of models almost always includes at least the following: +The set of models varies among the many UrbanSim applications to different regions, due to data availability and cleanliness, the time and resources that can be devoted to the project, and specific research questions that motivated the projects. The set of models almost always includes at least the following: Residential Real Estate Models ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -127,7 +127,7 @@ Some representation of real estate development must be modeled to accurately rep It should be noted that many other kinds of models can be included in the simulation loop as well. For instance, inclusion of scheduled development events is a key element to representing known future development projects. -In general, any Python script that reads and writes data can be included to help answer a specific research question or to model a certain real-world behavior - models can even be parameterized in JSON or YAML and included in the standard model set and an ever-increasing set of functionality will be added over time. +In general, any Python script that reads and writes data can be included to help answer a specific research question or to model a certain real-world behavior - models can even be parameterized in JSON or YAML and included in the standard model set, and an ever-increasing set of functionality will be added over time. Specifying Scenario Inputs -------------------------- diff --git a/urbansim/developer/sqftproforma.py b/urbansim/developer/sqftproforma.py index 9d16cc5b..bd14b87c 100644 --- a/urbansim/developer/sqftproforma.py +++ b/urbansim/developer/sqftproforma.py @@ -13,7 +13,7 @@ class SqFtProFormaConfig(object): parcel_sizes : list A list of parcel sizes to test. Interestingly, right now - the parcel sizes cancel is this style of pro forma computation so + the parcel sizes cancel in this style of pro forma computation so you can set this to something reasonable for debugging purposes - e.g. [10000]. All sizes can be feet or meters as long as they are consistently used. diff --git a/urbansim/models/lcm.py b/urbansim/models/lcm.py index 8184672b..45588bef 100644 --- a/urbansim/models/lcm.py +++ b/urbansim/models/lcm.py @@ -311,7 +311,7 @@ def predict(self, choosers, alternatives, debug=False): alternatives : pandas.DataFrame Table describing the things from which agents are choosing. debug : bool - If debug is set to true, well set the variable "sim_pdf" on + If debug is set to true, will set the variable "sim_pdf" on the object to store the probabilities for mapping of the outcome. @@ -504,7 +504,7 @@ def predict_from_cfg(cls, movers, locations, cfgname, movers : DataFrame A dataframe of agents doing the choosing. locations : DataFrame - A dataframe of locations which the choosers are location in and which + A dataframe of locations which the choosers are locating in and which have a supply. cfgname : string The name of the yaml config file from which to read the location @@ -697,7 +697,7 @@ def predict(self, choosers, alternatives, debug=False): alternatives : pandas.DataFrame Table describing the things from which agents are choosing. debug : bool - If debug is set to true, well set the variable "sim_pdf" on + If debug is set to true, will set the variable "sim_pdf" on the object to store the probabilities for mapping of the outcome. @@ -985,7 +985,7 @@ def predict(self, choosers, alternatives, debug=False): alternatives : pandas.DataFrame Table describing the things from which agents are choosing. debug : bool - If debug is set to true, well set the variable "sim_pdf" on + If debug is set to true, will set the variable "sim_pdf" on the object to store the probabilities for mapping of the outcome. @@ -1175,7 +1175,7 @@ def predict_from_cfg(cls, movers, locations, cfgname, movers : DataFrame A dataframe of agents doing the choosing. locations : DataFrame - A dataframe of locations which the choosers are location in and which + A dataframe of locations which the choosers are locating in and which have a supply. cfgname : string The name of the yaml config file from which to read the location