Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create TM2 test example for 3 zones and TVPB #334

Closed
bstabler opened this issue Aug 27, 2020 · 24 comments
Closed

create TM2 test example for 3 zones and TVPB #334

bstabler opened this issue Aug 27, 2020 · 24 comments
Assignees

Comments

@bstabler
Copy link
Contributor

@lmz
Copy link
Contributor

lmz commented Sep 3, 2020

See also: https://github.com/ActivitySim/activitysim/wiki/MTC-TVPB-Test-Data

I'm converting skims to OMX and posting on Box (@bstabler and @toliwaga already have edit access)

For next steps, how do I setup the test run to use these files?

@bstabler
Copy link
Contributor Author

bstabler commented Sep 3, 2020

Thanks @lmz. It looks like the data inputs are coming along. Can you also post your scripts for converting the files since we may need to make some updates once we start testing these. For now, I'll setup the test environment since I need to develop the expression files to go with these inputs. Once I get it basically working, which may require a bit of back and forth between us, I'll post it for you to try. How does that sound?

@lmz
Copy link
Contributor

lmz commented Sep 8, 2020

Sure, any preference where they ought to go? I can put them in travel-model-two or in this repo.

@bstabler
Copy link
Contributor Author

bstabler commented Sep 8, 2020

Or maybe the box account folder for now? Any of the locations is fine....if this repo then create a new folder under other_resources. Thanks.

@lmz
Copy link
Contributor

lmz commented Sep 10, 2020

To continue on from our conversation today - I need some clarity about what actually needs to be done for this (which is why I thought it would be easier to understand if something is runnable to see what fails). For example:

  • I would think we would just use householdData_3.csv as the household input file, personData_3.csv for the person input file. It's not clear to me what changes are required to those. Zone renumbering?

  • Also, do input land use data files need to be combined? Can the test run be configured to use multiple files?

  • Assuming the tour files are needed for the tour destinations and times, do the individual and joint tour files need to be appended? I don't know what other changes would be necessary. Zone renumbering?

  • The maz-based skims are in text form and it seems like they're already the right format (but without headers -- which I find unacceptable...!)

@bstabler
Copy link
Contributor Author

Sure. Here's what I think still needs to be done:

  • for the synthetic households and persons, we need to make sure all the fields required by the tour mode choice UEC are in the input files. For example, expression valueOfTime refers to person.value_of_time.
  • for the land use data, we need to make sure all the fields required by the tour mode choice UEC are in the input file and all the land use data is in one file. I think this means merging maz_data.csv, accessibilities.csv, and mgraParkingCost.csv.
  • for the tour data, we need to merge indivTourData_3.csv and jointTourData_3.csv into one tour table and also create the joint_tour_participants table. We also need to make sure all the fields required by the tour mode choice UEC are in the input files.
  • for the maz-based skims, we need to add headers

In terms of formats, see the examples referenced at https://github.com/ActivitySim/activitysim/wiki/MTC-TVPB-Test-Data. Thanks.

@lmz
Copy link
Contributor

lmz commented Sep 14, 2020

When running this example, what's the mechanism used to read the tour files (instead of generating them with the other submodels)? I would think it would be something like a submodel (or more than one) that reads these files rather than doing the cdap / school_location / workplace_location , etc. As such, I think it would make sense for that submodel to read the files as is and do some limited processing, consistent with the original submodel(s). I would think that would need to do processing anyway, so why not make it read the raw tm2 files as they are and do all the work, rather than having some pre-processing scripting and some work in there?

Where is the settings.yml / setup so far for a 3 zone mtc/tam example ?

@bstabler
Copy link
Contributor Author

For now, the mechanism to read the tour file will be to restart the pipeline at tour mode choice. To do so, we'll create a pipeline file of pandas DataFrames from the TM2 CSV files. Longer term, @toliwaga and I discussed creating an initialize_tours submodel to read this table from disk (like initialize_households). I like the idea of putting the pre-processing logic into an initialize_tours expression file, but we don't have that functionality currently, so I was thinking we'd write a script. Once we have the script, then we could convert the logic into an expressions file.

Here is the current example 3 zone setup. I'm working on building a TM2 example setup based on the data you're posting.

@lmz
Copy link
Contributor

lmz commented Sep 16, 2020

Ok, forgive me for my dumb questions (I'm still an ActivitySim noob), but I don't understand what triggers the model to "restart the pipeline at tour mode choice". What command would I run? Where is this reflected in a settings.yaml? This one doesn't include the models section

@bstabler
Copy link
Contributor Author

@lmz - I'm concerned we're missing each other a bit on this. I'm thinking you provide the reasonably well formatted data inputs and then I'll build and test the example (and possibly make revisions to the inputs if needed (and/or in coordination with you)). We currently don't have out-of-the-box functionality for this exact use case so I'm going to have to put something together for now. Once I get the example working, I'm planning to share it with you and @toliwaga and the rest of the team. Thanks.

P.S. Here is an example with the ability to restart.

@toliwaga
Copy link
Contributor

@lmz  The link @bstabler provided above was broken when I moved the branch from rsg to activitysim repo.
Here is the correct link.

@lmz
Copy link
Contributor

lmz commented Sep 17, 2020

Hi @bstabler - I think I understand your approach but I don't agree with it. I think it makes sense to develop the code in tandem with the inputs, because it will help us to understand how the inputs should work (and if there might be a more elegant way to do this, like the way I proposed above) Having a skeleton and then fixing input files in response to error messages (rather than scanning a UEC and guessing what needs to be updated) makes more sense to me.

@lmz
Copy link
Contributor

lmz commented Sep 17, 2020

@lmz  The link @bstabler provided above was broken when I moved the branch from rsg to activitysim repo.
Here is the correct link.

Is it the resume_after line? I searched the documentation for more about how to make this work (e.g. what's the input file for this functionality?) but couldn't find much

@toliwaga
Copy link
Contributor

toliwaga commented Sep 17, 2020 via email

@lmz
Copy link
Contributor

lmz commented Sep 17, 2020

I am sure it (and everything in asim) could be better documented.

Ok, so I am requesting now that you document this feature and how to use it now, since it's relevant for this task
Thank you.

@lmz
Copy link
Contributor

lmz commented Sep 17, 2020

Based on this note, it looks like the relevant command would be something like the following?

python simulation.py -c configs_local -c configs_3_zone -c configs -d data_3 -o output_3

And so is the plan to create a directory alongside configs_3_zone called configs_3_zone_mtc?

@bstabler
Copy link
Contributor Author

bstabler commented Sep 17, 2020

Hi @lmz, that's correct. Here's basically what I'm thinking, which we can discuss in more detail on the call:

  • copy configs_3_zone and rename as configs_3_zone_mtc
  • first run the example through tour mode choice since we need the pipeline file at this point in time in order to jump in at tour mode choice
  • modify the setup to use the network LOS data you provide
    • maz nearby skims
    • OMX Cube skims
  • now here's the tricky part, transform the example pipeline to use your data instead
    • replace the pipeline households table with your table
    • replace the pipeline persons table with your table
    • replace the pipeline land use table with your table (including maz, accessibility, and parking data in one file)
    • replace the pipeline tours table with your table (including both individual and joint tours in one file)
  • update the tour mode choice expression files as needed
  • update the tvpb expression files as needed
  • restart the model run at tour_mode_choice and only run tour_mode_choice
  • write out the tours table and summarize results against the previous model to check results
    • mode share
    • frequency of taps, tap pairs selected
    • trace some od pairs for similar taps selected
    • etc
  • give @toliwaga the example so he can work on performance
  • iterate/debug as needed in order to confirm the software is working fine and producing reasonable results

@bstabler
Copy link
Contributor Author

regarding resume_after, see https://activitysim.github.io/activitysim/abmexample.html#pipeline

@lmz
Copy link
Contributor

lmz commented Sep 17, 2020

The referenced documentation says "These model steps must be registered orca steps, as noted below. If you provide a resume_after argument to activitysim.core.pipeline.run() the pipeliner will load checkpointed tables from the checkpoint store and resume pipeline processing on the next model step after the specified checkpoint."

What's the "checkpoint store"? How does one create this file? If this is the recommended way to restart the model/skip some submodels, then there needs to be documentation on this.

@lmz
Copy link
Contributor

lmz commented Sep 17, 2020

Hi @lmz, that's correct. Here's basically what I'm thinking, which we can discuss in more detail on the call:

  • copy configs_3_zone and rename as configs_3_zone_mtc
  • first run the example through tour mode choice since we need the pipeline file at this point in time in order to jump in at tour mode choice

This sounds to me to be a cumbersome way to setup a model to just test the tour mode choice submodel.
I don't believe this is user-friendly skip-model functionality. It sounds like this is a topic to discuss tomorrow morning.

@bstabler
Copy link
Contributor Author

I agree, it's cumbersome and it's something we should improve. We don't have a recommended way to skip submodels so that's why we're pivoting off a previous run. The checkpoint store is also known as the data store and the pipeline file. It's in the output folder from the previous run.

@bstabler
Copy link
Contributor Author

Here's some additional helpful info on the very simple examples being used so far - https://github.com/ActivitySim/activitysim/wiki/TVPB-Design

@bstabler
Copy link
Contributor Author

@toliwaga finished creating the Marin example this week. I'm going to create a new issue for performance tuning/pre-computing/caching.

@bstabler
Copy link
Contributor Author

this is now in the release as example_3_marin_full

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants