New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework hdf output #11

Merged
merged 6 commits into from Jul 7, 2016

Conversation

Projects
None yet
4 participants
@bridwell
Contributor

bridwell commented Jul 1, 2016

This addresses #9 and supersedes a previous pull request I made #10:

  • Regardless of the output interval, tables will always be output for the base, the 1st iteration's results and the final iteration's results.
    • The interval is defined relative to the first iteration. For example, a run beginning in 2015, ending in 2020, with an out_interval of 2, will now write out results for base, 2015, 2017, 2019 and 2020.
    • Base tables are still prefixed with '/base/'.
    • Non-base tables are always prefixed with the iteration value.
    • Previously, the final iteration results were prefixed with '/final/' for cases where the final iteration didn't coincide with the output interval and with the iteration value when it did. This has been changed so that the final iteration tables are always prefixed with the iteration value.
  • Optionally, the run method can now be given a set of specific base tables and/or run tables that will be written. If not provided, then the tables will be inferred from those injected into the steps as is done now (this is still the default behavior). I've found this useful for suppressing tables (e.g. OD skims matrix) that are injected into steps every year but that we don't change and don't want to write because they add a lot of size to the output file. For the base, I typically write out all tables, regardless of what's injected, just to keep everything in a single store. For example:
orca.run(
    ['my_step'],
    range(2010, 2030),
    data_out=out_file,
    out_interval=5,
    out_base_tables=orca.list_tables(),
    out_run_tables=['a_table']
)
  • The write_tables method was re-factored so that it no longer take steps as inputs. This allows for outputting tables outside the context of a run:
# just output a couple tables
orca.write_tables(out_h5, ['buildings', 'persons'])

# write out everything that's registered
orca.write_tables(out_h5)

# still allow for step inference
step_tables = orca.get_step_table_names(['my_step'])
orca.write_tables(out_h5, step_tables)

The docstrings and tests have been updated to reflect these changes.

I looked through the repo and the only method calling write_tables is the run method. So all tests pass fine. However, if there are calls to orca.write_tables coming from outside orca, then these will now fail. I don't see why this previously would be called outside the context of a run, but if that is the case then I can revert the signature back to the previous version.

bridwell added some commits Jun 30, 2016

Update logic for outputing hdf files. Now, the first iterations resul…
…ts will always be output in addition to the base. Also, the final iteration will always be prefix with iteration value. Previously these tables were predixed with final in cases where the iteration didnt correspond to interval.
@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Jul 1, 2016

Coverage Status

Changes Unknown when pulling ef03a5d on AZMAG:rework_hdf_output into * on UDST:master*.

coveralls commented Jul 1, 2016

Coverage Status

Changes Unknown when pulling ef03a5d on AZMAG:rework_hdf_output into * on UDST:master*.

@bridwell

This comment has been minimized.

Show comment
Hide comment
@bridwell

bridwell Jul 1, 2016

Contributor

This passes for python 3.3 but fails for 2.7 and 3.4 on an assert column series equal.

It appears that setting values via .loc is changing the data type from int64 to float. Should the update_col_from_series method be updated so that the data type of the incoming series is cast to match the local column?

Or should update_col_from_series check the data types of the series and the incoming values and raise an exception if they don't match?

Contributor

bridwell commented Jul 1, 2016

This passes for python 3.3 but fails for 2.7 and 3.4 on an assert column series equal.

It appears that setting values via .loc is changing the data type from int64 to float. Should the update_col_from_series method be updated so that the data type of the incoming series is cast to match the local column?

Or should update_col_from_series check the data types of the series and the incoming values and raise an exception if they don't match?

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Jul 5, 2016

Coverage Status

Changes Unknown when pulling fa560ee on AZMAG:rework_hdf_output into * on UDST:master*.

coveralls commented Jul 5, 2016

Coverage Status

Changes Unknown when pulling fa560ee on AZMAG:rework_hdf_output into * on UDST:master*.

@fscottfoti

This comment has been minimized.

Show comment
Hide comment
@fscottfoti

fscottfoti Jul 5, 2016

Member

FWIW these changes all seem very reasonable to me. Ready to merge?

Member

fscottfoti commented Jul 5, 2016

FWIW these changes all seem very reasonable to me. Ready to merge?

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Jul 5, 2016

Coverage Status

Changes Unknown when pulling 7aba309 on AZMAG:rework_hdf_output into * on UDST:master*.

coveralls commented Jul 5, 2016

Coverage Status

Changes Unknown when pulling 7aba309 on AZMAG:rework_hdf_output into * on UDST:master*.

@bridwell

This comment has been minimized.

Show comment
Hide comment
@bridwell

bridwell Jul 5, 2016

Contributor

OK I think it's ready now.

Contributor

bridwell commented Jul 5, 2016

OK I think it's ready now.

@janowicz

This comment has been minimized.

Show comment
Hide comment
@janowicz

janowicz Jul 5, 2016

Contributor

These changes make a lot of sense to me too. Looks good.

Contributor

janowicz commented Jul 5, 2016

These changes make a lot of sense to me too. Looks good.

@fscottfoti fscottfoti merged commit 5983296 into UDST:master Jul 7, 2016

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls First build on master at 96.516%
Details

@bridwell bridwell deleted the AZMAG:rework_hdf_output branch Jul 7, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment