-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
This is the set up updates to finish moving urbansim to the new simulation framework #67
Conversation
yamlmodelrunner needs to be moved and import time at the top
marking agents to relocate with np.nan is a big performance issue. I think this forces the type to float and then calling .loc with a float is a major performance issue. Moving this back to marking relocating agents with -1 fixes the problem for now. This might need a more permanent fix. Also there was a warning in sqftproforma that needed to be addressed
need to remove nans before sampling -1 now marks relocating agents
This branch probably can't be merged until we converge on the best way to do the yamlmodelrunner and then update the client repos |
@@ -325,6 +332,7 @@ class instance for use during prediction. | |||
self.fit_parameters = _model_fit_to_table(fit) | |||
if debug: | |||
index = util.apply_filter_query(data, self.fit_filters).index | |||
assert len(fit.model.exog) == len(index) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a helpful message to this so someone knows what's wrong if this happens. The syntax is assert expr, 'message'
.
Conflicts: urbansim/sim/simulation.py
Conflicts: urbansim/sim/simulation.py
@@ -325,6 +333,8 @@ class instance for use during prediction. | |||
self.fit_parameters = _model_fit_to_table(fit) | |||
if debug: | |||
index = util.apply_filter_query(data, self.fit_filters).index | |||
assert len(fit.model.exog) == len(index), "The estimate data is" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you'll need to format this line like this:
assert len(fit.model.exog) == len(index), (
'The estimate data is unequal in length to '
'the original dataframe, usually caused by nans')
As is I think the second line of the message will be ignored.
and the test needed to account for this. I think this is the desired behavior since the segmentation column needs to be added to the dataframe
@jiffyclub I think this pull request is pretty close to ready to be merged - can you take a look? The Bay Area model no longer uses dataset.py or yamlmodelrunner.py but they're still there until semcog gets switched over. Let me know if you see any issues. |
if len(locations) > len(movers) * location_ratio: | ||
print("Location ratio exceeded: %d locations and only %d choosers" % | ||
(len(locations), len(movers))) | ||
idxes = random.choice(locations.index, size=len(movers) * location_ratio, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
numpy is already imported, you can use np.random
here and remove the import up top.
In lcm.py and regression.py it'd be good to have some logging statements along the lines of There are also a lot of print statements that don't have context. If those are providing essential info to users they are fine, but there should be context before printing a dataframe description or fit results. |
|
||
new_units = lcm.predict(movers, locations, debug=True) | ||
print("Assigned %d choosers to new units" % len(new_units.index)) | ||
return new_units |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I fixed the hardwired debug. What do you think about returning lcm, new_units
from the prediction function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works for me, just document it. Any function/method that returns something should have that documented in a Returns
section as described here: https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt#sections.
Conflicts: urbansim/sim/simulation.py
This is the set up updates to finish moving urbansim to the new simulation framework.
Hopefully I got all of the updates @jiffyclub suggested - if I didn't let me know. |
Looks good. |
No description provided.