Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monkeypatch patsy with with faster handling of boolean categoricals #31

Merged
merged 2 commits into from
May 28, 2014

Conversation

jiffyclub
Copy link
Member

In urbansim we know that data passing through patsy will always be pandas Series instances. We can capitalize on that in a couple places to avoid expensive categorical nan checking. What we do instead is to check the dtype of the data and if it is boolean take appropriate action without actually looking at the data.

Actually patched are the function patsy.categorical.categorical_to_int and the method patsy.categorical.CategoricalSniffer.sniff.

There may end up being discussion around these changes on a PR I submitted to patsy at pydata/patsy#44.

In urbansim we know that data passing through patsy will always
be pandas Series instances. We can capitalize on that in a couple
places to avoid expensive categorical nan checking. What we do instead
is to check the dtype of the data and if it is boolean take appropriate
action without actually looking at the data.

Actually patched are the function patsy.categorical.categorical_to_int
and the method patsy.categorical.CategoricalSniffer.sniff.
@coveralls
Copy link

Coverage Status

Coverage increased (+5.16%) when pulling 0d9680d on patsy-patch into 2e65245 on master.

@jiffyclub
Copy link
Member Author

Going to do a little more testing on this to make sure the patch has actually been applied when models are running.

@coveralls
Copy link

Coverage Status

Coverage increased (+5.22%) when pulling bc33a91 on patsy-patch into 2e65245 on master.

jiffyclub added a commit that referenced this pull request May 28, 2014
Monkeypatch patsy with with faster handling of boolean categoricals
@jiffyclub jiffyclub merged commit faaed39 into master May 28, 2014
@jiffyclub jiffyclub deleted the patsy-patch branch May 28, 2014 00:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants