Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle range on heterogeneously typed Dataset column #2345

Merged
merged 3 commits into from Feb 19, 2018

Conversation

Projects
None yet
2 participants
@philippjfr
Copy link
Contributor

philippjfr commented Feb 15, 2018

Calling range on a heterogeneously typed column currently causes all kinds of issues (including errors) due to python 3 sorting semantics. While range should generally not be called on object or string dtype columns it currently is so this needs to work robustly for the time being. Therefore this PR uses the python2sort utility to do the sorting robustly in python3.

@philippjfr philippjfr force-pushed the hetero_ds_range branch from ca2382e to ff26b55 Feb 15, 2018

@jlstevens

This comment has been minimized.

Copy link
Contributor

jlstevens commented Feb 16, 2018

This is one semantic change in Python 3 that I find to be downright annoying. Anyway, the suggested fix makes sense.

dimensions = [dataset.get_dimensions(d).name for d in dimensions]
inds = [dataset.data.columns.index(dim.name) for dim in dimensions]
return dataset.data.values[:, inds]

This comment has been minimized.

@jlstevens

jlstevens Feb 16, 2018

Contributor

I don't quite see how this relates to the python 2 vs 3 sorting issue. Is this method implementating an unrelated fix?

This comment has been minimized.

@philippjfr

philippjfr Feb 16, 2018

Author Contributor

Yes, accidentally pushed this. Will revert.

@philippjfr philippjfr force-pushed the hetero_ds_range branch 2 times, most recently from bf5ea1b to 92d9e35 Feb 18, 2018

@philippjfr

This comment has been minimized.

Copy link
Contributor Author

philippjfr commented Feb 19, 2018

Going to need rebuilt test data, but let's get #1978 merged first since that also requires new test data.

@philippjfr philippjfr force-pushed the hetero_ds_range branch from 92d9e35 to 7eeae5a Feb 19, 2018

@jlstevens

This comment has been minimized.

Copy link
Contributor

jlstevens commented Feb 19, 2018

Happy to merge when the tests go green.

@jlstevens jlstevens merged commit d05358f into master Feb 19, 2018

4 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.02%) to 82.058%
Details
s3-reference-data-cache Test data is cached.
Details

@philippjfr philippjfr deleted the hetero_ds_range branch Feb 20, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.