Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle range on heterogeneously typed Dataset column #2345

Merged
merged 3 commits into from Feb 19, 2018
Merged

Conversation

@philippjfr
Copy link
Member

@philippjfr philippjfr commented Feb 15, 2018

Calling range on a heterogeneously typed column currently causes all kinds of issues (including errors) due to python 3 sorting semantics. While range should generally not be called on object or string dtype columns it currently is so this needs to work robustly for the time being. Therefore this PR uses the python2sort utility to do the sorting robustly in python3.

@jlstevens
Copy link
Contributor

@jlstevens jlstevens commented Feb 16, 2018

This is one semantic change in Python 3 that I find to be downright annoying. Anyway, the suggested fix makes sense.

dimensions = [dataset.get_dimensions(d).name for d in dimensions]
inds = [dataset.data.columns.index(dim.name) for dim in dimensions]
return dataset.data.values[:, inds]

This comment has been minimized.

@jlstevens

jlstevens Feb 16, 2018
Contributor

I don't quite see how this relates to the python 2 vs 3 sorting issue. Is this method implementating an unrelated fix?

This comment has been minimized.

@philippjfr

philippjfr Feb 16, 2018
Author Member

Yes, accidentally pushed this. Will revert.

@philippjfr philippjfr force-pushed the hetero_ds_range branch 2 times, most recently from bf5ea1b to 92d9e35 Feb 18, 2018
@philippjfr
Copy link
Member Author

@philippjfr philippjfr commented Feb 19, 2018

Going to need rebuilt test data, but let's get #1978 merged first since that also requires new test data.

@philippjfr philippjfr force-pushed the hetero_ds_range branch from 92d9e35 to 7eeae5a Feb 19, 2018
@jlstevens
Copy link
Contributor

@jlstevens jlstevens commented Feb 19, 2018

Happy to merge when the tests go green.

@jlstevens jlstevens merged commit d05358f into master Feb 19, 2018
4 checks passed
4 checks passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.02%) to 82.058%
Details
@philippjfr
s3-reference-data-cache Test data is cached.
Details
@philippjfr philippjfr deleted the hetero_ds_range branch Feb 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants