New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle range on heterogeneously typed Dataset column #2345

Merged
merged 3 commits into from Feb 19, 2018

Conversation

Projects
None yet
2 participants
@philippjfr
Member

philippjfr commented Feb 15, 2018

Calling range on a heterogeneously typed column currently causes all kinds of issues (including errors) due to python 3 sorting semantics. While range should generally not be called on object or string dtype columns it currently is so this needs to work robustly for the time being. Therefore this PR uses the python2sort utility to do the sorting robustly in python3.

@jlstevens

This comment has been minimized.

Member

jlstevens commented Feb 16, 2018

This is one semantic change in Python 3 that I find to be downright annoying. Anyway, the suggested fix makes sense.

dimensions = [dataset.get_dimensions(d).name for d in dimensions]
inds = [dataset.data.columns.index(dim.name) for dim in dimensions]
return dataset.data.values[:, inds]

This comment has been minimized.

@jlstevens

jlstevens Feb 16, 2018

Member

I don't quite see how this relates to the python 2 vs 3 sorting issue. Is this method implementating an unrelated fix?

This comment has been minimized.

@philippjfr

philippjfr Feb 16, 2018

Member

Yes, accidentally pushed this. Will revert.

@philippjfr

This comment has been minimized.

Member

philippjfr commented Feb 19, 2018

Going to need rebuilt test data, but let's get #1978 merged first since that also requires new test data.

@jlstevens

This comment has been minimized.

Member

jlstevens commented Feb 19, 2018

Happy to merge when the tests go green.

@jlstevens jlstevens merged commit d05358f into master Feb 19, 2018

4 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.02%) to 82.058%
Details
s3-reference-data-cache Test data is cached.
Details

@philippjfr philippjfr deleted the hetero_ds_range branch Feb 20, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment