Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes int overflow when requesting large slices #389

Merged
merged 12 commits into from Mar 5, 2020

Conversation

jasonb5
Copy link
Contributor

@jasonb5 jasonb5 commented Feb 24, 2020

Fix for #383

The large slice size (300, 60, 384, 320) was causing the variable holding the total number of elements to overflow. The variable was type int, whose max value is 2,147,483,647 and the request slice was 2,211,840,000, causing the overflow. I've changed the type to unsigned long which i dont forsee anyone reading that much data into memory anytime soon.

@muryanto1 How would you like to tackle writing a test for this?

@jasonb5 jasonb5 requested a review from muryanto1 Feb 24, 2020
@coveralls
Copy link

coveralls commented Feb 27, 2020

Coverage Status

Coverage remained the same at ?% when pulling de80b82 on large_file_overflow_fix into 34cec4f on master.

@muryanto1 muryanto1 merged commit d32af17 into master Mar 5, 2020
@muryanto1 muryanto1 deleted the large_file_overflow_fix branch Mar 9, 2020
@durack1
Copy link
Member

durack1 commented Mar 12, 2020

@jasonb5 @muryanto1 absolute champs, I'll take nightly for a test and let you know if I hit any ongoing problems.

For a test, I am not sure how we can get around having a huge (>8Gb) dataset to read?

@hyma68
Copy link

hyma68 commented Mar 17, 2020

@jasonb5 @muryanto1 @durack1
I just tested the nightly version on NERSC cori use the following command:

conda create -n nightly -c cdat/label/nightly -c conda-forge cdat python=3.6

I still got zero values when loading a large size array. My array size is (56623106, 137), which is 7,757,365,522. This should be within the "unsigned long" limit. Did I get the right nightly version?

@muryanto1
Copy link
Member

muryanto1 commented Mar 17, 2020

@hyma68 Please do:
conda create -n cdms_nightly -c cdat/label/nightly -c conda-forge cdms2 python=3.7.
I only built for python 2.7 and python 3.7 https://anaconda.org/cdat/cdms2/files
I hope you can use python 3.7.

@hyma68
Copy link

hyma68 commented Mar 17, 2020

@muryanto1
Thanks! It is working now for loading large size array. I was able to do python 3.7.
Another question, it seems I can't import "cdutil", and "genutil" with this build. Is there a quick solution that I can import other cdat modules including the two with this build? Thanks again.

@muryanto1
Copy link
Member

muryanto1 commented Mar 17, 2020

@hyma68 If you need any other cdat packages, you can do:
conda install -n cdms_nightly -c cdat/label/nightly -c conda-forge cdutil genutil
and any other cdat packages you may need.

@durack1
Copy link
Member

durack1 commented Mar 18, 2020

@muryanto1, if it's not complicated to implement, could this fix be rolled out for all nightlies? It would create more work for you guys to address continuing bug reports when turning on broader nightly builds could solve this long-standing issue

@jasonb5 jasonb5 linked an issue Jul 8, 2020 that may be closed by this pull request
@jasonb5 jasonb5 added this to the 8.2.1 milestone Jul 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Problems with reading "big" arrays (>8.1Gb)
5 participants