New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large netCDF-4 file reading strategy #69
Comments
This looks good. I've modified it slightly so that it uses the DataType size (from your code), and compares against a multiple of the maximum available memory. This will make it to the next release of ncWMS2, and should hopefully be in a subsequent TDS release. |
Thank you Guy. |
Great, thanks, I've fixed that one too. |
So after some testing, it turns out that this is having a very detrimental effect on displaying data from large datasets - SCANLINE is a lot slower for compressed data, and this change is picking SCANLINE for datasets which really don't need it. I've changed the code so that only the size of the horizontal grid is taken into account. That's all that DataReadingStrategy applies to anyway, so this should give a more realistic estimate of the amount of data which needs to be read, and should only choose SCANLINE in cases where it's really necessary to avoid OutOfMemoryExceptions. Once I've confirmed that it's all working properly, would you mind testing with your dataset to make sure that SCANLINE is still chosen? |
Hi Guy - do you have a compiled ncwms jar containing your change that will On 30 Sep 2016 21:18, "Guy Griffiths" notifications@github.com wrote:
|
Thanks again Guy. I'll backport your patch into 4.6 for Adam and test the current master branch. |
just catching up here - what's the best way to go about testing this patch - is it part of any edal-java release yet (wondering if i should leap ahead to TDS 5 at this point)? and/or where can I grab a compiled ncwms.jar file containing the patch for TDS 4.x? Thanks |
@adamsteer - Yes, this will have made it into any recent edal-java release, and so should be available in the latest TDS 5 builds. @PeterWarren would be better placed to tell you whether this is in any 4.x version of TDS |
We are using thredds (ncwms currently but soon to be edal-java) to render wms layers of large (64GB) NetCDF-4 files. To avoid hitting out of memory errors we need to ensure the netcdf reading strategy is set to SCANNLINE. Currently, the reading strategy chooser (getOptimumDataReadingStrategy) only selects SCANNLINE if the file type is "netCDF" or "HDF4". Our files are "NetCDF-4" so the chooser falls-back to BOUNDING_BOX reading strategy and thredds quickly exhausts even very large memory allocations.
To avoid this we have patched our ncwms (thredds 4.6) to look for "NetCDF-4" type files and force them into SCANNLINE mode. We would now like to find a more permanent solution for thredds 5.0 and onwards.
I have 2 proposed solutions:
(1) is trivial so I wont provide any code for it.
I had a go at implementing (2) (attached bellow). I assumed that all NetcdfDatasets could be considered gridded datasets, I am not sure if that's safe? And I calculated the size of the dataset by taking the product of all dimensions.
Please let me know what you think.
NetCDF-4ReadingStratPatch.zip
Thanks
The text was updated successfully, but these errors were encountered: