Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address cdms2 slow read from multiple netcdf files accessed through spanning xml #479

Open
durack1 opened this issue Jun 19, 2014 · 20 comments
Assignees
Labels

Comments

@durack1
Copy link
Member

durack1 commented Jun 19, 2014

It appears cdms2 is taking a very long time (tens of minutes) to read a subset of data from multiple netcdf files containing a continuous time axis. In particular the issue is appearing with very high resolution (daily, 2700 x 3600) surface ocean fields.

The slow read also happens when addressing the netcdf file(s) directly

The issue is apparent on the GFDL computing systems and has also been replicated on a workstation physical disk so it's likely a cdms2 problem rather than anything hardware related.

@doutriaux1
Copy link
Contributor

i suspect the time dim to be the killer, will report here as soon as we found a solution or at least a reason why.

@doutriaux1 doutriaux1 self-assigned this Jun 19, 2014
@durack1
Copy link
Member Author

durack1 commented Jun 19, 2014

With compressed (and potentially shuffled) netcdf4 files potentially coming down the pipe with CMIP6 it could be a good time to revisit how cdms2 is calling the netcdf C API, and if some compression/shuffling magic is better done by the library than Cdunif, then switching this to use the library.. If data reads can be sped up by reading across certain dimensions (so it appeared that reading all lons, and subsetting lats was 2x faster than the inverse) it would be great if a read call automagically did that for the user..

@aashish24
Copy link
Contributor

👍 Would using parallel netcdf help?

@doutriaux1
Copy link
Contributor

I don't think so in this case, it would probably help a bit but not so much, most of the wasted time (I believe) is spent reading the time dimensions from each files and recomposing the time objects. Taking cdunif is pretty much out of question since it is what allows us for reading in multiple formats.
But @aashish24 you're right we should definitely take advantage of // netcdf anyway.

@durack1
Copy link
Member Author

durack1 commented Sep 22, 2014

You folks might want to assign this as a 2.1 milestone to get if off the to-do list

@aashish24 aashish24 added this to the 2.1 milestone Sep 22, 2014
@aashish24
Copy link
Contributor

Done

@doutriaux1
Copy link
Contributor

@durack1 it's not a fix but since our file are niceley ordered based on name, using -j option makes the re-reading fast.

@doutriaux1 doutriaux1 modified the milestones: 2.2, 2.1 Nov 10, 2014
@durack1
Copy link
Member Author

durack1 commented Nov 11, 2014

I'll take another look at this when I go and look at the GFDL data again, I do think that it's going to cause problems in the CMIP6+ timeframe..

@doutriaux1
Copy link
Contributor

@painter1 that is the bug I'm thinking is hitting us on rhea. Will take a look tomorrow

@doutriaux1
Copy link
Contributor

@durack1 are the files still around? @painter1 thread seems to indicate -j helps. Can't reproduce locally on a tiny example.

@painter1
Copy link
Contributor

When I looked at the -j option it didn't seem that it does much of anything relative to the default behavior, except that it disables the default behavior of linearizing time for very short time units. Maybe I missed something.

@durack1
Copy link
Member Author

durack1 commented Feb 12, 2015

@doutriaux1 yep - ocean:/export/doutriaux1/Paul we last discussed this back in June last year..

@doutriaux1 doutriaux1 modified the milestones: 2.3, 2.2 Mar 2, 2015
@durack1
Copy link
Member Author

durack1 commented May 6, 2015

@doutriaux1 can you assign the enhancement label please?

@doutriaux1
Copy link
Contributor

@dnadeau4 assigning this to you but let's work on it together

@dnadeau4
Copy link
Contributor

dnadeau4 commented Sep 7, 2015

I think it affect using THREDDS as well. #1475 I would like to only read 1 time dimension at a time instead of loading everything in memory at once. Will start to investigate a solution.

@durack1
Copy link
Member Author

durack1 commented Sep 8, 2015

@dnadeau4 happy to run any tweaks you have over the test case, we have it pretty well reproduced..

@dnadeau4 dnadeau4 modified the milestones: 3.0, 2.4 Sep 8, 2015
@dnadeau4
Copy link
Contributor

I have made progress on this parallel issue. Everything is working well. In an parallel world, every node must open the file, create dimensions and create variables. Only slicing is different for each one.

Please run testing/cmds2/test_mpi_write_2.py on branch issue_479.

@durack1
Copy link
Member Author

durack1 commented Sep 28, 2015

@dnadeau4 are the tweaks on the branch above addressing the slow read times which was the focus of this original issue?

@dnadeau4
Copy link
Contributor

This is a write in parallel program. Nothing to do with the original issue.

The original issue will need a re-architecture of CDMS. CDMS read the entire array in memory. I would like to change the code just to return a empty handler and read slices of the array per user request. Of course if the user ask for the entire array it will still be slow, but it could be possible to read in parallel using multiple nodes. Something to think about.

@durack1
Copy link
Member Author

durack1 commented Sep 28, 2015

@dnadeau4 if there is no "fix" for this issue, aside from a rearchitecture of cdms2, then close this issue and add the large grid test to the suite for testing when cdms2 is updated..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants