Skip to content

llcreader + ASTE Release 1#231

Merged
rabernat merged 33 commits intoMITgcm:masterfrom
timothyas:aste_reader
Nov 16, 2020
Merged

llcreader + ASTE Release 1#231
rabernat merged 33 commits intoMITgcm:masterfrom
timothyas:aste_reader

Conversation

@timothyas
Copy link
Member

@timothyas timothyas commented Oct 17, 2020

Release 1 of Arctic Subpolar gyre sTate Estimate, a data constrained Arctic from 2002-2017, will soon be made public. This has been a massive effort led by @antnguyen13 to get a data constrained Arctic state estimate. We thought it would be quite useful and convenient to provide the model output in a similar manner as the LLC 4320/2160 are on the ECCO Data Portal, so this PR aims to accomplish that. This required some generalizations within the llcreader module, namely:

  • generalizing to nfaces != 13, which involves a slight modification of _facet_strides
  • "padding" of various facets, as necessary for the ASTE version of the LLC grid (see llcmodel._pad_facet)
  • now one can specify iters directly for an LLCModel rather than iter_start, iter_stop, iter_step, since the output time steps are not uniform. This is the only addition to the existing user facing functions.
  • specifying unique data types per field. This is basically because some files are in double precision to close budgets as accurately as possible (you know us state estimators love that!), while for other fields this is not necessary

Some other small details with this PR:

  • llcreader will now grab attributes that make for a seamless integration with xgcm
  • some additional KPP variables added to variables.py

Of course this doesn't change anything for the existing predefined llcreader models, just adds some more. I originally went through all the debugging on our local cluster Sverdrup, and so left this available in the same spirit as the Pleiades file store.

The output are available through AWS, and while preparing this I've noticed no connectivity issues ok actually I'm starting to run into some connectivity issues when doing things like computing depth integrated volume transports. I'm looking into this...

I'm looking forward to hearing what people think if/when they have time to check this out! :)

# for fld in ds_ll[['Z','Zl','Zu','Zp1']].coords:
# if isinstance(ds_ll[fld].data,dsa):
# assert len(ds_ll[fld].data.chunks)==1
# assert (len(ds_ll[fld]),)==ds_ll[fld].data.chunks[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why all these tests are commented out?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😧 oh my gosh I can’t believe I pushed that! Total accident while I was locally testing the new ASTE tests. Whoops! I’ll get ‘em back.

@rabernat
Copy link
Member

Tim this looks absolutely fantastic! Thanks so much for all your work here.

I will try to review this carefully within a few days.

@timothyas
Copy link
Member Author

Thanks @rabernat!

@rabernat
Copy link
Member

Ok, 10 days have passed, and I still haven't reviewed this. I'm very sorry.

Is there a specific deadline here? That would be useful to help me prioritize.

@timothyas
Copy link
Member Author

No worries @rabernat, I totally understand. It sounds like we are calling "official" ASTE release to be whenever we are able to submit to essoar and get a DOI, so not firm but some time next week or the week after. If that time frame is possible that would be great, but not critical.

On another note I've created an example notebook/Binder demo here, which we could link to in the xmitgcm docs as well if you like - let me know what you think. For now this is using my branch of xmitgcm

@rabernat
Copy link
Member

rabernat commented Nov 12, 2020

I just pushed a fix for a few existing bugs in #225. I would appreciate your review there. I'd like to get that in before we move on to this.

Unfortunately that will cause some conflicts with this PR. But I looked things over and don't think it is too bad.

Thanks for your patience.

@rabernat rabernat mentioned this pull request Nov 12, 2020
6 tasks
@timothyas
Copy link
Member Author

Awesome, that sounds great. I'll check out #225 then rebase this and ping you when that's done.

@timothyas
Copy link
Member Author

Ok it's rebased and looks good, the conflict was just from our different styles for implementing #239 so not bad at all (I kept yours don't worry) 😁

Once #225 is in I can quickly rebase again

@timothyas
Copy link
Member Author

Alright #225 is in and this branch is rebased and ready for review whenever you get the chance @rabernat

Copy link
Member

@rabernat rabernat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tim this is truly a fantastic piece of work. I read through the code carefully and didn't see any major issues. The old tests all pass, so I think we are good!

My big concern is that the extra complexity of ASTE makes this code even more complicated. We now have to drag around nfaces, domain, and padding attributes. This will make it harder to maintain. For the sake of the other xmitgcm devs (basically just me! 😆), I would like to see a long comment in the code describing how the ASTE data are laid out on disk. Clearly you have a mental model of this--otherwise you wouldn't have been able to write this code. I would like that mental model to get written down somewhere.

Going forward, I think we need to refactor this code to define something like a FileLayout object, perhaps using a dataclass, which contains all of the information about a particular file configuration. Then, instead of calling

_get_facet_chunk(store, varname, iternum, nfacet, klevels, nx, nz, nfaces, dtype, mask_override, domain, pad_before, pad_after)

we could call

_get_facet_chunk(store, varname, iternum, nfacet, klevels, file_layout)

However, we don't need that for this PR.

Thanks again for all your work. This is pretty exciting! Just add the comment I requested and then we're good to go.

data_blank.shape = mask.shape
return data_blank

def _pad_facet(data,facet_shape,reshape,pad_before,pad_after,dtype):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above this line would be a good place to put a detailed comment about the ASTE grid layout and padding strategy.

@timothyas
Copy link
Member Author

Awesome, thanks @rabernat for the review and for the praise! I'm humbled... Yes this did get quite complicated and I agree something to remind other devs (possibly even myself later down the line!) how this went down would be good. I'm actually thinking a picture of the padding might go a long way to help this out, so I'll whip something up in addition to comments where you requested. I'll try to get this done asap while the momentum is going.

Going forward, I think we need to refactor this code to define something like a FileLayout object, perhaps using a dataclass, which contains all of the information about a particular file configuration. Then, instead of calling

I definitely agree this would be nice, and I'm a big fan of dataclasses 👍

Finally a comment on writing helpful comments: having the comments for ASTE in utils.get_extra_metadata (which I'm guessing as @raphaeldussin ?) helped a ton - thanks to whoever did that groundwork!!

@rabernat
Copy link
Member

Finally a comment on writing helpful comments: having the comments for ASTE in utils.get_extra_metadata (which I'm guessing as @raphaeldussin ?) helped a ton - thanks to whoever did that groundwork!!

FWIW, my eventual plan is to refactor all of xmitgcm around the approach in llcreader, which is better than the original code path.

@timothyas
Copy link
Member Author

FWIW, my eventual plan is to refactor all of xmitgcm around the approach in llcreader, which is better than the original code path.

Very cool. Hopefully this helps anyone struggling with the various LLC flavors then... I added some documentation in a new "Developer Notes" section of the API Documentation (of the llcreader section) and included a figure. I think your suggestion to add this was essential because even today I had to scratch my head and remember how this was done.

I think the content of that documentation is good, but I'm happy to move things around / reorganize / reword - just let me know!

@timothyas
Copy link
Member Author

aste_pad

this is the figure I added in the docs

@rabernat rabernat merged commit 5d87248 into MITgcm:master Nov 16, 2020
fraserwg pushed a commit to fraserwg/xmitgcm that referenced this pull request Nov 23, 2021
* very rough draft .... almost have all tiles correct...

* bering strait working

* all faces accounted for ... now to code it better...

* quick prefix fix

* look for meta file for data prec, otherwise use class default

* debugged ... was looking in grid dir

* to aws

* prefix bug

* give standard dimensions some attrs for xgcm

* temporary aws changes

* look for extra grid var filename prefixes

* semi general padding for non shrunk fields

* slightly better

* lots of variables

* added KPP nonlocal salt/temp flux

* dtype from meta prop to dask array creation

* some small bugs...

* provide dtype apriori, reading on the fly too slow. revert utils

* semi better handling of index key

* no latlon for aste

* specify iters for nonuniform step

* aws nested store, specify iters, add tests

* tests actually pass locally...

* python 3 prints

* ASTE documentation

* pretty it up

* hopefully clearer about iters vs iter params

* change some wording

* update ref names and order

* fixed up padding logic

* update that aste ref ... again

* get my face back!

* _pad_facet sufficiently documented...with pics
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants