Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design for CRS extension #9

Closed
snowman2 opened this issue Jun 6, 2019 · 10 comments
Closed

Design for CRS extension #9

snowman2 opened this issue Jun 6, 2019 · 10 comments

Comments

@snowman2
Copy link
Collaborator

snowman2 commented Jun 6, 2019

Continuation from: corteva/rioxarray#4 (comment)

Related to #2 & #8

Here are the basic needs for rioxarray and probably other geospatial projects as well:

  1. A way to retrieve a pyproj.CRS based on the information in the file. (I think pyproj.CRS.from_cf() will be useful here)

    It would be useful to have an attribute such as:

    xdf.geo.crs
    

    See:

  2. A way to set a pyproj.CRS if it does not exist/is incorrect. (I think pyproj.CRS.from_cf() and pyproj.CRS.from_user_input() will be useful here)

    It would be useful to be able to set the attribute such as:

    Idf.geo.crs = "epsg:4326"
    xdf.geo.crs_from_cf(...)
    

    See xgeo projection setter

  3. A way to update the CRS information on the Dataset/DataArray in a CF compliant way. (I think pyproj.CRS.to_cf() will be useful here)

    Something like:

    xdf_cf = xdf.geo.write_crs(version="WKT1_GDAL")
    

    Also see add_spatial_ref

@djhoese
Copy link
Contributor

djhoese commented Jun 25, 2019

A couple questions/clarifications if you could:

  1. Is there any reason that geo.crs has to be set or should crs_from_cf allow users to provide the CF grid mapping information to create the CRS with the default (no arguments) being to look at the current Dataset/DataArray for expected attributes?

  2. This could only be done for Datasets right?

My understanding was that geoxarray would be providing utilities for creating the crs coordinate and adding additional "standard" attributes (CF, etc) from an existing crs coordinate. Let me know if this isn't what you were thinking.

@snowman2
Copy link
Collaborator Author

snowman2 commented Jun 25, 2019

Is there any reason that geo.crs has to be set or should crs_from_cf allow users to provide the CF grid mapping information to create the CRS with the default (no arguments) being to look at the current Dataset/DataArray for expected attributes?

Ideally, it would be best for the CRS to be pulled from the current Dataset/DataArray expected attributes. However, there are so many different ways that this data is stored that it is likely going to fail if it is on a non-standard location or format. With this line of thinking, it would be nice to have multiple methods for setting the CRS as a backup. For example, if you already have a pyproj.CRS or rasterio.CRS generated using an alternate method, it would be nice to be able to:

geo.crs = CRS("epsg:4326")

I guess you could do:

geo.crs_from_cf(crs_wkt=CRS("epsg:4326").to_wkt())

But, it is not as clean IMO.

This could only be done for Datasets right?

This can work for both data arrays and datasets. If it is a data array, then it is much simpler as you only have one location to search for the CRS. Whereas if it is a dataset, then some logic needs to be added to determine which CRS is used by which variable if there are more then one (or just assume that there is only one CRS for the entire dataset).

My understanding was that geoxarray would be providing utilities for creating the crs coordinate and adding additional "standard" attributes (CF, etc) from an existing crs coordinate. Let me know if this isn't what you were thinking.

Yes, this is what I am thinking is the end goal. But, until it is ready in geoxarray, I was thinking about adding it in to rioxarray as a temporary addition that can be replaced at a later date.

@djhoese
Copy link
Contributor

djhoese commented Jun 25, 2019

So is data_arr.geo.crs = CRS(...) kind of a magic "take this CRS information and add it to my Dataarray whether it be a dictionary (of CF attributes), a rasterio CRS object, or a pyproj CRS object, or something else?

Looking at the spatial_ref stuff I think I get it a little better now. It is being added as a coordinate variable and not as another variable in a Dataset.

@snowman2
Copy link
Collaborator Author

So is data_arr.geo.crs = CRS(...) kind of a magic "take this CRS information and add it to my Dataarray whether it be a dictionary (of CF attributes), a rasterio CRS object, or a pyproj CRS object, or something else?

That is the general idea. I am not sure about the CF dict as there can also be PROJ dicts. So, you would need a way to differentiate. Maybe look for proj or init for PROJ string versus grid_mapping_name, crs_wkt or spatial_ref for CF dict.

@djhoese
Copy link
Contributor

djhoese commented Jun 25, 2019

I'd say that wouldn't be a bad idea if the property is already magic-ish.

@snowman2
Copy link
Collaborator Author

Okay, sounds good to me 👍

@atedstone
Copy link

Hi. I'd been poised a couple of days ago to generalise some of my existing code from a couple of years ago for writing CF-compliant netcdfs using xarray, but some searching revealed this project. I can give some time to helping with getting and setting CRS info on xarray Datasets, especially in creating functionality to write CF-compliant netCDFs. Before I start anything though, is there any local development of the CRS extension happening at the moment that hasn't made it to Github?

@djhoese
Copy link
Contributor

djhoese commented Aug 2, 2019

Yes! I was just about to push the changes even though I wasn't happy with everything. I'd also check out @snowman2's progress on rioxarray.

@djhoese
Copy link
Contributor

djhoese commented Aug 3, 2019

@atedstone FYI #13

@djhoese
Copy link
Contributor

djhoese commented Jul 10, 2023

Closing this as the initial support in #26 is merged and mimics rioxarray's support. That is, .geo.crs gives you a pyproj CRS object. .geo.write_crs writes a CF-compatible grid_mapping variable to the .coords of the current xarray object. Due to difficulties persisting information in an xarray accessor there is no way to provide the geoxarray accessor CRS information without persisting it with .geo.write_crs. Put another way, it is too easily to accidentally lose the temporary/internal CRS information if it isn't persisted in the xarray object itself.

@djhoese djhoese closed this as completed Jul 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants