Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert input data CRS to config's input_crs only if there is no CRS in input data #26

Open
julietcohen opened this issue May 31, 2023 · 4 comments
Labels
good first issue Good for newcomers

Comments

@julietcohen
Copy link
Collaborator

In the config, there is an option to set input_crs (see here) which was intended to be used when the input data lacks CRS information, which was the case with some early ice wedge polygon data. However, the way set_crs() is currently configured here in TileStager.py, the CRS of input data is set to the input_crs if the value in this in the config is not None. To be clear, the way the operation is set at the moment, the data is not transformed. See the documentation for geopandas set_crs() here.

We need the data to only be set to the value of input_crs when it is not None and the input data does not already have CRS info.

@robyngit
Copy link
Member

robyngit commented Jun 2, 2023

I was thinking about this @julietcohen... Do you think there will ever be a scenario where we would want to correct an existing CRS in a dataset? set_crs would already work for that case, but maybe we would want an option like replace_crs (True/False) to indicate when the given CRS needs to be corrected and when it needs to be set only for files missing that info? or maybe that would be option overload!

@julietcohen
Copy link
Collaborator Author

@robyngit That's a good idea, it does seem likely that one day we will receive data with incorrect CRS information. I have not encountered this before, but maybe instead of including another option like replace_crs, we could include a check for the actual CRS of the geometry using geodataframe.geometry.crs and compare it to what is returned by geodataframe.crs. Then if they are not the same, we use set_crs() to correct the geodataframe CRS info

@julietcohen
Copy link
Collaborator Author

julietcohen commented Jul 24, 2023

Seems like the output of both geodataframe.geometry.crs and geodataframe.crs are changed by set_crs() even though set_crs() doesn't transform the data so we would need anther way to check the actual CRS of the geometries and not just the metadata and there's no way to check the CRS of geometries besides the metadata

@robyngit robyngit added the good first issue Good for newcomers label Jan 23, 2024
@julietcohen
Copy link
Collaborator Author

To clarify: In the current code, here are the 4 possible scenarios and their outcome:

Does input data already have a CRS set ? Does input_crs config option have a value besides None? Result
yes yes data is set to the CRS defined as input_crs option in the config, then transformed to the CRS of the TMS
yes no data is transformed to the CRS of the TMS
no yes data is set to the CRS defined as input_crs option in the config, then transformed to the CRS of the TMS
no no data is transformed to the CRS of the TMS

So considering options 1 and 3, if input_crs has a value, then the input data is set to that CRS regardless if the input data already has a CRS. Perhaps this was the intentional purpose of this option in order to correct an incorrectly set CRS. If not, then the code should be adjusted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
Status: No status
Status: No status
Development

No branches or pull requests

2 participants