Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check metadata (geometry_name, crs) in merge/concat #326

Closed
jorisvandenbossche opened this issue May 26, 2016 · 14 comments
Closed

Check metadata (geometry_name, crs) in merge/concat #326

jorisvandenbossche opened this issue May 26, 2016 · 14 comments

Comments

@jorisvandenbossche
Copy link
Member

Follow-up issue for #322

When concatting or merging multiple GeoDataFrames, we could:

  • check if the crs is the same. If not, we can either raise, or convert on the fly to the crs of the first object.
  • check if the _geometry_column_name is equal. If not, raise?
@knaaptime
Copy link

knaaptime commented Sep 20, 2023

I'm getting an odd error where pandas refuses to concatenate two geodataframes with the same CRS. I have two dataframes, both in 4269, but trying to stack them gives

ValueError: Cannot determine common CRS for concatenation inputs, got ['NAD83', 'NAD83']. Use `to_crs()` to transform geometries to the same CRS before merging.

even though i can verify df1.crs.equals(df2.crs).

Data and details here https://gist.github.com/knaaptime/8e3b0841facd39856495dcb4923ffb16

@martinfleis
Copy link
Member

@knaaptime that is a bug. For some reason, set([dc18.crs, dc19.crs]) results in a set of 2 equal objects, breaking our common CRS check.

@knaaptime
Copy link

adding to the mix, other files (like 2017) play just fine (i.e. you can concat 17 and 18, just not 19) even though they're all 4269

@martinfleis
Copy link
Member

@snowman2 do you have any idea why == considers two pyproj.crs.CRS objects equal but set does not?

@snowman2
Copy link
Contributor

A set hashes the CRS object and that uses the __hash__ method to convert the CRS to WKT2 strings. The __eq__ method uses the PROJ to compare whether the CRS are the eqivalent. Two CRS can be equivalent without having the same WKT2 string.

Side note: The CRS object has an is_exact_same and an equals method to compare CRS. Most users want to check if the CRS are equivalent, which is why that is the default for CRS comparison.

@m-richards
Copy link
Member

A set hashes the CRS object and that uses the __hash__ method to convert the CRS to WKT2 strings. The __eq__ method uses the PROJ to compare whether the CRS are the eqivalent. Two CRS can be equivalent without having the same WKT2 string.

Is this considered correct behaviour in pyproj? Generally if objects are "equal" they should have the same hash - the python docs say

The only required property is that objects which compare equal have the same hash value;

(I am not at all across the technical details of CRS, so perhaps this is the best compromise to have an efficient hash, and we certainly should be able to work around this on the geopandas side. But we might not be the only ones caught out by this)

@snowman2
Copy link
Contributor

Is this considered correct behaviour in pyproj?

It is the best that we were able to come up with. The logic PROJ uses to compare CRS is complex and is difficult to be simplified to a hash. Ideas and contributions are welcome to improve this behavior 😃

@snowman2
Copy link
Contributor

Related pyproj4/pyproj#303

@m-richards
Copy link
Member

Going to close this issue now, since we do check metadata in merge and concat and have done for some time. Any related discussions are probably better served in a new issue.

@knaaptime
Copy link

@m-richards I still have the same issue shown this gist on 14.2. Woud you prefer if i open another issue?

@m-richards
Copy link
Member

@knaaptime that seems to be exactly the issue resolved in #3023, but it hasn't made it into a released version of geopandas yet ( we have only done 0.14.x releases patch releases since this was merged). If you think it's distinct, then let's continue the discussion in another issue.

@knaaptime
Copy link

knaaptime commented Jan 18, 2024

hm. since #3023 was merged in november and 14.2 was released 2 weeks ago, i figured 14.2 includes that patch? (in which case, it's the same issue and remains unresolved by 3023?)

@m-richards
Copy link
Member

hm. since #3023 was merged in november and 14.2 was released 2 weeks ago, i figured 14.2 includes that patch? (in which case, it's the same issue and remains unresolved by 3023?)

No, it's not included in 0.14.2 - see https://github.com/geopandas/geopandas/commits/0.14.x. This was a backport release to fix regressions. (Having said that we may do a 0.14.3 xref #3068 and this is perhaps small enough we could include)

@knaaptime
Copy link

ah gotcha. thanks, will keep an eye out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants