Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help with PROJ 9.3 CRS comparison issue #1339

Closed
djhoese opened this issue Aug 31, 2023 · 26 comments
Closed

Help with PROJ 9.3 CRS comparison issue #1339

djhoese opened this issue Aug 31, 2023 · 26 comments
Labels
proj Bug or issue related to PROJ question

Comments

@djhoese
Copy link
Contributor

djhoese commented Aug 31, 2023

Code Sample, a copy-pastable example if possible

See full context here: pytroll/pyresample#539

In [1]: from pyproj import CRS

In [2]: crs1 = CRS.from_proj4("+proj=laea +lat_0=90 +lon_0=0 +a=6371228.0 +units=m")

In [3]: crs2 = CRS.from_epsg(3408)

In [4]: crs1.to_proj4()
/usr/lib/python3/dist-packages/pyproj/crs/crs.py:1293: UserWarning: You will likely lose important projection information when converting to a PROJ string from another format. See: https://proj.org/faq.html#what-is-the-best-format-for-describing-coordinate-reference-systems
  proj = self._crs.to_proj4(version=version)
Out[4]: '+proj=laea +lat_0=90 +lon_0=0 +x_0=0 +y_0=0 +R=6371228 +units=m +no_defs +type=crs'

In [5]: crs2.to_proj4()
/usr/lib/python3/dist-packages/pyproj/crs/crs.py:1293: UserWarning: You will likely lose important projection information when converting to a PROJ string from another format. See: https://proj.org/faq.html#what-is-the-best-format-for-describing-coordinate-reference-systems
  proj = self._crs.to_proj4(version=version)
Out[5]: '+proj=laea +lat_0=90 +lon_0=0 +x_0=0 +y_0=0 +R=6371228 +units=m +no_defs +type=crs'

In [6]: crs1.to_epsg()
Out[6]: 3408

In [7]: crs2.to_epsg()
Out[7]: 3408

In [8]: crs1 == crs2
Out[8]: False

Problem description

The above code with PROJ 9.2.x does not produce an EPSG code for crs1. In PROJ 9.3, it does. The equality is False in both cases...but I'm not sure it should be in the 9.3 case if PROJ 9.3 is able to think of them as equivalent EPSG codes.

Expected Output

Either no EPSG code for crs1 (the from PROJ.4 string) or equality between the two.

NOTE: This has only been tested on PROJ 9.3 by @avalentino on their test system. I do not have 9.3 running locally (yet).

@snowman2 Can you help me file a bug with PROJ with a C example? Or do you not consider this a bug?

Environment Information

  • Output from: pyproj -v
  • Output from: python -m pyproj -v
  • Output from: python -c "import pyproj; pyproj.show_versions()"
  • pyproj version (python -c "import pyproj; print(pyproj.__version__)")
  • PROJ version (python -c "import pyproj; print(pyproj.proj_version_str)")
  • PROJ data directory (python -c "import pyproj; print(pyproj.datadir.get_data_dir())")
  • Python version (python -c "import sys; print(sys.version.replace('\n', ' '))")
  • Operation System Information (python -c "import platform; print(platform.platform())")

Installation method

  • conda, pip wheel, from source, etc...

Conda environment information (if you installed with conda):


Environment (conda list):
$ conda list proj


Details about conda and system ( conda info ):
$ conda info

@djhoese djhoese added the bug label Aug 31, 2023
@avalentino
Copy link

@sebastic FYI

@sebastic
Copy link
Contributor

@djhoese, I think you should bring this up on the PROJ list:

https://lists.osgeo.org/mailman/listinfo/proj

Specially the thread about PROJ 9.3:

https://lists.osgeo.org/pipermail/proj/2023-August/011116.html

@djhoese
Copy link
Contributor Author

djhoese commented Aug 31, 2023

$ docker run --rm -it osgeo/proj:latest bash
root@7aa0aa098f43:/# apt update && apt install python3-pip
root@7aa0aa098f43:/# python3 -m pip install pyproj
root@7aa0aa098f43:/# python3 -c "from pyproj import CRS; crs1 = CRS.from_proj4('+proj=laea +lat_0=90 +lon_0=0 +a=6371228.0 +units=m'); crs2 = CRS.from_epsg(3408); print(crs1.to_epsg() or "None"); print(crs2.to_epsg()); print(crs1 == crs2)"
None
3408
False
root@7aa0aa098f43:/# pyproj -v
pyproj info:
    pyproj: 3.6.0
      PROJ: 9.2.1
  data dir: /usr/local/lib/python3.10/dist-packages/pyproj/proj_dir/share/proj
user_data_dir: /root/.local/share/proj
PROJ DATA (recommended version): 1.14
PROJ Database: 1.2
EPSG Database: v10.088 [2023-05-13]
ESRI Database: ArcGIS Pro 3.1 [2023-19-01]
IGNF Database: 3.1.0 [2019-05-24]

System:
    python: 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
executable: /usr/bin/python3
   machine: Linux-6.4.6-76060406-generic-x86_64-with-glibc2.35

Python deps:
   certifi: 2023.7.22
    Cython: None
setuptools: 59.6.0
       pip: 22.0.2

@avalentino This example above doesn't seem to produce an EPSG code. And just like in the pyresample issue it lists PROJ 9.2.1 even though I think it should be using PROJ from the system. Maybe?

@avalentino
Copy link

avalentino commented Aug 31, 2023

@djhoese to use libproj v9.3rc1 in debian you should enable "experimental" and then run:

# apt -t experimental  install libproj-dev libproj25 proj-bin proj-data

EDIT: moreover probably it is better to use "debian/sid" of "debian/sid-slim" as docker image

@sebastic
Copy link
Contributor

Just install libproj-dev & proj-bin from experimental, the rest get pulled in via the dependencies.

@djhoese
Copy link
Contributor Author

djhoese commented Aug 31, 2023

I was able to build a version of pyproj from source (github from main) and run the same command I showed above. Even the pyproj command showed it was using PROJ 9.3.0, but still that crs1 did not produce an EPSG code.

Then I used debian:sid, added the experimental repository (along with some certificate packages to get GPG to work), then installed libproj-dev and proj-bin. I then installed python3-pip and python3-pyproj. Running my example command from that showed that crs1 produced an EPSG code.

Then I uninstalled python3-pyproj, installed git, and did python3 -m pip install --break-system-packages --force-reinstall git+https://github.com/pyproj4/pyproj. And this too produced an EPSG code for crs1.

So...to me this means either:

  1. In my first test case using the PROJ docker image, I wasn't ever actually using PROJ 9.3.0 even though I'm pretty sure I was.
  2. There is a major difference between PROJ 9.3.0 in the PROJ images versus the experimental debian repos.

@sebastic
Copy link
Contributor

There is a major difference between PROJ 9.3.0 in the PROJ images versus the experimental debian repos.

The PROJ images may have grids from PROJ-data installed which are not available in Debian.

@djhoese
Copy link
Contributor Author

djhoese commented Aug 31, 2023

Hhhmmm any idea if/why that would cause a CRS to not resolve to an EPSG code? If they were missing then I'd guess that an EPSG couldn't be found, but this is the opposite.

@snowman2 snowman2 added question and removed bug labels Aug 31, 2023
@snowman2
Copy link
Member

While digging into this, consider:

  • Note warning: UserWarning: You will likely lose important projection information when converting to a PROJ string from another format. PROJ strings are not good for comparing CRS.
  • See: min_confidence parameter in CRS.to_epsg. Set it to 100 to see if it is an exact match.
  • Notice the difference in datums:
>>> crs1
<Projected CRS: +proj=laea +lat_0=90 +lon_0=0 +a=6371228.0 +units= ...>
Name: unknown
Axis Info [cartesian]:
- E[south]: Easting (metre)
- N[south]: Northing (metre)
Area of Use:
- undefined
Coordinate Operation:
- name: unknown
- method: Lambert Azimuthal Equal Area (Spherical)
Datum: unknown
- Ellipsoid: unknown
- Prime Meridian: Greenwich

>>> crs2
<Projected CRS: EPSG:3408>
Name: NSIDC EASE-Grid North
Axis Info [cartesian]:
- X[south]: Easting (metre)
- Y[south]: Northing (metre)
Area of Use:
- name: Northern hemisphere.
- bounds: (-180.0, 0.0, 180.0, 90.0)
Coordinate Operation:
- name: US NSIDC Equal Area north projection
- method: Lambert Azimuthal Equal Area (Spherical)
Datum: Not specified (based on International 1924 Authalic Sphere)
- Ellipsoid: International 1924 Authalic Sphere
- Prime Meridian: Greenwich

@djhoese
Copy link
Contributor Author

djhoese commented Aug 31, 2023

Thanks @snowman2! I know about the PROJ.4 string warning, but that's why I'm confused with this behavior. It is close enough to be considered the same EPSG with to_epsg, but not when doing equality. I did not know about min_confidence, but I would have assumed that in cases like this it would default to a value that would handle the round trip decently well.

The original issue started in Pyresample and how it bounces between EPSG and PROJ.4 strings when exporting to on-disk formats. I think bumping up the min_confidence would be a "good enough" workaround.

I'm currently trying to build PROJ from source in the PROJ docker container and see if that gives me different results. I tried syncing the proj-data in the debian:sid image and that doesn't change the final result (crs1 still produces an EPSG code).

@snowman2
Copy link
Member

snowman2 commented Aug 31, 2023

The min_confidence defaults to 70, which I would have assumed to be equivalent. If you need to be able to roundtrip, I would change it to 100. Alternatively, WKT2 is also a good format for storing a CRS if you need to roundtrip.

If you are confident that the CRS are equivalent, I can help you create an issue upstream with PROJ to ask what their thoughts are on it.

@snowman2
Copy link
Member

I tried syncing the proj-data in the debian:sid image and that doesn't change the final result

That is only for transfomations. proj.db is all you need for CRS information.

@djhoese
Copy link
Contributor Author

djhoese commented Aug 31, 2023

The min_confidence defaults to 70, which I would have assumed to be equivalent. If you need to be able to roundtrip, I would change it to 100. Alternatively, WKT2 is also a good format for storing a CRS if you need to roundtrip.

Sounds good. I wouldn't say I need roundtripping in normal user use cases, but the pyresample tests assume it can be done sometimes. I understand WKT2 would be good for a roundtrip, but human readability/parse-ability is also something we'd like to have.

@djhoese
Copy link
Contributor Author

djhoese commented Aug 31, 2023

The other question here is what is different between the two systems (debian:sid with experimental libproj-dev versus osgeo/proj:latest) that is causing an EPSG code to be used or not.

@snowman2
Copy link
Member

I don't recommend using osgeo/proj:latest as it doesn't work how you would expect: #1085

@djhoese
Copy link
Contributor Author

djhoese commented Aug 31, 2023

Hhhmmm in the 9.3.0 case it was showing the proper version.

@djhoese
Copy link
Contributor Author

djhoese commented Aug 31, 2023

I haven't been able to figure out what's going on here. I installed PROJ from github on both systems and pyproj from github on both and pointing to the PROJ which was in a custom prefix in PROJ_DIR and I get an EPSG code for crs1 in the debian:sid container but not the osgeo/proj:latest which I think is Ubuntu-based.

Bumping up min_confidence from 70 to 71 causes there to be no EPSG code generated on the debian system.

@djhoese
Copy link
Contributor Author

djhoese commented Sep 1, 2023

@snowman2 Do you have a suggested way of investigating the proj.db database? Or verifying that it is the same version between PROJ installations? I was about to post on the PROJ mailing list about this issue, specifically that one platform is getting an EPSG code and one is not. But then I realized I should know more about the proj_identify function used by pyproj and about the proj.db used in each installation.

https://proj.org/en/9.2/development/reference/functions.html#c.proj_identify

Is there a way to update the proj.db for an installation?

@snowman2
Copy link
Member

snowman2 commented Sep 1, 2023

Or verifying that it is the same version between PROJ installations?

import pyproj

print(pyproj.show_versions())

or

python -m pyproj -v

Is there a way to update the proj.db for an installation?

That is not recommended as it may not be compatible with a different PROJ version.

@djhoese
Copy link
Contributor Author

djhoese commented Sep 1, 2023

Well when I do md5sum /path/to/proj.db with installations from source (both from github and from the RC1 tarball posted on the mailing list) the md5sums are the same on a particular platform but different between them (debian versus the ubuntu-based osgeo/proj container). Any ideas what would lead to that? It would have to be some dependency or something available on one platform and not the other.

@snowman2
Copy link
Member

snowman2 commented Sep 1, 2023

It would have to be some dependency or something available on one platform and not the other.

sqllite3 is the only one I can think of that would impact that.

@djhoese
Copy link
Contributor Author

djhoese commented Sep 3, 2023

@snowman2 I just did:

rm -r /usr/local/lib/python3.10/dist-packages/pyproj/proj_dir

And now I'm getting the EPSG code in the crs1 case (from PROJ.4). When does proj_dir get created in the pyproj package and when does it use the system version?

@djhoese
Copy link
Contributor Author

djhoese commented Sep 3, 2023

Ok so I'm going with a corrupt installation. We've got a workaround for the EPSG roundtrip inequality from Even here: OSGeo/PROJ#3879

If/when that makes it through then the pyresample tests should pass with it. @snowman2 @avalentino I don't think I want to change the to_epsg confidence used in pyresample since the general purpose is to have it "magically" replace old PROJ.4 versions people were using that lacked datum information with some level of standard EPSG versions of these projections/CRSes. Thoughts?

@snowman2 snowman2 added the proj Bug or issue related to PROJ label Sep 20, 2023
@snowman2
Copy link
Member

When does proj_dir get created in the pyproj package and when does it use the system version?

This is what is included in the wheel. If you install pyproj with --no-binary pyproj ref it will compile using the system PROJ and won't include proj_dir.

@snowman2
Copy link
Member

I don't think I want to change the to_epsg confidence used in pyresample since the general purpose is to have it "magically" replace old PROJ.4 versions people were using that lacked datum information with some level of standard EPSG versions of these projections/CRSes. Thoughts?

I would be careful when "magically" replacing PROJ strings with EPSG versions. Unintended changes may occur that could result in differences in transformations. This is likely why Even is nervous about the change ref.

@djhoese
Copy link
Contributor Author

djhoese commented Sep 20, 2023

Point taken. The main purpose of the code in pyresample is to gently replace users understanding of their PROJ.4 string that they've used for the last 10 to 20 years and to recognize there is something more accurate and explicit in at least some kind of standard. By that I mean, many users probably copied PROJ.4 strings from the EPSG site many years ago only because that's what we supported back then. Now that EPSG strings/codes are actually supported we're "helping" the user realize that.

I think I can close this and when things break I'll come back and ask you to save us.

@djhoese djhoese closed this as completed Sep 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proj Bug or issue related to PROJ question
Projects
None yet
Development

No branches or pull requests

4 participants