Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suboptimal (?) result when going from "WGS 84" to "GDA2020 + AHD height" #2348

Closed
rouault opened this issue Sep 5, 2020 · 10 comments
Closed
Labels

Comments

@rouault
Copy link
Member

rouault commented Sep 5, 2020

echo -33 151 5 | PROJ_NETWORK=ON PROJ_DEBUG=3 PROJ_LIB=data src/cs2cs -d 8 "WGS 84" "GDA2020 + AHD height"

returns

-32.99998729	151.00000557 -20.85300064

and takes the "Using coordinate operation Inverse of Null geographic offset from WGS 84 (geog2D) to WGS 84 (geog3D) + Inverse of AGD66 to WGS 84 (17) + AGD66 to GDA94 (11) + Inverse of Null geographic offset from GDA94 (geog3D) to GDA94 (geog2D) + GDA94 to AHD height (49) + GDA94 to GDA2020 (1)" root, using au_ga_AUSGeoid09_V1.01.tif

whereas

echo -33 151 5 | PROJ_NETWORK=ON PROJ_DEBUG=3 PROJ_LIB=data src/cs2cs -d 8 GDA2020 "GDA2020 + AHD height"

returns

-33.00000000	151.00000000 -20.74200058

using
au_ga_AUSGeoid2020_20180201.tif

The later should ideally also be the result of "WGS 84" to "GDA2020 + AHD height" since there is a (null) WGS 84 to GDA2020 transformation available, and there is a geoid transformation for AHD height to GDA2020 (using AUSGeoid2020). Going through older AGD66, GDA94 and AUSGeoid09 is clearly suboptimal

@rouault rouault added the bug label Sep 5, 2020
@rouault
Copy link
Member Author

rouault commented Sep 5, 2020

Hum, actually this is tricky !

Looking at:

projinfo -s EPSG:4326 -t  EPSG:7856+5711  --spatial-test intersects --summary
Candidate operations found: 42
unknown id, WGS 84 to WGS 84 (G1762) + Inverse of Conversion from WGS 84 (G1762) (geocentric) to WGS 84 (G1762) (geog2D) + Inverse of GDA2020 to WGS 84 (G1762) (1) + Inverse of Conversion from GDA2020 (geog3D) to GDA2020 (geocentric) + GDA2020 to AHD height (1) + Map Grid of Australia zone 56, 2.23 m, Australia Christmas and Cocos - onshore
[...]
unknown id, Inverse of AGD66 to WGS 84 (17) + AGD66 to GDA94 (11) + Inverse of Null geographic offset from GDA94 (geog3D) to GDA94 (geog2D) + GDA94 to AHD height (49) + GDA94 to GDA2020 (1) + Map Grid of Australia zone 56, 1.14 m, Australia - mainland
[...]

So using the route through AGD66 and GDA94 has actually a 1.14 m accuracy, whether the more direct route has only 2.23m accuracy !

However, if using the deprecated +init=epsg:XXXX syntax,

echo 151 -33  5 | PROJ_NETWORK=ON PROJ_DEBUG=3 PROJ_LIB=data src/cs2cs -d 8  +init=epsg:4326 +to +init=epsg:7844+5711

one gets

-33.00000000	151.00000000 -20.74200058

that is the result using AUSGeoid2020. This is presumably due to internally the EPSG CRS codes being lost when using the +init=epsg:XXXX syntax (the CRS definition is derived from the EPSG one, but with the axis order altered), and thus the complicated logic of createOperations() goes through the datum codes to find CRS -> CRS transformations, whereas when using plain EPSG:4326 it will start by using it first, after trying through datum codes...

Here I'm a bit at a loss to do. Seems that PROJ is too smart for that case. Or more precisely, that the EPSG database has non consistent advertized accuracies, which cause PROJ to take potential suboptimal decisions

@rouault rouault changed the title Suboptimal result when going from "WGS 84" to "GDA2020 + AHD height" Suboptimal (?) result when going from "WGS 84" to "GDA2020 + AHD height" Sep 5, 2020
@gdt
Copy link
Contributor

gdt commented Sep 5, 2020

Without fully understanding, this strikes me as likely a bug in the EPSG database. I have a hard time believing that using a 1966 datum as an intermediary actually has better accuracy. It may be that with proj's new ability to search that this is only now becoming apparent.

It is also hard to fully understand what these accuracies mean when WGS84 is involved, given that it is a family of datums with internal consistency no better than the values being discussed.

@rouault
Copy link
Member Author

rouault commented Sep 5, 2020

@gdt Actually, you make good point. Digging more, I can see:

Route using GDA2020 and AUSGeoid2020 (2.23m total accuracy):
"WGS 84 to WGS 84 (G1762)": 2m --> this one doesn't come from EPSG. This is a null transformation added under the PROJ authority with 2m accuracy corresponding to the accuracy of the WGS 84 datum enseemble !!
"GDA2020 to WGS 84 (G1762) (1)": 0.2m
"GDA2020 to AHD height (1)": 0.03m

Route using AGD66, GDA94 (1.14m total accuracy):
"AGD66 to WGS 84 (17)": 1m
"AGD66 to GDA94 (11)": 0.1 m
"GDA94 to AHD height (49)": 0.03m
"GDA94 to GDA2020 (1)": 0.01m

So the more straightforward road is not taken because it takes the PROJ entry "WGS 84 to WGS 84 (G1762)": 2m . But actually, this is not PROJ's fault. This PROJ entry is preferred because the "GDA2020 to WGS 84 (2)" (EPSG:8450), 3.0 m (which is also a no-op) exposes an even worse accuracy.

So in all in all, the data in EPSG is what makes PROJ take surprising roads. Mostly the "AGD66 to WGS 84 (17)" transformation which advertizes a 1m accuracy is the main culprit...

@gdt
Copy link
Contributor

gdt commented Sep 5, 2020

Are you (or the OP) going to file a bug with EPSG? My understanding is that the EPSG license is non-free so it can't really be patched in proj.

On the proj side, I can see the point of having the WGS84 to WGS84(G1762) transform and that 2m is reasonable. But having done that, I would suggest considering the impact of a rule that any transform including an unqualified WGS84 have its accuracy brought up to somewhat more than 2m.

I would also suggest that it's a bug in datum datasets to not clearly have WGS84(unspecified) vs the various realizations. And likely an error to say one's data is in WGS84(unqualified), in many cases.

@rouault
Copy link
Member Author

rouault commented Sep 5, 2020

Just reported that to feedback@epsg.org:
"""
Hello,

The PROJ software strongly leverages the EPSG database to determine which
transformations are involved when transforming between 2 CRSs. In particular it uses
advertized accuracies of tranformations to try to identify the most relevant possibilities

In #2348, I came to a case where the accuracies lead
the software to take wrong conclusions

Particularly

"AGD66 to WGS 84 (17)" (EPSG:15786) reports a 1m accuracy,
whereas "GDA2020 to WGS 84 (2)" (EPSG:8450) reports a 3m accuracy

This seems to be quite surprising.
Shouldn't all transformations involving WGS 84 be reported at least with an accuracy of
2m, since this is the one of the WGS 84 ensemble ?

Looking at the comment of "AGD66 to WGS 84 (17)" (EPSG:15786), it mentions
"""Transformation taken from AGD66 to GDA94 (11) (code 1803) assuming that GDA94 is
equivalent to WGS 84 within the accuracy of this tfm"""
and EPSG:1803 "AGD66 to GDA94 (11)" has a 0.1m accuracy. So in the process of copying it to
"AGD66 to WGS 84 (17)" (EPSG:15786) , this was bumped to 1m. But it looks quite weird to
have a transformation involving an older datum and WGS 84 being reported as more accurate
as one with a newer datum and WGS 84.
"""

@RogerLott
Copy link

This is complicated issue to comment upon. There are several interweaving elements within the theme.

The route WGS 84 geographic 3D > compound GDA2020 + AHD height via 2D AGD66 is seriously flawed. If you have a requirement for a 3D transformation that includes changing to a compound pseudo3D (2D + 1D) CRS, the logic should be:

  • The last step must be geog3D to compound (here GDA2020 geog3D to compound GDA2020 geog2D + AHD height)
  • The route from WGS 84 geog3D to GDA2020 geog3D must be entirely through 3D CRSs. The geodetic reason for this is that ellipsoid height changes with change of datum and cannot be carried through as a 1D value in parallel with a set of 2D transformations.
  • If a direct transformation exists it should be used, as it is likely to be more correct regardless of accuracy. In this case there is one, EPSG::8450.
  • Operation Accuracy is used only to determine which of multiple direct transformations should be preferred. In this case there is only one WGS 84 to GDA2020 transformation, so operation accuracy does not need to be used.

In the EPSG Dataset, the Operation Accuracy is provided by the Information Source. There is no consistency in what has been and is being provided, even from within the same agency as it seems to be dependent upon the individual providing the information. Although we suggest that a 1-sigma accuracy is provided, in practice we have some RMS, a few 1-sigma, a few 2-sigma and many where we have no idea what statistic is given. We have suspicions that a majority are actually the internal precision of an adjustment, which is always likely to be much lower than the actual accuracy. A couple of years ago we investigated harmonising all of the values we carry to 1-sigma. We abandoned it as the provenance of over 60% of accuracy values was uncertain. As such, to be using Operation Accuracy for a general routing between CRS A and CRS B is not sound, from both a geodetic perspective in principle and especially when the variable values of this parameter in the EPSG Dataset is understood.
For data in the EPSG Dataset, Operation Accuracy should be used only to give a relative ranking of transformations between any particular pair of source/target CRSs. Relative values across different CRS pairs are not reliable. Operation Accuracy should be used as a differentiator only after superseded records have been eliminated from consideration (use their replacement), and area (geographic extent) of all CRSs have been check as including point(s) to be transformed.

@gdt said

I have a hard time believing that using a 1966 datum as an intermediary actually has better accuracy.
Yes it is to be expected that more recent CRSs have higher accuracy. But that is CRS network internal accuracy, which is a different concept to Operation Accuracy. Operation Accuracy estimates the impact of the coordinate operation on point accuracy. It assumes no errors in source coordinates.

But do not assume that newer always means more accurate. "AGD66 to WGS 84 (1)" (EPSG::1108) had an RMS error of 6m. It has been superseded by its information source by "AGD66 to WGS 84 (20)" (EPSG::6905) which is better because more points are included in its derivation. But EPSG::6905 accuracy is given as 9m RMS.

And then we have a complex topic in its own right, the use of WGS 84 (EPSG::4326 or EPSG::4979). Transformations to WGS 84 with EPSG or IOGP as information source (as opposed transformations from other agencies such as US NGA) are copies of some other transformation or an artificial null transformation. These have been given a nominal accuracy of 1m. This caters for 30-40 years of tectonic plate movement in most of North America and Europe. But in Australia 1m was valid only for 15-20 years, hence EPSG::1150 ("GDA94 to WGS 84" had its accuracy changed from 1m to 3m a couple of years ago.

If data has been geolocated using GPS in the past 7 years then it could be assumed to be referenced to the current WGS 84 (G1762) realization. Within a decimetre or so this is similar to ATRF2014, and EPSG::8448 can be used to transform coordinates directly to GDA2020.

Bottom line is that the accuracy of "AGD66 to WGS 84 (17)" (EPSG::15786) should be totally irrelevant to a transformation from WGS 84 to GDA2020 as AGD66 should not be part of the path.

@rouault
Copy link
Member Author

rouault commented Sep 8, 2020

@RogerLott Thanks a lot for those detailed explanations ! They make a lot of sense.

Unfortunately your recommendations are not necessarily easy to implement with the PROJ logic without altering it considerably, although perhaps in "simple" cases like the one that triggers this ticket, one could probably improve things along the directions you give.

I'm pretty sure that PROJ can be triggered in situations where it must compare added accuracies of things that aren't comparable. Like if doing CRS A to CRS B, and having to go through intermediary datums I1 or I2 if there's no direct route. In that case you'll have to compare the set of result (candidate transformation for A -> I1 + candidate transformation I1 -> B) to the set of result (candidate transformation A -> I2 + candidate transformation I2 -> B)

@gdt
Copy link
Contributor

gdt commented Sep 9, 2020

While I can appreciate that the "Operation Accuracies" in the database for various operations come from various sources and are therefore not comparable, I think what proj is trying to do fundamentally needs comparable accuracies. Setting aside where that will be done and by who, it also doesn't seem easy.

It also seems that there are two separate concepts which currently aren't being separated. One is the intrinsic accuracy of a datum. That is perhaps some measure of the variance between distances and angles in that datum, compared to those in ITRF2014 for the same points - but that's a difficult definition. For something like NAD27, this is considerable, due to distortions within the datum. For a modern GNSS-derived datum, much less so. That's a separate issue, perhaps, from the error in the transform -- although it seems somewhat difficult to talk about these concepts separately and rigorously. Perhaps proj's pipeline cost estimate should include not only costs for transforms, but for the intrinsic accuracy of intermediate datums.

I am tentatively arriving at the notion that someone needs to assign accuracies to datums and transforms, and if EPSG isn't doing this, perhaps because it is wisely declining to bite off too much, then perhaps proj does. We have to be careful about the non-Free license, so perhaps can maintain a lookaside table of codes and accuracies.

(There's a much larger issue about "WGS84", which I won't drag into this.)

rouault added a commit to rouault/PROJ that referenced this issue Jul 3, 2022
…4/GDA2020

Prior to this change, transforming to WGS 84 to GDA2020 used in priority
transformation EPGS:9690 ("WGS 84 to GDA2020 (3)"), which is a Helmert
transformation assuming WGS 84 ~= GDA94.
But transforming from GDA2020 to WGS 84 used EPSG:8450 "GDA2020 to WGS
84 (2)" which is a no-op.
Both have the same advertized accuracy 3m
Similar case for WGS 84 <--> GDA94.

This non symetric behaviour is clearly non-desirable in scenarios where
data is transformed back and forth. I believe it is preferable to use in
that situation of transformations with the same accuracy to use the one
that involves the less transformation steps, ie the no-op one.

Seen when checking OSGeo#2348 again
(whose bulk issues happens to have been fixed by other previous commits)
@rouault
Copy link
Member Author

rouault commented Jul 3, 2022

Testing again with latest PROJ master, the bulk of this issue has been fixed in between. Now the pipeline just involves GDA2020 and with #3248 that further tune behaviour,

$ echo -33 151 5 | PROJ_NETWORK=ON PROJ_LIB=data bin/cs2cs -d 8 "WGS 84" "GDA2020 + AHD height" --3d
-33.00000000	151.00000000 -20.74200058

consistently with:

$ echo -33 151 5 | PROJ_NETWORK=ON PROJ_LIB=data bin/cs2cs -d 8 "GDA2020" "GDA2020 + AHD height" --3d
-33.00000000	151.00000000 -20.74200058

@rouault rouault closed this as completed Jul 3, 2022
rouault added a commit to rouault/PROJ that referenced this issue Jul 3, 2022
…4/GDA2020

Prior to this change, transforming to WGS 84 to GDA2020 used in priority
transformation EPSG:9690 ("WGS 84 to GDA2020 (3)"), which is a Helmert
transformation assuming WGS 84 ~= GDA94.
But transforming from GDA2020 to WGS 84 used EPSG:8450 "GDA2020 to WGS
84 (2)" which is a no-op.
Both have the same advertized accuracy 3m
Similar case for WGS 84 <--> GDA94.

This non symetric behaviour is clearly non-desirable in scenarios where
data is transformed back and forth. I believe it is preferable to use in
that situation of transformations with the same accuracy to use the one
that involves the less transformation steps, ie the no-op one.

Seen when checking OSGeo#2348 again
(whose bulk issues happens to have been fixed by other previous commits)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants
@gdt @rouault @RogerLott and others