Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add explicit NaN handling to proj_trans and gie #3603

Merged
merged 10 commits into from
Feb 10, 2023

Conversation

djhoese
Copy link
Contributor

@djhoese djhoese commented Feb 3, 2023

This is an attempt at resolving some of the NaN issues I described in #3596. It has been a long time since I've done any real C/C++ development so feel free to complain about everything. I'll describe a little of what I did in the next paragraph(s) and also with inline comments after I create this.

First, I added a test to more_builtins.gie. This (as predicted by @rouault) meant adding some level of support of NaNs to gie.cpp. To fully support roundtrip tests this also meant updating proj_roundtrip. So now gie and proj_roundtrip understand a NaN output and expected NaN to be a distance of 0.

Next, to get gie to actually parse NaNs from the text file I needed to update proj_strtod. I went the simplest and maybe least flexible route. I do a strncmp for the exact characters "NaN" which as far as the documentation tells me is also supported by strtod (the C++ builtin), but strtod also supports NAN and nan I think. See the C99/C11 docs here: https://cplusplus.com/reference/cstdlib/strtod/

Lastly, I updated proj_trans to check if any of the 4D input coordinates for NaN and if so skip the actual transformation and just return NaNs for all 4 coordinates.

src/4D_api.cpp Outdated
@@ -189,6 +189,11 @@ double proj_roundtrip(PJ *P, PJ_DIRECTION direction, int n, PJ_COORD *coord) {
/* finally, we take the last half-step */
t = proj_trans(P, opposite_direction(direction), t);

/* if we start with NaN, we expect NaN as output */
if (isnan(org.v[0]) && isnan(t.v[0])) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I check all 4 coordinates?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I check all 4 coordinates?

I think that would make sense. Also, should it be changed to or instead of and?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the or/and: I don't think so. This is checking that the original input is NaN and the final output is NaN. If the final output was NaN, but we didn't give it something that should return NaN then that should be a failure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could do: (isnan(org.v[0]) || isnan(org.v[1]) || ...) && (isnan(t.v[0]) || isnan(t.v[1]) || ...)? Basically if any of the inputs are NaN and any of the outputs are NaN...no maybe it should be that any NaNs in the input produce all NaNs in the output?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If everyone agrees on other parts of this PR, I think this is the only other part I might want to change. That is, checking all axes for NaN. This also adds to execution time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

proj_roundtrip() makes only sense as part of a test utility like gie, not "production" use cases. Execution time is not a concern

src/apps/gie.cpp Outdated
@@ -753,7 +753,7 @@ static PJ_COORD parse_coord(const char *args) {
// test failures when testing points at edge of grids.
// For example 1501000.0 becomes 1501000.000000000233
double d = proj_strtod(prev, &endp);
if (*endp != '\0' && !isspace(*endp)) {
if (!isnan(d) && *endp != '\0' && !isspace(*endp)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the simplest solution here to just say "skip this if statement if we have a NaN" but I'm not sure it is the right decision.

Comment on lines 126 to 131
/* NaN */
if (strncmp(p, "NaN", 3) == 0) {
if (endptr)
*endptr = const_cast<char *>(p + 3);
return NAN;
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how strict/flexible this code needs to be. It specifically checks for only NaN (not "nan", not "NAN", etc). It also doesn't do anything if the character after NaN is a whitespace. For a valid NaN it should be, but right now someone could technically enter the next coordinate number right after a NaN (ex. NaN0.0m would be two values I think).

@djhoese
Copy link
Contributor Author

djhoese commented Feb 3, 2023

Also, I didn't notice until after I started making this PR that gie has its own tests. If you want me to add tests there I can.

src/4D_api.cpp Outdated Show resolved Hide resolved
src/apps/proj_strtod.cpp Outdated Show resolved Hide resolved
src/apps/proj_strtod.cpp Outdated Show resolved Hide resolved
src/apps/proj_strtod.cpp Show resolved Hide resolved
src/4D_api.cpp Outdated Show resolved Hide resolved
djhoese and others added 5 commits February 3, 2023 12:11
Co-authored-by: Even Rouault <even.rouault@spatialys.com>
Co-authored-by: Even Rouault <even.rouault@spatialys.com>
Co-authored-by: Even Rouault <even.rouault@spatialys.com>
Co-authored-by: Even Rouault <even.rouault@spatialys.com>
@@ -123,6 +125,13 @@ double proj_strtod(const char *str, char **endptr) {
return 0;
}

/* NaN */
if (strncmp(p, "NaN", 3) == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this comparison also include nan?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, if we're going for feature parity with standard strtod, then it should handle nan, NAN, NaN, and maybe every combination of upper and lowercase. I mentioned in a higher up comment:

I'm not sure how strict/flexible this code needs to be. It specifically checks for only NaN (not "nan", not "NAN", etc). It also doesn't do anything if the character after NaN is a whitespace. For a valid NaN it should be, but right now someone could technically enter the next coordinate number right after a NaN (ex. NaN0.0m would be two values I think).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you may use the ci_equal() function for case insensitive comparison, provided you add at top of file

#include "proj/internal/internal.hpp"
using namespace NS_PROJ::internal;

Note that GDAL's strtod() function also catches other variations of NaN that used to be/are output by Microsoft C runtime library. Cf https://github.com/OSGeo/gdal/blob/master/port/cpl_strtod.cpp#L227 and below

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, actually ci_equal() isn't appropriate. That should be ci_starts_with()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How may those comparisons affect performance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#include "proj/internal/internal.hpp"
using namespace NS_PROJ::internal;

results in:

In file included from /home/davidh/repos/git/PROJ/src/apps/proj_strtod.cpp:89:
/home/davidh/repos/git/PROJ/include/proj/internal/internal.hpp:30:2: error: #error This file should only be included from a PROJ cpp file
   30 | #error This file should only be included from a PROJ cpp file
      |  ^~~~~
gmake[2]: *** [src/apps/CMakeFiles/cct.dir/build.make:90: src/apps/CMakeFiles/cct.dir/proj_strtod.cpp.o] Error 1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add #define FROM_PROJ_CPP before including proj/internal/internal.hpp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#include "proj_strtod.h"
#define FROM_PROJ_CPP
#include "proj/internal/internal.hpp"
using namespace NS_PROJ::internal;

#include <ctype.h>
#include <errno.h>

It's been a long time since I've done real C, and I really don't know C++ and especially not namespaces...

    if (ci_starts_with(p, "NaN")) {
/usr/bin/ld: CMakeFiles/cct.dir/proj_strtod.cpp.o: in function `proj_strtod(char const*, char**) [clone .part.0]':
proj_strtod.cpp:(.text+0x6b): undefined reference to `osgeo::proj::internal::ci_starts_with(char const*, char const*)'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

proj_strtod.cpp:(.text+0x6b): undefined reference to `osgeo::proj::internal::ci_starts_with(char const*, char const*)'

ah, I missed proj_strtod.cpp was an application file, and not a file in the library. ci_starts_with() needs to be exported then.

You need to apply the following patch:

diff --git a/include/proj/internal/internal.hpp b/include/proj/internal/internal.hpp
index dda11cbc..d7aa0d49 100644
--- a/include/proj/internal/internal.hpp
+++ b/include/proj/internal/internal.hpp
@@ -128,7 +128,7 @@ inline bool starts_with(const std::string &str, const char *prefix) noexcept {
 
 bool ci_less(const std::string &a, const std::string &b) noexcept;
 
-bool ci_starts_with(const char *str, const char *prefix) noexcept;
+PROJ_DLL bool ci_starts_with(const char *str, const char *prefix) noexcept;
 
 bool ci_starts_with(const std::string &str, const std::string &prefix) noexcept;
 
diff --git a/scripts/reference_exported_symbols.txt b/scripts/reference_exported_symbols.txt
index 6ff75bef..fc512037 100644
--- a/scripts/reference_exported_symbols.txt
+++ b/scripts/reference_exported_symbols.txt
@@ -335,6 +335,7 @@ osgeo::proj::HorizontalShiftGridSet::reopen(pj_ctx*)
 osgeo::proj::internal::ci_equal(std::string const&, char const*)
 osgeo::proj::internal::ci_equal(std::string const&, std::string const&)
 osgeo::proj::internal::ci_find(std::string const&, char const*)
+osgeo::proj::internal::ci_starts_with(char const*, char const*)
 osgeo::proj::internal::c_locale_stod(std::string const&)
 osgeo::proj::internal::replaceAll(std::string const&, std::string const&, std::string const&)
 osgeo::proj::internal::split(std::string const&, char)

@kbevers
Copy link
Member

kbevers commented Feb 4, 2023

What are the exact implications of this change for me as a user of PROJ? It's clear we've had a somewhat undefined approach to NaNs and sorting it out is probably a good idea. I'm also worried this will break existing use cases that have come to rely on specific behaviour.

@djhoese
Copy link
Contributor Author

djhoese commented Feb 4, 2023

What are the exact implications of this change for me as a user of PROJ?

@kbevers To me it is about consistency. From my experience, most projections "handle" NaNs just by the basic nature of any math performed on a NaN results in a NaN. So NaN in means NaN out. The LAEA projection (as described in #3596) is the first I've found where it returned 0 with NaN input and even that may have been a PROJ 6+ change. This had the odd effect of causing my resampling algorithm to put image pixels at the projections reference point (0m, 0m) because there were some invalid (NaN) values in the satellite instrument's coordinates. So I think anyone using NaN as a sentinel value for invalid/bad values in their data will be pleased.

Anyone who was benefiting from the NaN -> 0 or NaN -> ? of certain projections probably (big assumption) didn't know they were using it, especially given that most projections do return NaNs. I could also see people who aren't used to working with NaNs (isnan, val != val, etc) would be confused by these new results...but at the same time they are providing NaNs as input.

On the developer side, this should mean any projection algorithm development shouldn't have to worry about NaN handling like doing weird inverse/negated if statements to avoid NaNs.

@djhoese
Copy link
Contributor Author

djhoese commented Feb 4, 2023

How may those comparisons affect performance?

@jjimenezshaw brings up a good point that I didn't address in my above comment. This is adding extra checks in places where people who aren't working with NaN will still be affected by them. I'm not sure what modern C/C++ compilers can do as far as NaN checks and string comparison optimizations.

@rouault
Copy link
Member

rouault commented Feb 4, 2023

How may those comparisons affect performance?

proj_strtod() is not used in computational code paths, just in pipeline initialization, so I'd expect any change in it to have totally unmeasurable timing effects

@snowman2
Copy link
Contributor

snowman2 commented Feb 4, 2023

Would there be any benefit to removing all NaN checks from the lower level projection code?

@kbevers
Copy link
Member

kbevers commented Feb 4, 2023

Anyone who was benefiting from the NaN -> 0 or NaN -> ? of certain projections probably (big assumption) didn't know they were using it, especially given that most projections do return NaNs.

Or possibly they have put in work-arounds that'll break. I'm asking to get a sense of the scale of this change. If it ends up only affecting a handful of projections I'm happy to call it a bug fix and include this in the next release. If this ends up affecting most projections it's a big change in behaviour and I don't think that's justifiable in anything other than a major version release.

@djhoese
Copy link
Contributor Author

djhoese commented Feb 5, 2023

@kbevers Understood. I think this will be difficult to have a real answer to those questions. There is such inconsistent behavior here that I'm nervous to find out what workarounds people may have had. I would hope any generic application (one that handles a lot of projections) is using the detection of NaNs in the inputs to mask the results because otherwise I'm not sure how NaNs could be worked with.

I did a little test where I did proj -lp | cut -d : -f 1 to get a list of projections then ran this python code with that list using pyproj:

from pyproj import Transformer, CRS
import numpy as np

for proj in projections:
    try:
        t = Transformer.from_crs('EPSG:4326', CRS.from_proj4(f"+proj={proj}"))
        print(f"{proj}: ", t.transform(np.nan, np.nan))
    except:
        print(f"Bad projection: {proj}")

So basically: for each projection define a PROJ.4 projection as +proj=<projection name> and try to forward transform NaN. If the +proj= is not enough to properly define the CRS and we get an error, print "Bad projection" and continue on. Here are the results:

Proj transform results
adams_hemi:  (0.0, 16723842.659738889)
adams_ws1:  (11825542.552198235, 11825542.552198235)
adams_ws2:  (0.0, 16723842.659738889)
Bad projection: aea
aeqd:  (nan, nan)
affine:  (nan, nan)
airy:  (nan, nan)
aitoff:  (nan, nan)
alsk:  (nan, nan)
apian:  (0.0, nan)
august:  (nan, nan)
Bad projection: axisswap
bacon:  (0.0, nan)
bertin1953:  (nan, nan)
bipc:  (nan, nan)
boggs:  (nan, nan)
Bad projection: bonne
calcofi:  (nan, nan)
cart:  (nan, nan)
cass:  (nan, nan)
cc:  (nan, nan)
Bad projection: ccon
cea:  (nan, nan)
Bad projection: chamb
collg:  (nan, nan)
col_urban:  (nan, nan)
comill:  (nan, nan)
crast:  (nan, nan)
Bad projection: defmodel
Bad projection: deformation
denoy:  (nan, nan)
eck1:  (nan, nan)
eck2:  (nan, nan)
eck3:  (nan, nan)
eck4:  (nan, 8460601.461471582)
eck5:  (nan, nan)
eck6:  (inf, inf)
eqearth:  (nan, nan)
eqc:  (nan, nan)
Bad projection: eqdc
Bad projection: euler
etmerc:  (inf, inf)
fahey:  (nan, nan)
fouc:  (nan, nan)
fouc_s:  (nan, nan)
gall:  (nan, nan)
Bad projection: geoc
geogoffset:  (nan, nan)
Bad projection: geos
gins8:  (nan, nan)
Bad projection: gn_sinu
gnom:  (nan, nan)
goode:  (nan, 9356813.481673645)
gs48:  (nan, nan)
gs50:  (nan, nan)
guyou:  (11825542.552198235, 11825542.552198235)
hammer:  (nan, nan)
hatano:  (nan, nan)
healpix:  (nan, nan)
rhealpix:  (nan, nan)
Bad projection: helmert
Bad projection: hgridshift
Bad projection: horner
igh:  (nan, 9356835.979712626)
igh_o:  (nan, 9356835.979712626)
imoll:  (nan, 9020047.848073646)
imoll_o:  (nan, 9020047.848073646)
Bad projection: imw_p
isea:  (nan, nan)
kav5:  (nan, nan)
kav7:  (nan, nan)
krovak:  (nan, nan)
Bad projection: labrd
laea:  (nan, nan)
lagrng:  (nan, nan)
larr:  (nan, nan)
lask:  (nan, nan)
lonlat:  (nan, nan)
latlon:  (nan, nan)
Bad projection: lcc
Bad projection: lcca
leac:  (nan, nan)
lee_os:  (nan, nan)
loxim:  (nan, nan)
Bad projection: lsat
mbt_s:  (nan, nan)
mbt_fps:  (nan, nan)
mbtfpp:  (nan, nan)
mbtfpq:  (nan, nan)
mbtfps:  (inf, inf)
merc:  (nan, nan)
mil_os:  (nan, nan)
mill:  (nan, nan)
Bad projection: misrsom
moll:  (nan, 9020047.848073646)
Bad projection: molobadekas
Bad projection: molodensky
Bad projection: murd1
Bad projection: murd2
Bad projection: murd3
natearth:  (nan, nan)
natearth2:  (nan, nan)
nell:  (nan, nan)
nell_h:  (nan, nan)
nicol:  (nan, nan)
Bad projection: nsper
nzmg:  (nan, nan)
noop:  (nan, nan)
Bad projection: ob_tran
ocea:  (nan, nan)
Bad projection: oea
Bad projection: omerc
ortel:  (0.0, nan)
ortho:  (nan, nan)
Bad projection: pconic
patterson:  (nan, nan)
peirce_q:  (11825542.552198235, 11825542.552198235)
Bad projection: pipeline
poly:  (nan, nan)
pop:  (nan, nan)
push:  (nan, nan)
putp1:  (nan, nan)
putp2:  (nan, 9492245.3982264)
putp3:  (nan, nan)
putp3p:  (nan, nan)
putp4p:  (nan, nan)
putp5:  (nan, nan)
putp5p:  (nan, nan)
putp6:  (nan, 10153534.22269667)
putp6p:  (nan, 8882436.792968152)
qua_aut:  (nan, nan)
qsc:  (nan, nan)
robin:  (inf, inf)
rouss:  (nan, nan)
rpoly:  (nan, nan)
s2:  (nan, nan)
Bad projection: sch
set:  (nan, nan)
sinu:  (nan, nan)
somerc:  (nan, nan)
stere:  (nan, nan)
sterea:  (nan, nan)
gstmerc:  (nan, nan)
tcc:  (nan, nan)
tcea:  (nan, nan)
times:  (nan, nan)
Bad projection: tinshift
Bad projection: tissot
tmerc:  (inf, inf)
tobmerc:  (nan, nan)
Bad projection: topocentric
Bad projection: tpeqd
Bad projection: tpers
Bad projection: unitconvert
ups:  (nan, nan)
Bad projection: urm5
Bad projection: urmfps
Bad projection: utm
vandg:  (nan, nan)
vandg2:  (nan, nan)
vandg3:  (nan, nan)
vandg4:  (nan, nan)
vertoffset:  (nan, nan)
Bad projection: vitk1
Bad projection: vgridshift
wag1:  (nan, nan)
wag2:  (nan, nan)
wag3:  (nan, nan)
wag4:  (nan, 9984854.942197768)
wag5:  (nan, 10524818.98918)
wag6:  (nan, nan)
wag7:  (nan, nan)
webmerc:  (nan, nan)
weren:  (nan, nan)
wink1:  (nan, nan)
wink2:  (nan, nan)
wintri:  (nan, nan)
Bad projection: xyzgridshift

What you should notice: many of the projections are (nan, nan), many were also not able to be defined this way, and others had nan only in the x coordinate (the first axis). Others have (inf, inf) or completely valid results. I should also note that the LAEA projection that I originally filed the related issue about returned (nan, nan) here because the "bug" I reported was only revealed in the north/south pole version of the projection in which (0, 0) is returned.

@djhoese
Copy link
Contributor Author

djhoese commented Feb 5, 2023

Would there be any benefit to removing all NaN checks from the lower level projection code?

@snowman2 I don't think so, but I'm also not sure how much work people have put in to handling them. Like I said, I think in a lot of cases NaN handling is just happening due to the math. When the result is not NaN, my guess is this is from if statements giving undesired results:

if (lon > some_valid_threshold) {
    // do math
} else {
    // fail
    x = 0
    y = 0
}

// more math

@snowman2
Copy link
Contributor

snowman2 commented Feb 5, 2023

In case you run it again, I tweaked the script so you can do everything natively in python without proj -lp | cut -d : -f 1

from math import nan

from pyproj.list import get_proj_operations_map
from pyproj import Transformer


for proj in get_proj_operations_map():
    try:
        t = Transformer.from_crs('EPSG:4326', f"+proj={proj}")
        print(f"{proj}: ", t.transform(nan, nan))
    except Exception:
        print(f"Bad projection: {proj}")

@rouault
Copy link
Member

rouault commented Feb 5, 2023

Or possibly they have put in work-arounds that'll break.

@kbevers I doubt the proposed change in this PR would have large breaking potential. I would assume that anyone having relied in the past on PROJ's undefined NaN handling got the (nan, nan) result as output as @djhoese's experiment shows and thus will see no difference, and if they ran into the few cases where PROJ currently returns garbage, the user would have had to catch the NaN input before submitting coordinates to proj_trans() and do whatever they felt appropriate, so any change in PROJ isn't going to affect them. So overall I'd see this as a fix-as-usual situation.

This is adding extra checks in places where people who aren't working with NaN will still be affected by them

@djhoese Could you run bin/bench_proj_trans on a few transformations before & after your change (on a -DCMAKE_BUIKD_TYPE=Release build to get realistic numbers) ?

@djhoese
Copy link
Contributor Author

djhoese commented Feb 5, 2023

Ok so I have my main build directory for this PR, and a build2 for master where I switched the branch before running the build commands. I build both of these with:

cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH="/home/davidh/miniconda3/envs/satpy_py310" ..
cmake --build .

So both of these are based off of a conda environment (for the dependencies) which also has PROJ installed.

And I'm not really sure how to use the benchmark tool, but here's what I got:

# this PR:
$ ./bin/bench_proj_trans -s EPSG:4326 -t EPSG:3572 72.0 -120.0
72 -120 -> 1000792.32698984 -1733423.15817148
Duration: 432 ms
Throughput: 11.57 million coordinates/s

# master:
$ cd ../build2
$ ./bin/bench_proj_trans -s EPSG:4326 -t EPSG:3572 72.0 -120.0
72 -120 -> 1000792.32698984 -1733423.15817148
Duration: 441 ms
Throughput: 11.34 million coordinates/s

I'll see if I can verify that this is the different code sources being used.

@rouault
Copy link
Member

rouault commented Feb 5, 2023

And I'm not really sure how to use the benchmark tool, but here's what I got:


ok, well, this mostly demonstrates that the effect of this PR is neglectable on performance. In theory your PR should be slightly slower than master as it adds a few instructions, but with modern CPUs and dynamic CPU frequency scaling, accurate measurement is incredible hard.

Actually trying myself, and using hyperfine to get hopefully more accurate measurements, I find that this PR would be ~ 2% slower (but this might be within measurement noise and not being significant) on that particular use case:

master:

$ hyperfine --warmup 20  './bin/bench_proj_trans -s EPSG:4326 -t EPSG:3572 72.0 -120.0'
Benchmark #1: ./bin/bench_proj_trans -s EPSG:4326 -t EPSG:3572 72.0 -120.0
  Time (mean ± σ):     458.7 ms ±   5.4 ms    [User: 454.0 ms, System: 4.5 ms]
  Range (min … max):   453.2 ms … 468.7 ms
 
$ hyperfine --warmup 20  './bin/bench_proj_trans -s EPSG:4326 -t EPSG:3572 72.0 -120.0'
Benchmark #1: ./bin/bench_proj_trans -s EPSG:4326 -t EPSG:3572 72.0 -120.0
  Time (mean ± σ):     455.0 ms ±   4.4 ms    [User: 451.0 ms, System: 3.8 ms]
  Range (min … max):   450.3 ms … 463.5 ms

your PR:

 $ hyperfine --warmup 20  './bin/bench_proj_trans -s EPSG:4326 -t EPSG:3572 72.0 -120.0'
Benchmark #1: ./bin/bench_proj_trans -s EPSG:4326 -t EPSG:3572 72.0 -120.0
  Time (mean ± σ):     469.4 ms ±   7.3 ms    [User: 463.8 ms, System: 3.9 ms]
  Range (min … max):   461.1 ms … 480.5 ms

$ hyperfine --warmup 20  './bin/bench_proj_trans -s EPSG:4326 -t EPSG:3572 72.0 -120.0'
Benchmark #1: ./bin/bench_proj_trans -s EPSG:4326 -t EPSG:3572 72.0 -120.0
  Time (mean ± σ):     466.2 ms ±   3.7 ms    [User: 461.8 ms, System: 4.0 ms]
  Range (min … max):   459.9 ms … 471.2 ms

So, I'd say we are OK from a performance point of view

@djhoese
Copy link
Contributor Author

djhoese commented Feb 6, 2023

Not sure how you feel about helper functions, but I made two inline functions for the "any NaNs" and "all NaNs" checks in 4D_api.cpp.

@kbevers
Copy link
Member

kbevers commented Feb 6, 2023

I did a little test where I did proj -lp | cut -d : -f 1 to get a list of projections then ran this python code with that list using pyproj

Thanks. That gave a good overview of the current state of affairs. I agree with @rouault that this isn't in breaking changes territory. There's a chance some people will have put in place code to sanitize potentially weird PROJ output but most likely that will not be tripped up by this change.

I would like to have it mentioned in the docs for proj_trans that we are now guaranteed to get NaN output on NaN input. Normally that probably wouldn't be necessary but since we've been inconsistent on the matter for 30+ years I think it's worth adding a sentence about.

@djhoese
Copy link
Contributor Author

djhoese commented Feb 6, 2023

I would like to have it mentioned in the docs for proj_trans that we are now guaranteed to get NaN output on NaN input. Normally that probably wouldn't be necessary but since we've been inconsistent on the matter for 30+ years I think it's worth adding a sentence about.

Sounds good. Would that go in docs/source/development/reference/functions.rst or in the comments/docstring inside the proj_trans function or somewhere else or all of the above?

@rouault
Copy link
Member

rouault commented Feb 6, 2023

Would that go in docs/source/development/reference/functions.rst

yes, that one

@djhoese
Copy link
Contributor Author

djhoese commented Feb 6, 2023

Last release was 9.1.1, if I put this under a versionchanged directive should I put 9.1.2 or 9.2.0?

@kbevers
Copy link
Member

kbevers commented Feb 7, 2023

should I put 9.1.2 or 9.2.0?

9.2.0

@djhoese
Copy link
Contributor Author

djhoese commented Feb 8, 2023

should I put 9.1.2 or 9.2.0?

9.2.0

I've updated this. Could someone start the tests for me?

@rouault rouault added this to the 9.2.0 milestone Feb 8, 2023
Copy link
Contributor

@snowman2 snowman2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @djhoese 👍

@rouault rouault merged commit 1be6e21 into OSGeo:master Feb 10, 2023
@djhoese djhoese deleted the feature-explicit-nan-handling branch February 10, 2023 19:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LAEA projection odd handling of NaN
5 participants