Update Mercator projection #2397

cffk · 2020-10-26T16:49:58Z

Introduction

The existing formulation for the Mercator projection is
"satisfactory"; it is reasonably accurate. However for a core
projection like Mercator, I think we should strive for full double
precision accuracy.

This commit uses cleaner, more accurate, and faster methods for
computing the forward and inverse projections. These use the
formulation in terms of hyperbolic functions that are manifestly odd
in latitude

psi = asinh(tan(phi)) - e * atanh(e  * sin(phi))

(phi = latitude; psi = isometric latitude = Mercator y coordinate).
Contrast this with the existing formulation

psi = log(tan(pi/4 - phi/2))
      - e/2 * log((1 + e * sin(phi)) / (1 - e * sin(phi)))

where psi(-phi) isn't exactly equal to -psi(phi) and psi(0) isn't
guaranteed to be 0.

Implementation

There's no particular issue implementing the forward projection, just
apply the formulas above. The inverse projection is tricky because
there's no closed form solution for the inverse. The existing code
for the inverse uses an iterative method from Snyder. This is the
usual hokey function iteration, and, as usual, the convergence rate is
linear (error reduced by a constant factor on each iteration). This
is OK (just) for low accuracy work. But nowadays, something with
quadratic convergence (e.g., Newton's method, number of correct digits
doubles on each iteration) is preferred (and used here). More on this
later.

The solution for phi(psi) I use is described in my TM paper and I
lifted the specific formulation from GeographicLib's Math::tauf, which
uses the same underlying machinery for all conformal projections. It
solves for tan(phi) in terms of sinh(psi) which as a near identity
mapping is ideal for Newton's method.

For comparison I also look at the approach adopted by Poder + Engsager
in their TM paper and implemented in etmerc. This uses trigonometric
series (accurate to n^6) to convert phi <-> chi. psi is then given by

psi = asinh(tan(chi))

Accuracy

I tested just the routines for transforming phi <-> psi from merc.cpp
and measured the errors (converted to true nm = nanometers) for the
forward and inverse mapping. I also included in my analysis the
method used by etmerc. This uses a trigonometric series to convert
phi <-> chi = atan(sinh(psi)), the conformal latitude.

            forward      inverse
            max  rms     max    rms
old merc    3.60 0.85 2189.47 264.81
  etmerc    1.82 0.38    1.42   0.37
new merc    1.83 0.30    2.12   0.31

1 nm is pretty much the absolute limit for accuracy in double
precision (1 nm = 10e6 m / 2^53, approximately), and 5 nm is probably
the limit on what you should routinely expect. So the old merc
inverse is considerably less accurate that it could be. The old merc
forward is OK on accuracy -- except that if does not preserve the
parity of the projection.

The accuracy of etmerc is fine (the truncation error of the 6th order
series is small compared with the round-off error). However,
situation reverses as the flattening is increased. E.g., at f =
1/150, the max error for the inverse projection is 8 nm. etmerc is OK
for terrestrial applications, but couldn't be used for Mars.

Timing

Here's what I get with g++ -O3 on various Linux machines with recent
versions of g++. As always, you should take these with a grain of
salt. You might expect the relative timings to vary by 20% or so when
switching between compilers/machines. Times per call in ns =
nanoseconds.

           forward inverse
old merc     121      360
  etmerc    4e-6      1.4
new merc      20      346

The new merc method is 6 times faster at the forward projection and
modestly faster at the inverse projection (despite being more
accurate). The latter result is because it only take 2 iterations of
Newton's method to get full accuracy compared with an average of 5
iterations for the old method to get only um accuracy.

A shocking aspect of these timings is how fast etmerc is. Another is
that forward etmerc is streaks faster that inverse etmerc (it made be
doubt my timing code). Evidently, asinh(tan(chi)) is a lot faster to
compute than atan(sinh(psi)). The hesitation about adopting etmerc
then comes down to:

the likelihood that Mercator may be used for non-terrestrial
bodies;
the question of whether the timing benefits for the etmerc method
would be noticeable in a realistic application;
need to duplicate the machinery for evaluating the coefficients
for the series and for Clenshaw summation in the current code
layout.

More discussion follows

Tests added
Added clear title that can be used to generate release notes
Fully documented, including updating docs/source/*.rst for new API

@schwehr

Introduction ------------ The existing formulation for the Mercator projection is "satisfactory"; it is reasonably accurate. However for a core projection like Mercator, I think we should strive for full double precision accuracy. This commit uses cleaner, more accurate, and faster methods for computing the forward and inverse projections. These use the formulation in terms of hyperbolic functions that are manifestly odd in latitude psi = asinh(tan(phi)) - e * atanh(e * sin(phi)) (phi = latitude; psi = isometric latitude = Mercator y coordinate). Contrast this with the existing formulation psi = log(tan(pi/4 - phi/2)) - e/2 * log((1 + e * sin(phi)) / (1 - e * sin(phi))) where psi(-phi) isn't exactly equal to -psi(phi) and psi(0) isn't guaranteed to be 0. Implementation -------------- There's no particular issue implementing the forward projection, just apply the formulas above. The inverse projection is tricky because there's no closed form solution for the inverse. The existing code for the inverse uses an iterative method from Snyder. This is the usual hokey function iteration, and, as usual, the convergence rate is linear (error reduced by a constant factor on each iteration). This is OK (just) for low accuracy work. But nowadays, something with quadratic convergence (e.g., Newton's method, number of correct digits doubles on each iteration) is preferred (and used here). More on this later. The solution for phi(psi) I use is described in my TM paper and I lifted the specific formulation from GeographicLib's Math::tauf, which uses the same underlying machinery for all conformal projections. It solves for tan(phi) in terms of sinh(psi) which as a near identity mapping is ideal for Newton's method. For comparison I also look at the approach adopted by Poder + Engsager in their TM paper and implemented in etmerc. This uses trigonometric series (accurate to n^6) to convert phi <-> chi. psi is then given by psi = asinh(tan(chi)) Accuracy -------- I tested just the routines for transforming phi <-> psi from merc.cpp and measured the errors (converted to true nm = nanometers) for the forward and inverse mapping. I also included in my analysis the method used by etmerc. This uses a trigonometric series to convert phi <-> chi = atan(sinh(psi)), the conformal latitude. forward inverse max rms max rms old merc 3.60 0.85 2189.47 264.81 etmerc 1.82 0.38 1.42 0.37 new merc 1.83 0.30 2.12 0.31 1 nm is pretty much the absolute limit for accuracy in double precision (1 nm = 10e6 m / 2^53, approximately), and 5 nm is probably the limit on what you should routinely expect. So the old merc inverse is considerably less accurate that it could be. The old merc forward is OK on accuracy -- except that if does not preserve the parity of the projection. The accuracy of etmerc is fine (the truncation error of the 6th order series is small compared with the round-off error). However, situation reverses as the flattening is increased. E.g., at f = 1/150, the max error for the inverse projection is 8 nm. etmerc is OK for terrestrial applications, but couldn't be used for Mars. Timing ------ Here's what I get with g++ -O3 on various Linux machines with recent versions of g++. As always, you should take these with a grain of salt. You might expect the relative timings to vary by 20% or so when switching between compilers/machines. Times per call in ns = nanoseconds. forward inverse old merc 121 360 etmerc 4e-6 1.4 new merc 20 346 The new merc method is 6 times faster at the forward projection and modestly faster at the inverse projection (despite being more accurate). The latter result is because it only take 2 iterations of Newton's method to get full accuracy compared with an average of 5 iterations for the old method to get only um accuracy. A shocking aspect of these timings is how fast etmerc is. Another is that forward etmerc is streaks faster that inverse etmerc (it made be doubt my timing code). Evidently, asinh(tan(chi)) is a lot faster to compute than atan(sinh(psi)). The hesitation about adopting etmerc then comes down to: * the likelihood that Mercator may be used for non-terrestrial bodies; * the question of whether the timing benefits for the etmerc method would be noticeable in a realistic application; * need to duplicate the machinery for evaluating the coefficients for the series and for Clenshaw summation in the current code layout. Ripple effects ============== The Mercator routines used the the Snyder method, pj_tsfn and pj_phi2, are used in other projections. These relate phi to t = exp(-psi) (a rather bizarre choice in my book). I've retrofitted these to use the more accurate methods. These do the "right thing" for phi in [-pi/2, pi/2] , t in [0, inf], and e in [0, 1). NANs are properly handled. Of course, phi = pi/2 in double precision is actually less than pi/2, so cos(pi/2) > 0. So no special handling is needed for pi/2. Even if angles were handled in such a way that 90deg were exactly represented, these routines would still "work", with, e.g., tan(pi/2) -> inf. (A caution: with long doubles = a 64-bit fraction, we have cos(pi/2) < 0; and now we would need to be careful.) As a consequence, there no need for error handling in pj_tsfn; the HUGE_VAL return has gone and, of course, HUGE_VAL is a perfectly legal input to tsfn's inverse, phi2, which would return -pi/2. This "error handling" was only needed for e = 1, a case which is filtered out upstream. I will note that bad argument handling is much more natural using NAN instead of HUGE_VAL. See issue OSGeo#2376 I've renamed the error condition for non-convergence of the inverse projection from "non-convergent inverse phi2" to "non-convergent sinh(psi) to tan(phi)". Now that pj_tsfn and pj_phi2 now return "better" results, there were some malfunctions in the projections that called them, specifically gstmerc, lcc, and tobmerc. * gstmerc invoked pj_tsfn(phi, sinphi, e) with a value of sinphi that wasn't equal to sin(phi). Disaster followed. I fixed this. I also replaced numerous occurrences of "-1.0 * x" by "-x". (Defining a function with arguments phi and sinphi is asking for trouble.) * lcc incorrectly thinks that the projection isn't defined for standard latitude = +/- 90d. This happens to be false (it reduces to polar stereographic in this limit). The check was whether tsfn(phi) = 0 (which only tested for the north pole not the south pole). However since tsfn(pi/2) now (correctly) returns a nonzero result, this test fails. I now just test for |phi| = pi/2. This is clearer and catches both poles (I'm assuming that the current implementation will probably fail in these cases). * tobmerc similarly thinks that phi close to +/- pi/2 can't be transformed even though psi(pi/2) is only 38. I'm disincline to fight this. However I did tighten up the failure condition (strict equality of |phi| == pi/2). OTHER STUFF =========== Testing ------- builtins.gei: I tightened up the tests for merc (and while I was about it etmerc and tmerc) to reflect full double precision accuracy. My test values are generated with MPFR enabled code and so should be accurate to all digits given. For the record, for GRS80 I use f = 1/298.2572221008827112431628366 in these calculations. pj_phi2_test: many of the tests were bogus testing irrelevant input parameters, like negative values of exp(-psi), and freezing in the arbitrary behavior of phi2. I've reworked most for the tests to be semi-useful. @schwehr can you review. Documentation ------------- I've updated merc.rst to outline the calculation of the inverse projection. phi2.cpp includes detailed notes about applying Newton's method to find tan(phi) in terms of sinh(psi). Future work ----------- lcc needs some tender loving care. It can easily (and should) be modified to allow stdlat = +/- 90 (reduces to polar stereographic), stdlat = 0 and stdlat_1 + stdlat_2 = 0 (reduces to Mercator). A little more elbow grease will allow the treatment of stdlat_1 close to stdlat_2 using divided differences. (See my implementation of the LambertConformalConic class in GeographicLib.) All the places where pj_tsfn and pj_phi2 are called need to be reworked to cut out the use of Snyder's t = exp(-psi() variable and instead use sinh(psi). Maybe include the machinery for series conversions between all auxiliary latitudes as "support functions". Then etmerc could use this (as could mlfn for computing meridional distance). merc could offer the etmerc style projection via chi as an option when the flattening is sufficiently small.

cffk · 2020-10-26T16:50:45Z

Ripple effects

The Mercator routines used the the Snyder method, pj_tsfn and pj_phi2,
are used in other projections. These relate phi to t = exp(-psi) (a
rather bizarre choice in my book). I've retrofitted these to use the
more accurate methods. These do the "right thing" for phi in [-pi/2,
pi/2] , t in [0, inf], and e in [0, 1). NANs are properly handled.

Of course, phi = pi/2 in double precision is actually less than pi/2,
so cos(pi/2) > 0. So no special handling is needed for pi/2. Even if
angles were handled in such a way that 90deg were exactly represented,
these routines would still "work", with, e.g., tan(pi/2) -> inf.

(A caution: with long doubles = a 64-bit fraction, we have cos(pi/2) <
0; and now we would need to be careful.)

As a consequence, there no need for error handling in pj_tsfn; the
HUGE_VAL return has gone and, of course, HUGE_VAL is a perfectly legal
input to tsfn's inverse, phi2, which would return -pi/2. This "error
handling" was only needed for e = 1, a case which is filtered out
upstream. I will note that bad argument handling is much more natural
using NAN instead of HUGE_VAL. See issue #2376

I've renamed the error condition for non-convergence of the inverse
projection from "non-convergent inverse phi2" to "non-convergent
sinh(psi) to tan(phi)".

Now that pj_tsfn and pj_phi2 now return "better" results, there were
some malfunctions in the projections that called them, specifically
gstmerc, lcc, and tobmerc.

gstmerc invoked pj_tsfn(phi, sinphi, e) with a value of sinphi
that wasn't equal to sin(phi). Disaster followed. I fixed this.
I also replaced numerous occurrences of "-1.0 * x" by "-x".
(Defining a function with arguments phi and sinphi is asking for
trouble.)
lcc incorrectly thinks that the projection isn't defined for
standard latitude = +/- 90d. This happens to be false (it reduces
to polar stereographic in this limit). The check was whether
tsfn(phi) = 0 (which only tested for the north pole not the south
pole). However since tsfn(pi/2) now (correctly) returns a nonzero
result, this test fails. I now just test for |phi| = pi/2. This
is clearer and catches both poles (I'm assuming that the current
implementation will probably fail in these cases).
tobmerc similarly thinks that phi close to +/- pi/2 can't be
transformed even though psi(pi/2) is only 38. I'm disincline to
fight this. However I did tighten up the failure condition
(strict equality of |phi| == pi/2).

cffk · 2020-10-26T16:51:08Z

OTHER STUFF

Testing

builtins.gei: I tightened up the tests for merc (and while I was about
it etmerc and tmerc) to reflect full double precision accuracy. My
test values are generated with MPFR enabled code and so should be
accurate to all digits given. For the record, for GRS80 I use f =
1/298.2572221008827112431628366 in these calculations.

pj_phi2_test: many of the tests were bogus testing irrelevant input
parameters, like negative values of exp(-psi), and freezing in the
arbitrary behavior of phi2. I've reworked most for the tests to be
semi-useful. @schwehr can you review.

Documentation

I've updated merc.rst to outline the calculation of the inverse
projection.

phi2.cpp includes detailed notes about applying Newton's method to
find tan(phi) in terms of sinh(psi).

Future work

lcc needs some tender loving care. It can easily (and should) be
modified to allow stdlat = +/- 90 (reduces to polar stereographic),
stdlat = 0 and stdlat_1 + stdlat_2 = 0 (reduces to Mercator). A
little more elbow grease will allow the treatment of stdlat_1 close to
stdlat_2 using divided differences. (See my implementation of the
LambertConformalConic class in GeographicLib.)

All the places where pj_tsfn and pj_phi2 are called need to be
reworked to cut out the use of Snyder's t = exp(-psi() variable and
instead use sinh(psi).

Maybe include the machinery for series conversions between all
auxiliary latitudes as "support functions". Then etmerc could use
this (as could mlfn for computing meridional distance). merc could
offer the etmerc style projection via chi as an option when the
flattening is sufficiently small.

test/unit/pj_phi2_test.cpp

src/tsfn.cpp

schwehr · 2020-10-26T17:46:50Z

For pj_get_default_ctx, the test seem fine. The tests I wrote long ago were done without a deep understanding of the code. Just some style comments. I'm not a fan of some of things like converting 1.0 to 1 causing a hidden cast. These aren't critical by any means. I just always try to keep conversions to the minimum and const / constexpr as much as possible. It makes debugging easier and every once in a while it finds a sneaky bug. I'm unusual in that I'm usually debugging through massive apps that I often don't know and these things probably matter to me far more than most.

src/phi2.cpp

schwehr · 2020-10-26T17:57:59Z

src/phi2.cpp

+  // if (e2m < 0) return std::numeric_limits<double>::quiet_NaN();
+  int i = numit;
+  for (; i; --i) {
+    double tau1 = sqrt(1 + tau * tau),


All these are const, yes? I've seen too many sneaky things over the years.

We went around the block a few times on this particular question a while back. I know that I can make these variable const. But that's not how I think of them which is "a bunch of variable I need in this do loop". const elevates them in importance so now I have to ask: "gee, is there something I need to watch out for?" So in my mind, I reserve the const declaration for long-lived immutable quantities; so maybe I would agree with you if the boday of the for loop was 50 lines or more. This is all a matter of taste, of course. The compiler will produce the same code in either case.

It's fine to not make them const, but I take issue with const declaration for long-lived immutable quantities and elevates them in importance. e.g. This says I'm iterating over something and I'm not mutating it. It's definitely not important.

for (const auto& item : container) { // short bit of code. }

rouault · 2020-10-26T18:10:24Z

src/projections/lcc.cpp

@@ -106,21 +106,21 @@ PJ *PROJECTION(lcc) {
        double ml1, m1;

        m1 = pj_msfn(sinphi, cosphi, P->es);
-        ml1 = pj_tsfn(Q->phi1, sinphi, P->e);
-        if( ml1 == 0 ) {
+        if( abs(Q->phi1) == M_HALFPI ) {


I believe those checks were added because ossfuzz came with odd PROJ strings (mostly degenerate ellipsoids) where ml1 would evaluate to 0 but abs(phi1). I don't recall if phi1 was close to half pi. Same for ml2 below

I'm wondering how exact equality testing against half pi is useful ? It is generally considered not a good idea to use exact equality for floating point numbers

The "normal" way ml is zero is with phi = pi/2. phi = -pi/2 which from
the perspective of projections should be the equivalent situation gives
ml = inf (or maybe just a large number). This is why the test is best
done on phi instead of ml.

It's likely that many things come apart the very degenerate ellipsoids.
I can see the need for an up front check on the flattening (e.g., f <
0.99) and that would probably be preferable to junking up the code with

The mantra that "comparing floating point numbers for equality is bad"
is now somewhat outdated. Since IEEE floating point has become the
norm, floating point arithmetic is much more predictable (and accurate)
and equality tests are often OK. That's the case here, I would say.

I can see the need for an up front check on the flattening (e.g., f <
0.99) and that would probably be preferable to junking up the code with

Sounds good

equality tests are often OK. That's the case here, I would say.

definitely trusting you on this

rouault · 2020-10-26T18:11:53Z

src/projections/tobmerc.cpp

 static PJ_XY tobmerc_s_forward (PJ_LP lp, PJ *P) {           /* Spheroidal, forward */
    PJ_XY xy = {0.0, 0.0};
    double cosphi;

-    if (fabs(fabs(lp.phi) - M_HALFPI) <= EPS10) {
+    if (fabs(lp.phi) >= M_HALFPI) {
+      // builtins.gie tests "Test expected failure at the poles:".  However


I guess the test could be amended/removed if we have better accuracy now

Yes, this is possible. Best would be to have someone review the code for tobmerc and make a decision on how the near pole situation should be handled (and document this). An implementation of the ellipsoidal version would be easy (however, it's clear that the projection is there "to make a point" and not for serious use).

rouault · 2020-10-26T18:14:06Z

src/projections/tobmerc.cpp

+      // given that M_HALFPI is strictly less than pi/2 in double precision,
+      // it's not clear why shouldn't just return a large result for xy.y (and
+      // it's not even that large, merely 38.025...).  Even if the logic was
+      // such that phi was strictly equal to pi/2, allowing xy.y = inf would be


I'm not sure returning inf as a valid value is a good idea (especially since it is our HUGE_VAL error marker).

I think that makes #2376 more relevant. Would help differentiate between HUGE_VAL that are valid versus invalid.

I agree with @snowman2. However, the issue is moot. An inf return isn't on the cards in the present code base, because there's no way that phi can be set to exactly 90deg. The closest is round(pi/2) which then gives asinh(tan(phi)) = 38, a distinctly non-large number. Things would change, if the representation of angles were changed (e.g., after we get sinpi and cospi functions) so that the cardinal points were exactly represented. I don't see this happening in the short term.

rouault · 2020-10-26T18:16:29Z

src/tsfn.cpp

-
-    return (tan (.5 * (M_HALFPI - phi)) /
-            pow((1. - sinphi) / (denominator), .5 * e));
+    double cosphi = cos(phi);


if we need cos(phi) now then we should probably alter the signature of pj_tsfn to provide this value. This will enable callers to implicitly use sincos() to compute both sin(phi) and cos(phi) faster than sin(phi) and cos(phi) in a isolated way

I wonder about this. I recommend putting a higher priority on removing the calls to tsfn and phi2 by reformulating the algorithms not to use Snyder's flakey t = exp(-psi) variable (which leads to oddities like the N and S poles being treated differently.

src/phi2.cpp

rouault · 2020-10-26T18:19:07Z

src/phi2.cpp

+  // if (e2m < 0) return std::numeric_limits<double>::quiet_NaN();
+  int i = numit;
+  for (; i; --i) {
+    double tau1 = sqrt(1 + tau * tau),


same here. splitting onto several lines would make it more readable

src/phi2.cpp

rouault · 2020-10-26T18:22:24Z

src/phi2.cpp

+   *
+   * returns atan(sinpsi2tanphi(tau'))
+   ***************************************************************************/
+  return atan(pj_sinhpsi2tanphi(ctx, (1/ts0 - ts0) / 2, e));


I remember I optimized recently pj_phi2() to use much less transcendent functions to have speed ups (#2052). At first sight, it looks like the new implementation will use more. Any idea of the perf impact ?

ah sorry I didn't read your cover letter where you explain that the faster convergence makes the computation faster globally

Yes, cutting down on the number of iterations is the key. Also, I wonder... should pow count as 2 transcendental calls to log then exp? That then matches (sort of) the new code atanh (log'ish) then sinh (exp'ish).

rouault · 2020-10-26T18:49:18Z

A shocking aspect of these timings is how fast etmerc is.

Are you really sure about your timings ? 4e-6 ns per call for etmerc looks completely bogus to me.

cffk · 2020-10-26T19:04:32Z

A shocking aspect of these timings is how fast etmerc is.

Are you really sure about your timings ? 4e-6 ns per call for etmerc looks completely bogus to me.

I agree. I looked long an hard at the timing code and can't fathom what's happening. Perhaps is screwing up somehow. Any this quirk of timing (affecting the etmerc method) is a side issue for now.

rouault · 2020-10-26T19:07:29Z

I looked long an hard at the timing code and can't fathom what's happening

Can you show your timing code ? Isn't it possible if you put it too close of the etmerc function and repeat the measurement on the same input value that the compiler is smart enough to realize that the output is the same without side effects and just completely remove the loop ?

cffk · 2020-10-26T19:44:19Z

I looked long an hard at the timing code and can't fathom what's happening

Can you show your timing code ? Isn't it possible if you put it too close of the etmerc function and repeat the measurement on the same input value that the compiler is smart enough to realize that the output is the same without side effects and just completely remove the loop ?

Even, you're right. The compiler outsmarted me. I know most of its
tricks that can give you bogus timings. I had

initialize big arrays of starting values for phi/psi

double xx = 0;

start timer1
loop over phi
  xx += psif(phi[i]);
stop timer1

start timer2
loop over psi
  xx += phif(psi[i]);
stop timer2

The "xx +=" is there to make the compiler actually call the function.
HOWEVER!!!, I neglected to include at the end

cout << "IGNORE " << xx << "\n";

So the compiler was skipping some of the function calls. (Not all of
them! because some has mutable variables for counting the number of
iterations.) I will post updated timing data...

rouault · 2020-10-26T19:58:07Z

So the compiler was skipping some of the function calls.

ok, that's reassuring.

Side though (no action needed for this PR): perhaps we should have a reference benchmark program ? It could use proj_trans() so as to be realistic about the timings users will get when using the API.

cffk · 2020-10-26T20:01:40Z

Timing UPDATED

Sorry the previous timing data was wrong. Here are corrected values..

Here's what I get with g++ -O3 on two Linux machines with recent
versions of g++. As always, you should take these with a grain of
salt. Times per call in ns = nanoseconds.

            Fedora 31    Ubuntu 18 
            g++-9.3.1    g++-7.5.0
            fwd   inv    fwd   inv
old merc    207   461    217   522
new merc    228   457    168   410
  etmerc    212   196    174   147

The new forward method is the 10% slower (resp 20% faster) on Fedora
31 (resp Ubuntu 18). The new inverse method is the same speed (resp
20% faster) on Fedora 31 (resp Ubuntu 18).

Roughly speaking the speed comparison is a wash. Maybe we should pay
attention more to the Fedora 31 results since these are with a newer
version of the compiler. I would still make the argument that a 20%
time penalty (which in a full PROJ pipeline would probably be no more
than a 5% penalty) would be a worthwhile price to pay for a more
robust implementation of the projection.

Also some corrected information... Timing UPDATED -------------- Sorry the previous timing data was wrong. Here are corrected values.. Here's what I get with g++ -O3 on two Linux machines with recent versions of g++. As always, you should take these with a grain of salt. Times per call in ns = nanoseconds. Fedora 31 Ubuntu 18 g++-9.3.1 g++-7.5.0 fwd inv fwd inv old merc 207 461 217 522 new merc 228 457 168 410 etmerc 212 196 174 147 The new forward method is the 10% slower (resp 20% faster) on Fedora 31 (resp Ubuntu 18). The new inverse method is the same speed (resp 20% faster) on Fedora 31 (resp Ubuntu 18). Roughly speaking the speed comparison is a wash. Maybe we should pay attention more to the Fedora 31 results since these are with a newer version of the compiler. I would still make the argument that a 20% time penalty (which in a full PROJ pipeline would probably be no more than a 5% penalty) would be a worthwhile price to pay for a more robust implementation of the projection.

cffk · 2020-10-26T21:25:10Z

I'm not a fan of some of things like converting 1.0 to 1 causing a hidden cast. These aren't critical by any means.

I come from the world of scientific writing where 1.0 is hardly ever used to represent 1 and, likewise, 0.5 is not typically used to represent 1/2. So I normally like to see sqrt(1 + x*x) because it's closer to the expression given in the paper. (Of course, I then have to remember to do 1/2.0. Life is full of little compromises.)

rouault · 2020-10-26T21:33:37Z

src/projections/merc.cpp

    xy.x = P->k0 * lp.lam;
-    xy.y = - P->k0 * log(pj_tsfn(lp.phi, sin(lp.phi), P->e));
+    xy.y = P->k0 * (asinh(tan(lp.phi)) - P->e * atanh(P->e * sin(lp.phi)));


I'm wondering if the following wouldn't recover some speed:

Suggested change

xy.y = P->k0 * (asinh(tan(lp.phi)) - P->e * atanh(P->e * sin(lp.phi)));

const double cosphi = cos(lp.phi);

const double sinphi = sin(lp.phi);

xy.y = P->k0 * (asinh(sinphi/cosphi) - P->e * atanh(P->e * sinphi));

This should be optimized as a sincos() call instead of tan() + sin(). But perhaps the numerical accuracy of computing tan() this way is questionable ?

Yup, this speeds the forward projection up 20% - 30%. The accuracy is hardly affected: rms error increases from 0.30 nm to 0.33 nm, max error increases from 1.83 nm to 1.84 nm (a barely noticeable degradation). So now we have a modest speed up overall for the new method. Thanks.

BTW, I know gcc has this sincos optimization. Does Visual Studio have it too?

Does Visual Studio have it too?

Just found https://github.com/MicrosoftDocs/cpp-docs/blob/master/docs/overview/what-s-new-for-visual-cpp-in-visual-studio.md which mentions "Some of the compiler optimizations are brand new, such as [...] the combining of calls sin(x) and cos(x) into a new sincos(x)" since Visual Studio 2017 version 15.5

Times per call in ns = nanoseconds. Fedora 31 Ubuntu 18 g++-9.3.1 g++-7.5.0 fwd inv fwd inv old merc 207 461 217 522 new merc 159 457 137 410 etmerc 212 196 174 147 The new forward method is now 25% faster (resp 35% faster) on Fedora 31 (resp Ubuntu 18). The new inverse method is the same speed (resp 20% faster) on Fedora 31 (resp Ubuntu 18). The accuracy is hardly affected: rms error increases from 0.30 nm to 0.33 nm, max error increases from 1.83 nm to 1.84 nm (a barely noticeable degradation).

src/phi2.cpp

kbevers

Charles, this PR is very impressive. I appreciate the effort that has gone into this.

I'm a bit late to the game so most everything has been mentioned and dealt with already. Good job by everyone. Here's my few comments, that mostly just reiterate what's been said already:

I agree with Kurt and Even that multiple variable initialization in one statement is not a good practice.
I am mostly indifferent regarding 1.0 vs 1. The "scientific writing" approach is simpler to read and I trust that compilers are clever enough to handle the implicit conversions correctly, so I have a sligt preference for that.
I am slightly worried that changing the wording of a error message might cause problems downstream. This is only a concern though if this PR is sneaked in 7.2.0RC2 tomorrow.

Regarding that last point, do you feel this has been tested enough to be included in 7.2.0? I am happy to include it so that these improvements will be available right away (next chance is in six months).

src/phi2.cpp

test/gie/builtins.gie

test/unit/pj_phi2_test.cpp

cffk · 2020-10-27T11:35:51Z

Regarding that last point, do you feel this has been tested enough to be included in 7.2.0? I am happy to include it so that these improvements will be available right away (next chance is in six months).

Let's not rush this into 7.2.0. No bugs are being fixed. It's just part of an ongoing effort to improve PROJ.

cffk requested review from rouault, busstoptaktik and kbevers October 26, 2020 16:53