Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extended tmerc (Poder/Engsager): speed optimizations #2036

Merged
merged 1 commit into from Mar 29, 2020

Conversation

rouault
Copy link
Member

@rouault rouault commented Mar 9, 2020

fwd: 52% faster on Core-i7@2.6GHz, 40% faster on GCE Xeon@2GHz
inv: 24% faster on Core-i7@2.6GHz, 36% faster on GCE Xeon@2GHz

Those speed-ups are obtained due to elimination of a number of
transcendental functions (atan, sincos, cosh, sinh), through the
use of usual trigonometric/hyperbolic formulas.

The numeric results before/after seem identical at worse up to
14 decimal digits, which is way beyond the apparent accuracy of the
computations

On a point, far from central meridian, where round-tripping is not so great:

Before:
echo "81 1" | src/proj -d 18 +proj=utm +zone=31 | src/proj -d 18 -I +proj=utm +zone=31
80.999994286593448578 0.999987970918421620

After:
$ echo "81 1" | src/proj -d 18 +proj=utm +zone=31 | src/proj -d 18 -I +proj=utm +zone=31
80.999994286593448578 0.999987970918422175

The benchmarking program used is:

 #include "proj.h"
 #include <stdlib.h>
 #include <stdio.h>
 #include <string.h>

int main(int argc, char* argv[])
{
    if( argc != 3 ) {
        fprintf(stderr, "Usage: bench quoted_proj_string fwd/inv\n");
        exit(1);
    }
    PJ* p = proj_create(0, argv[1]);
    const int dir = strcmp(argv[2], "inv") == 0 ? PJ_INV : PJ_FWD;
    PJ_COORD coord;
    if( dir == PJ_FWD )
    {
        coord.xy.x = 0.5; // rad
        coord.xy.y = 0.5; // rad
    }
    else
    {
        coord.xy.x = 450000; // m
        coord.xy.y = 5000000; // m
    }
    for(int i = 0; i < 40 * 1000* 1000; i++)
        proj_trans(p, dir, coord);
    proj_destroy(p);
    return 0;
}

fwd: 52% faster on Core-i7@2.6GHz, 40% faster on GCE Xeon@2GHz
inv: 24% faster on Core-i7@2.6GHz, 36% faster on GCE Xeon@2GHz

Those speed-ups are obtained due to elimination of a number of
transcendental functions (atan, sincos, cosh, sinh), through the
use of usual trigonometric/hyperbolic formulas.

The numeric results before/after seem identical at worse up to
14 decimal digits, which is way beyond the apparent accuracy of the
computations

On a point, far from central meridian, where round-tripping is not so great:

Before:
echo "81 1" | src/proj -d 18 +proj=utm +zone=31 | src/proj -d 18 -I +proj=utm +zone=31
80.999994286593448578	0.999987970918421620

After:
$ echo "81 1" | src/proj -d 18 +proj=utm +zone=31 | src/proj -d 18 -I +proj=utm +zone=31
80.999994286593448578	0.999987970918422175

The benchmarking program used is:
```
 #include "proj.h"
 #include <stdlib.h>
 #include <stdio.h>
 #include <string.h>

int main(int argc, char* argv[])
{
    if( argc != 3 ) {
        fprintf(stderr, "Usage: bench quoted_proj_string fwd/inv\n");
        exit(1);
    }
    PJ* p = proj_create(0, argv[1]);
    const int dir = strcmp(argv[2], "inv") == 0 ? PJ_INV : PJ_FWD;
    PJ_COORD coord;
    if( dir == PJ_FWD )
    {
        coord.xy.x = 0.5; // rad
        coord.xy.y = 0.5; // rad
    }
    else
    {
        coord.xy.x = 450000; // m
        coord.xy.y = 5000000; // m
    }
    for(int i = 0; i < 40 * 1000* 1000; i++)
        proj_trans(p, dir, coord);
    proj_destroy(p);
    return 0;
}
```
@cffk
Copy link
Contributor

cffk commented Mar 10, 2020

Good job with the speed-up. There is a potential for additional
round-off errors in, e.g.,

const double cosh_arg_i = 2 * tmp_i - 1;

So I'll try to benchmark the errors rigorously within the next week
(comparing the results before/after to the exact -- elliptic function --
implementation with long double).

I wonder which of the changes you made really made the speed difference?
Surely the "const double" declarations don't tell the compiler anything
it didn't know already?

@rouault
Copy link
Member Author

rouault commented Mar 10, 2020

There is a potential for additional round-off errors

Yes, perhaps, but if the additional round-off is several order of magnitude below the inaccuracy due to the truncation of the series, I hope we can accept it.

I wonder which of the changes you made really made the speed difference?

that's hard to tell. Benchmarking reliably tend to be quite difficult due to how modern CPU & OS work, so I didn't try all combinations of improvements.

Surely the "const double" declarations don't tell the compiler anything

Yes, compilers should hopefully be smart enough. That's more for us mortals to figure out more easily which are invariant and which are really variables.

@cffk
Copy link
Contributor

cffk commented Mar 10, 2020

There is a potential for additional round-off errors

Yes, perhaps, but if the additional round-off is several order of magnitude below the inaccuracy due to the truncation of the series, I hope we can accept it.

I'm rather expecting everything to be OK. However, it's good to check. For the record, the round-off error exceeds the truncation error for the extended tmerc (at order 6) within 4000 km of the central meridian (for WGS84). See Fig.2 at https://geographiclib.sourceforge.io/html/transversemercator.html#tmfigures

@cffk
Copy link
Contributor

cffk commented Mar 28, 2020

I at last got around to checking the errors with this version of the code (speedup_poder_engsager_tmerc). There is no appreciable change in the errors compared to version 7.0.0. (The difference in the error is less than 1 nm.) For the record here is an abbreviated printout of the errors for the Poder/Engsager algorithms (errror in nm as a function of distance from the central meridian):

GeographicLib precision = 64 bits
Errors in PROJ tmerc (Poder/Engsager formulation)
as a function of distance from central medidian
dist(km) ferr(nm) rerr(nm)
0.0 3.2 2.6
...
3800.0 3.3 2.4
3900.0 4.6 1.9
4000.0 5.2 2.3
...
5000.0 33.4 2.7
...
6000.0 685.3 21.2
...
7000.0 32147.7 1052.3

and for the Snyder/Evenden algorithm (errors now in um and as a function of longitude):

GeographicLib precision = 64 bits
Errors in PROJ tmerc (Snyder/Evenden formulation)
as a function of longitude from central medidian
lon(deg) ferr(um) rerr(um)
0.0 4.9 4.9
...
2.0 4.9 5.1
...
2.5 5.8 6.0
...
3.0 15.5 10.1
...
4.0 76.1 46.4
...
5.0 276.1 179.3
...
6.0 826.3 595.0
...
7.0 2160.9 1841.9

Finally here is the code I used to do the checking (so you can run checks with other modifications to the tm projection):

// Compute absolute errors in PROJ implementations of the transverse Mercator
// projection.  The compares the computations with GeographicLib's
// TransverseMercatorExact (which uses elliptic functions).  GeographicLib
// should be configured to use long double as the working precision, e.g., with
//
// cmake -D GEOGRAPHICLIB_PRECISION=3
//       -D CMAKE_INSTALL_PREFIX=/tmp/geographiclib-3
//       ..
//
// so that round-off errors can be ignored.  Then configure the compilation of
// this file with
//
// cmake -D CMAKE_PREFIX_PATH="/tmp/geographiclib-3;$PROJ_INSTALL" ..
//
// Errors are expressed as true meters (in the case of the forward projection,
// the error needs to be reduced by the scale of the projection).  By default,
// the Poder/Engsager implemention in PROJ is used and the errors (in
// nanometers) are given as a function of distance from the central meridian.
// If the program is invoked with an argument "+approx", then the
// Snyder/Evenden implemention in PROJ is used and the errors (in micrometers)
// are given as a function of longitude from the central meridian.

#include <iostream>
#include <iomanip>
#include <string>
#include <GeographicLib/TransverseMercatorExact.hpp>
#include <GeographicLib/Geodesic.hpp>
#include <GeographicLib/Utility.hpp>
#include <proj.h>

int main(int argc, const char* const argv[]) {
  using namespace std;
  using namespace GeographicLib;
  // for GEOGRAPHICLIB_PRECISION = 1, 2, 3, 4, 5, real is
  // float, double, long double, boost::multiprecision::float128, mpfr::mpreal
  typedef Math::real real;
  Utility::set_digits();
  cout << "GeographicLib precision = " << Math::digits() << " bits\n";
  bool approx = argc > 1 && string(argv[1]) == "+approx";
  cout << "Errors in PROJ tmerc (" 
       << (approx ? "Snyder/Evenden" : "Poder/Engsager") 
       << " formulation)\nas a function of "
       << (approx ? "longitude" : "distance")
       << " from central medidian\n";
  // WGS84 parameters + central meridian
  string astr = "6378137", rfstr= "298.257223563", lon0str = "0";
  int nlat = 900;                // sample every 0.01 deg in [0, 90] deg
  // approx: sample longitude every 0.1 deg in [0, 15] deg
  // !approx: sample distance every 100 km in [0, 9000] km
  real d = approx ? 1/real(10) : 100000;
  int n = approx ? 150 : 90;
  real a = Utility::val<real>(astr), f = 1 / Utility::val<real>(rfstr),
    lon0 = Utility::val<real>(lon0str), deg = Math::degree();
  const TransverseMercatorExact tm(a, f, 1);
  const Geodesic geod(a, f);
  string projstr =
    "+proj=tmerc +a=" + astr + " +rf=" + rfstr + " +lon_0=" + lon0str;
  if (approx) projstr += " +approx";
  PJ* p = proj_create(0, projstr.c_str());
  cout << fixed << setprecision(1)
       << (approx ? "lon(deg) ferr(um) rerr(um)" :
                    "dist(km) ferr(nm) rerr(nm)")
       << "\n";
  for (int i = 0; i <= n; ++i) { // loop over distances from lon = lon0
    double ferr = 0, rerr = 0;  // errors in forward and reverse projections
    real D = i * d, azi = 90;
    for (int ilat = 0; ilat <= nlat; ++ilat) { // loop over lats on lon = lon0
      real lat0 = 90 * ilat/real(nlat), lat, lon, x, y, gamma, k, err;
      if (approx) {
        lat = lat0, lon = D;
      } else {
        geod.Direct(lat0, lon0, azi, D, lat, lon);
      }
      tm.Forward(lon0, lat, lon, x, y, gamma, k); // TM via GeographicLb
      PJ_COORD lp, xy;
      lp.lp.phi = double(lat * deg); lp.lp.lam = double(lon * deg);
      xy = proj_trans(p, PJ_FWD, lp);             // TM via PROJ
      // divide xy error by scale to get true error
      ferr = max(double(hypot(xy.xy.x - x, xy.xy.y - y) / k), ferr);
      xy.xy.x = double(x); xy.xy.y = double(y);
      lp = proj_trans(p, PJ_INV, xy); // inv TM via PROJ
      // Use geodesic calculation to figure error (overkill for small errors)
      geod.Inverse(lat, lon, real(lp.lp.phi) / deg, real(lp.lp.lam) / deg, err);
      rerr = max(double(err), rerr);
    }
    D /= (approx ? 1 : 1000);
    ferr *= (approx ? 1e6 : 1e9);
    rerr *= (approx ? 1e6 : 1e9);
    cout << D << " " << ferr << " " << rerr << endl;
  }
  proj_destroy(p);
}

@rouault
Copy link
Member Author

rouault commented Mar 29, 2020

Thanks @cffk for the detailed analaysis. I've run it locally and of course confirm your findings. The difference between master and this pull request do not appear to be significant compared to the error of Poder/Engsager w.r.t the output of GeographicLib

@rouault rouault merged commit 455e630 into OSGeo:master Mar 29, 2020
@rouault rouault added this to the 7.1.0 milestone Mar 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants