New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extended tmerc (Poder/Engsager): speed optimizations #2036
Conversation
fwd: 52% faster on Core-i7@2.6GHz, 40% faster on GCE Xeon@2GHz inv: 24% faster on Core-i7@2.6GHz, 36% faster on GCE Xeon@2GHz Those speed-ups are obtained due to elimination of a number of transcendental functions (atan, sincos, cosh, sinh), through the use of usual trigonometric/hyperbolic formulas. The numeric results before/after seem identical at worse up to 14 decimal digits, which is way beyond the apparent accuracy of the computations On a point, far from central meridian, where round-tripping is not so great: Before: echo "81 1" | src/proj -d 18 +proj=utm +zone=31 | src/proj -d 18 -I +proj=utm +zone=31 80.999994286593448578 0.999987970918421620 After: $ echo "81 1" | src/proj -d 18 +proj=utm +zone=31 | src/proj -d 18 -I +proj=utm +zone=31 80.999994286593448578 0.999987970918422175 The benchmarking program used is: ``` #include "proj.h" #include <stdlib.h> #include <stdio.h> #include <string.h> int main(int argc, char* argv[]) { if( argc != 3 ) { fprintf(stderr, "Usage: bench quoted_proj_string fwd/inv\n"); exit(1); } PJ* p = proj_create(0, argv[1]); const int dir = strcmp(argv[2], "inv") == 0 ? PJ_INV : PJ_FWD; PJ_COORD coord; if( dir == PJ_FWD ) { coord.xy.x = 0.5; // rad coord.xy.y = 0.5; // rad } else { coord.xy.x = 450000; // m coord.xy.y = 5000000; // m } for(int i = 0; i < 40 * 1000* 1000; i++) proj_trans(p, dir, coord); proj_destroy(p); return 0; } ```
Good job with the speed-up. There is a potential for additional
So I'll try to benchmark the errors rigorously within the next week I wonder which of the changes you made really made the speed difference? |
Yes, perhaps, but if the additional round-off is several order of magnitude below the inaccuracy due to the truncation of the series, I hope we can accept it.
that's hard to tell. Benchmarking reliably tend to be quite difficult due to how modern CPU & OS work, so I didn't try all combinations of improvements.
Yes, compilers should hopefully be smart enough. That's more for us mortals to figure out more easily which are invariant and which are really variables. |
I'm rather expecting everything to be OK. However, it's good to check. For the record, the round-off error exceeds the truncation error for the extended tmerc (at order 6) within 4000 km of the central meridian (for WGS84). See Fig.2 at https://geographiclib.sourceforge.io/html/transversemercator.html#tmfigures |
I at last got around to checking the errors with this version of the code (speedup_poder_engsager_tmerc). There is no appreciable change in the errors compared to version 7.0.0. (The difference in the error is less than 1 nm.) For the record here is an abbreviated printout of the errors for the Poder/Engsager algorithms (errror in nm as a function of distance from the central meridian):
and for the Snyder/Evenden algorithm (errors now in um and as a function of longitude):
Finally here is the code I used to do the checking (so you can run checks with other modifications to the tm projection): // Compute absolute errors in PROJ implementations of the transverse Mercator
// projection. The compares the computations with GeographicLib's
// TransverseMercatorExact (which uses elliptic functions). GeographicLib
// should be configured to use long double as the working precision, e.g., with
//
// cmake -D GEOGRAPHICLIB_PRECISION=3
// -D CMAKE_INSTALL_PREFIX=/tmp/geographiclib-3
// ..
//
// so that round-off errors can be ignored. Then configure the compilation of
// this file with
//
// cmake -D CMAKE_PREFIX_PATH="/tmp/geographiclib-3;$PROJ_INSTALL" ..
//
// Errors are expressed as true meters (in the case of the forward projection,
// the error needs to be reduced by the scale of the projection). By default,
// the Poder/Engsager implemention in PROJ is used and the errors (in
// nanometers) are given as a function of distance from the central meridian.
// If the program is invoked with an argument "+approx", then the
// Snyder/Evenden implemention in PROJ is used and the errors (in micrometers)
// are given as a function of longitude from the central meridian.
#include <iostream>
#include <iomanip>
#include <string>
#include <GeographicLib/TransverseMercatorExact.hpp>
#include <GeographicLib/Geodesic.hpp>
#include <GeographicLib/Utility.hpp>
#include <proj.h>
int main(int argc, const char* const argv[]) {
using namespace std;
using namespace GeographicLib;
// for GEOGRAPHICLIB_PRECISION = 1, 2, 3, 4, 5, real is
// float, double, long double, boost::multiprecision::float128, mpfr::mpreal
typedef Math::real real;
Utility::set_digits();
cout << "GeographicLib precision = " << Math::digits() << " bits\n";
bool approx = argc > 1 && string(argv[1]) == "+approx";
cout << "Errors in PROJ tmerc ("
<< (approx ? "Snyder/Evenden" : "Poder/Engsager")
<< " formulation)\nas a function of "
<< (approx ? "longitude" : "distance")
<< " from central medidian\n";
// WGS84 parameters + central meridian
string astr = "6378137", rfstr= "298.257223563", lon0str = "0";
int nlat = 900; // sample every 0.01 deg in [0, 90] deg
// approx: sample longitude every 0.1 deg in [0, 15] deg
// !approx: sample distance every 100 km in [0, 9000] km
real d = approx ? 1/real(10) : 100000;
int n = approx ? 150 : 90;
real a = Utility::val<real>(astr), f = 1 / Utility::val<real>(rfstr),
lon0 = Utility::val<real>(lon0str), deg = Math::degree();
const TransverseMercatorExact tm(a, f, 1);
const Geodesic geod(a, f);
string projstr =
"+proj=tmerc +a=" + astr + " +rf=" + rfstr + " +lon_0=" + lon0str;
if (approx) projstr += " +approx";
PJ* p = proj_create(0, projstr.c_str());
cout << fixed << setprecision(1)
<< (approx ? "lon(deg) ferr(um) rerr(um)" :
"dist(km) ferr(nm) rerr(nm)")
<< "\n";
for (int i = 0; i <= n; ++i) { // loop over distances from lon = lon0
double ferr = 0, rerr = 0; // errors in forward and reverse projections
real D = i * d, azi = 90;
for (int ilat = 0; ilat <= nlat; ++ilat) { // loop over lats on lon = lon0
real lat0 = 90 * ilat/real(nlat), lat, lon, x, y, gamma, k, err;
if (approx) {
lat = lat0, lon = D;
} else {
geod.Direct(lat0, lon0, azi, D, lat, lon);
}
tm.Forward(lon0, lat, lon, x, y, gamma, k); // TM via GeographicLb
PJ_COORD lp, xy;
lp.lp.phi = double(lat * deg); lp.lp.lam = double(lon * deg);
xy = proj_trans(p, PJ_FWD, lp); // TM via PROJ
// divide xy error by scale to get true error
ferr = max(double(hypot(xy.xy.x - x, xy.xy.y - y) / k), ferr);
xy.xy.x = double(x); xy.xy.y = double(y);
lp = proj_trans(p, PJ_INV, xy); // inv TM via PROJ
// Use geodesic calculation to figure error (overkill for small errors)
geod.Inverse(lat, lon, real(lp.lp.phi) / deg, real(lp.lp.lam) / deg, err);
rerr = max(double(err), rerr);
}
D /= (approx ? 1 : 1000);
ferr *= (approx ? 1e6 : 1e9);
rerr *= (approx ? 1e6 : 1e9);
cout << D << " " << ferr << " " << rerr << endl;
}
proj_destroy(p);
} |
Thanks @cffk for the detailed analaysis. I've run it locally and of course confirm your findings. The difference between master and this pull request do not appear to be significant compared to the error of Poder/Engsager w.r.t the output of GeographicLib |
fwd: 52% faster on Core-i7@2.6GHz, 40% faster on GCE Xeon@2GHz
inv: 24% faster on Core-i7@2.6GHz, 36% faster on GCE Xeon@2GHz
Those speed-ups are obtained due to elimination of a number of
transcendental functions (atan, sincos, cosh, sinh), through the
use of usual trigonometric/hyperbolic formulas.
The numeric results before/after seem identical at worse up to
14 decimal digits, which is way beyond the apparent accuracy of the
computations
On a point, far from central meridian, where round-tripping is not so great:
Before:
echo "81 1" | src/proj -d 18 +proj=utm +zone=31 | src/proj -d 18 -I +proj=utm +zone=31
80.999994286593448578 0.999987970918421620
After:
$ echo "81 1" | src/proj -d 18 +proj=utm +zone=31 | src/proj -d 18 -I +proj=utm +zone=31
80.999994286593448578 0.999987970918422175
The benchmarking program used is: