Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proj 6.1.1 to 8.0.1 performance issues and regressions #2785

Closed
rconde01 opened this issue Jul 22, 2021 · 2 comments
Closed

Proj 6.1.1 to 8.0.1 performance issues and regressions #2785

rconde01 opened this issue Jul 22, 2021 · 2 comments
Labels

Comments

@rconde01
Copy link

Example of problem

#include <filesystem>
#include <fstream>
#include <iostream>
#include <stdexcept>
#include <string>
#include <vector>

#include "gdal_priv.h"

enum class EPSGCodes {
    // https://epsg.io/4326
    WGS84_Horizontal = 4326,

    // https://epsg.io/4979
    WGS84_Vertical = 4979,

    // https://epsg.io/5773
    EGM96_Vertical = 5773
};

// Hold input data
struct data {
    std::vector<double> latitudes;
    std::vector<double> longitudes;
    std::vector<double> elevations;
};

// batches of data
std::vector<data> data_groups;

// Load the data we want to transform
void load_data(std::filesystem::path const & exe_dir){
    auto data_path = (exe_dir / "terrain_data.dat").string();

    std::ifstream f(data_path,std::ios::binary);

    if(!f)
        throw std::runtime_error("");

    while(!f.eof()){
        int count{};

        f.read(reinterpret_cast<char *>(&count),sizeof(count));

        data_groups.push_back(data{});

        auto & lat = data_groups.back().latitudes;
        auto & lon = data_groups.back().longitudes;
        auto & elv = data_groups.back().elevations;

        lat.reserve(count);
        lon.reserve(count);
        elv.reserve(count);

        for(int i = 0; i < count; ++i){
            double v;

            f.read(reinterpret_cast<char *>(&v),sizeof(double));
            lat.push_back(v);

            f.read(reinterpret_cast<char *>(&v),sizeof(double));
            lon.push_back(v);

            f.read(reinterpret_cast<char *>(&v),sizeof(double));
            elv.push_back(v);
        }
    }
}

// transform the data
void transform_data(std::filesystem::path const & exe_dir){
    std::string proj_data_path = (exe_dir / "./proj").string();

    const char* proj_search_paths[] = {proj_data_path.c_str(), nullptr};

    OSRSetPROJSearchPaths(proj_search_paths);

    OGRSpatialReference source_horizontal;

    if(source_horizontal.importFromEPSG(
        static_cast<int>(EPSGCodes::WGS84_Horizontal)) != OGRERR_NONE)
        throw std::runtime_error("");

    OGRSpatialReference source_vertical;

    if(source_vertical.importFromEPSG(
        static_cast<int>(EPSGCodes::EGM96_Vertical)) != OGRERR_NONE)
        throw std::runtime_error("");

    OGRSpatialReference source_coordinate_system;

    if(source_coordinate_system.SetCompoundCS(
        "WGS84 Horizontal + EGM96 Vertical",
        &source_horizontal,
        &source_vertical) != OGRERR_NONE)
        throw std::runtime_error("");

    OGRSpatialReference dest_coordinate_system;

    if(dest_coordinate_system.importFromEPSG(
        static_cast<int>(EPSGCodes::WGS84_Vertical)) != OGRERR_NONE)
        throw std::runtime_error("");

    auto transform = 
        OGRCreateCoordinateTransformation(
            &source_coordinate_system,
            &dest_coordinate_system);

    for(auto & d : data_groups){
        transform->Transform(
            static_cast<int>(d.elevations.size()),
            d.latitudes.data(),
            d.longitudes.data(),
            d.elevations.data(),
            nullptr);
    }
}

int main(int argc, char ** argv){
    std::filesystem::path exe_path(argv[0]);
    auto exe_dir = exe_path.parent_path();

    std::cout << "Load Data" << std::endl;
    load_data(exe_dir);

    std::cout << "Transform Data" << std::endl;
    transform_data(exe_dir);
}

Problem description

I have compiled and run the above example on visual studio 2019 with a preprocessed data file (~500 mb of lat/lon/elevation data). I run it against gdal/proj 3.0.1/6.1.1 and 3.2.1/8.0.1.

  • 3.0.1/6.1.1 takes 17.9 seconds to run
  • 3.2.1/8.0.1 takes 21.6 seconds to run
  • Loading the data takes ~3.3 seconds in both cases
  • So the newer version is ~25% slower

The vTune profile from 6.1.1 is:
image

The vTune profile from 8.0.1 is:
image

Besides the performance regression, it is notable that about 25% of the runtime in both cases is spent just calling proj_errno_reset...not sure if anything can be done about that.

Environment Information

  • PROJ version (proj) : 6.1.1/8.0.1
  • Operation System Information: Windows 10

Installation method

  • from source
@rconde01 rconde01 added the bug label Jul 22, 2021
@rouault
Copy link
Member

rouault commented Jul 22, 2021

Are the longitude, latitude covering the whole world/a large extent or a small extent ? And if covering a large extent, are consecutive points in "random order" or are they generally grouped by location ? I'm asking this since the us_nga_egm96_15.tif uses 256x256 tiles (18 tiles for the whole world), and PROJ has a cache for 12 tiles per file. Could you also expand the vTune profile for GTiffVGrid::valueAt ?

@rconde01
Copy link
Author

Here's a summary of the point-count, south-west, and north-east bounds for each data group:
terrain_regions.txt

It's a small region near 37 latitude, -119 longitude. It should be scanning west to east, south to north.

Here's the expanded detail in the profile:
image

rouault added a commit to rouault/PROJ that referenced this issue Jul 22, 2021
With this commit, and the 2 previous ones, given mytest.cpp
```

int main()
{
    PJ* pj = proj_create(nullptr, "+proj=vgridshift +grids=us_nga_egm96_15.tif");
    for( int i = 0; i < 5*1000*1000; i++)
    {
        PJ_COORD coord;
        coord.lpz.lam = 0;
        coord.lpz.phi = 0;
        coord.lpz.z = 0;
        proj_trans(pj, PJ_FWD, coord);
    }
    return 0;
}
```

we get a x2 speedup

Before:
```
$ PROJ_LIB=data:$HOME/proj/PROJ-data/us_nga LD_LIBRARY_PATH=src/.libs  hyperfine --warmup 1 'taskset -c 11 ./mytest'
Benchmark #1: taskset -c 11 ./mytest
  Time (mean ± σ):      1.950 s ±  0.014 s    [User: 1.945 s, System: 0.005 s]
  Range (min … max):    1.937 s …  1.971 s
```

After:
```
$ PROJ_LIB=data:$HOME/proj/PROJ-data/us_nga LD_LIBRARY_PATH=src/.libs  hyperfine --warmup 1 'taskset -c 11 ./mytest'
Benchmark #1: taskset -c 11 ./mytest
  Time (mean ± σ):     984.4 ms ±   3.1 ms    [User: 977.0 ms, System: 7.2 ms]
  Range (min … max):   979.3 ms … 990.5 ms
```
rouault added a commit to rouault/PROJ that referenced this issue Jul 22, 2021
With this commit, and the 2 previous ones, given mytest.cpp
```

int main()
{
    PJ* pj = proj_create(nullptr, "+proj=vgridshift +grids=us_nga_egm96_15.tif");
    for( int i = 0; i < 5*1000*1000; i++)
    {
        PJ_COORD coord;
        coord.lpz.lam = 0;
        coord.lpz.phi = 0;
        coord.lpz.z = 0;
        proj_trans(pj, PJ_FWD, coord);
    }
    return 0;
}
```

we get a x2 speedup

Before:
```
$ PROJ_LIB=data:$HOME/proj/PROJ-data/us_nga LD_LIBRARY_PATH=src/.libs  hyperfine --warmup 1 'taskset -c 11 ./mytest'
Benchmark #1: taskset -c 11 ./mytest
  Time (mean ± σ):      1.950 s ±  0.014 s    [User: 1.945 s, System: 0.005 s]
  Range (min … max):    1.937 s …  1.971 s
```

After:
```
$ PROJ_LIB=data:$HOME/proj/PROJ-data/us_nga LD_LIBRARY_PATH=src/.libs  hyperfine --warmup 1 'taskset -c 11 ./mytest'
Benchmark #1: taskset -c 11 ./mytest
  Time (mean ± σ):     984.4 ms ±   3.1 ms    [User: 977.0 ms, System: 7.2 ms]
  Range (min … max):   979.3 ms … 990.5 ms
```
rouault added a commit that referenced this issue Jul 23, 2021
GeoTIFF grid reading: perf improvements (fixes #2785)
github-actions bot pushed a commit that referenced this issue Jul 23, 2021
GeoTIFF grid reading: perf improvements (fixes #2785)
rouault added a commit that referenced this issue Jul 23, 2021
[Backport 8.1] GeoTIFF grid reading: perf improvements (fixes #2785)
a0x8o added a commit to a0x8o/PROJ that referenced this issue Jul 23, 2021
GeoTIFF grid reading: perf improvements (fixes OSGeo#2785)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants