Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

H3 library is not inlined #23

Closed
Komzpa opened this issue Mar 25, 2020 · 8 comments
Closed

H3 library is not inlined #23

Komzpa opened this issue Mar 25, 2020 · 8 comments
Assignees
Labels
enhancement 🚀 New feature or request

Comments

@Komzpa
Copy link
Contributor

Komzpa commented Mar 25, 2020

Hi,

Investingating performance issues I found that H3 actively resists inlining.

objdump -D h3.so:

image

Simple math like *pi/180 is not getting inlined.

After some fight I got some pieces but haven't got to destination.

CMake supposedly enables inline like this:

cmake_minimum_required(VERSION 3.9)
cmake_policy(SET CMP0069 NEW)
set(CMAKE_INTERPROCEDURAL_OPTIMIZATION TRUE)

These guys helped a bit inside the h3-pg itself but haven't got me to let linktime optimizations to cross library boundary.

CFLAGS += -flto -O2
LDFLAGS += -flto -Wl,-Bsymbolic-functions -Wl,-z,relro

Any ideas help.

@zachasme
Copy link
Owner

Thanks for the PR, I'll try looking into it. Could it be related to the way h3 is built?

cmake \
  -DCMAKE_C_FLAGS=-fPIC \
  -DBUILD_TESTING=OFF \
  -DENABLE_COVERAGE=OFF \
  -DENABLE_DOCS=OFF \
  -DENABLE_FORMAT=OFF \
  -DENABLE_LINTING=OFF \
  ..

Is there a Release flag I'm missing or similar?

Thanks for opening the issue on h3 core as well.

@zachasme zachasme self-assigned this Mar 26, 2020
@zachasme zachasme added the enhancement 🚀 New feature or request label Mar 26, 2020
@Komzpa
Copy link
Contributor Author

Komzpa commented Mar 26, 2020

Per uber/h3#326 (comment) flag is -DCMAKE_BUILD_TYPE=Release, thanks @isaacbrodsky.

@Komzpa
Copy link
Contributor Author

Komzpa commented Mar 26, 2020

I've spent another couple of hours with it and give up for today.
Test is simple: after you build everything right, this should show empty. In non-inlined build it shows four callsites for square and the function def.
objdump -D h3.so | grep _square

@AbelVM
Copy link

AbelVM commented Apr 8, 2020

Per uber/h3#326 (comment) flag is -DCMAKE_BUILD_TYPE=Release, thanks @isaacbrodsky.

Why is this flag not actually used in the latest v3.6.2 release? It looks harmless, and the 2x/3x improvements shown at uber/h3#326 are huge

@zachasme
Copy link
Owner

zachasme commented Apr 8, 2020

Release v3.6.3 includes -DCMAKE_BUILD_TYPE=Release which should improve performance.

Still objdump -D h3.so | grep _square shows three callsites, so I'll keep this issue open until we figure out how to inline h3. Hopefully something good will come from uber/h3#326 but apart from that I have no ideas.

@Komzpa do you have any ideas for benchmarking the built extension? I would like to be able to compare changes like these (after being built into extension and not just h3 core directly).

@Komzpa
Copy link
Contributor Author

Komzpa commented Apr 8, 2020

@zachasme pgbench is an industry standard to benchmark query execution times in postgres.

@zachasme
Copy link
Owner

We should also keep an eye on uber/h3#360.

@zachasme
Copy link
Owner

Solved by #75

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement 🚀 New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants