Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify performance impact of reentrant libqhull_r #85

Open
StefanBruens opened this issue Apr 3, 2021 · 1 comment
Open

Clarify performance impact of reentrant libqhull_r #85

StefanBruens opened this issue Apr 3, 2021 · 1 comment

Comments

@StefanBruens
Copy link

Currently, the documentation mentions a 1%...2% performance impact of using the reentrant version vs the non-reentrant.

  1. It scares away dependent projects from using the libqhull_r, staying with the deprecated libqhull.
  2. It is very likely incorrect

Both versions read any data from the qhT struct as loads relative to its base pointer, so any possible difference has to be attributed to the base pointer. For the reentrant version, the base pointer is explicitly passed as first argument, for the non-reentrant version the (relocatable) pointer has to be fetched from memory as a programm-counter relative load. This can be confirmed by analyzing the assembly of both variants.

As current architectures have sufficient registers to pass all function parameters via registers (in the vast majority of cases), register pressure and additional loads/stores from/to stack due to lack of registers are no longer an issue (on ix86, this is likely different).

The qhT context address has to be restored on each function call, in the reentrant version by the caller, in the non-reentrant version by the callee. While the first can and is be done (according to the assembly code) as register move, the second always involves a load from memory. Loading from memory may be slower (higher latency), depending on cache status and instruction interleaving.

Given both libraries contain almost identical and the same amount of code, identical performance should be expected.

A static, non-relocatable version of the can be faster as the extra load of the context base pointer can be optimized away. This is not possible for any relocatable version.

@cbbarber
Copy link
Collaborator

Thanks for your note and good comments. I'll update the documentation in the next release.

                               --Brad

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants