Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add back preadv2 optimization for try_acquire on Linux #90

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

NobodyXu
Copy link
Contributor

Fixed #87

@NobodyXu
Copy link
Contributor Author

cc @the8472 can you check the syscall I added please?

I'm not sure if it is correct, thank you!

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
@NobodyXu NobodyXu force-pushed the reintroduce-preadv2-optimization branch from 5a43208 to c65cd8e Compare April 22, 2024 13:53
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
src/unix.rs Outdated Show resolved Hide resolved
src/unix.rs Outdated Show resolved Hide resolved
src/unix.rs Outdated Show resolved Hide resolved
Copy link
Member

@the8472 the8472 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can simplify things by limiting to aarch64, x86_64 linux gnu? Those are probably the most common environments where stuff is built. And it's mostly a perf optimization, so we can be conservative in which platforms get it.

src/unix.rs Show resolved Hide resolved
Copy link

@workingjubilee workingjubilee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are the numbers on this optimization like?

src/unix.rs Show resolved Hide resolved
src/unix.rs Outdated Show resolved Hide resolved
src/unix.rs Outdated Show resolved Hide resolved
@NobodyXu
Copy link
Contributor Author

what are the numbers on this optimization like?

It would enable try_acquire support on cases where:

  • jobserver gets an annoymous pipe instead of named fifo
  • procfs is not available
    Without preadv2, try_acquire would return a not-supported error.

On performance, it's mostly for avoiding performance loss with acquire when jobserver receives a named fifo or procfs is available (which jobserver would use to convert pipe to fifo).

Without preadv2,try_acquire would set the fifo to non-blocking, which would cause acquire to enter poll + read if the fifo is empty, which is less efficient than a blocking read.

@NobodyXu
Copy link
Contributor Author

Maybe we can simplify things by limiting to aarch64, x86_64 linux gnu? Those are probably the most common environments where stuff is built.

Well, from the glibc implementation, it looks like there're only 3 versions for it, so maybe we can still bare with the cost?

And it's mostly a perf optimization, so we can be conservative in which platforms get it.

While it's mostly a perf optimization, there is situation where preadv2 actually enable try_acquire to work properly instead of returning a not-supported error:

  • jobserver gets an annoymous pipe instead of named fifo
  • procfs is not available
    Without preadv2, try_acquire would return a not-supported error.

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
@weihanglo
Copy link
Member

what are the numbers on this optimization like?

It would enable try_acquire support on cases where:

* jobserver gets an annoymous pipe instead of named fifo

* procfs is not available
  Without `preadv2`, `try_acquire` would return a not-supported error.

On performance, it's mostly for avoiding performance loss with acquire when jobserver receives a named fifo or procfs is available (which jobserver would use to convert pipe to fifo).

Without preadv2,try_acquire would set the fifo to non-blocking, which would cause acquire to enter poll + read if the fifo is empty, which is less efficient than a blocking read.

Thanks @NobodyXu for the work. Believing that people already know the benefit of this for fixing unsupported situations. What Jubilee asked for are numbers. I think that's a reasonable request for a change aiming at optimization. Could you do some benchmark and show the number, so we can have a baseline in mind? It is also beneficial to prevent from future performance regression if we have the reproducible benchmark steps (if you don't mind sharing).

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
@NobodyXu NobodyXu force-pushed the reintroduce-preadv2-optimization branch from ab7b8db to 5cf22b5 Compare April 25, 2024 12:41
Co-authored-by: Jubilee

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
@NobodyXu
Copy link
Contributor Author

NobodyXu commented Apr 25, 2024

What Jubilee asked for are numbers. I think that's a reasonable request for a change aiming at optimization. Could you do some benchmark and show the number, so we can have a baseline in mind? It is also beneficial to prevent from future performance regression if we have the reproducible benchmark steps (if you don't mind sharing).

I could add one in benches/, it would first call try_acquire on Linux and then call acquire.
Might have to add criterion as a dev-dep though.

Edit:

Would do that tomorrow.

@the8472
Copy link
Member

the8472 commented Apr 25, 2024

Note that using blocking reads in acquire is mostly beneficial when there are many concurrent waiters. A single waiter won't see much of a speedup (perhaps a little from the avoided syscalls).

See #30 and torvalds/linux@0ddad21, the latter one has numbers.

@workingjubilee
Copy link

Oh, cute, so it is in fact a kernel optimization specifically for this protocol!

src/unix.rs Show resolved Hide resolved
src/unix.rs Outdated Show resolved Hide resolved
src/unix.rs Outdated Show resolved Hide resolved
src/unix.rs Outdated Show resolved Hide resolved
@petrochenkov
Copy link
Contributor

Can the code generally minimize amount of #[cfg(...)] in favor of cfg!(...)?
It will get more checking and less surprises on remaining platforms.
#[cfg(...)] is not really needed that often, only when some names are unavailable on some platforms (and even then they can sometimes be defined to dummies, see e.g. how type RawFd is defined).

Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
Signed-off-by: Jiahao XU <Jiahao_XU@outlook.com>
@NobodyXu
Copy link
Contributor Author

Can the code generally minimize amount of #[cfg(...)] in favor of cfg!(...)? It will get more checking and less surprises on remaining platforms.

Thanks, I've added the code in preadv2 to use cfg! instead of #[cfg(...)], it's actually more readable now.

@NobodyXu NobodyXu requested a review from the8472 April 29, 2024 13:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

preadv2 not found during linking
5 participants