Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive syscalls when using BufWriter #220

Open
meithecatte opened this issue Aug 20, 2023 · 2 comments
Open

Excessive syscalls when using BufWriter #220

meithecatte opened this issue Aug 20, 2023 · 2 comments
Labels
performance-issue Oops! Slowness occurred

Comments

@meithecatte
Copy link

I am writing to a BufWriter<File>. Unfortunately, std::io::BufWriter::stream_position falls back to running a seek syscall to figure out the current stream position. This makes my serialization workload take ~300x longer than necessary.

As a workaround, NoSeek<BufWriter<File>> works for my usecase. In fact, the values returned by stream_position are never used by the generated code. Emitting them only when necessary should improve the performance in many cases.

@csnover
Copy link
Collaborator

csnover commented Sep 21, 2023

Hi, sorry for the delay. Thanks for reporting this issue, it is symmetrical to the issue with std::io::BufReader where we provide a different implementation there for the same reason.

Could you provide an example of something that that emits a stream_position call where the value is never consumed? Mostly, stream position is emitted for the sake of error handling, so there should always be a consumer of the emitted value, even if it is never used most of the time because most of the time writes don’t fail.

Thanks!

@csnover csnover added the performance-issue Oops! Slowness occurred label Sep 22, 2023
@meithecatte
Copy link
Author

Hi, sorry for taking a while to get back to you. In the following example, the output of cargo expand has 3 occurrences of stream_position, always assigned to an entirely dead variable:

use binrw::binwrite;

#[binwrite]
struct Meow {
    a: u32,
    b: u32,
}

fn main() {
    println!("Hello, world!");
}

Tested with binrw@0.12.0 (latest at the time of writing).

csnover added a commit to csnover/binrw that referenced this issue Oct 25, 2023
This should increase the accuracy of positioning information in
errors and ensures `stream_position` calls will only be emitted
when the result may actually be used.

There are still too many `stream_position` calls in part because
there is no API to coordinate between parent and child objects to
prevent children from doing their own position management when the
parent is already doing it. Improving this will probably require
some kind of rudimentary stack tracking, or at least to split the
API so that `read_options` is an entrypoint and then there is e.g.
`read_inner` that does not try to restore position on error.

In future, it should be possible for authors to configure trading
off worse error handling with performance by e.g. making error
position `Option<u64>` and emitting `None`s instead of a variable
reference, and giving some option to not leave the stream in a
known good state on failure.

Refs jam1garner#220.
csnover added a commit to csnover/binrw that referenced this issue Oct 25, 2023
This should increase the accuracy of positioning information in
errors and ensures `stream_position` calls will only be emitted
when the result may actually be used.

There are still too many `stream_position` calls in part because
there is no API to coordinate between parent and child objects to
prevent children from doing their own position management when the
parent is already doing it. Improving this will probably require
some kind of rudimentary stack tracking, or at least to split the
API so that `read_options` is an entrypoint and then there is e.g.
`read_inner` that does not try to restore position on error.

In future, it should be possible for authors to configure trading
off worse error handling with performance by e.g. making error
position `Option<u64>` and emitting `None`s instead of a variable
reference, and giving some option to not leave the stream in a
known good state on failure.

Refs jam1garner#220.
csnover added a commit to csnover/binrw that referenced this issue Oct 26, 2023
This should increase the accuracy of positioning information in
errors and ensures `stream_position` calls will only be emitted
when the result may actually be used.

There are still too many `stream_position` calls in part because
there is no API to coordinate between parent and child objects to
prevent children from doing their own position management when the
parent is already doing it. Improving this will probably require
some kind of rudimentary stack tracking, or at least to split the
API so that `read_options` is an entrypoint and then there is e.g.
`read_inner` that does not try to restore position on error.

In future, it should be possible for authors to configure trading
off worse error handling with performance by e.g. making error
position `Option<u64>` and emitting `None`s instead of a variable
reference, and giving some option to not leave the stream in a
known good state on failure.

Refs jam1garner#220.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance-issue Oops! Slowness occurred
Projects
None yet
Development

No branches or pull requests

2 participants