Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functions to replace BufReader's reader #122666

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Morganamilo
Copy link
Contributor

This allows the buffer of the BufRead to be used between multiple readers.

This allows the buffer of the BufRead to be used between multiple
readers.
@rustbot
Copy link
Collaborator

rustbot commented Mar 17, 2024

r? @Mark-Simulacrum

rustbot has assigned @Mark-Simulacrum.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Mar 17, 2024
@the8472
Copy link
Member

the8472 commented Mar 18, 2024

This

  • requires &mut
  • discards the buffer
  • then replaces the reader

At which point you can just do mem::replace(&mut existing_buf_reader, BufReader::new(new_reader))

@Morganamilo
Copy link
Contributor Author

That allocates a new buffer on the heap instead of reusing the allocation.

@the8472
Copy link
Member

the8472 commented Mar 18, 2024

Which likely isn't significant compared to the syscall overhead of opening the file.

And I also find discarding the buffer and the old reader dubious since it can lead to data being dropped. This can be important for streams (e.g. TCP) where the data can only be obtained once.

@Morganamilo
Copy link
Contributor Author

My specific use case is I have a few tens of thousand files that are a line based format that I'm parsing in a loop.

I'd like my parser to not make the same allocation over and over when it could easily use the same buffer. I don't expect this to be useful on streams but it seemed useful here and was counter intuitive that I couldn't here.

@workingjubilee
Copy link
Contributor

Do you have a benchmark suggesting this is a substantial win?

@Mark-Simulacrum
Copy link
Member

We already allow replacing the underlying reader (*BufReader::get_mut() = new_reader), so the only new functionality here is (a) allowing replacing with a different type (map) and (b) discarding the current buffer before doing so.

Discarding the buffer does not seem particularly useful to me for the known use case. Presumably, you want to exhaust the reader anyway, not accidentally drop contents. I would probably assert!(buf_reader.buffer().is_empty()) rather than call discard on it.

I think (a) is also not yet well-motivated, it feels like an even more rare use case. Box<dyn Read> / &mut dyn Read as the stored type are probably also fine for 95% of BufReaders given the intentionally rare calls to the inner type.

@Mark-Simulacrum Mark-Simulacrum added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants