Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upBufReader should provide a way to seek without dumping the buffer #31100
Comments
This comment has been minimized.
This comment has been minimized.
|
(I'm in the process of writing my own BufReader implementation that doesn't do this.) |
steveklabnik
added
the
A-libs
label
Jan 26, 2016
This comment has been minimized.
This comment has been minimized.
neokril
commented
Jan 27, 2016
|
Currently BufReader saves only the offset within buffered data. Another variant is to make a separate function (e.g. |
This comment has been minimized.
This comment has been minimized.
|
I'm not trying to avoid seeking the underlying stream, I'm trying to avoid dumping the buffer. In my implementation, small seeks that fit in the existing buffer cause a call to self.inner.seek(SeekFrom::Current(0)). |
taralx
changed the title
BufReader's seek implementation is somewhat irritating for performance
BufReader should provide a way to seek without dumping the buffer
Jan 27, 2016
This comment has been minimized.
This comment has been minimized.
neokril
commented
Jan 28, 2016
|
@taralx Could you please share your implementation? You use self.inner.seek(SeekFrom::Current(0)) to get current file offset. Am I correct? |
This comment has been minimized.
This comment has been minimized.
|
Not without getting source approval from my employer, sorry. Yes, I implement let p = try!(self.pos());
self.pos -= back;
return p - back; |
This comment has been minimized.
This comment has been minimized.
gcarq
commented
Oct 22, 2016
|
Some time ago I had a similiar requirement, I extracted the implementation into a own library: seek_bufread, maybe its useful for you. |
steveklabnik
added
T-libs
and removed
A-libs
labels
Mar 24, 2017
Mark-Simulacrum
added
the
C-feature-request
label
Jul 24, 2017
This comment has been minimized.
This comment has been minimized.
lolgesten
commented
Oct 14, 2017
|
I stumbled upon this today. When decoding mp4 movies, the decoder will do small seeks forward to skip in the region of 100 bytes at a time. I used a BufReader with 65k capacity to avoid my underlying reader doing http requests for every seek, only to find that BufReader isn't helping at all. I think BufReader could be more useful if it worked from the buffer also when seek fits. |
dtolnay
added
C-feature-accepted
and removed
C-feature-request
labels
Nov 18, 2017
This comment has been minimized.
This comment has been minimized.
|
I would like to see a PR that implements |
Diggsey
referenced this issue
Dec 19, 2017
Merged
BufRead: Only flush the internal buffer if seeking outside of it. #46832
bors
added a commit
that referenced
this issue
Jan 14, 2018
bors
closed this
in
#46832
Jan 14, 2018
This comment has been minimized.
This comment has been minimized.
|
FWIW, this solution requires R:Seek. A |
This comment has been minimized.
This comment has been minimized.
|
This should not be closed, the new method is still unstable. It’s been a couple months though: @rfcbot fcp merge |
SimonSapin
reopened this
Mar 16, 2018
This comment has been minimized.
This comment has been minimized.
|
@SimonSapin to confirm, this is for the |
This comment has been minimized.
This comment has been minimized.
|
Yes, that is the one item with |
This comment has been minimized.
This comment has been minimized.
|
Hmm, maybe I needed to reopen the issue before making the command: @rfcbot fcp merge |
rfcbot
added
the
proposed-final-comment-period
label
Mar 17, 2018
This comment has been minimized.
This comment has been minimized.
rfcbot
commented
Mar 17, 2018
•
|
Team member @SimonSapin has proposed to merge this. The next step is review by the rest of the tagged teams: No concerns currently listed. Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
rfcbot
added
the
final-comment-period
label
Mar 18, 2018
This comment has been minimized.
This comment has been minimized.
rfcbot
commented
Mar 18, 2018
|
|
rfcbot
removed
the
proposed-final-comment-period
label
Mar 18, 2018
This comment has been minimized.
This comment has been minimized.
rfcbot
commented
Mar 28, 2018
|
The final comment period is now complete. |
Centril
added
disposition-merge
finished-final-comment-period
and removed
final-comment-period
labels
May 24, 2018
This comment has been minimized.
This comment has been minimized.
|
I agree with @dtolnay and don't think a separate method is needed for this. This can just be the behavior when calling |
This comment has been minimized.
This comment has been minimized.
I think it could be done something like this: fn seek(&mut self, pos: SeekFrom) -> io::Result<u64> {
let result: u64;
if let SeekFrom::Current(n) = pos {
if let Some(new_pos) = (self.pos as i64).checked_add(n) {
if new_pos >= 0 && new_pos <= self.cap {
self.pos = new_pos as usize;
return self.inner.seek(SeekFrom::Current(0)).map(|p| p + (self.cap - self.pos));
}
}
// .. the rest is the same as before |
This comment has been minimized.
This comment has been minimized.
|
I don't have any real cases in mind but changing the existing seek semantics could cause subtle bugs for people who are relying on it. E.g. if the user is expecting the data at an in-buffer offset to change in the underlying reader and is using seek to refresh the buffer, this optimization would cause it to yield stale data. |
This comment has been minimized.
This comment has been minimized.
t-rapp
commented
Dec 5, 2018
•
|
I stumbled over this ticket when implementing a library that parses media files. What makes me uneasy about using It would be great to add an optimization path for this case (get current position) to the |
This comment has been minimized.
This comment has been minimized.
funchal
commented
Feb 9, 2019
|
Hi I'm new to rust, I'm surprised by this behaviour. Can I ask why |
This comment has been minimized.
This comment has been minimized.
|
Because that's the way to sync the BufReader's idea of where the file pointer is with the underlying reader. The other alternative would be to make |
taralx commentedJan 22, 2016
BufReader's seek() implementation always dumps the buffer. This is not good for performance, where users may want to skip a variable amount of buffered data -- only BufReader really knows if it's reasonable to move the pointer or not.
Additionally, only BufReader can know if the pointer can be reversed or not -- so there's absolutely no cheap way to "unget" data from BufReader, even if you know it's seekable.
I'd recommend removing this behavior and moving it to unwrap() or another method (sync?), but it's now baked into a stable API.
What are the options now?