Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upProposed EOF reinstatement. #903
Conversation
tfogal
referenced this pull request
Feb 24, 2015
Closed
BufRead::read_line is not as described in the io RFC #22588
jmesmon
reviewed
Feb 24, 2015
| number of bytes read or an error. If a short read occurs, that call | ||
| gives the (short) data and returns the number of bytes written into | ||
| that buffer. If the error was non-transient, future calls shall return | ||
| the reason for the error (usually `EndOfFile`). |
This comment has been minimized.
This comment has been minimized.
jmesmon
Feb 24, 2015
So now we'll need to wrap read_to_end in order to actually read to the end? Why not add a provided method called really_read_to_end?
Why doesn't a plain read() work for the use case that you're using this modified read_to_end() on?
This comment has been minimized.
This comment has been minimized.
tfogal
Feb 24, 2015
Point taken. Honestly, perhaps it would be better just to remove these methods and let them get handled at the user level. I would certainly be more in favor of that than standardizing something that was not vetted.
Anyway, the thought was that sometimes you might intend to read "everything that's there". Depending on how things are set up, those (system call) reads (i.e. Rust's implementation) might get some partial bytes and not actually block. Without this clause, the implementation would need to busy wait (or "busy read" i guess) to deal with non-blocking and non-terminating sources.
I would be much more in favor of dropping these from the trait and only providing them as specific methods on types that make sense (i.e. synchronous files).
This comment has been minimized.
This comment has been minimized.
mahkoh
Feb 24, 2015
Contributor
These methods worked perfectly fine in old_io. There is no reason to remove them.
This comment has been minimized.
This comment has been minimized.
jmesmon
commented
Feb 24, 2015
|
For what it's worth, the returning of 0 on EOF is similar in style to standard posix apis. Not sure we should really be looking at Win32 as an example of what to do in API design ("Well, mostly." doesn't inspire confidence). The use of slices is for efficiency purposes, if iterators were used most |
This comment has been minimized.
This comment has been minimized.
tfogal
commented
Feb 24, 2015
|
C's We probably agree w.r.t. win32; just trying to be thorough. I expected that the slice implementation would decay into a pointer under the hood and the implementation would thus get to read directly into it. If this isn't conceivable, first I'd cry, then I'd agree with you. |
alexcrichton
reviewed
Feb 24, 2015
| @@ -556,6 +556,15 @@ buffers as input. This has multiple benefits: | |||
| results are wanted, one can use `read_to_end` and convert to a | |||
| `String` only at the end. | |||
|
|
|||
| `read_to_end` and `read_to_string` return a `Result` with either the | |||
| number of bytes read or an error. If a short read occurs, that call | |||
This comment has been minimized.
This comment has been minimized.
alexcrichton
Feb 24, 2015
Member
In this context, what would you define as a "short read"? If you're asking for the entire contents of the stream then a "short read" implies you already know how many bytes are in the stream in the first place.
Additionally, below you mention a "non-transient" error, but how do we classify errors as transient or not transient?
This comment has been minimized.
This comment has been minimized.
tfogal
Feb 24, 2015
In this context, what would you define as a "short read"? If you're asking for the entire
contents of the stream then a "short read" implies you already know how many bytes are in
the stream in the first place.
We talked about this above, I conceded the point. But to reiterate, the intent was to allow the implementation not to implement a busy loop if given a non-blocking source.
Additionally, below you mention a "non-transient" error, but how do we classify errors as
transient or not transient?
Not sure it's worth discussing, given the above, but: the intent was to give liberty to an implementation to just return whatever it got from the syscall.
Anyway, we've established that this solution sucks, but I still believe there should be something that prevents these from getting called with non-blocking sources, because it can't have any reasonable semantics for that case.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Feb 25, 2015
Member
We talked about this above, I conceded the point.
Ah ok, do you want to update the RFC with the recent comments?
alexcrichton
reviewed
Feb 24, 2015
| fn flush(&mut self) -> Result<(), Error>; | ||
| fn write_all(&mut self, buf: &[u8]) -> Result<(), Error> { .. } | ||
| fn write_fmt(&mut self, fmt: &fmt::Arguments) -> Result<(), Error> { .. } | ||
| fn write_all(&mut self, buf: &[u8]) -> Result<usize, Error> { .. } |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Feb 24, 2015
Member
This is a crucial part of the design of write_all that it fails if the entire buffer could not be written. Otherwise there are no real semantic wins over the write function and this ends up just largely being a duplicate.
This comment has been minimized.
This comment has been minimized.
alexcrichton
reviewed
Feb 24, 2015
| @@ -1009,6 +1021,13 @@ cross-platform compatibility (in the case of `read_line`) and for ease | |||
| in copying data without loss (in particular, distinguishing whether | |||
| the last line included a final delimiter). | |||
|
|
|||
| `EINTR`-based errors are automatically restarted by `read_until` | |||
| and `read_line`. Short reads give a partial buffer and return the | |||
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
tfogal
Feb 24, 2015
e.g. for read_line, it might not represent a line in the sense that it might not ever see a '\n'.
Or were you hinting that it was clear from discussion but not in the doc (and I should update the commit)?
aturon
self-assigned this
Mar 5, 2015
This comment has been minimized.
This comment has been minimized.
|
Thanks for this RFC! Note that the method signatures have now all been updated to return The other proposal of the RFC is to produce a separate notion of EOF error after the first time a zero byte result is returned. This seems problematic for a few reasons:
Note that in particular, with the current IO setup, it works well to use With all of that said, and the signature change already in place, I'm going to close out this RFC. Thanks again for bringing these issues to the community's attention! |
tfogal commentedFeb 24, 2015
Precedence: C, C++, Go, and Java all return EOFs as error codes when
the stream ends. Java includes higher-level APIs that return nulls,
about as close to the current Rust implementation as can be, given the
language differences.
Precedence: Win32 APIs and Windows-based stuff in general tends to
return 'success' and an indicator that it read 0 bytes when such calls
fail. Well, mostly. Asynchronous calls are weird, and if you supply
one of the parameters that is commonly null, then Win32's "ReadFile"
will actually start to give you EOFs.
Personally, I have a strong preference for using EOFs a la "the C way"
everywhere. Rust being sold as a systems programming language makes me
expect this kind of behavior. Hearing that the IO reforms are trying
to simplify and not implement the union of things between platforms
again implies to me that this will work in the "standard" way.
To be honest, I'm not sure I even like this. It seems strange that we have
read_to_endandread_to_string, and furthermore thatreadstaticallytakes a &mut [u8]. Why don't these methods take an iterator (or is it a
Cursorin this nomenclature?) and the specifics of whether that goes into au8 slice or a Vec or a String or a user-implemented-type-T get handled
by the iterator implementation?
Also, side bug fixed:
writewas returning a uint, but apparentlyusizeis thenew
uint.