Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upstd: Add `io` module again #21835
Conversation
rust-highfive
assigned
aturon
Feb 1, 2015
This comment has been minimized.
This comment has been minimized.
|
r? @aturon (rust_highfive has picked a reviewer for you, use r? to override) |
This comment has been minimized.
This comment has been minimized.
|
Note that rust-lang/rfcs#576 has not yet been merged, so this should hold off on merging until that happens. |
alexcrichton
referenced this pull request
Feb 1, 2015
Merged
Amend RFC 517: Revisions to reader/writer, core::io and std::io #576
alexcrichton
force-pushed the
alexcrichton:iov2
branch
from
0ae1533
to
7aaa7de
Feb 1, 2015
alexcrichton
reviewed
Feb 1, 2015
| unsafe { | ||
| let other = ptr::read(self); | ||
| *self = &mut other[amt..]; | ||
| } |
This comment has been minimized.
This comment has been minimized.
alexcrichton
Feb 1, 2015
Author
Member
I was unable to figure out a safe way of doing this, but I have a feeling one may be lurking somewhere, certainly open to suggestions!
This comment has been minimized.
This comment has been minimized.
eddyb
Feb 1, 2015
Member
This works (see in playpen):
fn cut<T>(xs: &mut &mut [T], amt: usize) {
let original = mem::replace(xs, &mut []);
mem::replace(xs, &mut original[amt..]);
}
This comment has been minimized.
This comment has been minimized.
nagisa
reviewed
Feb 1, 2015
| /// | ||
| /// The returned type implements `Iterator` where the `Item` is `Result<u8, | ||
| /// R::Err>`. The yielded item is `Ok` if a byte was successfully read and | ||
| /// `Err` otherwise for I/O errors. EOF is mapped to returning `None` for |
This comment has been minimized.
This comment has been minimized.
nagisa
reviewed
Feb 1, 2015
| /// | ||
| /// The returned instance of `Read` will yield all this object's bytes | ||
| /// until EOF is reached. Afterwards the bytes of `next` will be yielded | ||
| /// infinitely. |
This comment has been minimized.
This comment has been minimized.
nagisa
Feb 1, 2015
Contributor
Infinitely as even regardless of EOFs? Where will it get those infinite bytes from?
The returned
Readinstance will first read all bytes from this object until EOF is encountered. Afterwards the instance(’s output) is equivalent to thenext(’s output).
nagisa
reviewed
Feb 1, 2015
| /// throughout the I/O and related libraries take and provide types which | ||
| /// implement the `Write` trait. | ||
| pub trait Write { | ||
| /// Write a buffer into this object, returning how many bytes were read. |
This comment has been minimized.
This comment has been minimized.
nagisa
reviewed
Feb 1, 2015
| match self.write(buf) { | ||
| Ok(0) => return Err(Error::new(ErrorKind::EndOfFile, | ||
| "failed to write whole buffer: \ | ||
| eof reached", None)), |
This comment has been minimized.
This comment has been minimized.
nagisa
Feb 1, 2015
Contributor
This is Write, not Read. Ok(0) does not signify EOFs, does it? I’d use ErrorKind::Other or something here.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Feb 1, 2015
Author
Member
Currently this is used to signify that the write could not complete because the "destination was full", which in some sense is a bit of an EOF condition. For example the main user of this is the Write impl for &mut [u8] which returns EOF when self is the empty slice.
That being said though, I can definitely see where having an EndOfFile variant of ErrorKind is somewhat confusing as it's not intended to be returned for functions like read.
This comment has been minimized.
This comment has been minimized.
aturon
Feb 3, 2015
Member
Yes, we definitely need to change this error variant, or we will have mass confusion.
nagisa
reviewed
Feb 1, 2015
|
|
||
| /// Enumeration of possible methods to seek within an I/O object. | ||
| #[derive(Copy, PartialEq, Eq, Clone, Debug)] | ||
| pub enum SeekFrom { |
This comment has been minimized.
This comment has been minimized.
nagisa
Feb 1, 2015
Contributor
This doesn’t match the current version of RFC. Did somebody forgot to update it?
This comment has been minimized.
This comment has been minimized.
alexcrichton
Feb 1, 2015
Author
Member
Indeed! I adopted the proposed renaming ahead of time as I think we'll take it. I'll update this to whatever ends up being settled upon, however.
nagisa
reviewed
Feb 1, 2015
|
|
||
| /// Read all bytes until the byte 0xA ('\n') is reached. | ||
| /// | ||
| /// This function will ;consume all bytes in the underlying stream until a |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
mahkoh
Feb 1, 2015
Contributor
This is not good because
- It will likely read more bytes from the underlying stream
- Not all of those bytes will be appended
nagisa
reviewed
Feb 1, 2015
| /// Cursors are not currently generic over the type contained within, but may | ||
| /// become so. | ||
| pub struct Cursor<T> { | ||
| pos: u64, |
This comment has been minimized.
This comment has been minimized.
mahkoh
reviewed
Feb 1, 2015
| /// | ||
| /// This writer will be flushed when it is dropped. | ||
| pub struct BufWriter<W> { | ||
| inner: Option<W>, |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Feb 1, 2015
Author
Member
It can indeed. That is currently being tracked in #21729. I would like to investigate the performance impact of having Option<W> vs W and didn't want to go too much into the benchmarking details on this PR. This is an implementation detail which shouldn't affect the API and we can always update this in the future.
mahkoh
reviewed
Feb 1, 2015
| fn flush_buf(&mut self) -> io::Result<usize> { | ||
| let len = self.buf.len(); | ||
| let n = try!(self.inner.as_mut().unwrap().write(&self.buf[])); | ||
| // FIXME would be better expressed as .remove(0..n) if it existed |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
mahkoh
reviewed
Feb 1, 2015
| } | ||
| } | ||
|
|
||
| fn flush(&mut self) -> io::Result<()> { |
This comment has been minimized.
This comment has been minimized.
mahkoh
Feb 1, 2015
Contributor
/// Flush this output stream, ensuring that all intermediately buffered contents reach their destination.
This implementation is wrong because flush_buf does not ensure this.
This comment has been minimized.
This comment has been minimized.
mahkoh
Feb 1, 2015
Contributor
flush should probably return an error if flush_buf doesn't write all bytes.
mahkoh
reviewed
Feb 1, 2015
| fn drop(&mut self) { | ||
| if self.inner.is_some() { | ||
| // dtors should not panic, so we ignore a panicked flush | ||
| let _ = self.flush_buf(); |
This comment has been minimized.
This comment has been minimized.
mahkoh
reviewed
Feb 1, 2015
|
|
||
| impl<W: Write> Write for LineWriter<W> { | ||
| fn write(&mut self, buf: &[u8]) -> io::Result<usize> { | ||
| match buf.rposition_elem(&b'\n') { |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Feb 1, 2015
Author
Member
I agree that this is definitely an area where we can have future optimizations. For now I think we may want to track this in an issue because we're not exposing memchr in the standard library (or memrchr). I tried to move instances of .iter().position(..) to position_elem or rposition_elem so they're easy to flag and migrate once we have them implemented.
One possibility would be to expand the slice::bytes module with some functions like this as a place to house the optimized versions.
This comment has been minimized.
This comment has been minimized.
huonw
Feb 2, 2015
Member
memrchr seems to be a GNU extension, and isn't necessarily available cross-platform.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
eddyb
Feb 2, 2015
Member
We could at least have the assembly implementations bundled with Rust to not depend on system ones - however, that's subpar compared with LLVM handling them - but it could be cumbersome for LLVM to keep track of all these intrinsics, maybe what it needs is more general loop idioms that optimize down to instruction similar to the manual versions.
mahkoh
reviewed
Feb 1, 2015
| let n = try!(self.inner.write(&buf[..i + 1])); | ||
| if n != i + 1 { return Ok(n) } | ||
| try!(self.inner.flush()); | ||
| self.inner.write(&buf[i + 1..]) |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
mahkoh
reviewed
Feb 1, 2015
| // currently are | ||
| let difference = pos as i64 - len as i64; | ||
| if difference > 0 { | ||
| self.get_mut().extend(repeat(0).take(difference as usize)); |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Feb 1, 2015
Author
Member
I definitely agree that memset would likely do the trick here, but for now I would prefer to avoid unnecessary unsafe code and this is a well known problem which needs an idiomatic solution no matter what.
For now I'd prefer to postpone optimizing this aspect in particular until we have a more general method of allowing the optimization to happen on Vec natively.
mahkoh
reviewed
Feb 1, 2015
| } | ||
| } | ||
| impl Write for Cursor<Vec<u8>> { | ||
| fn write(&mut self, buf: &[u8]) -> io::Result<usize> { |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Feb 1, 2015
Author
Member
It would be more helpful to me at least for this to be a little more constructive. Do you have some suggestions about how the method could be trimmed down?
This comment has been minimized.
This comment has been minimized.
mahkoh
Feb 1, 2015
Contributor
Something like this:
fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
let pos = self.pos as usize;
// Insert skipped zeros.
let tail = pos.saturating_sub(self.inner.len());
self.inner.extend(repeat(0).take(tail));
// Make room for buf.
let overhead = (pos + buf.len()).saturating_sub(self.inner.len());
self.inner.reserve(overhead);
unsafe {
let len = self.inner.len();
self.inner.set_len(len + overhead);
}
// Copy buf.
let dst = &mut self.inner[pos..];
slice::bytes::copy_memory(dst, buf);
self.pos += buf.len() as u64;
Ok(buf.len())
}
This comment has been minimized.
This comment has been minimized.
alexcrichton
Feb 1, 2015
Author
Member
Cool! I think that the size of this method originally stemmed from trying to micro-optimize it because Write for Vec<u8> didn't exist (and it was the only write-into-memory writer), but I agree that lots of the branches can be removed.
I've adapted what you've posted to use memcpy twice instead of once to avoid the use of unsafe, but if it becomes a bottleneck we can certainly optimize it a bit more.
mahkoh
reviewed
Feb 1, 2015
| } | ||
|
|
||
| /// Creates a new instance of an `Error` from a particular OS error code. | ||
| pub fn from_os_error(code: i32) -> Error { |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Feb 1, 2015
Author
Member
Our conventions dictate that to_foo is a function which returns Foo of some form. In this case the Error isn't necessarily an OsError, so I wouldn't expect the os_error suffix of the to variant.
Our conventions also say that from_foo means that the function takes a Foo and returns a Self, which is largely what this method is doing. That being said this is saying that an "os error" is simply an instance of i32, which we may want to change in the future (with an explicit OsError), so it may not be the best name for this method.
This comment has been minimized.
This comment has been minimized.
mahkoh
Feb 1, 2015
Contributor
I actually meant that there should be an inverse to_os_error function that returns the underlying error code, if any, for C interop.
This comment has been minimized.
This comment has been minimized.
alexcrichton
Feb 3, 2015
Author
Member
I think for now I'm going to try to take a relatively conservative approach with io::Error and we can try to expand it later on.
mahkoh
reviewed
Feb 1, 2015
| } | ||
|
|
||
| /// Returns a detailed error message for this error (if one is available) | ||
| pub fn detail(&self) -> Option<String> { |
This comment has been minimized.
This comment has been minimized.
mahkoh
reviewed
Feb 1, 2015
| let n = try!(f({ | ||
| let base = v.as_mut_ptr().offset(v.len() as isize); | ||
| black_box(slice::from_raw_mut_buf(mem::copy_lifetime(v, &base), | ||
| v.capacity() - v.len())) |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
mahkoh
reviewed
Feb 1, 2015
| unsafe fn black_box<T>(dummy: T) -> T { | ||
| asm!("" : : "r"(&dummy)); | ||
| dummy | ||
| } |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
eddyb
Feb 1, 2015
Member
This is passing a shared reference to an (otherwise immutable) stack slot. The only reason it works is that LLVM doesn't tell the difference - it's like using const in C when you clearly want to pretend mutation is happening.
Aside from telling LLVM about its immutability, an immediate T with no unsafe interior is safe to promote to a SSA value when debuginfo isn't required, and I have implemented this optimization in all of of my trans refactoring branches.
I highly suggest passing &mut dummy to asm!, in addition to the "memory" clobber suggested by @mahkoh.
mahkoh
reviewed
Feb 1, 2015
| /// | ||
| /// If this function encounters any form of I/O or other error, an error | ||
| /// variant will be returned. If an error is returned then it is guaranteed | ||
| /// that no bytes were read successfully. |
This comment has been minimized.
This comment has been minimized.
mahkoh
reviewed
Feb 1, 2015
| /// | ||
| /// If the return value of this method is `Ok(n)`, then it must be | ||
| /// guaranteed that `0 <= n <= buf.len()`. If `n` is `0` then it indicates | ||
| /// that this reader as reached "end of file" and will likely no longer be |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
mahkoh
Feb 1, 2015
Contributor
There are well known examples where 0 does not mean that further reads will also return 0. Maybe this should not talk about what's "likely" or "unlikely".
mahkoh
reviewed
Feb 1, 2015
| /// is reached, at which point this function will immediately return. | ||
| /// | ||
| /// This function will only successfully return if an invocation of `Ok(0)` | ||
| /// succeeds at which point all bytes have been read into `buf`. |
This comment has been minimized.
This comment has been minimized.
mahkoh
Feb 1, 2015
Contributor
an invocation of
Ok(0)
Maybe my English is failing me but this doesn't make much sense.
at which point all bytes have been read into
buf
All bytes of what?
mahkoh
reviewed
Feb 1, 2015
| /// | ||
| /// If any other read error is encountered then this function immediately | ||
| /// returns. Any bytes which have already been read will be present in | ||
| /// `buf` and the length will be adjusted appropriately. |
This comment has been minimized.
This comment has been minimized.
mahkoh
Feb 1, 2015
Contributor
and the length will be adjusted appropriately.
Sounds like an implementation detail.
mahkoh
reviewed
Feb 1, 2015
| read_to_end(self, buf) | ||
| } | ||
|
|
||
| /// Read all remaining bytes in this source, placing them into `buf`. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
mahkoh
Feb 1, 2015
Contributor
It reads until EOF which does not imply that further reads also return EOF.
alexcrichton
force-pushed the
alexcrichton:iov2
branch
from
8346a0d
to
5cf9905
Feb 3, 2015
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
bors
added a commit
that referenced
this pull request
Feb 3, 2015
This comment has been minimized.
This comment has been minimized.
|
|
This comment has been minimized.
This comment has been minimized.
|
|
This comment has been minimized.
This comment has been minimized.
|
|
This comment has been minimized.
This comment has been minimized.
|
@bors: retry |
bors
added a commit
that referenced
this pull request
Feb 4, 2015
This comment has been minimized.
This comment has been minimized.
|
|
alexcrichton
added a commit
to alexcrichton/rust
that referenced
this pull request
Feb 4, 2015
This comment has been minimized.
This comment has been minimized.
|
|
This comment has been minimized.
This comment has been minimized.
|
@bors: retry |
bors
added a commit
that referenced
this pull request
Feb 4, 2015
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
|
This comment has been minimized.
This comment has been minimized.
|
@bors: retry |
This comment has been minimized.
This comment has been minimized.
bors
added a commit
that referenced
this pull request
Feb 4, 2015
This comment has been minimized.
This comment has been minimized.
|
|
This comment has been minimized.
This comment has been minimized.
|
@bors: retry |
1 similar comment
This comment has been minimized.
This comment has been minimized.
|
@bors: retry |
This comment has been minimized.
This comment has been minimized.
bors
added a commit
that referenced
this pull request
Feb 4, 2015
This comment has been minimized.
This comment has been minimized.
|
|
This comment has been minimized.
This comment has been minimized.
|
@bors: retry |
bors
merged commit 5cf9905
into
rust-lang:master
Feb 4, 2015
alexcrichton
deleted the
alexcrichton:iov2
branch
Feb 4, 2015
reem
reviewed
Feb 24, 2015
|
|
||
| // Semi-hack used to prevent LLVM from retaining any assumptions about | ||
| // `dummy` over this function call | ||
| unsafe fn black_box<T>(mut dummy: T) -> T { |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
nagisa
Feb 24, 2015
Contributor
It is not related to memory, though. I find test be a better location for this.
alexcrichton commentedFeb 1, 2015
This commit is an implementation of RFC 576 which adds back the
std::iomodule to the standard library. No functionality in
std::old_iohas beendeprecated just yet, and the new
std::iomodule is behind the sameiofeature gate.
A good bit of functionality was copied over from
std::old_io, but many tweakswere required for the new method signatures. Behavior such as precisely when
buffered objects call to the underlying object may have been tweaked slightly in
the transition. All implementations were audited to use composition wherever
possible. For example the custom
posandcapcursors inBufReaderwereremoved in favor of just using
Cursor<Vec<u8>>.A few liberties were taken during this implementation which were not explicitly
spelled out in the RFC:
LineBufferedWriteris now namedLineWriterErrornow favors OS error codes (a0-allocation path) and contains a
Boxfor extra semantic data.SeekasNewSeekto prevent conflictswith the real prelude reexport of
old_io::Seekcharsmethod was moved fromBufReadExttoReadExt.charsiterator returns a custom error with a variant that explains thatthe data was not valid UTF-8.