Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for RFC 2930 (read-buf) #78485

Open
1 of 5 tasks
nikomatsakis opened this issue Oct 28, 2020 · 40 comments
Open
1 of 5 tasks

Tracking Issue for RFC 2930 (read-buf) #78485

nikomatsakis opened this issue Oct 28, 2020 · 40 comments
Assignees
Labels
A-io Area: std::io, std::fs, std::net and std::path B-RFC-approved Feature: Approved by a merged RFC but not yet implemented. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Oct 28, 2020

This is a tracking issue for the RFC "2930" (rust-lang/rfcs#2930).
The feature gate for the issue is #![feature(read_buf)].

About tracking issues

Tracking issues are used to record the overall progress of implementation.
They are also used as hubs connecting to other relevant issues, e.g., bugs or open design questions.
A tracking issue is however not meant for large scale discussion, questions, or bug reports about a feature.
Instead, open a dedicated issue for the specific matter and add the relevant feature gate label.

Steps

Unresolved Questions

Implementation history

@nikomatsakis nikomatsakis added B-RFC-approved Feature: Approved by a merged RFC but not yet implemented. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. labels Oct 28, 2020
@nikomatsakis
Copy link
Contributor Author

( cc @rust-lang/libs )

@beepster4096
Copy link
Contributor

I'm interested in working on this.

@rustbot claim

@KodrAus KodrAus added the Libs-Tracked Libs issues that are tracked on the team's project board. label Nov 6, 2020
@beepster4096
Copy link
Contributor

beepster4096 commented Nov 8, 2020

I am slightly confused by the API of ReadBufs. What return type should methods like initialized have: &[u8] or &[IoSlice]? If its the former, how do you know which slices are initialized/filled? If it's the latter, what happens if a slice is partially initialized/filled?

edit: probably I should ask this on zulip instead of github

@sfackler
Copy link
Member

sfackler commented Nov 8, 2020

It would return &[IoSliceMut], and only include the slices that are fully initialized.

@programmerjake
Copy link
Member

It would return &[IoSliceMut], and only include the slices that are fully initialized.

I would have expected it to return (&[IoSliceMut], &[u8]) where it is some number of fully initialized buffers and the initialized portion of the partially initialized buffer.

@sfackler
Copy link
Member

sfackler commented Nov 8, 2020

Sure, that makes sense.

@sfackler
Copy link
Member

sfackler commented Dec 5, 2020

@drmeepster are you still working on this?

@beepster4096
Copy link
Contributor

Yeah, I am. Although I'm currently working on #79607 because I needed it for this.

@beepster4096
Copy link
Contributor

Okay, I can continue working on this now that we have MaybeUninit::write_slice

@sunshowers
Copy link
Contributor

I have a PR to add an inner_mut method to Tokio's implementation of ReadBuf: tokio-rs/tokio#3443. As far as I can tell it's a valid use case that there's no other way to do short of pointer arithmetic, so it may make sense to have this be in upstream Rust's ReadBuf as well.

@erickt
Copy link
Contributor

erickt commented Apr 28, 2021

Have we considered extending ReadBuf to be generic on the type, rather than be constrained to u8? I'm guessing much of ReadBuf is generic over the type.

This came up because I'm looking into fixing some UB in rayon, which is passing around uninitialized &mut [T] in its collect iterator. I think the main thing making rayon use this over a &mut Vec<T> is that it wants to split the output buffer across threads, but there's no safe way to do this without initializing the slice. I'd like to replace this with a safe abstraction that's probably quite similar to the design proposed for ReadBuf (plus a split_at method), so I thought maybe there are other people potentially interested in this functionality.

Going further, it would also be interesting to see if we could rewrite Vec to sit upon ReadBuf.

@djc
Copy link
Contributor

djc commented Apr 28, 2021

@Amanieu
Copy link
Member

Amanieu commented Apr 28, 2021

Note that we now have a spare_capacity_mut method on Vec which gives you a &mut [MaybeUninit<T>] for the uninitialized part of a vector.

@AngelicosPhosphoros
Copy link
Contributor

Can we put this struct into core crate? It can be useful for no_std implementations of getrandom or other things.
Am I right that this struct never allocates?

@andylizi
Copy link
Contributor

andylizi commented Jul 9, 2022

Can we put this struct into core crate?

I'd imagine this would be blocked by #48331.

@beepster4096
Copy link
Contributor

There's no reason that this isn't in core other than the fact that the module core::io doesn't exist yet. It doesn't rely on anything else in std::io.

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Aug 28, 2022
std::io: migrate ReadBuf to BorrowBuf/BorrowCursor

This PR replaces `ReadBuf` (used by the `Read::read_buf` family of methods) with `BorrowBuf` and `BorrowCursor`.

The general idea is to split `ReadBuf` because its API is large and confusing. `BorrowBuf` represents a borrowed buffer which is mostly read-only and (other than for construction) deals only with filled vs unfilled segments. a `BorrowCursor` is a mostly write-only view of the unfilled part of a `BorrowBuf` which distinguishes between initialized and uninitialized segments. For `Read::read_buf`, the caller would create a `BorrowBuf`, then pass a `BorrowCursor` to `read_buf`.

In addition to the major API split, I've made the following smaller changes:

* Removed some methods entirely from the API (mostly the functionality can be replicated with two calls rather than a single one)
* Unified naming, e.g., by replacing initialized with init and assume_init with set_init
* Added an easy way to get the number of bytes written to a cursor (`written` method)

As well as simplifying the API (IMO), this approach has the following advantages:

* Since we pass the cursor by value, we remove the 'unsoundness footgun' where a malicious `read_buf` could swap out the `ReadBuf`.
* Since `read_buf` cannot write into the filled part of the buffer, we prevent the filled part shrinking or changing which could cause underflow for the caller or unexpected behaviour.

## Outline

```rust
pub struct BorrowBuf<'a>

impl Debug for BorrowBuf<'_>

impl<'a> From<&'a mut [u8]> for BorrowBuf<'a>
impl<'a> From<&'a mut [MaybeUninit<u8>]> for BorrowBuf<'a>

impl<'a> BorrowBuf<'a> {
    pub fn capacity(&self) -> usize
    pub fn len(&self) -> usize
    pub fn init_len(&self) -> usize
    pub fn filled(&self) -> &[u8]
    pub fn unfilled<'this>(&'this mut self) -> BorrowCursor<'this, 'a>
    pub fn clear(&mut self) -> &mut Self
    pub unsafe fn set_init(&mut self, n: usize) -> &mut Self
}

pub struct BorrowCursor<'buf, 'data>

impl<'buf, 'data> BorrowCursor<'buf, 'data> {
    pub fn clone<'this>(&'this mut self) -> BorrowCursor<'this, 'data>
    pub fn capacity(&self) -> usize
    pub fn written(&self) -> usize
    pub fn init_ref(&self) -> &[u8]
    pub fn init_mut(&mut self) -> &mut [u8]
    pub fn uninit_mut(&mut self) -> &mut [MaybeUninit<u8>]
    pub unsafe fn as_mut(&mut self) -> &mut [MaybeUninit<u8>]
    pub unsafe fn advance(&mut self, n: usize) -> &mut Self
    pub fn ensure_init(&mut self) -> &mut Self
    pub unsafe fn set_init(&mut self, n: usize) -> &mut Self
    pub fn append(&mut self, buf: &[u8])
}
```

## TODO

* ~~Migrate non-unix libs and tests~~
* ~~Naming~~
  * ~~`BorrowBuf` or `BorrowedBuf` or `SliceBuf`? (We might want an owned equivalent for the async IO traits)~~
  * ~~Should we rename the `readbuf` module? We might keep the name indicate it includes both the buf and cursor variations and someday the owned version too. Or we could change it. It is not publicly exposed, so it is not that important~~.
  * ~~`read_buf` method: we read into the cursor now, so the `_buf` suffix is a bit weird.~~
* ~~Documentation~~
* Tests are incomplete (I adjusted existing tests, but did not add new ones).

cc rust-lang#78485, rust-lang#94741
supersedes: rust-lang#95770, rust-lang#93359
fixes rust-lang#93305
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Aug 28, 2022
std::io: migrate ReadBuf to BorrowBuf/BorrowCursor

This PR replaces `ReadBuf` (used by the `Read::read_buf` family of methods) with `BorrowBuf` and `BorrowCursor`.

The general idea is to split `ReadBuf` because its API is large and confusing. `BorrowBuf` represents a borrowed buffer which is mostly read-only and (other than for construction) deals only with filled vs unfilled segments. a `BorrowCursor` is a mostly write-only view of the unfilled part of a `BorrowBuf` which distinguishes between initialized and uninitialized segments. For `Read::read_buf`, the caller would create a `BorrowBuf`, then pass a `BorrowCursor` to `read_buf`.

In addition to the major API split, I've made the following smaller changes:

* Removed some methods entirely from the API (mostly the functionality can be replicated with two calls rather than a single one)
* Unified naming, e.g., by replacing initialized with init and assume_init with set_init
* Added an easy way to get the number of bytes written to a cursor (`written` method)

As well as simplifying the API (IMO), this approach has the following advantages:

* Since we pass the cursor by value, we remove the 'unsoundness footgun' where a malicious `read_buf` could swap out the `ReadBuf`.
* Since `read_buf` cannot write into the filled part of the buffer, we prevent the filled part shrinking or changing which could cause underflow for the caller or unexpected behaviour.

## Outline

```rust
pub struct BorrowBuf<'a>

impl Debug for BorrowBuf<'_>

impl<'a> From<&'a mut [u8]> for BorrowBuf<'a>
impl<'a> From<&'a mut [MaybeUninit<u8>]> for BorrowBuf<'a>

impl<'a> BorrowBuf<'a> {
    pub fn capacity(&self) -> usize
    pub fn len(&self) -> usize
    pub fn init_len(&self) -> usize
    pub fn filled(&self) -> &[u8]
    pub fn unfilled<'this>(&'this mut self) -> BorrowCursor<'this, 'a>
    pub fn clear(&mut self) -> &mut Self
    pub unsafe fn set_init(&mut self, n: usize) -> &mut Self
}

pub struct BorrowCursor<'buf, 'data>

impl<'buf, 'data> BorrowCursor<'buf, 'data> {
    pub fn clone<'this>(&'this mut self) -> BorrowCursor<'this, 'data>
    pub fn capacity(&self) -> usize
    pub fn written(&self) -> usize
    pub fn init_ref(&self) -> &[u8]
    pub fn init_mut(&mut self) -> &mut [u8]
    pub fn uninit_mut(&mut self) -> &mut [MaybeUninit<u8>]
    pub unsafe fn as_mut(&mut self) -> &mut [MaybeUninit<u8>]
    pub unsafe fn advance(&mut self, n: usize) -> &mut Self
    pub fn ensure_init(&mut self) -> &mut Self
    pub unsafe fn set_init(&mut self, n: usize) -> &mut Self
    pub fn append(&mut self, buf: &[u8])
}
```

## TODO

* ~~Migrate non-unix libs and tests~~
* ~~Naming~~
  * ~~`BorrowBuf` or `BorrowedBuf` or `SliceBuf`? (We might want an owned equivalent for the async IO traits)~~
  * ~~Should we rename the `readbuf` module? We might keep the name indicate it includes both the buf and cursor variations and someday the owned version too. Or we could change it. It is not publicly exposed, so it is not that important~~.
  * ~~`read_buf` method: we read into the cursor now, so the `_buf` suffix is a bit weird.~~
* ~~Documentation~~
* Tests are incomplete (I adjusted existing tests, but did not add new ones).

cc rust-lang#78485, rust-lang#94741
supersedes: rust-lang#95770, rust-lang#93359
fixes rust-lang#93305
workingjubilee pushed a commit to tcdi/postgrestd that referenced this issue Sep 15, 2022
std::io: Modify some ReadBuf method signatures to return `&mut Self`

This allows using `ReadBuf` in a builder-like style and to setup a `ReadBuf` and
pass it to `read_buf` in a single expression, e.g.,

```
// With this PR:
reader.read_buf(ReadBuf::uninit(buf).assume_init(init_len))?;

// Previously:
let mut buf = ReadBuf::uninit(buf);
buf.assume_init(init_len);
reader.read_buf(&mut buf)?;
```

r? `@sfackler`

cc rust-lang/rust#78485, rust-lang/rust#94741
workingjubilee pushed a commit to tcdi/postgrestd that referenced this issue Sep 15, 2022
std::io: migrate ReadBuf to BorrowBuf/BorrowCursor

This PR replaces `ReadBuf` (used by the `Read::read_buf` family of methods) with `BorrowBuf` and `BorrowCursor`.

The general idea is to split `ReadBuf` because its API is large and confusing. `BorrowBuf` represents a borrowed buffer which is mostly read-only and (other than for construction) deals only with filled vs unfilled segments. a `BorrowCursor` is a mostly write-only view of the unfilled part of a `BorrowBuf` which distinguishes between initialized and uninitialized segments. For `Read::read_buf`, the caller would create a `BorrowBuf`, then pass a `BorrowCursor` to `read_buf`.

In addition to the major API split, I've made the following smaller changes:

* Removed some methods entirely from the API (mostly the functionality can be replicated with two calls rather than a single one)
* Unified naming, e.g., by replacing initialized with init and assume_init with set_init
* Added an easy way to get the number of bytes written to a cursor (`written` method)

As well as simplifying the API (IMO), this approach has the following advantages:

* Since we pass the cursor by value, we remove the 'unsoundness footgun' where a malicious `read_buf` could swap out the `ReadBuf`.
* Since `read_buf` cannot write into the filled part of the buffer, we prevent the filled part shrinking or changing which could cause underflow for the caller or unexpected behaviour.

## Outline

```rust
pub struct BorrowBuf<'a>

impl Debug for BorrowBuf<'_>

impl<'a> From<&'a mut [u8]> for BorrowBuf<'a>
impl<'a> From<&'a mut [MaybeUninit<u8>]> for BorrowBuf<'a>

impl<'a> BorrowBuf<'a> {
    pub fn capacity(&self) -> usize
    pub fn len(&self) -> usize
    pub fn init_len(&self) -> usize
    pub fn filled(&self) -> &[u8]
    pub fn unfilled<'this>(&'this mut self) -> BorrowCursor<'this, 'a>
    pub fn clear(&mut self) -> &mut Self
    pub unsafe fn set_init(&mut self, n: usize) -> &mut Self
}

pub struct BorrowCursor<'buf, 'data>

impl<'buf, 'data> BorrowCursor<'buf, 'data> {
    pub fn clone<'this>(&'this mut self) -> BorrowCursor<'this, 'data>
    pub fn capacity(&self) -> usize
    pub fn written(&self) -> usize
    pub fn init_ref(&self) -> &[u8]
    pub fn init_mut(&mut self) -> &mut [u8]
    pub fn uninit_mut(&mut self) -> &mut [MaybeUninit<u8>]
    pub unsafe fn as_mut(&mut self) -> &mut [MaybeUninit<u8>]
    pub unsafe fn advance(&mut self, n: usize) -> &mut Self
    pub fn ensure_init(&mut self) -> &mut Self
    pub unsafe fn set_init(&mut self, n: usize) -> &mut Self
    pub fn append(&mut self, buf: &[u8])
}
```

## TODO

* ~~Migrate non-unix libs and tests~~
* ~~Naming~~
  * ~~`BorrowBuf` or `BorrowedBuf` or `SliceBuf`? (We might want an owned equivalent for the async IO traits)~~
  * ~~Should we rename the `readbuf` module? We might keep the name indicate it includes both the buf and cursor variations and someday the owned version too. Or we could change it. It is not publicly exposed, so it is not that important~~.
  * ~~`read_buf` method: we read into the cursor now, so the `_buf` suffix is a bit weird.~~
* ~~Documentation~~
* Tests are incomplete (I adjusted existing tests, but did not add new ones).

cc rust-lang/rust#78485, rust-lang/rust#94741
supersedes: rust-lang/rust#95770, rust-lang/rust#93359
fixes #93305
@ChayimFriedman2
Copy link
Contributor

Currently, if you Write::write() into BorrowedCursor more space than it can hold, it panics like if you call append(). I'd expect that it write only part of the data for write() and err with ErrorKind::WriteZero for write_all(), like impl Write for &mut [u8] does.

@Pr0methean
Copy link

Pr0methean commented Apr 24, 2023

Could we please have a shared interface trait for Seek::stream_position and BorrowedBuf::len? That would help in adapting library crates such as zip_next to use a BorrowedBuf (which would make it possible to read back one compressed file while writing another without loading the whole file into memory).

@CAD97
Copy link
Contributor

CAD97 commented Jul 7, 2023

I just want to note that while rustc_must_implement_one_of is a prerequisite to stabilizing Read::read_buf, BorrowedBuf/BorrowedCursor are separately useful even without the existence of Read::read_buf, so could be stabilized separately. I happen to be writing some unsafe APIs which could be made meaningfully safer with the use of BorrowedCursor.

@jmillikin
Copy link
Contributor

I'm interested in BorrowedBuf and BorrowedCursor for use in no_std environments. Would it be possible to move them into core and stabilize them separately from the new std::io::{Read,Write} functionality?

The current implementation of those types has no dependency on std, and I'd be happy to send out the PRs if the Rust folks are willing to review them.

fmease added a commit to fmease/rust that referenced this issue Nov 8, 2023
…nic, r=dtolnay

Don't panic in `<BorrowedCursor as io::Write>::write`

Instead of panicking if the BorrowedCursor does not have enough capacity for the whole buffer, just return a short write, [like `<&mut [u8] as io::Write>::write` does](https://doc.rust-lang.org/src/std/io/impls.rs.html#349).

(cc `@ChayimFriedman2` rust-lang#78485 (comment))

(I'm not sure if this needs an ACP? since it's not changing the "API", just what the function does)
bors added a commit to rust-lang-ci/rust that referenced this issue Nov 8, 2023
…c, r=dtolnay

Don't panic in `<BorrowedCursor as io::Write>::write`

Instead of panicking if the BorrowedCursor does not have enough capacity for the whole buffer, just return a short write, [like `<&mut [u8] as io::Write>::write` does](https://doc.rust-lang.org/src/std/io/impls.rs.html#349).

(cc `@ChayimFriedman2` rust-lang#78485 (comment))

(I'm not sure if this needs an ACP? since it's not changing the "API", just what the function does)
github-actions bot pushed a commit to rust-lang/miri that referenced this issue Nov 15, 2023
Don't panic in `<BorrowedCursor as io::Write>::write`

Instead of panicking if the BorrowedCursor does not have enough capacity for the whole buffer, just return a short write, [like `<&mut [u8] as io::Write>::write` does](https://doc.rust-lang.org/src/std/io/impls.rs.html#349).

(cc `@ChayimFriedman2` rust-lang/rust#78485 (comment))

(I'm not sure if this needs an ACP? since it's not changing the "API", just what the function does)
@the8472
Copy link
Member

the8472 commented Nov 23, 2023

The BorrowedCursor documentation could use some polish. It says some slightly contradictory or misleading things. The docs start with

A writeable view of the unfilled portion of a BorrowedBuf.

but the very next paragraph:

Provides access to the initialized and uninitialized parts of the underlying BorrowedBuf.

Looking at the actual implementations makes it obvious that they mostly pass-through and grab slices from the underlying BorrowedBuf and totally ignore the write position (start). The write position is only relevant for written().

@tgross35
Copy link
Contributor

Does core_io_borrowed_buf really need to be a separate feature gate, or could it be rolled into read_buf? Bit confusing that the BorrowedBuf/BorrowedCursor docs all point to #117693 rather than this issue, assuming the same types are designed to work in core.

Also related to @the8472's comment, docs need examples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-io Area: std::io, std::fs, std::net and std::path B-RFC-approved Feature: Approved by a merged RFC but not yet implemented. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. Libs-Tracked Libs issues that are tracked on the team's project board. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests