Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add provided methods `Seek::{stream_len, stream_position}` #58422

Open
wants to merge 5 commits into
base: master
from

Conversation

@LukasKalbertodt
Copy link
Contributor

LukasKalbertodt commented Feb 13, 2019

This adds two new, provided methods to the io::Seek trait:

  • fn stream_len(&mut self) -> Result<u64>
  • fn stream_position(&mut self) -> Result<u64>

Both are added for convenience and to improve readability in user code. Reading file.stream_len() is much better than to manually seek two or three times. Similarly, file.stream_position() is much more clear than file.seek(SeekFrom::Current(0)).

You can find prior discussions in this internals thread. I think I addressed all concerns in that thread.

I already wrote three RFCs to add a small new API to libstd but I noticed that many public changes to libstd happen without an RFC. So I figured I can try opening a PR directly without going through RFCs first. After all, we do have rfcbot here too. If you think this change is too big to merge without an RFC, I can still close this PR and write an RFC.

@rust-highfive

This comment has been minimized.

Copy link
Collaborator

rust-highfive commented Feb 13, 2019

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive

This comment has been minimized.

Copy link
Collaborator

rust-highfive commented Feb 13, 2019

The job x86_64-gnu-llvm-6.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:end:06691ecb:start=1550058005071365345,finish=1550058097613898394,duration=92542533049
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#pull-requests-and-security-restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
Setting environment variables from .travis.yml
$ export IMAGE=x86_64-gnu-llvm-6.0
---
travis_time:start:test_debuginfo
Check compiletest suite=debuginfo mode=debuginfo-both (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[01:10:32] 
[01:10:32] running 119 tests
[01:11:01] .iiiii...i.....i..i...i..i.i..i.ii...i.....i..i....i..........iiii..........i...ii...i.......ii.i.i. 100/119
[01:11:06] i......iii.i.....ii
[01:11:06] 
[01:11:06]  finished in 33.570
[01:11:06] travis_fold:end:test_debuginfo

---
[01:29:21] running 991 tests
[01:29:44] i................................................................................................... 100/991
[01:29:57] .................................................................................................... 200/991
[01:30:05] ..........iii......i......i...i......i.............................................................. 300/991
[01:30:10] ...............................................................................................F.F.. 400/991
[01:30:26] .................................................................................................... 600/991
[01:30:34] .................................................................................................... 700/991
[01:30:43] .............iiii................................................................................... 800/991
[01:30:56] .................................................................................................... 900/991
[01:30:56] .................................................................................................... 900/991
[01:31:04] .......................................iiii................................................
[01:31:04] failures:
[01:31:04] 
[01:31:04] ---- io/mod.rs - io::Seek::stream_len (line 1232) stdout ----
[01:31:04] error[E0658]: use of unstable library feature 'seek_convenience'
[01:31:04]   --> io/mod.rs:1241:17
[01:31:04]    |
[01:31:04] 11 |     let len = f.stream_len()?;
[01:31:04]    |
[01:31:04]    |
[01:31:04]    = help: add #![feature(seek_convenience)] to the crate attributes to enable
[01:31:04] 
[01:31:04] thread 'io/mod.rs - io::Seek::stream_len (line 1232)' panicked at 'couldn't compile the test', src/librustdoc/test.rs:354:13
[01:31:04] 
[01:31:04] ---- io/mod.rs - io::Seek::current_position (line 1261) stdout ----
[01:31:04] ---- io/mod.rs - io::Seek::current_position (line 1261) stdout ----
[01:31:04] error[E0658]: use of unstable library feature 'seek_convenience'
[01:31:04]   --> io/mod.rs:1270:20
[01:31:04]    |
[01:31:04] 11 |     let before = f.current_position();
[01:31:04]    |
[01:31:04]    |
[01:31:04]    = help: add #![feature(seek_convenience)] to the crate attributes to enable
[01:31:04] 
[01:31:04] error[E0658]: use of unstable library feature 'seek_convenience'
[01:31:04]   --> io/mod.rs:1272:19
[01:31:04]    |
[01:31:04] 13 |     let after = f.current_position()?;
[01:31:04]    |
[01:31:04]    |
[01:31:04]    = help: add #![feature(seek_convenience)] to the crate attributes to enable
[01:31:04] 
[01:31:04] error[E0277]: cannot subtract `std::result::Result<u64, std::io::Error>` from `u64`
[01:31:04]   --> io/mod.rs:1274:56
[01:31:04]    |
[01:31:04] 15 |     println!("The first line was {} bytes long", after - before);
[01:31:04]    |                                                        ^ no implementation for `u64 - std::result::Result<u64, std::io::Error>`
[01:31:04]    |
[01:31:04]    = help: the trait `std::ops::Sub<std::result::Result<u64, std::io::Error>>` is not implemented for `u64`
[01:31:04] thread 'io/mod.rs - io::Seek::current_position (line 1261)' panicked at 'couldn't compile the test', src/librustdoc/test.rs:354:13
[01:31:04] 
[01:31:04] 
[01:31:04] failures:
---
[01:31:04] 
[01:31:04] error: test failed, to rerun pass '--doc'
[01:31:04] 
[01:31:04] 
[01:31:04] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "test" "--target" "x86_64-unknown-linux-gnu" "-j" "4" "--release" "--locked" "--color" "always" "--features" "panic-unwind backtrace" "--manifest-path" "/checkout/src/libstd/Cargo.toml" "-p" "std" "--" "--quiet"
[01:31:04] 
[01:31:04] 
[01:31:04] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test
[01:31:04] Build completed unsuccessfully in 0:32:57
[01:31:04] Build completed unsuccessfully in 0:32:57
[01:31:04] make: *** [check] Error 1
[01:31:04] Makefile:48: recipe for target 'check' failed
The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:081d3200
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)
Wed Feb 13 13:12:51 UTC 2019
---
travis_time:end:05960c8a:start=1550063573399702000,finish=1550063573405182950,duration=5480950
travis_fold:end:after_failure.3
travis_fold:start:after_failure.4
travis_time:start:091a8e90
$ ln -s . checkout && for CORE in obj/cores/core.*; do EXE=$(echo $CORE | sed 's|obj/cores/core\.[0-9]*\.!checkout!\(.*\)|\1|;y|!|/|'); if [ -f "$E

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@LukasKalbertodt LukasKalbertodt force-pushed the LukasKalbertodt:seek-convenience branch from c2f43a1 to 2440db0 Feb 13, 2019

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Feb 14, 2019

Seems plausible to me! Since this is changing a pretty core trait, even though these functions are unstable, I'd want to gain some other feedback too:

@rfcbot fcp merge

@rfcbot

This comment has been minimized.

Copy link

rfcbot commented Feb 14, 2019

Team member @alexcrichton has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@dtolnay

This comment has been minimized.

Copy link
Member

dtolnay commented Feb 14, 2019

current_position looks very similar to Cursor::position. Could we remove "current" from the name or does that make it less clear?

@rfcbot concern position

@dtolnay

This comment has been minimized.

Copy link
Member

dtolnay commented Feb 14, 2019

Could stream_len be len? Would there be any Seek impl that has a "stream len" different from its "len" or for which "len" isn't an appropriate term?

@rfcbot concern len

@LukasKalbertodt

This comment has been minimized.

Copy link
Contributor Author

LukasKalbertodt commented Feb 14, 2019

I would be fine with changing both names!

The only (very minor) argument against len I can think of: len has a precedence in many data structures where the method just returns an already stored value. For example, slices, vectors and everything in std::collections just return a stored length in len(). As such, people might assume Seek::len is also just a simple getter. But I don't really think it's a strong argument.

@sfackler

This comment has been minimized.

Copy link
Member

sfackler commented Feb 15, 2019

This len would return a result, which should indicate that there's something bigger going on.

@LukasKalbertodt LukasKalbertodt force-pushed the LukasKalbertodt:seek-convenience branch from 2440db0 to 2dfac80 Feb 17, 2019

@rust-highfive

This comment has been minimized.

Copy link
Collaborator

rust-highfive commented Feb 17, 2019

The job x86_64-gnu-llvm-6.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:end:04940500:start=1550414973194700673,finish=1550415055424368629,duration=82229667956
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#pull-requests-and-security-restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
Setting environment variables from .travis.yml
$ export IMAGE=x86_64-gnu-llvm-6.0
---
[00:05:10]    Compiling parking_lot v0.6.4
[00:05:11]    Compiling rustc-rayon v0.1.1
[00:05:15]    Compiling rustc_data_structures v0.0.0 (/checkout/src/librustc_data_structures)
[00:05:15]    Compiling tempfile v3.0.5
[00:05:15] error[E0658]: use of unstable library feature 'seek_convenience'
[00:05:15]   --> /cargo/registry/src/github.com-1ecc6299db9ec823/tempfile-3.0.5/src/spooled.rs:90:50
[00:05:15]    |
[00:05:15] 90 |                 file.seek(SeekFrom::Start(cursor.position()))?;
[00:05:15]    |
[00:05:15]    |
[00:05:15]    = help: add #![feature(seek_convenience)] to the crate attributes to enable
[00:05:15] 
[00:05:15] error: internal compiler error: src/librustc/hir/def.rs:257: attempted .def_id() on invalid def: NonMacroAttr(Builtin)
[00:05:15] thread 'rustc' panicked at 'Box<Any>', src/librustc_errors/lib.rs:588:9
[00:05:15] note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
[00:05:15] error: aborting due to 2 previous errors
[00:05:15] 
---
[00:05:15] note: we would appreciate a bug report: https://github.com/rust-lang/rust/blob/master/CONTRIBUTING.md#bug-reports
[00:05:15] 
[00:05:15] note: rustc 1.33.0-beta.1 (d1add9723 2019-01-17) running on x86_64-unknown-linux-gnu
[00:05:15] 
[00:05:15] note: compiler flags: -Z force-unstable-if-unmarked -C opt-level=2 -C prefer-dynamic -C debug-assertions=y -C link-args=-Wl,-rpath,$ORIGIN/../lib --crate-type lib
[00:05:15] note: some of the compiler flags provided by cargo are hidden
[00:05:15] 
[00:05:15] error: Could not compile `tempfile`.
[00:05:15] warning: build failed, waiting for other jobs to finish...
[00:05:15] warning: build failed, waiting for other jobs to finish...
[00:05:17] error: build failed
[00:05:17] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "build" "--target" "x86_64-unknown-linux-gnu" "-j" "4" "--release" "--locked" "--color" "always" "--features" "" "--manifest-path" "/checkout/src/rustc/Cargo.toml" "--message-format" "json"
[00:05:17] expected success, got: exit code: 101
[00:05:17] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap build
[00:05:17] Build completed unsuccessfully in 0:02:01
[00:05:17] Makefile:18: recipe for target 'all' failed
[00:05:17] make: *** [all] Error 1
The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:0bdd40b4
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)
Sun Feb 17 14:56:22 UTC 2019

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@jonas-schievink

This comment has been minimized.

Copy link
Member

jonas-schievink commented Feb 17, 2019

The ICE will be fixed once a new beta is cut that includes #58501, but the bigger issue is that tempfile attempts to call Cursor::position, but for some reason ends up calling the new Seek::position (inherent impls should be preferred over trait impls, though?)

@LukasKalbertodt

This comment has been minimized.

Copy link
Contributor Author

LukasKalbertodt commented Feb 17, 2019

Interesting! I just pushed the renaming changes (len and position). I already thought that the same name on Cursor and Seek might lead to problems, but I wasn't sure. Thankfully Travis showed it to us.

First I was confused, but then I remembered this question of dtolnay's Rust quiz. Here is a minimal example on Playground.

So: keep current_position as name? I think that we are technically allowed to make these changes, because programmers could have prevented this by using a more explicit function call syntax. But I doubt that we practically want to make a change that breaks real crates out there. So what is the best way forward?

@jonas-schievink

This comment has been minimized.

Copy link
Member

jonas-schievink commented Feb 17, 2019

FWIW I also think that having a method len() that returns a Result is more confusing than naming it differently. I'm pretty sure it would be the first len() method in std that isn't just a getter (even Metadata::len is just a getter that returns u64 directly).

@dtolnay

This comment has been minimized.

Copy link
Member

dtolnay commented Feb 23, 2019

Oh man, that quiz question. I think that rules out position.

I am mildly opposed to current_position because every getter is implicitly current -- we don't say HashMap::current_len and String::current_capacity for example. I would make an exception for getters for quantities that can be changed from outside of the current thread of execution. These are more like "measurements" than getters, and it may make sense to indicate that the current value of the quantity is measured but if you measure again you may get a different number. This applies to stream len (so if someone wanted current_len that would be justifiable) but not to position which only changes if we act on the stream from the current thread of execution.

I think we can land this with any name other than position and current_position. io_position or stream_position would be fine with me but open to other suggestions.

@dtolnay

This comment has been minimized.

Copy link
Member

dtolnay commented Feb 23, 2019

We can reconsider len also if people find &mut self -> Result<u64> to be a weird signature for something called len (though I don't expect it to cause problems). Bonus points for harmony between the len name and the position name, so that could be io_len + io_position or stream_len + stream_position.

@LukasKalbertodt

This comment has been minimized.

Copy link
Contributor Author

LukasKalbertodt commented Feb 24, 2019

I think that rules out position.

Is that really the case, though? RFC 1105 defines what changes are considered "major" and which ones are considered "minor". It says:

Minor change: adding a defaulted item

Adding a defaulted item is technically a breaking change: [...] According to the basic principles of this RFC, such a change is minor: it is always possible to annotate the call [...].

The RFC uses a different example to show that adding a defaulted item can be a breaking change, but I guess the reasoning still applies.

Now of course we wouldn't want to have a change that we know causes real breakage in the world -- regardless if it's minor or major according to the RFC. So I guess what I'm asking is: do you consider this one breakage in the compiler suite as enough evidence that such a change is not OK? Or would it be better to have a crater run?

As @jonas-schievink pointed out in a private chat with me, the code that breaks in Travis is extremely fishy: neither get_ref nor position of Cursor need a mutable reference, so ref cursor should be enough. Furthermore the mutable reference is then explicitly dropped which shouldn't do anything because it's at the end of its scope anyway. That doesn't mean that there can't be non-fishy code that breaks under this change to Seek, but I guess it's still worth mentioning that in that particular case, everything would be fine if the code was "more correct".


but not to position which only changes if we act on the stream from the current thread of execution.

That's not necessarily true, right? Someone could implement Seek for a type that can be shared across multiple threads. (Although I still agree that current_ in current_position is kinda bad.)

I don't like the io_ prefix. I would be fine with stream_len and stream_position (although I'm not super happy about the stream_position). Alternatives: seek_position or position_in_stream?

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Feb 28, 2019

☔️ The latest upstream changes (presumably #58208) made this pull request unmergeable. Please resolve the merge conflicts.

Add provided methods `Seek::{stream_len, stream_position}`
These two methods are defined in terms of `Seek::seek` and are
added for convenience. Tests are included.

@LukasKalbertodt LukasKalbertodt force-pushed the LukasKalbertodt:seek-convenience branch from 2dfac80 to e8ee00a Mar 10, 2019

@LukasKalbertodt LukasKalbertodt changed the title Add provided methods `Seek::{stream_len, current_position}` Add provided methods `Seek::{stream_len, stream_position}` Mar 10, 2019

@LukasKalbertodt

This comment has been minimized.

Copy link
Contributor Author

LukasKalbertodt commented Mar 10, 2019

I renamed the methods again to: stream_len and stream_position (one of @dtolnay's suggestions). This should avoid any name-clashes and make this PR mergable. Figuring out the exact name can probably be done in the tracking issue after merging this, right?

@dtolnay

This comment has been minimized.

Copy link
Member

dtolnay commented Mar 11, 2019

Thanks, that works for me.

@rfcbot resolve position
@rfcbot resolve len

@rfcbot

This comment has been minimized.

Copy link

rfcbot commented Mar 11, 2019

🔔 This is now entering its final comment period, as per the review above. 🔔

@Dr-Emann

This comment has been minimized.

Copy link

Dr-Emann commented Mar 13, 2019

Are there plans to specialize this where possible?

  • Cursor can trivially get the stream length without seeking twice (might get optimized out anyway?)
  • File can try to go through metadata.len, before trying to fall back to the double-seek?
@Dr-Emann

This comment has been minimized.

Copy link

Dr-Emann commented Mar 13, 2019

Are we also allowed to do something like

let current = cursor.seek(SeekFrom::Current(0))?;
let end = cursor.seek(SeekFrom::End(0))?;
if (current != end) {
    cursor.seek(SeekFrom::Start(current))?;
}

To prevent the third call to seek if we are at the end already?

@jonas-schievink jonas-schievink added relnotes and removed relnotes labels Mar 13, 2019

@hellow554

This comment has been minimized.

Copy link
Contributor

hellow554 commented Mar 14, 2019

  • File can try to go through metadata.len, before trying to fall back to the double-seek?

But then you have to sync the metadata first, which can be more expensive than two seeks (see Stackoverflow)

Show resolved Hide resolved src/libstd/io/mod.rs Outdated
Show resolved Hide resolved src/libstd/io/mod.rs Outdated

LukasKalbertodt added some commits Mar 14, 2019

@LukasKalbertodt LukasKalbertodt force-pushed the LukasKalbertodt:seek-convenience branch from a481512 to c518f2d Mar 14, 2019

@Dr-Emann

This comment has been minimized.

Copy link

Dr-Emann commented Mar 14, 2019

  • File can try to go through metadata.len, before trying to fall back to the double-seek?

But then you have to sync the metadata first, which can be more expensive than two seeks (see Stackoverflow)

Is that accurate? I thought sync_all is for ensuring changes make it to disk (in case of shutdown, etc), but metadata should still be consistent on a running system.

However, I did find another possible reason to prefer seeking, even on files: files in /proc are reported as 0 sized in file metadata, whereas trying to get the length by seeking fails

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=923ffea90a1e35d53c79ddc3a134619d

Show resolved Hide resolved src/libstd/io/mod.rs Outdated
Show resolved Hide resolved src/libstd/io/mod.rs Outdated
Apply suggestions from code review
Fix typos in the documentation

Co-Authored-By: LukasKalbertodt <lukas.kalbertodt@gmail.com>
@rfcbot

This comment has been minimized.

Copy link

rfcbot commented Mar 21, 2019

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC will be merged soon.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Mar 21, 2019

@bors: r+

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Mar 21, 2019

📌 Commit f95219f has been approved by alexcrichton

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Mar 21, 2019

⌛️ Testing commit f95219f with merge 89573b3...

bors added a commit that referenced this pull request Mar 21, 2019

Auto merge of #58422 - LukasKalbertodt:seek-convenience, r=alexcrichton
Add provided methods `Seek::{stream_len, stream_position}`

This adds two new, provided methods to the `io::Seek` trait:
- `fn stream_len(&mut self) -> Result<u64>`
- `fn stream_position(&mut self) -> Result<u64>`

Both are added for convenience and to improve readability in user code. Reading `file.stream_len()` is much better than to manually seek two or three times. Similarly, `file.stream_position()` is much more clear than `file.seek(SeekFrom::Current(0))`.

You can find prior discussions [in this internals thread](https://internals.rust-lang.org/t/pre-rfc-idea-extend-io-seek-with-convenience-methods-with-e-g-stream-len/9262). I think I addressed all concerns in that thread.

I already wrote three RFCs to add a small new API to libstd but I noticed that many public changes to libstd happen without an RFC. So I figured I can try opening a PR directly without going through RFCs first. After all, we do have rfcbot here too. If you think this change is too big to merge without an RFC, I can still close this PR and write an RFC.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.