Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std: Update documentation of seek_write on Windows #120452

Merged
merged 1 commit into from Jan 30, 2024

Conversation

alexcrichton
Copy link
Member

Currently the documentation of FileExt::seek_write on Windows indicates that writes beyond the end of the file leave intermediate bytes uninitialized. This commentary dates back to the original inclusion of these functions in #35704 (wow blast from the past!). At the time the functionality here was implemented using WriteFile, but nowadays the NtWriteFile method is used instead. The documentation for NtWriteFile explicitly states:

If Length and ByteOffset specify a write operation past the current
end-of-file mark, NtWriteFile automatically extends the file and updates
the end-of-file mark; any bytes that are not explicitly written between
such old and new end-of-file marks are defined to be zero.

This commentary has had a downstream impact in the system-interface crate where it tries to handle this by explicitly writing zeros, but I don't believe that's necessary any more. I'm sending a PR upstream here to avoid future confusion and codify that zeros are written in the intermediate bytes matching what Windows currently provides.

Currently the documentation of `FileExt::seek_write` on Windows
indicates that writes beyond the end of the file leave intermediate
bytes uninitialized. This commentary dates back to the original
inclusion of these functions in rust-lang#35704 (wow blast from the past!). At
the time the functionality here was implemented using `WriteFile`, but
nowadays the `NtWriteFile` method is used instead. The documentation for
`NtWriteFile` explicitly states:

> If Length and ByteOffset specify a write operation past the current
> end-of-file mark, NtWriteFile automatically extends the file and updates
> the end-of-file mark; any bytes that are not explicitly written between
> such old and new end-of-file marks are defined to be zero.

This commentary has had a downstream impact in the `system-interface`
crate where it tries to handle this by explicitly writing zeros, but I
don't believe that's necessary any more. I'm sending a PR upstream here
to avoid future confusion and codify that zeros are written in the
intermediate bytes matching what Windows currently provides.
@rustbot
Copy link
Collaborator

rustbot commented Jan 28, 2024

r? @m-ou-se

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added O-windows Operating system: Windows S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jan 28, 2024
alexcrichton added a commit to alexcrichton/system-interface that referenced this pull request Jan 28, 2024
This commit it borne out of CI failures on bytecodealliance/wasmtime#7638.
Investigating this failure has revealed a number of aspects here which
I've attempted to address in this PR. The notable changes here are:

* The current code in this crate was handling the case where
  `FileExt::seek_write` on Windows was leaving intermediate bytes as
  undefined when a write happened beyond the end of a file. I believe
  that this is due to an error in the documentation of the Rust standard
  library which I've submitted rust-lang/rust#120452 to fix.

* Removing handling of "always write zeros" handles the primary failure
  of the PR bytecodealliance/wasmtime#7638 which is that
  `write_vectored_at` was always returning 0 on Windows for writes past
  the end of the file. This is because Windows doesn't have a vectored
  file write so the vector chosen was the first nonempty vector which
  was the one containing zeros to extend the file. That meant that the
  method always returned zero.

* Previously the methods here used file locking which appeared to handle
  the case where the file was calculated and then the write happened.
  Given that this no longer happens I've removed the locking here.

* The `write_all_at` method had a loop around `reopen_write` handling
  the `Interrupted` error but no other methods did, so I opted to remove
  the loop and leave that to the internals of `reopen_write` if
  necessary.

* Other methods related to this are all simplified to directly use
  `seek_write` and avoid handling the case where writes past the end
  need to write zeros (as zeros are guaranteed by Windows).

Overall my hope is to use this to unblock bytecodealliance/wasmtime#7638
to get more platform-agnostic behavior for writing beyond the end of a
file.
@ChrisDenton ChrisDenton assigned ChrisDenton and unassigned m-ou-se Jan 29, 2024
@ChrisDenton
Copy link
Contributor

Hm, the move to NtWriteFile was done for soundness reasons, with the intent to keep open the option of moving back (or to another function) should that become viable.

That said, I'm pretty sure WriteFile behaves the same way as NtWriteFile considering the former wraps the latter. And tbh I don't think it can work any other way than zeroing the bytes. The docs for SetFileValidData have a long discussion on the security implications of uninitialized file data. And the whole reason that function exists as a separate thing is to silo such concerns (and the privileges it requires). My complete guess is that the microsoft docs are hedging based on old (win95?) behaviour.

Which is a long winded way of saying I'll probably accept this but I'd want to check in with a few people first.

@ChrisDenton
Copy link
Contributor

I am told that, yes, WriteFile and friends will always zero initialize too so there's no concern if we ever did move back to them.

@bors r+ rollup

@bors
Copy link
Contributor

bors commented Jan 29, 2024

📌 Commit b85b2a7 has been approved by ChrisDenton

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 29, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request Jan 30, 2024
…llaumeGomez

Rollup of 18 pull requests

Successful merges:

 - rust-lang#119123 (Add triagebot mentions entry for simd intrinsics)
 - rust-lang#119991 (Reject infinitely-sized reads from io::Repeat)
 - rust-lang#120172 (bootstrap: add more unit tests)
 - rust-lang#120250 (rustdoc: Prevent JS injection from localStorage)
 - rust-lang#120376 (Update codegen test for LLVM 18)
 - rust-lang#120387 (interpret/memory: fix safety comment for large array memset optimization)
 - rust-lang#120400 (Bound errors span label cleanup)
 - rust-lang#120402 (Make the coroutine def id of an async closure the child of the closure def id)
 - rust-lang#120403 (Add instructions of how to use pre-vendored 'rustc-src')
 - rust-lang#120424 (raw pointer metadata API: data address -> data pointer)
 - rust-lang#120425 (Remove unnecessary unit returns in query declarations)
 - rust-lang#120439 (Move UI issue tests to subdirectories)
 - rust-lang#120443 (Fixes footnote handling in rustdoc)
 - rust-lang#120452 (std: Update documentation of seek_write on Windows)
 - rust-lang#120460 (Be more careful about interpreting a label/lifetime as a mistyped char literal.)
 - rust-lang#120464 (Add matthewjasper to some review groups)
 - rust-lang#120467 (Update books)
 - rust-lang#120488 (Diagnostic lifetimes cleanups)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit d5e8d85 into rust-lang:master Jan 30, 2024
11 checks passed
@rustbot rustbot added this to the 1.77.0 milestone Jan 30, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Jan 30, 2024
Rollup merge of rust-lang#120452 - alexcrichton:update-windows-seek-write-docs, r=ChrisDenton

std: Update documentation of seek_write on Windows

Currently the documentation of `FileExt::seek_write` on Windows indicates that writes beyond the end of the file leave intermediate bytes uninitialized. This commentary dates back to the original inclusion of these functions in rust-lang#35704 (wow blast from the past!). At the time the functionality here was implemented using `WriteFile`, but nowadays the `NtWriteFile` method is used instead. The documentation for `NtWriteFile` explicitly states:

> If Length and ByteOffset specify a write operation past the current
> end-of-file mark, NtWriteFile automatically extends the file and updates
> the end-of-file mark; any bytes that are not explicitly written between
> such old and new end-of-file marks are defined to be zero.

This commentary has had a downstream impact in the `system-interface` crate where it tries to handle this by explicitly writing zeros, but I don't believe that's necessary any more. I'm sending a PR upstream here to avoid future confusion and codify that zeros are written in the intermediate bytes matching what Windows currently provides.
sunfishcode pushed a commit to bytecodealliance/system-interface that referenced this pull request Feb 13, 2024
This commit it borne out of CI failures on bytecodealliance/wasmtime#7638.
Investigating this failure has revealed a number of aspects here which
I've attempted to address in this PR. The notable changes here are:

* The current code in this crate was handling the case where
  `FileExt::seek_write` on Windows was leaving intermediate bytes as
  undefined when a write happened beyond the end of a file. I believe
  that this is due to an error in the documentation of the Rust standard
  library which I've submitted rust-lang/rust#120452 to fix.

* Removing handling of "always write zeros" handles the primary failure
  of the PR bytecodealliance/wasmtime#7638 which is that
  `write_vectored_at` was always returning 0 on Windows for writes past
  the end of the file. This is because Windows doesn't have a vectored
  file write so the vector chosen was the first nonempty vector which
  was the one containing zeros to extend the file. That meant that the
  method always returned zero.

* Previously the methods here used file locking which appeared to handle
  the case where the file was calculated and then the write happened.
  Given that this no longer happens I've removed the locking here.

* The `write_all_at` method had a loop around `reopen_write` handling
  the `Interrupted` error but no other methods did, so I opted to remove
  the loop and leave that to the internals of `reopen_write` if
  necessary.

* Other methods related to this are all simplified to directly use
  `seek_write` and avoid handling the case where writes past the end
  need to write zeros (as zeros are guaranteed by Windows).

Overall my hope is to use this to unblock bytecodealliance/wasmtime#7638
to get more platform-agnostic behavior for writing beyond the end of a
file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-windows Operating system: Windows S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants