Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Call FileEncoder::finish in rmeta encoding #117301

Merged
merged 1 commit into from Nov 26, 2023

Conversation

saethlin
Copy link
Member

Fixes #117254

The bug here was that rmeta encoding never called FileEncoder::finish. Now it does. Most of the changes here are needed to support that, since rmeta encoding wants to finish then access the File in the encoder, so finish can't move out.

I tried adding a cfg(debug_assertions) exploding Drop impl to FileEncoder that checked for finish being called before dropping, but fatal errors cause unwinding so this isn't really possible. If we encounter a fatal error with a dirty FileEncoder, the Drop impl ICEs even though the implementation is correct. If we try to paper over that by wrapping FileEncoder in ManuallyDrop then that just erases the fact that Drop automatically checks that we call finish on all paths.

I also changed the name of DepGraph::encode to DepGraph::finish_encoding, because that's what it does and it makes the fact that it is the path to FileEncoder::finish less confusing.

r? @WaffleLapkin

@rustbot
Copy link
Collaborator

rustbot commented Oct 28, 2023

Could not assign reviewer from: WaffleLapkin.
User(s) WaffleLapkin are either the PR author, already assigned, or on vacation, and there are no other candidates.
Use r? to specify someone else to assign.

@rustbot
Copy link
Collaborator

rustbot commented Oct 28, 2023

r? @TaKO8Ki

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Oct 28, 2023
@Mark-Simulacrum
Copy link
Member

Mark-Simulacrum commented Oct 28, 2023

I tried adding a cfg(debug_assertions) exploding Drop impl to FileEncoder that checked for finish being called before dropping, but fatal errors cause unwinding so this isn't really possible. If we encounter a fatal error with a dirty FileEncoder, the Drop impl ICEs even though the implementation is correct.

Can't we guard the assertion in Drop with std::thread::panicking, so that a fatal error just skips over the explosion?

@saethlin
Copy link
Member Author

Yep I think that can be done. I'll push a change that does that in a bit.

@rustbot
Copy link
Collaborator

rustbot commented Oct 28, 2023

These commits modify the Cargo.lock file. Unintentional changes to Cargo.lock can be introduced when switching branches and rebasing PRs.

If this was unintentional then you should revert the changes before this PR is merged.
Otherwise, you can ignore this comment.

@bors
Copy link
Contributor

bors commented Oct 30, 2023

☔ The latest upstream changes (presumably #117405) made this pull request unmergeable. Please resolve the merge conflicts.

@bors
Copy link
Contributor

bors commented Nov 17, 2023

☔ The latest upstream changes (presumably #117993) made this pull request unmergeable. Please resolve the merge conflicts.

@bors
Copy link
Contributor

bors commented Nov 22, 2023

☔ The latest upstream changes (presumably #118086) made this pull request unmergeable. Please resolve the merge conflicts.

file.write_all(&[(pos >> 24) as u8, (pos >> 16) as u8, (pos >> 8) as u8, (pos >> 0) as u8])
.unwrap_or_else(|err| tcx.sess.emit_fatal(FailWriteFile { err }));
file.seek(std::io::SeekFrom::Start(header as u64))?;
file.write_all(&[(pos >> 24) as u8, (pos >> 16) as u8, (pos >> 8) as u8, (pos >> 0) as u8])?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to the PR but why is this not (pos as u32).to_be_bytes()? D:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is very old. It could probably do with being totally rewritten, and I didn't even notice that this is another u32 position, ouch.

@WaffleLapkin
Copy link
Member

r? WaffleLapkin
@bors r+

@bors
Copy link
Contributor

bors commented Nov 26, 2023

📌 Commit fbaa24e has been approved by WaffleLapkin

It is now in the queue for this repository.

@rustbot rustbot assigned WaffleLapkin and unassigned TaKO8Ki Nov 26, 2023
@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 26, 2023
@bors
Copy link
Contributor

bors commented Nov 26, 2023

⌛ Testing commit fbaa24e with merge 3dbb4da...

@bors
Copy link
Contributor

bors commented Nov 26, 2023

☀️ Test successful - checks-actions
Approved by: WaffleLapkin
Pushing 3dbb4da to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Nov 26, 2023
@bors bors merged commit 3dbb4da into rust-lang:master Nov 26, 2023
12 checks passed
@rustbot rustbot added this to the 1.76.0 milestone Nov 26, 2023
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (3dbb4da): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
1.3% [1.2%, 1.7%] 3
Improvements ✅
(primary)
-2.1% [-2.1%, -2.1%] 1
Improvements ✅
(secondary)
-4.1% [-4.7%, -3.7%] 3
All ❌✅ (primary) -2.1% [-2.1%, -2.1%] 1

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.5% [-2.5%, -2.5%] 1
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 674.277s -> 674.133s (-0.02%)
Artifact size: 313.37 MiB -> 313.38 MiB (0.00%)

@saethlin saethlin deleted the finish-rmeta-encoding branch November 26, 2023 18:54
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 11, 2023
…apkin

 Use a u64 for the rmeta root position

Waffle noticed this in rust-lang#117301 (comment)

We've upgraded the other file offsets to u64, and this one only costs 4 bytes per file. Also the way the truncation was being done before was extremely easy to miss, I sure missed it! It's not clear to me if not having this change effectively made the other upgrades from u32 to u64 ineffective, but we can have it now.

r? `@WaffleLapkin`
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Dec 12, 2023
 Use a u64 for the rmeta root position

Waffle noticed this in rust-lang/rust#117301 (comment)

We've upgraded the other file offsets to u64, and this one only costs 4 bytes per file. Also the way the truncation was being done before was extremely easy to miss, I sure missed it! It's not clear to me if not having this change effectively made the other upgrades from u32 to u64 ineffective, but we can have it now.

r? `@WaffleLapkin`
lnicola pushed a commit to lnicola/rust-analyzer that referenced this pull request Dec 12, 2023
 Use a u64 for the rmeta root position

Waffle noticed this in rust-lang/rust#117301 (comment)

We've upgraded the other file offsets to u64, and this one only costs 4 bytes per file. Also the way the truncation was being done before was extremely easy to miss, I sure missed it! It's not clear to me if not having this change effectively made the other upgrades from u32 to u64 ineffective, but we can have it now.

r? `@WaffleLapkin`
fmease added a commit to fmease/rust that referenced this pull request Jan 3, 2024
…Lapkin,Nilstrieb

Report I/O errors from rmeta encoding with emit_fatal

rust-lang#119456 reminded me that I never did systematic testing to provoke the out-of-disk ICEs so I grepped through a recent crater run (rust-lang#119440 (comment)) for more out-of-disk ICEs on current master and yep there's 2 in there.

So I finally cooked up a way to provoke for these crashes. I wrote a little `cdylib` crate that has a `#[no_mangle] pub extern "C" fn write` which occasionally reports `ENOSPC`, and prints a backtrace when it does.
<details><summary><strong>code for the dylib</strong></summary>
<p>

```rust
// cargo add libc rand backtrace
use rand::Rng;

#[no_mangle]
pub extern "C" fn write(
    fd: libc::c_int,
    buf: *const libc::c_void,
    count: libc::size_t,
) -> libc::ssize_t {
    if fd > 2 && rand::thread_rng().gen::<u8>() == 0 {
        let mut count = 0;
        backtrace::trace(|frame| {
            backtrace::resolve_frame(frame, |symbol| {
                if let Some(name) = symbol.name() {
                    if count > 3 {
                        eprintln!("{}", name);
                    }
                }
                count += 1;
            });
            true
        });

        unsafe {
            *libc::__errno_location() = libc::ENOSPC;
        }
        return -1;
    } else {
        unsafe {
            let res =
                libc::syscall(libc::SYS_write, fd as usize, buf as usize, count as usize) as isize;
            if res < 0 {
                *libc::__errno_location() = -res as i32;
                -1
            } else {
                res
            }
        }
    }
}
```

</p>
</details>

Then `LD_PRELOAD` that dylib and repeatedly build a big project until it ICEs, such as with this:
```bash
while true; do
    cargo clean
    LD_PRELOAD=/home/ben/evil/target/release/libevil.so cargo +stage1 check 2> errors
    if grep "thread 'rustc' panicked" errors; then
        break
    fi
done
```
My "big project" for testing was an otherwise-empty project with `cargo add axum`.

Before this PR, the above procedure finds a crash in between 1 and 15 minutes. With this PR, I have not found a crash in 30 minutes, and I'll be leaving this to run overnight (starting now). (A night has now passed, no crashes were found)

I believe the problem is that even though since rust-lang#117301 we correctly check `FileEncoder` for errors on all paths, we use `emit_err`, so there is a window of time between the call to `emit_err` and the full error reporting where rustc believes it has emitted a valid rmeta file and will permit Cargo to launch a build for a dependent crate. Changing these calls to `emit_fatal` closes that window.

I think there are a number of other cases where `emit_err` has been used instead of the more-correct `emit_fatal` such as https://github.com/rust-lang/rust/blob/e51e98dde6a60637b6a71b8105245b629ac3fe77/compiler/rustc_codegen_ssa/src/back/write.rs#L542 but unlike rmeta encoding I am not aware of those cases of those causing problems.

r? `@WaffleLapkin`
fmease added a commit to fmease/rust that referenced this pull request Jan 3, 2024
…Lapkin,Nilstrieb

Report I/O errors from rmeta encoding with emit_fatal

rust-lang#119456 reminded me that I never did systematic testing to provoke the out-of-disk ICEs so I grepped through a recent crater run (rust-lang#119440 (comment)) for more out-of-disk ICEs on current master and yep there's 2 in there.

So I finally cooked up a way to provoke for these crashes. I wrote a little `cdylib` crate that has a `#[no_mangle] pub extern "C" fn write` which occasionally reports `ENOSPC`, and prints a backtrace when it does.
<details><summary><strong>code for the dylib</strong></summary>
<p>

```rust
// cargo add libc rand backtrace
use rand::Rng;

#[no_mangle]
pub extern "C" fn write(
    fd: libc::c_int,
    buf: *const libc::c_void,
    count: libc::size_t,
) -> libc::ssize_t {
    if fd > 2 && rand::thread_rng().gen::<u8>() == 0 {
        let mut count = 0;
        backtrace::trace(|frame| {
            backtrace::resolve_frame(frame, |symbol| {
                if let Some(name) = symbol.name() {
                    if count > 3 {
                        eprintln!("{}", name);
                    }
                }
                count += 1;
            });
            true
        });

        unsafe {
            *libc::__errno_location() = libc::ENOSPC;
        }
        return -1;
    } else {
        unsafe {
            let res =
                libc::syscall(libc::SYS_write, fd as usize, buf as usize, count as usize) as isize;
            if res < 0 {
                *libc::__errno_location() = -res as i32;
                -1
            } else {
                res
            }
        }
    }
}
```

</p>
</details>

Then `LD_PRELOAD` that dylib and repeatedly build a big project until it ICEs, such as with this:
```bash
while true; do
    cargo clean
    LD_PRELOAD=/home/ben/evil/target/release/libevil.so cargo +stage1 check 2> errors
    if grep "thread 'rustc' panicked" errors; then
        break
    fi
done
```
My "big project" for testing was an otherwise-empty project with `cargo add axum`.

Before this PR, the above procedure finds a crash in between 1 and 15 minutes. With this PR, I have not found a crash in 30 minutes, and I'll be leaving this to run overnight (starting now). (A night has now passed, no crashes were found)

I believe the problem is that even though since rust-lang#117301 we correctly check `FileEncoder` for errors on all paths, we use `emit_err`, so there is a window of time between the call to `emit_err` and the full error reporting where rustc believes it has emitted a valid rmeta file and will permit Cargo to launch a build for a dependent crate. Changing these calls to `emit_fatal` closes that window.

I think there are a number of other cases where `emit_err` has been used instead of the more-correct `emit_fatal` such as https://github.com/rust-lang/rust/blob/e51e98dde6a60637b6a71b8105245b629ac3fe77/compiler/rustc_codegen_ssa/src/back/write.rs#L542 but unlike rmeta encoding I am not aware of those cases of those causing problems.

r? ``@WaffleLapkin``
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Jan 3, 2024
Rollup merge of rust-lang#119510 - saethlin:fatal-io-errors, r=WaffleLapkin,Nilstrieb

Report I/O errors from rmeta encoding with emit_fatal

rust-lang#119456 reminded me that I never did systematic testing to provoke the out-of-disk ICEs so I grepped through a recent crater run (rust-lang#119440 (comment)) for more out-of-disk ICEs on current master and yep there's 2 in there.

So I finally cooked up a way to provoke for these crashes. I wrote a little `cdylib` crate that has a `#[no_mangle] pub extern "C" fn write` which occasionally reports `ENOSPC`, and prints a backtrace when it does.
<details><summary><strong>code for the dylib</strong></summary>
<p>

```rust
// cargo add libc rand backtrace
use rand::Rng;

#[no_mangle]
pub extern "C" fn write(
    fd: libc::c_int,
    buf: *const libc::c_void,
    count: libc::size_t,
) -> libc::ssize_t {
    if fd > 2 && rand::thread_rng().gen::<u8>() == 0 {
        let mut count = 0;
        backtrace::trace(|frame| {
            backtrace::resolve_frame(frame, |symbol| {
                if let Some(name) = symbol.name() {
                    if count > 3 {
                        eprintln!("{}", name);
                    }
                }
                count += 1;
            });
            true
        });

        unsafe {
            *libc::__errno_location() = libc::ENOSPC;
        }
        return -1;
    } else {
        unsafe {
            let res =
                libc::syscall(libc::SYS_write, fd as usize, buf as usize, count as usize) as isize;
            if res < 0 {
                *libc::__errno_location() = -res as i32;
                -1
            } else {
                res
            }
        }
    }
}
```

</p>
</details>

Then `LD_PRELOAD` that dylib and repeatedly build a big project until it ICEs, such as with this:
```bash
while true; do
    cargo clean
    LD_PRELOAD=/home/ben/evil/target/release/libevil.so cargo +stage1 check 2> errors
    if grep "thread 'rustc' panicked" errors; then
        break
    fi
done
```
My "big project" for testing was an otherwise-empty project with `cargo add axum`.

Before this PR, the above procedure finds a crash in between 1 and 15 minutes. With this PR, I have not found a crash in 30 minutes, and I'll be leaving this to run overnight (starting now). (A night has now passed, no crashes were found)

I believe the problem is that even though since rust-lang#117301 we correctly check `FileEncoder` for errors on all paths, we use `emit_err`, so there is a window of time between the call to `emit_err` and the full error reporting where rustc believes it has emitted a valid rmeta file and will permit Cargo to launch a build for a dependent crate. Changing these calls to `emit_fatal` closes that window.

I think there are a number of other cases where `emit_err` has been used instead of the more-correct `emit_fatal` such as https://github.com/rust-lang/rust/blob/e51e98dde6a60637b6a71b8105245b629ac3fe77/compiler/rustc_codegen_ssa/src/back/write.rs#L542 but unlike rmeta encoding I am not aware of those cases of those causing problems.

r? ``@WaffleLapkin``
lnicola pushed a commit to lnicola/rust-analyzer that referenced this pull request Apr 7, 2024
 Use a u64 for the rmeta root position

Waffle noticed this in rust-lang/rust#117301 (comment)

We've upgraded the other file offsets to u64, and this one only costs 4 bytes per file. Also the way the truncation was being done before was extremely easy to miss, I sure missed it! It's not clear to me if not having this change effectively made the other upgrades from u32 to u64 ineffective, but we can have it now.

r? `@WaffleLapkin`
RalfJung pushed a commit to RalfJung/rust-analyzer that referenced this pull request Apr 27, 2024
 Use a u64 for the rmeta root position

Waffle noticed this in rust-lang/rust#117301 (comment)

We've upgraded the other file offsets to u64, and this one only costs 4 bytes per file. Also the way the truncation was being done before was extremely easy to miss, I sure missed it! It's not clear to me if not having this change effectively made the other upgrades from u32 to u64 ineffective, but we can have it now.

r? `@WaffleLapkin`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FileEncoder delayed error reporting is still broken
8 participants