Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase max file size to ~500MB #2490

Closed
Mark-Simulacrum opened this issue Sep 16, 2020 · 9 comments
Closed

Increase max file size to ~500MB #2490

Mark-Simulacrum opened this issue Sep 16, 2020 · 9 comments
Labels

Comments

@Mark-Simulacrum
Copy link
Member

Problem

rustc_driver dll on Windows bundles the whole compiler and LLVM into one file, and unsurprisingly, that's a big binary, currently hovering at ~193 MB. This is quite close to the 200 MB limit (and indeed, depending on how that limit is defined -- powers of 10 or 2, over it).

Steps

I can't reproduce myself, but we did get a user report rust-lang/rust#76795 -- maybe the limit used to be smaller?

Possible Solution(s)

Notes

Output of rustup --version: unknown
Output of rustup show: unknown

@rbtcollins
Copy link
Contributor

rbtcollins commented Sep 17, 2020 via email

@kinnison
Copy link
Contributor

I've asked about rustup version on the linked issue. 500M is definitely not going to happen because small platforms are more of a risk. Chunking the installation of larger files would be more effective long term since we could then remove the maximum, or set it to something entirely unlikely to ever happen like 1G.

@rbtcollins
Copy link
Contributor

We don't actually measure the largest file size today, it's a buffer size. So we wouldn't make it bigger, we'd just be able to the minimum size we can work with smaller.

@Mark-Simulacrum
Copy link
Member Author

@kinnison it seems odd to artificially constrain the file size rather than either just allocating or using the try_reserve operators to fallibly do so (to the extent possible). I would expect this to be a perfect case where even if you don't have much RAM, you would then allocate swap -- failing to install because of hard-coded limits seems worse. Could we at least make it an environment-configured option or something like that?

@rbtcollins
Copy link
Contributor

Can we just not speculate about what we might or might not do, and actually go off what the code does?

We probably want to change this to a warning:

if size > MAX_FILE_SIZE {

The current limit is 220MB, well over the size of the file they are failing on. So we don't know why the OP has encountered a failure, and guessing at what is going on isn't going to make things better. They are on Windows where the current rustup code should work perfectly fine with no tuning or tweaking at all, and figuring out why it didn't should be our first goal.

In terms of the code today, our unpack code path is tuned to submit IO in a way that is friendly to Linux with local FS's, Linux with networked FS's, and Windows similarly. That means threads because neither Windows nor Linux have non-blocking close available, and both Linux and Windows have blocking behaviour on close under various circumstances we encounter.

The code is somewhat naive though, and submits all the IO for a single file at once, rather than permitting chunks of a file to be dispatched and completed incrementally; this means that the peak memory footprint to handle a single file is precisely the size of that single file.

try_reserve as a way to handle backpressure on the input into the IO scheduler would be great in principle, but it has two problems: 1) it is unstable and 2) it is not sufficient, as any use of swap would completely undermine the point of threaded IO in the first place, which is to be fast: rustup was taking up to 10 minutes to unpack on Windows, and thrashing because we drive small machines into hundreds of MB of swap by decompressing straight into RAM faster than a relatively slow Flash device can write would be a regression.

But thats ok - we already have autodetection of the actual available RAM for the process, taking into account ulimits etc -

impl MemoryBudget {
- we currently set that to a max size of 500MB, for no particularly good reason, except that we felt that it was unlikely that any one file would exceed 500MB for a good long time, and hopefully we'd be chunking individual files by then, at which point having arbitrarily large buffers would be irrelevant: we only need a large enough buffer to keep ahead of the OS.

This is the magic constant where we track the currrent largest file size; when that changes in what rust is shipping we update it, and this is solely so that users with low memory situations get a better error message than a panic.

Low memory situations include:

  • Running on a small unix device without swap
  • Running with low ulimits, including on Windows with Job objects.

So in summary today:

  • We should make the current hard-failure on files over the limit a logged warning rather than an error, so that we have a grace period to get new rustup releases out - thats easy to do and will make this less of a panic situation
  • Other than that, it should Just Work for the vast majority of users, and certainly all tier one cases, all desktop and server cases, without configuration or mucking around
  • Contributing a chunking implementation to the unpack code path would remove the core scalability issue that drives the problem in the first place and let us drop the message that people see down to verbose mode in the first place.

@kinnison
Copy link
Contributor

Thanks for that Robert, I wonder if the original poster ran into the limit we still have coded here: https://github.com/rust-lang/rustup/blob/master/src/dist/component/package.rs#L300 though.

@rbtcollins
Copy link
Contributor

I suspect they did - where that is used to artificially error is the first link in my screed. Also my apologies to @Mark-Simulacrum for the slight tone in my long post, I'm very tired right now having moved across the globe, and I probably shouldn't be interacting with anyone just now.

@kinnison
Copy link
Contributor

Aha, I read "This is the magic constant where we track the currrent largest file size" as referring to the 500MiB total IO buffer, I perhaps shouldn't be on issue duty either :D I'll go back to human management instead :D

kinnison added a commit that referenced this issue Sep 30, 2020
Fix #2490: Don't fail artificially on large files
@gilescope
Copy link

gilescope commented Oct 22, 2020

We saw this 'File Too Large' running cargo update.
error: File too big rustc-nightly-x86_64-pc-windows-gnu/rustc/bin/rustc_driver-343d46098513aae4.dll 202316426

Ah there was a rust-toolchain in there that was sparking this off... ok that makes sense. (that was 18-10-2020 nightly)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants