Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustc runs out of memory while trying to compile a large static (lazy_static!) HashMap #61513

Open
ctrlcctrlv opened this issue Jun 4, 2019 · 9 comments
Labels
A-borrow-checker Area: The borrow checker C-bug Category: This is a bug. I-compilemem Issue: Problems and improvements with respect to memory usage during compilation. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@ctrlcctrlv
Copy link

ctrlcctrlv commented Jun 4, 2019

    Blocking waiting for file lock on build directory
   Compiling crash v0.1.0 (/home/fred/Workspace/rtuts/crash)
error: Could not compile `crash`.

Caused by:
  process didn't exit successfully: `rustc --edition=2018 --crate-name crash src/main.rs --color always --crate-type bin --emit=dep-info,link -C debuginfo=2 -C metadata=71bdac3a949cee3b -C extra-filename=-71bdac3a949cee3b --out-dir /home/fred/Workspace/rtuts/crash/target/debug/deps -C incremental=/home/fred/Workspace/rtuts/crash/target/debug/incremental -L dependency=/home/fred/Workspace/rtuts/crash/target/debug/deps --extern lazy_static=/home/fred/Workspace/rtuts/crash/target/debug/deps/liblazy_static-7468e7fc38fb612c.rlib` (signal: 9, SIGKILL: kill)

shell returned 101

My machine has 48GB of memory, yet still can't compile the example project, which creates a static hash map at compile time.

I tried both nightly and stable, same thing.

The input source code is only 19M, only containing u32s and strings. This shouldn't happen, GCC could do this no problem.

2019-06-04-161946_2554x1540_scrot

Project crash.zip

@jonas-schievink jonas-schievink added I-compilemem Issue: Problems and improvements with respect to memory usage during compilation. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 4, 2019
@JuanPotato
Copy link
Member

Hey I'm doing almost your exact thing. I solved it by using a phf. It still crashed when I had all the Unicode entries added in one file, so I had the data in another file and read it from there in a loop while adding entries.

https://github.com/JuanPotato/charname

@jonas-schievink
Copy link
Contributor

GCC is not a Rust compiler, so that comparison doesn't really work. It's also a single function containing 277000 lines, so it's not exactly surprising that it uses a lot of time and memory to build.

That said, this should of course not blow up as drastically. According to -Ztime-passes, the memory usage spike happens after the misc checking 2 phase, which should be the borrow checking phase. It also reproduces with --edition=2018 with NLL (after printing solve_nll_region_constraints). Phases before that are comparatively tame, but during borrow checking the compiler somehow starts allocating multiple gigabytes per second.

Even if this will be fixed, I can not recommend compiling this much generated code. A better solution might be to prepare the data in a build script and including it using include_bytes! or include_str!, which shouldn't suffer from this.

@ctrlcctrlv
Copy link
Author

@jonas-schievink This should be fixed. Compiling in data like this is routine. I do not want to use a data file or include_str, I want all the data in the binary directly so the program can begin using it right away without deserializing.

Compiling large amounts of data into a program as an optimization is common to do in C and C++. Machine learning, video game graphics, etc. This is essential for a systems programming language to be able to do.

@JuanPotato Are there run-time costs? It's interesting how close the projects are, but I just made crash to demo the problem, my real program is a text mode character map so it will require all the Unicode data to be compiled in.

@JuanPotato
Copy link
Member

I'm not sure about runtime costs, you would have to look up more about phf (perfect hash functions) for that.

@ctrlcctrlv
Copy link
Author

I tried using PHF but it similarly uses all my memory and crashes. Didn't try using codegen though, just phf::map.

@JuanPotato
Copy link
Member

Yeah I had the same issue and ended up with my build.rs. My friend says phf might have better performance than a standard hashmap. I'll try a benchmark sometime today

@jonas-schievink jonas-schievink added A-NLL Area: Non Lexical Lifetimes (NLL) NLL-performant Working towards the "performance is good" goal and removed A-NLL Area: Non Lexical Lifetimes (NLL) NLL-performant Working towards the "performance is good" goal labels Jun 4, 2019
@hellow554
Copy link
Contributor

I want all the data in the binary directly

That's exactly what include_str! or include_bytes! does.

@ctrlcctrlv
Copy link
Author

@hellow554 Without deserializing.

@jonas-schievink jonas-schievink added the A-borrow-checker Area: The borrow checker label Jun 5, 2019
@Enselic Enselic added the C-bug Category: This is a bug. label Nov 24, 2023
@AlexTMjugador
Copy link

I've just got bitten by this issue when building a Rust file with ~26k quickphf set entries, worth 8.2 MiB of Rust source code. Unlike the quickphf benchmarks, I used a struct with strings for the value type instead of primitive integers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-borrow-checker Area: The borrow checker C-bug Category: This is a bug. I-compilemem Issue: Problems and improvements with respect to memory usage during compilation. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants