Skip to content

Commit

Permalink
syntax: Optimize some literal parsing
Browse files Browse the repository at this point in the history
Currently in the `wasm-bindgen` project we have a very very large crate that's
procedurally generated, `web-sys`. To generate this crate we parse all of a
browser's WebIDL and we then generate bindings for all of the APIs contained
within.

The resulting Rust file is 18MB large (wow!) and currently takes a very long
time to compile in debug mode. On the nightly compiler a *debug* build takes 90s
for the crate to finish. I was curious what was taking so long and upon
investigating a *massive* portion of the time was spent in the `lit_token`
method of the compiler, primarily formatting strings via `format!`.

Upon some more investigation it looks like the `byte_str_lit` was allocating an
error message once per byte, causing a very large number of allocations to
happen for large literals, of which wasm-bindgen generates quite a few (some are
MB large).

This commit fixes the issue by lazily allocating the error message, only doing
so if the error message is actually needed (which should be never). As a result,
the debug mode compilation time for our `web-sys` crate decreased from 90s to
20s, a very nice improvement! (although we've still got some work to do).
  • Loading branch information
alexcrichton committed Aug 20, 2018
1 parent 3ac79c7 commit 5bf2ad3
Showing 1 changed file with 4 additions and 6 deletions.
10 changes: 4 additions & 6 deletions src/libsyntax/parse/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -532,7 +532,7 @@ fn byte_lit(lit: &str) -> (u8, usize) {
fn byte_str_lit(lit: &str) -> Lrc<Vec<u8>> {
let mut res = Vec::with_capacity(lit.len());

let error = |i| format!("lexer should have rejected {} at {}", lit, i);
let error = |i| panic!("lexer should have rejected {} at {}", lit, i);

/// Eat everything up to a non-whitespace
fn eat<I: Iterator<Item=(usize, u8)>>(it: &mut iter::Peekable<I>) {
Expand All @@ -551,12 +551,11 @@ fn byte_str_lit(lit: &str) -> Lrc<Vec<u8>> {
loop {
match chars.next() {
Some((i, b'\\')) => {
let em = error(i);
match chars.peek().expect(&em).1 {
match chars.peek().unwrap_or_else(|| error(i)).1 {
b'\n' => eat(&mut chars),
b'\r' => {
chars.next();
if chars.peek().expect(&em).1 != b'\n' {
if chars.peek().unwrap_or_else(|| error(i)).1 != b'\n' {
panic!("lexer accepted bare CR");
}
eat(&mut chars);
Expand All @@ -573,8 +572,7 @@ fn byte_str_lit(lit: &str) -> Lrc<Vec<u8>> {
}
},
Some((i, b'\r')) => {
let em = error(i);
if chars.peek().expect(&em).1 != b'\n' {
if chars.peek().unwrap_or_else(|| error(i)).1 != b'\n' {
panic!("lexer accepted bare CR");
}
chars.next();
Expand Down

0 comments on commit 5bf2ad3

Please sign in to comment.