Avoid allocating when parsing \u{...} literals. #50052

nnethercote · 2018-04-18T12:07:04Z

char_lit uses an allocation in order to ignore '_' chars in \u{...}
literals. This patch changes it to not do that by processing the chars
more directly.

This improves various rustc-perf benchmark measurements by up to 6%,
particularly regex, futures, clap, coercions, hyper, and encoding.

rustc-perf results, on a stage 2 build with jemalloc disabled:

regex-check
	avg: -5.4%	min: -6.5%	max: -2.7%
futures-check
	avg: -3.5%	min: -5.3%	max: -1.7%
regex-opt
	avg: -2.0%	min: -5.1%	max: -0.2%
regex
	avg: -2.3%	min: -5.0%	max: -0.6%
futures-opt
	avg: -3.0%	min: -4.8%	max: -1.1%
futures
	avg: -3.1%	min: -4.8%	max: -1.3%
clap-rs-check
	avg: -1.8%	min: -3.5%	max: -0.9%
coercions-check
	avg: -2.0%	min: -3.3%	max: -1.0%
hyper-check
	avg: -2.2%	min: -3.1%	max: -1.3%
hyper
	avg: -1.3%	min: -2.4%	max: -0.3%
hyper-opt
	avg: -0.9%	min: -2.3%	max: -0.1%
coercions
	avg: -1.1%	min: -2.2%	max: -0.4%
encoding-check
	avg: -1.7%	min: -2.2%	max: -0.9%
clap-rs-opt
	avg: -0.7%	min: -2.2%	max: 0.0%
coercions-opt
	avg: -1.2%	min: -2.1%	max: -0.3%
clap-rs
	avg: -0.8%	min: -1.9%	max: -0.4%
encoding-opt
	avg: -1.0%	min: -1.9%	max: -0.3%
encoding
	avg: -1.1%	min: -1.9%	max: -0.4%
piston-image-check
	avg: -0.7%	min: -1.3%	max: -0.3%
inflate-opt
	avg: -0.3%	min: -0.9%	max: -0.0%
piston-image
	avg: -0.3%	min: -0.8%	max: -0.1%
piston-image-opt
	avg: -0.3%	min: -0.7%	max: -0.1%
syn-check
	avg: -0.3%	min: -0.6%	max: -0.1%
deep-vector
	avg: 0.1%	min: -0.1%	max: 0.5%
syn-opt
	avg: -0.1%	min: -0.4%	max: 0.0%
html5ever
	avg: -0.2%	min: -0.4%	max: -0.0%
deep-vector-check
	avg: 0.0%	min: -0.3%	max: 0.3%
syn
	avg: -0.2%	min: -0.3%	max: -0.1%
html5ever-check
	avg: -0.3%	min: -0.3%	max: -0.2%
issue-46449-check
	avg: -0.1%	min: -0.2%	max: 0.2%
html5ever-opt
	avg: -0.0%	min: -0.2%	max: 0.1%
deep-vector-opt
	avg: -0.0%	min: -0.2%	max: 0.1%
issue-46449-opt
	avg: -0.0%	min: -0.2%	max: 0.1%
unify-linearly-check
	avg: -0.0%	min: -0.2%	max: 0.1%
helloworld-check
	avg: 0.0%	min: -0.0%	max: 0.2%
parser-check
	avg: -0.0%	min: -0.2%	max: 0.0%
inflate
	avg: 0.0%	min: -0.0%	max: 0.1%
tokio-webpush-simple-check
	avg: -0.1%	min: -0.1%	max: -0.0%
regression-31157-check
	avg: 0.0%	min: -0.1%	max: 0.1%
issue-46449
	avg: 0.0%	min: -0.1%	max: 0.1%
tuple-stress-opt
	avg: 0.0%	min: -0.0%	max: 0.1%
tuple-stress-check
	avg: -0.0%	min: -0.1%	max: 0.1%
tuple-stress
	avg: 0.0%	min: -0.0%	max: 0.1%
deeply-nested-check
	avg: 0.0%	min: -0.0%	max: 0.1%
regression-31157
	avg: -0.0%	min: -0.1%	max: 0.1%
deeply-nested-opt
	avg: -0.0%	min: -0.1%	max: 0.1%
parser-opt
	avg: -0.0%	min: -0.1%	max: 0.0%
parser
	avg: 0.1%	min: 0.0%	max: 0.1%
tokio-webpush-simple
	avg: -0.0%	min: -0.1%	max: 0.1%
regression-31157-opt
	avg: -0.0%	min: -0.1%	max: 0.1%
helloworld-opt
	avg: 0.0%	min: -0.0%	max: 0.1%
unify-linearly-opt
	avg: 0.0%	min: -0.0%	max: 0.1%
unused-warnings-check
	avg: 0.0%	min: 0.0%	max: 0.1%
tokio-webpush-simple-opt
	avg: -0.0%	min: -0.1%	max: 0.0%
helloworld
	avg: -0.0%	min: -0.0%	max: 0.1%
unused-warnings
	avg: 0.0%	min: -0.0%	max: 0.0%
deeply-nested
	avg: -0.0%	min: -0.0%	max: -0.0%
unused-warnings-opt
	avg: 0.0%	min: -0.0%	max: 0.0%
unify-linearly
	avg: 0.0%	min: -0.0%	max: 0.0%
inflate-check
	avg: 0.0%	min: -0.0%	max: 0.0%

kennytm · 2018-04-18T12:28:38Z

Could we do the same for the integer_lit function too 🤔

Mark-Simulacrum · 2018-04-18T13:38:05Z

@bors try

bors · 2018-04-18T13:38:15Z

⌛ Trying commit ca47fdc00059278c920df9743d84758dac533a32 with merge 1d4dcc78c0dfae72827cd04a442c6bd9e9c89ce2...

bors · 2018-04-18T15:31:30Z

☀️ Test successful - status-travis
State: approved= try=True

leonardo-m · 2018-04-18T18:13:26Z

src/libsyntax/parse/mod.rs

+
+            // All digits and '_' are ascii, so treat each byte as a char.
+            let mut v: u32 = 0;
+            for &c in lit[3..idx].as_bytes().iter() {


Why not just lit[3..idx].bytes() {

leonardo-m · 2018-04-18T18:14:54Z

src/libsyntax/parse/mod.rs

+            // All digits and '_' are ascii, so treat each byte as a char.
+            let mut v: u32 = 0;
+            for &c in lit[3..idx].as_bytes().iter() {
+                let c = c as char;


Why don't you replace that hard cast "as" with:

let c = char::from(c);

nnethercote · 2018-04-18T23:04:01Z

Could we do the same for the integer_lit function too thinking

char_lit is much hotter, but I see in my profiles that integer_lit is moderately hot for a couple of benchmarks. So I'll do that too.

nnethercote · 2018-04-18T23:15:26Z

Actually,integer_lit is a bit harder, so I'd prefer to do that in a follow-up.

`char_lit` uses an allocation in order to ignore '_' chars in \u{...} literals. This patch changes it to not do that by processing the chars more directly. This improves various rustc-perf benchmark measurements by up to 6%, particularly regex, futures, clap, coercions, hyper, and encoding.

nnethercote · 2018-04-18T23:18:39Z

The new patch addresses @leonardo-m's comments.

Mark-Simulacrum · 2018-04-18T23:25:43Z

@bors r+

bors · 2018-04-18T23:25:44Z

📌 Commit 9f14502 has been approved by Mark-Simulacrum

bors · 2018-04-20T03:59:03Z

⌛ Testing commit 9f14502 with merge f6e532b8b41298b6adb04d6254e660cd7a5abb38...

bors · 2018-04-20T04:34:10Z

💔 Test failed - status-travis

nnethercote · 2018-04-20T06:16:54Z

Could this be an infrastructure issue? I don't see anything else obviously wrong...

kennytm · 2018-04-20T07:16:35Z

@bors retry

[00:01:44]    Compiling syn v0.13.1


No output has been received in the last 30m0s, this potentially indicates a stalled build or something wrong with the build itself.

bors · 2018-04-20T10:40:32Z

⌛ Testing commit 9f14502 with merge 85f5dd4...

Avoid allocating when parsing \u{...} literals. `char_lit` uses an allocation in order to ignore '_' chars in \u{...} literals. This patch changes it to not do that by processing the chars more directly. This improves various rustc-perf benchmark measurements by up to 6%, particularly regex, futures, clap, coercions, hyper, and encoding. rustc-perf results, on a stage 2 build with jemalloc disabled: <details> ``` regex-check avg: -5.4% min: -6.5% max: -2.7% futures-check avg: -3.5% min: -5.3% max: -1.7% regex-opt avg: -2.0% min: -5.1% max: -0.2% regex avg: -2.3% min: -5.0% max: -0.6% futures-opt avg: -3.0% min: -4.8% max: -1.1% futures avg: -3.1% min: -4.8% max: -1.3% clap-rs-check avg: -1.8% min: -3.5% max: -0.9% coercions-check avg: -2.0% min: -3.3% max: -1.0% hyper-check avg: -2.2% min: -3.1% max: -1.3% hyper avg: -1.3% min: -2.4% max: -0.3% hyper-opt avg: -0.9% min: -2.3% max: -0.1% coercions avg: -1.1% min: -2.2% max: -0.4% encoding-check avg: -1.7% min: -2.2% max: -0.9% clap-rs-opt avg: -0.7% min: -2.2% max: 0.0% coercions-opt avg: -1.2% min: -2.1% max: -0.3% clap-rs avg: -0.8% min: -1.9% max: -0.4% encoding-opt avg: -1.0% min: -1.9% max: -0.3% encoding avg: -1.1% min: -1.9% max: -0.4% piston-image-check avg: -0.7% min: -1.3% max: -0.3% inflate-opt avg: -0.3% min: -0.9% max: -0.0% piston-image avg: -0.3% min: -0.8% max: -0.1% piston-image-opt avg: -0.3% min: -0.7% max: -0.1% syn-check avg: -0.3% min: -0.6% max: -0.1% deep-vector avg: 0.1% min: -0.1% max: 0.5% syn-opt avg: -0.1% min: -0.4% max: 0.0% html5ever avg: -0.2% min: -0.4% max: -0.0% deep-vector-check avg: 0.0% min: -0.3% max: 0.3% syn avg: -0.2% min: -0.3% max: -0.1% html5ever-check avg: -0.3% min: -0.3% max: -0.2% issue-46449-check avg: -0.1% min: -0.2% max: 0.2% html5ever-opt avg: -0.0% min: -0.2% max: 0.1% deep-vector-opt avg: -0.0% min: -0.2% max: 0.1% issue-46449-opt avg: -0.0% min: -0.2% max: 0.1% unify-linearly-check avg: -0.0% min: -0.2% max: 0.1% helloworld-check avg: 0.0% min: -0.0% max: 0.2% parser-check avg: -0.0% min: -0.2% max: 0.0% inflate avg: 0.0% min: -0.0% max: 0.1% tokio-webpush-simple-check avg: -0.1% min: -0.1% max: -0.0% regression-31157-check avg: 0.0% min: -0.1% max: 0.1% issue-46449 avg: 0.0% min: -0.1% max: 0.1% tuple-stress-opt avg: 0.0% min: -0.0% max: 0.1% tuple-stress-check avg: -0.0% min: -0.1% max: 0.1% tuple-stress avg: 0.0% min: -0.0% max: 0.1% deeply-nested-check avg: 0.0% min: -0.0% max: 0.1% regression-31157 avg: -0.0% min: -0.1% max: 0.1% deeply-nested-opt avg: -0.0% min: -0.1% max: 0.1% parser-opt avg: -0.0% min: -0.1% max: 0.0% parser avg: 0.1% min: 0.0% max: 0.1% tokio-webpush-simple avg: -0.0% min: -0.1% max: 0.1% regression-31157-opt avg: -0.0% min: -0.1% max: 0.1% helloworld-opt avg: 0.0% min: -0.0% max: 0.1% unify-linearly-opt avg: 0.0% min: -0.0% max: 0.1% unused-warnings-check avg: 0.0% min: 0.0% max: 0.1% tokio-webpush-simple-opt avg: -0.0% min: -0.1% max: 0.0% helloworld avg: -0.0% min: -0.0% max: 0.1% unused-warnings avg: 0.0% min: -0.0% max: 0.0% deeply-nested avg: -0.0% min: -0.0% max: -0.0% unused-warnings-opt avg: 0.0% min: -0.0% max: 0.0% unify-linearly avg: 0.0% min: -0.0% max: 0.0% inflate-check avg: 0.0% min: -0.0% max: 0.0% ``` </details>

bors · 2018-04-20T12:52:46Z

☀️ Test successful - status-appveyor, status-travis
Approved by: Mark-Simulacrum
Pushing 85f5dd4 to master...

nnethercote · 2018-10-29T21:48:59Z

Actually,integer_lit is a bit harder, so I'd prefer to do that in a follow-up.

I finally did integer_lit (and float_lit) in #55384.

kennytm added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 18, 2018

leonardo-m reviewed Apr 18, 2018

View reviewed changes

nnethercote force-pushed the char_lit branch from ca47fdc to 9f14502 Compare April 18, 2018 23:18

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 18, 2018

This comment has been minimized.

Sign in to view

bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Apr 20, 2018

This comment has been minimized.

Sign in to view

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 20, 2018

pietroalbini assigned petrochenkov and unassigned petrochenkov Apr 20, 2018

pietroalbini assigned Mark-Simulacrum Apr 20, 2018

bors merged commit 9f14502 into rust-lang:master Apr 20, 2018

nnethercote deleted the char_lit branch April 23, 2018 05:10

nnethercote mentioned this pull request Aug 20, 2018

syntax: Optimize some literal parsing #53521

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid allocating when parsing \u{...} literals. #50052

Avoid allocating when parsing \u{...} literals. #50052

nnethercote commented Apr 18, 2018 •

edited by kennytm

Loading

kennytm commented Apr 18, 2018

Mark-Simulacrum commented Apr 18, 2018

bors commented Apr 18, 2018

bors commented Apr 18, 2018

leonardo-m Apr 18, 2018

nnethercote Apr 18, 2018

leonardo-m Apr 18, 2018 •

edited

Loading

nnethercote Apr 18, 2018

nnethercote commented Apr 18, 2018

nnethercote commented Apr 18, 2018

nnethercote commented Apr 18, 2018

Mark-Simulacrum commented Apr 18, 2018

bors commented Apr 18, 2018

bors commented Apr 20, 2018

bors commented Apr 20, 2018

This comment has been minimized.

This comment has been minimized.

nnethercote commented Apr 20, 2018

kennytm commented Apr 20, 2018

bors commented Apr 20, 2018

bors commented Apr 20, 2018

nnethercote commented Oct 29, 2018

Avoid allocating when parsing \u{...} literals. #50052

Avoid allocating when parsing \u{...} literals. #50052

Conversation

nnethercote commented Apr 18, 2018 • edited by kennytm Loading

kennytm commented Apr 18, 2018

Mark-Simulacrum commented Apr 18, 2018

bors commented Apr 18, 2018

bors commented Apr 18, 2018

leonardo-m Apr 18, 2018

Choose a reason for hiding this comment

nnethercote Apr 18, 2018

Choose a reason for hiding this comment

leonardo-m Apr 18, 2018 • edited Loading

Choose a reason for hiding this comment

nnethercote Apr 18, 2018

Choose a reason for hiding this comment

nnethercote commented Apr 18, 2018

nnethercote commented Apr 18, 2018

nnethercote commented Apr 18, 2018

Mark-Simulacrum commented Apr 18, 2018

bors commented Apr 18, 2018

bors commented Apr 20, 2018

bors commented Apr 20, 2018

This comment has been minimized.

This comment has been minimized.

nnethercote commented Apr 20, 2018

kennytm commented Apr 20, 2018

bors commented Apr 20, 2018

bors commented Apr 20, 2018

nnethercote commented Oct 29, 2018

nnethercote commented Apr 18, 2018 •

edited by kennytm

Loading

leonardo-m Apr 18, 2018 •

edited

Loading