Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use `Symbol` more #60630

Merged
merged 5 commits into from May 13, 2019

Conversation

@nnethercote
Copy link
Contributor

commented May 8, 2019

A Symbol can be equated with a string (e.g. &str). This involves a
TLS lookup to get the chars (and a Mutex lock in a parallel compiler)
and then a char-by-char comparison. This functionality is convenient but
avoids one of the main benefits of Symbols, which is fast equality
comparisons.

This PR removes the Symbol/string equality operations, forcing a lot
of existing string occurrences to become Symbols. Fortunately, these
are almost all static strings (many are attribute names) and we can add
static Symbols as necessary, and very little extra interning occurs.
The benefits are (a) a slight speedup (possibly greater in a parallel
compiler), and (b) the code is a lot more principled about Symbol use.
The main downside is verbosity, particularly with more use syntax::symbol::symbols items.

r? @Zoxc

@nnethercote nnethercote changed the title Use symbol more Use `Symbol` more May 8, 2019

@nnethercote

This comment has been minimized.

Copy link
Contributor Author

commented May 8, 2019

@bors try

@bors

This comment has been minimized.

Copy link
Contributor

commented May 8, 2019

⌛️ Trying commit b28d0ae with merge 1a89a32...

bors added a commit that referenced this pull request May 8, 2019

Auto merge of #60630 - nnethercote:use-Symbol-more, r=<try>
Use `Symbol` more

A `Symbol` can be equated with a string (e.g. `&str`). This involves a
TLS lookup to get the chars (and a Mutex lock in a parallel compiler)
and then a char-by-char comparison. This functionality is convenient but
avoids one of the main benefits of `Symbol`s, which is fast equality
comparisons.

This PR removes the `Symbol`/string equality operations, forcing a lot
of existing string occurrences to become `Symbol`s. Fortunately, these
are almost all static strings (many are attribute names) and we can add
static `Symbol`s as necessary, and very little extra interning occurs.
The benefits are (a) a slight speedup (possibly greater in a parallel
compiler), and (b) the code is a lot more principled about `Symbol` use.
The main downside is verbosity, particularly with more `use
syntax::symbol::symbols` items.

r? @Zoxc

@petrochenkov petrochenkov self-assigned this May 8, 2019

@oli-obk

This comment has been minimized.

Copy link
Contributor

commented May 8, 2019

bors seems to have gotten stuck

@bors try

@bors

This comment has been minimized.

Copy link
Contributor

commented May 8, 2019

💥 Test timed out

@mati865

This comment has been minimized.

Copy link
Contributor

commented May 8, 2019

@oli-obk I think you have to use retry, commit hash hasn't changed.

@oli-obk

This comment has been minimized.

Copy link
Contributor

commented May 8, 2019

@bors retry

@bors

This comment has been minimized.

Copy link
Contributor

commented May 8, 2019

⌛️ Trying commit b28d0ae with merge ce805fd...

bors added a commit that referenced this pull request May 8, 2019

Auto merge of #60630 - nnethercote:use-Symbol-more, r=<try>
Use `Symbol` more

A `Symbol` can be equated with a string (e.g. `&str`). This involves a
TLS lookup to get the chars (and a Mutex lock in a parallel compiler)
and then a char-by-char comparison. This functionality is convenient but
avoids one of the main benefits of `Symbol`s, which is fast equality
comparisons.

This PR removes the `Symbol`/string equality operations, forcing a lot
of existing string occurrences to become `Symbol`s. Fortunately, these
are almost all static strings (many are attribute names) and we can add
static `Symbol`s as necessary, and very little extra interning occurs.
The benefits are (a) a slight speedup (possibly greater in a parallel
compiler), and (b) the code is a lot more principled about `Symbol` use.
The main downside is verbosity, particularly with more `use
syntax::symbol::symbols` items.

r? @Zoxc
@bors

This comment has been minimized.

Copy link
Contributor

commented May 8, 2019

☔️ The latest upstream changes (presumably #60246) made this pull request unmergeable. Please resolve the merge conflicts.

@bors

This comment has been minimized.

Copy link
Contributor

commented May 8, 2019

☀️ Try build successful - checks-travis
Build commit: ce805fd

@Zoxc

This comment has been minimized.

Copy link
Contributor

commented May 8, 2019

@rust-timer

This comment has been minimized.

Copy link

commented May 8, 2019

Success: Queued ce805fd with parent b92d360, comparison URL.

@rust-timer

This comment has been minimized.

Copy link

commented May 8, 2019

Finished benchmarking try commit ce805fd

@oli-obk

This comment has been minimized.

Copy link
Contributor

commented May 8, 2019

I think github is being bugged by time travellers today

@oli-obk

This comment has been minimized.

Copy link
Contributor

commented May 8, 2019

One 1.4% perf improvement, many small ones (<1%)

@eddyb

This comment has been minimized.

Copy link
Member

commented May 8, 2019

cc @petrochenkov

(I haven't done a full review but this does sound like something I've been wanting for a long time)

@eddyb

This comment has been minimized.

Copy link
Member

commented May 8, 2019

@nnethercote

This comment has been minimized.

Copy link
Contributor Author

commented May 8, 2019

The symbols:: qualifier is now common enough that changing it to syms:: might be worth considering.

@nnethercote nnethercote force-pushed the nnethercote:use-Symbol-more branch from b28d0ae to 6df17b0 May 8, 2019

}

// Non-identifer symbols that can be referred to with syntax_pos::nisymbols::*.
NISymbols {

This comment has been minimized.

Copy link
@petrochenkov

petrochenkov May 9, 2019

Contributor

"Non-identness" doesnt' really matter, entries in the Symbols { ... } block should probably just allow an optional : "non-ident" literal (with everything generated in one mod symbols).

This comment has been minimized.

Copy link
@nnethercote

nnethercote May 9, 2019

Author Contributor

Ah, I didn't realize that was possible. Good suggestion, I will implement it.

This comment has been minimized.

Copy link
@Zoxc

Zoxc May 10, 2019

Contributor

I'd like us to have 2 namespaces here, so if you use symbols::a you know you're getting a and not some other identifier. It also ensure there won't be any name conflicts if you add a: "b" and you want to add a later.

We could probably rename symbols to idents in that case.

@petrochenkov

This comment has been minimized.

Copy link
Contributor

commented May 9, 2019

So, my primary concern is that we are front-loading interning of about 500 strings and do it in every compilation session even if most of them won't be actually used.

Could you measure how much time the UI test suite (a lot of little programs) takes with and without these changes?
If the change is negligible, then creation of the pre-interned table is cheap enough to be front-loaded safely.

@bors

This comment has been minimized.

Copy link
Contributor

commented May 13, 2019

⌛️ Testing commit ea9fac5 with merge fe5f42c...

bors added a commit that referenced this pull request May 13, 2019

Auto merge of #60630 - nnethercote:use-Symbol-more, r=petrochenkov
Use `Symbol` more

A `Symbol` can be equated with a string (e.g. `&str`). This involves a
TLS lookup to get the chars (and a Mutex lock in a parallel compiler)
and then a char-by-char comparison. This functionality is convenient but
avoids one of the main benefits of `Symbol`s, which is fast equality
comparisons.

This PR removes the `Symbol`/string equality operations, forcing a lot
of existing string occurrences to become `Symbol`s. Fortunately, these
are almost all static strings (many are attribute names) and we can add
static `Symbol`s as necessary, and very little extra interning occurs.
The benefits are (a) a slight speedup (possibly greater in a parallel
compiler), and (b) the code is a lot more principled about `Symbol` use.
The main downside is verbosity, particularly with more `use
syntax::symbol::symbols` items.

r? @Zoxc
@bors

This comment has been minimized.

Copy link
Contributor

commented May 13, 2019

☀️ Test successful - checks-travis, status-appveyor
Approved by: petrochenkov
Pushing fe5f42c to master...

@bors bors added the merged-by-bors label May 13, 2019

@bors bors merged commit ea9fac5 into rust-lang:master May 13, 2019

2 checks passed

Travis CI - Pull Request Build Passed
Details
homu Test successful
Details

@nnethercote nnethercote deleted the nnethercote:use-Symbol-more branch May 13, 2019

@nnethercote nnethercote referenced this pull request May 13, 2019

Merged

fix Miri #60780

bors added a commit that referenced this pull request May 14, 2019

Auto merge of #60740 - petrochenkov:kw, r=nnethercote
Simplify use of keyword symbols

They mirror non-keyword symbols now (see #60630).

`keywords::MyKeyword.name()` -> `kw::MyKeyword`
`keywords::MyKeyword.ident()` -> `Ident::with_empty_ctxt(kw::MyKeyword)` (not common)
`keywords::Invalid.ident()` -> `Ident::invalid()` (more common)

Keywords are simply `Symbol` constants now, the `Keyword` struct is eliminated.
This means `kw::MyKeyword` can now be used in `match` in particular.

bors added a commit that referenced this pull request May 14, 2019

Auto merge of #60740 - petrochenkov:kw, r=nnethercote
Simplify use of keyword symbols

They mirror non-keyword symbols now (see #60630).

`keywords::MyKeyword.name()` -> `kw::MyKeyword`
`keywords::MyKeyword.ident()` -> `Ident::with_empty_ctxt(kw::MyKeyword)` (not common)
`keywords::Invalid.ident()` -> `Ident::invalid()` (more common)

Keywords are simply `Symbol` constants now, the `Keyword` struct is eliminated.
This means `kw::MyKeyword` can now be used in `match` in particular.

bors added a commit that referenced this pull request May 15, 2019

Auto merge of #60740 - petrochenkov:kw, r=nnethercote
Simplify use of keyword symbols

They mirror non-keyword symbols now (see #60630).

`keywords::MyKeyword.name()` -> `kw::MyKeyword`
`keywords::MyKeyword.ident()` -> `Ident::with_empty_ctxt(kw::MyKeyword)` (not common)
`keywords::Invalid.ident()` -> `Ident::invalid()` (more common)

Keywords are simply `Symbol` constants now, the `Keyword` struct is eliminated.
This means `kw::MyKeyword` can now be used in `match` in particular.

bors added a commit that referenced this pull request May 15, 2019

Auto merge of #60740 - petrochenkov:kw, r=nnethercote
Simplify use of keyword symbols

They mirror non-keyword symbols now (see #60630).

`keywords::MyKeyword.name()` -> `kw::MyKeyword`
`keywords::MyKeyword.ident()` -> `Ident::with_empty_ctxt(kw::MyKeyword)` (not common)
`keywords::Invalid.ident()` -> `Ident::invalid()` (more common)

Keywords are simply `Symbol` constants now, the `Keyword` struct is eliminated.
This means `kw::MyKeyword` can now be used in `match` in particular.

bors added a commit that referenced this pull request May 15, 2019

Auto merge of #60740 - petrochenkov:kw, r=nnethercote
Simplify use of keyword symbols

They mirror non-keyword symbols now (see #60630).

`keywords::MyKeyword.name()` -> `kw::MyKeyword`
`keywords::MyKeyword.ident()` -> `Ident::with_empty_ctxt(kw::MyKeyword)` (not common)
`keywords::Invalid.ident()` -> `Ident::invalid()` (more common)

Keywords are simply `Symbol` constants now, the `Keyword` struct is eliminated.
This means `kw::MyKeyword` can now be used in `match` in particular.

nnethercote added a commit to nnethercote/rust that referenced this pull request May 21, 2019

Remove impls for `InternedString`/string equality.
`Symbol` received the same treatment in rust-lang#60630.

Also, we can derive `PartialEq` for `InternedString`.

nnethercote added a commit to nnethercote/rust that referenced this pull request May 21, 2019

Remove impls for `InternedString`/string equality.
`Symbol` received the same treatment in rust-lang#60630.

Also, we can derive `PartialEq` for `InternedString`.

bors added a commit that referenced this pull request May 21, 2019

Auto merge of #61003 - nnethercote:rm-InternedString-PartialEq-impls,…
… r=<try>

Remove impls for `InternedString`/string equality.

`Symbol` received the same treatment in #60630.

Also, we can derive `PartialEq` for `InternedString`.

r? @petrochenkov

bors added a commit that referenced this pull request May 21, 2019

Auto merge of #61003 - nnethercote:rm-InternedString-PartialEq-impls,…
… r=<try>

Remove impls for `InternedString`/string equality.

`Symbol` received the same treatment in #60630.

Also, we can derive `PartialEq` for `InternedString`.

r? @petrochenkov

bors added a commit that referenced this pull request May 21, 2019

Auto merge of #61003 - nnethercote:rm-InternedString-PartialEq-impls,…
… r=<try>

Remove impls for `InternedString`/string equality.

`Symbol` received the same treatment in #60630.

Also, we can derive `PartialEq` for `InternedString`.

r? @petrochenkov

Centril added a commit to Centril/rust that referenced this pull request May 22, 2019

Rollup merge of rust-lang#61003 - nnethercote:rm-InternedString-Parti…
…alEq-impls, r=petrochenkov

Remove impls for `InternedString`/string equality.

`Symbol` received the same treatment in rust-lang#60630.

Also, we can derive `PartialEq` for `InternedString`.

r? @petrochenkov

Centril added a commit to Centril/rust that referenced this pull request May 22, 2019

Rollup merge of rust-lang#61003 - nnethercote:rm-InternedString-Parti…
…alEq-impls, r=petrochenkov

Remove impls for `InternedString`/string equality.

`Symbol` received the same treatment in rust-lang#60630.

Also, we can derive `PartialEq` for `InternedString`.

r? @petrochenkov

bors added a commit that referenced this pull request May 23, 2019

Auto merge of #60740 - petrochenkov:kw, r=nnethercote
Simplify use of keyword symbols

They mirror non-keyword symbols now (see #60630).

`keywords::MyKeyword.name()` -> `kw::MyKeyword`
`keywords::MyKeyword.ident()` -> `Ident::with_empty_ctxt(kw::MyKeyword)` (not common)
`keywords::Invalid.ident()` -> `Ident::invalid()` (more common)

Keywords are simply `Symbol` constants now, the `Keyword` struct is eliminated.
This means `kw::MyKeyword` can now be used in `match` in particular.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.