Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reuse phf hash and remove phf::OrderedSet indirection #103

Merged
merged 3 commits into from Sep 1, 2015
Merged

Conversation

@SimonSapin
Copy link
Member

SimonSapin commented Aug 3, 2015

Do not merge yet. This depends on sfackler/rust-phf#62

Use the phf_shared and phf_generator crates directly instead of phf. This allows us to re-use the phf hash in the dynamic table and avoid hashing the same string again.

Also remove the indirection of phf::OrderedSet compared to phf::Set: we don’t care about the order, only about getting numeric indices. (The optimization mentioned in a comment of using a bit map of the first 64 atoms in the html5ever tree builder was never implemented. If we want it, the indirection and order preservation can be added back while preserving the hash reuse.)

Fixes #38.

Review on Reviewable

SimonSapin added 2 commits Aug 3, 2015
This allows us to bypass the indirection of `phf::OrderedSet`:
we don’t care about the order, only about getting numeric indices.

This will also allow us to re-use the phf hash for the dynamic table,
to avoid hashing strings twice.
SimonSapin added a commit to SimonSapin/rust-phf that referenced this pull request Aug 3, 2015
In https://github.com/servo/string-cache, we currently use a `phf::OrderedSet`
with its `get_index` method to get an identified stored in an `Atom`,
and `index` to get a string back from that identifier.

However, the extra inderection of `OrderedSet` of `Set` is not necessary.
We don’t care about the order, only about getting numeric identifiers.

Additionally, when `get_index` returns `None`,
we hash the input string again to find it in table of dynamic atoms.
With this chang, we can reuse the phf hash instead:

servo/string-cache#103

At first I tried adding hash and index access to `phf::Map`,
but the API got messy quickly.
@SimonSapin
Copy link
Member Author

SimonSapin commented Aug 3, 2015

@pcwalton, would this help servo/servo#6906 ?

@pcwalton
Copy link
Contributor

pcwalton commented Aug 3, 2015

I suspect it will, yes.

@SimonSapin
Copy link
Member Author

SimonSapin commented Aug 4, 2015

r? @glennw

@glennw
Copy link
Member

glennw commented Sep 1, 2015

@bors-servo
Copy link
Contributor

bors-servo commented Sep 1, 2015

📌 Commit 573161f has been approved by glennw

@bors-servo
Copy link
Contributor

bors-servo commented Sep 1, 2015

Testing commit 573161f with merge dc46510...

bors-servo pushed a commit that referenced this pull request Sep 1, 2015
bors-servo
Reuse phf hash and remove phf::OrderedSet indirection

<s>Do not merge yet.</s> This depends on sfackler/rust-phf#62

Use the `phf_shared` and `phf_generator` crates directly instead of `phf`. This allows us to re-use the phf hash in the dynamic table and avoid hashing the same string again.

Also remove the indirection of `phf::OrderedSet` compared to `phf::Set`: we don’t care about the order, only about getting numeric indices. (The optimization mentioned in a comment of using a bit map of the first 64 atoms in the html5ever tree builder was never implemented. If we want it, the indirection and order preservation can be added back while preserving the hash reuse.)

Fixes #38.

<!-- Reviewable:start -->
[<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/servo/string-cache/103)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

bors-servo commented Sep 1, 2015

☀️ Test successful - travis

@bors-servo bors-servo merged commit 573161f into master Sep 1, 2015
3 checks passed
3 checks passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
homu Test successful
Details
@tetsuharuohzeki tetsuharuohzeki deleted the raw-phf branch Sep 3, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants
You can’t perform that action at this time.