New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(schema): use FnvHashMap for type and directive names #93

Merged
merged 2 commits into from Oct 8, 2017

Conversation

Projects
None yet
3 participants
@srijs
Contributor

srijs commented Oct 2, 2017

Changes SchemaType to use a HashMap backed by the [https://github.com/servo/rust-fnv](Fowler–Noll–Vo hash function). This hash function performs much better for smaller inputs (think short strings like most GraphQL type names) than the default hash function, SipHash. The downside is that it does not have the same cryptographic properties, which open it up to DoS vulnerabilities when the input can be manipulated from the outside.

In this change set, I have chosen to apply FNV only to the type and directive names tracked by the SchemaType, for two reasons:

  1. Look-ups in SchemaType::type_by_name and co. have been showing up as the most frequent/expensive use of hashing for my workload.
  2. I believe it is reasonable to assume that a malicious party would not have control over the schema type, therefore could not leverage FNV for a DoS attack (contrary to e.g. using it in the validation logic).

Here you can see how SchemaType::type_by_name, ::concrete_type_by_name as well as both ::make_type and ValidatorContext::with_pushed_type (these use type_by_name internally) show up in my profile:

schematype-hash-perf

In my (somewhat limited) benchmark, the change appears to lead to a worthwhile performance improvement.

Pre:

Running 30s test @ http://localhost:3000/graphql
  2 threads and 4 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   572.58us  336.26us  14.93ms   97.35%
    Req/Sec     3.54k   284.33     3.93k    77.08%
  Latency Distribution
     50%  526.00us
     75%  573.00us
     90%  663.00us
     99%    1.15ms
  211827 requests in 30.10s, 26.46MB read
Requests/sec:   7037.39
Transfer/sec:      0.88MB

Post:

Running 30s test @ http://localhost:3000/graphql
  2 threads and 4 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   550.45us  213.25us   8.73ms   97.53%
    Req/Sec     3.64k   169.32     3.93k    79.87%
  Latency Distribution
     50%  520.00us
     75%  554.00us
     90%  616.00us
     99%    0.96ms
  217928 requests in 30.10s, 27.23MB read
Requests/sec:   7240.40
Transfer/sec:      0.90MB
@Techcable

This comment has been minimized.

Show comment
Hide comment
@Techcable

Techcable Oct 7, 2017

Why not seahash? Isn't that signifigantly faster?

Techcable commented Oct 7, 2017

Why not seahash? Isn't that signifigantly faster?

@srijs

This comment has been minimized.

Show comment
Hide comment
@srijs

srijs Oct 8, 2017

Contributor

SeaHash might have a higher throughput, but FNV is pretty much on par or better on short strings, which is the typical workload for Juniper.

Juniper Benchmarks

FnvHasher

running 2 tests
test introspection_query ... bench:   1,744,556 ns/iter (+/- 422,545)
test query_type_name     ... bench:      14,961 ns/iter (+/- 3,658)

test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured

SeaHasher

running 2 tests
test introspection_query ... bench:   1,774,394 ns/iter (+/- 458,161)
test query_type_name     ... bench:      15,497 ns/iter (+/- 12,935)

test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured

DefaultHasher

running 2 tests
test introspection_query ... bench:   1,866,340 ns/iter (+/- 389,326)
test query_type_name     ... bench:      14,831 ns/iter (+/- 7,612)

test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured
Contributor

srijs commented Oct 8, 2017

SeaHash might have a higher throughput, but FNV is pretty much on par or better on short strings, which is the typical workload for Juniper.

Juniper Benchmarks

FnvHasher

running 2 tests
test introspection_query ... bench:   1,744,556 ns/iter (+/- 422,545)
test query_type_name     ... bench:      14,961 ns/iter (+/- 3,658)

test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured

SeaHasher

running 2 tests
test introspection_query ... bench:   1,774,394 ns/iter (+/- 458,161)
test query_type_name     ... bench:      15,497 ns/iter (+/- 12,935)

test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured

DefaultHasher

running 2 tests
test introspection_query ... bench:   1,866,340 ns/iter (+/- 389,326)
test query_type_name     ... bench:      14,831 ns/iter (+/- 7,612)

test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured
@srijs

This comment has been minimized.

Show comment
Hide comment
@srijs

srijs Oct 8, 2017

Contributor

@theduke @mhallin Friendly ping on this one, would appreciate a review!

Contributor

srijs commented Oct 8, 2017

@theduke @mhallin Friendly ping on this one, would appreciate a review!

@mhallin

This comment has been minimized.

Show comment
Hide comment
@mhallin

mhallin Oct 8, 2017

Collaborator

I think this looks good, thanks for the improvement! I think your assumptions about hash-collisions attacks are reasonable.

Without having done any deeper performance analysis, I'd think that Juniper's biggest overhead on larger queries is running the validation passes. Having a MFU/MRU cache of parsed and validated ASTs could probably help there, or implementing Apollo's persisted queries.

Collaborator

mhallin commented Oct 8, 2017

I think this looks good, thanks for the improvement! I think your assumptions about hash-collisions attacks are reasonable.

Without having done any deeper performance analysis, I'd think that Juniper's biggest overhead on larger queries is running the validation passes. Having a MFU/MRU cache of parsed and validated ASTs could probably help there, or implementing Apollo's persisted queries.

@mhallin mhallin merged commit 88f0fdc into graphql-rust:master Oct 8, 2017

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@srijs srijs deleted the srijs:perf/schema-fnv-hash-map branch Oct 8, 2017

@srijs

This comment has been minimized.

Show comment
Hide comment
@srijs

srijs Oct 8, 2017

Contributor

@mhallin agreed. Coincidentally, my work on removing the lifetime parameter from the AST for purposes of introducing async could prove useful for introducing a query cache as well!

Generally though it'd be nice if Juniper was fast for all queries, regardless of how frequently they are executed :)

Contributor

srijs commented Oct 8, 2017

@mhallin agreed. Coincidentally, my work on removing the lifetime parameter from the AST for purposes of introducing async could prove useful for introducing a query cache as well!

Generally though it'd be nice if Juniper was fast for all queries, regardless of how frequently they are executed :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment