Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use fast HPACK comparisons when not checking sensitive headers #9259

Merged
merged 1 commit into from Oct 24, 2019

Conversation

@carl-mastrangelo
Copy link
Member

carl-mastrangelo commented Jun 20, 2019

Motivation:
Constant time comparison functions are used to compare HTTP/2 header
values, even if they are not sensitive.

Modification:
After checking for sensitivity, use fast comparison.

Result: Faster HPACK table reads/writes

@netty-bot

This comment has been minimized.

Copy link

netty-bot commented Jun 20, 2019

Can one of the admins verify this patch?

@carl-mastrangelo carl-mastrangelo requested review from Scottmitch and ejona86 Jun 20, 2019
@@ -59,21 +60,16 @@ final int size() {

@Override
public final int hashCode() {
// TODO(nmittler): Netty's build rules require this. Probably need a better implementation.
return super.hashCode();
throw new UnsupportedOperationException();

This comment has been minimized.

Copy link
@normanmaurer

normanmaurer Jun 20, 2019

Member

Isn’t it possible that the user calls this somehow ? Why not just leave it as it is ?

This comment has been minimized.

Copy link
@carl-mastrangelo

carl-mastrangelo Jun 20, 2019

Author Member

I was unsure, but I made it throw because it would be unsafe to use this without knowing it's insensitive.

This comment has been minimized.

Copy link
@normanmaurer

normanmaurer Jun 20, 2019

Member

Hmm still this seems wrong

This comment has been minimized.

Copy link
@carl-mastrangelo

carl-mastrangelo Jun 20, 2019

Author Member

I can remove hashCode and equals. WDYT?

HpackHeaderField other = (HpackHeaderField) obj;
// To avoid short circuit behavior a bitwise operator is used instead of a boolean operator.
return (HpackUtil.equalsConstantTime(name, other.name) & HpackUtil.equalsConstantTime(value, other.value)) != 0;
throw new UnsupportedOperationException();

This comment has been minimized.

Copy link
@normanmaurer

normanmaurer Jun 20, 2019

Member

Same... equals should never throw

@normanmaurer

This comment has been minimized.

Copy link
Member

normanmaurer commented Jun 20, 2019

@carl-mastrangelo do you have a benchmark by any chance?

@normanmaurer

This comment has been minimized.

Copy link
Member

normanmaurer commented Jun 20, 2019

@netty-bot test this please

Copy link
Member

normanmaurer left a comment

Just two questions

@carl-mastrangelo

This comment has been minimized.

Copy link
Member Author

carl-mastrangelo commented Jun 20, 2019

The effect is not so pronounced, but it helps in the larger cases. My guess why insensitive is slower for small headers is that there are multiple lookups involved in the dynamic and static table. It only gets faster when the headers are already in the table.

BEFORE:

Benchmark                     (duplicates)  (limitToAscii)  (sensitive)  (size)  Mode  Cnt       Score       Error  Units
HpackEncoderBenchmark.encode          true            true         true   SMALL  avgt   10    2572.595 ±    16.184  ns/op
HpackEncoderBenchmark.encode          true            true         true  MEDIUM  avgt   10   19580.815 ±   397.780  ns/op
HpackEncoderBenchmark.encode          true            true         true   LARGE  avgt   10  379456.381 ±  2059.919  ns/op
HpackEncoderBenchmark.encode          true            true        false   SMALL  avgt   10     730.579 ±     8.116  ns/op
HpackEncoderBenchmark.encode          true            true        false  MEDIUM  avgt   10    2087.590 ±    84.644  ns/op
HpackEncoderBenchmark.encode          true            true        false   LARGE  avgt   10   11725.228 ±    89.298  ns/op
HpackEncoderBenchmark.encode          true           false         true   SMALL  avgt   10     555.971 ±     5.120  ns/op
HpackEncoderBenchmark.encode          true           false         true  MEDIUM  avgt   10    2831.874 ±    41.801  ns/op
HpackEncoderBenchmark.encode          true           false         true   LARGE  avgt   10   36054.025 ±   179.504  ns/op
HpackEncoderBenchmark.encode          true           false        false   SMALL  avgt   10     340.337 ±     3.313  ns/op
HpackEncoderBenchmark.encode          true           false        false  MEDIUM  avgt   10    1006.817 ±     8.942  ns/op
HpackEncoderBenchmark.encode          true           false        false   LARGE  avgt   10    8784.168 ±   164.014  ns/op
HpackEncoderBenchmark.encode         false            true         true   SMALL  avgt   10    2561.934 ±    27.056  ns/op
HpackEncoderBenchmark.encode         false            true         true  MEDIUM  avgt   10   22061.105 ±   154.533  ns/op
HpackEncoderBenchmark.encode         false            true         true   LARGE  avgt   10  435744.897 ±  8853.388  ns/op
HpackEncoderBenchmark.encode         false            true        false   SMALL  avgt   10    2737.683 ±    47.142  ns/op
HpackEncoderBenchmark.encode         false            true        false  MEDIUM  avgt   10   22385.146 ±    98.430  ns/op
HpackEncoderBenchmark.encode         false            true        false   LARGE  avgt   10  408159.698 ± 12044.931  ns/op
HpackEncoderBenchmark.encode         false           false         true   SMALL  avgt   10     544.213 ±     3.279  ns/op
HpackEncoderBenchmark.encode         false           false         true  MEDIUM  avgt   10    2908.978 ±    31.026  ns/op
HpackEncoderBenchmark.encode         false           false         true   LARGE  avgt   10   36471.262 ±  1044.010  ns/op
HpackEncoderBenchmark.encode         false           false        false   SMALL  avgt   10     609.305 ±     4.371  ns/op
HpackEncoderBenchmark.encode         false           false        false  MEDIUM  avgt   10    3223.946 ±    23.505  ns/op
HpackEncoderBenchmark.encode         false           false        false   LARGE  avgt   10   39975.152 ±   655.196  ns/op

AFTER:

Benchmark                     (duplicates)  (limitToAscii)  (sensitive)  (size)  Mode  Cnt       Score      Error  Units
HpackEncoderBenchmark.encode          true            true         true   SMALL  avgt   10    2580.732 ±   39.935  ns/op
HpackEncoderBenchmark.encode          true            true         true  MEDIUM  avgt   10   19963.569 ±  417.324  ns/op
HpackEncoderBenchmark.encode          true            true         true   LARGE  avgt   10  361053.507 ± 2141.450  ns/op
HpackEncoderBenchmark.encode          true            true        false   SMALL  avgt   10     774.761 ±    3.446  ns/op
HpackEncoderBenchmark.encode          true            true        false  MEDIUM  avgt   10    1601.366 ±   11.409  ns/op
HpackEncoderBenchmark.encode          true            true        false   LARGE  avgt   10    4852.643 ±   43.546  ns/op
HpackEncoderBenchmark.encode          true           false         true   SMALL  avgt   10     551.301 ±    3.420  ns/op
HpackEncoderBenchmark.encode          true           false         true  MEDIUM  avgt   10    2821.285 ±   11.290  ns/op
HpackEncoderBenchmark.encode          true           false         true   LARGE  avgt   10   35519.282 ±  194.778  ns/op
HpackEncoderBenchmark.encode          true           false        false   SMALL  avgt   10     251.308 ±    2.342  ns/op
HpackEncoderBenchmark.encode          true           false        false  MEDIUM  avgt   10     509.551 ±    3.531  ns/op
HpackEncoderBenchmark.encode          true           false        false   LARGE  avgt   10    1648.384 ±   17.924  ns/op
HpackEncoderBenchmark.encode         false            true         true   SMALL  avgt   10    2782.013 ±  151.077  ns/op
HpackEncoderBenchmark.encode         false            true         true  MEDIUM  avgt   10   21969.109 ±   63.417  ns/op
HpackEncoderBenchmark.encode         false            true         true   LARGE  avgt   10  415850.888 ± 1507.624  ns/op
HpackEncoderBenchmark.encode         false            true        false   SMALL  avgt   10    2699.697 ±   34.675  ns/op
HpackEncoderBenchmark.encode         false            true        false  MEDIUM  avgt   10   23074.582 ±  105.401  ns/op
HpackEncoderBenchmark.encode         false            true        false   LARGE  avgt   10  405572.113 ± 2954.830  ns/op
HpackEncoderBenchmark.encode         false           false         true   SMALL  avgt   10     543.590 ±    2.325  ns/op
HpackEncoderBenchmark.encode         false           false         true  MEDIUM  avgt   10    2943.186 ±   12.283  ns/op
HpackEncoderBenchmark.encode         false           false         true   LARGE  avgt   10   36131.091 ±  119.945  ns/op
HpackEncoderBenchmark.encode         false           false        false   SMALL  avgt   10     601.432 ±    5.279  ns/op
HpackEncoderBenchmark.encode         false           false        false  MEDIUM  avgt   10    3225.028 ±   46.043  ns/op
HpackEncoderBenchmark.encode         false           false        false   LARGE  avgt   10   39672.871 ±  585.702  ns/op
@carl-mastrangelo

This comment has been minimized.

Copy link
Member Author

carl-mastrangelo commented Jun 20, 2019

@normanmaurer I am going to send a followup PR. Disabling the huffman coder makes this ridiculously faster, to the point where this PR makes only a minor difference.

@normanmaurer

This comment has been minimized.

Copy link
Member

normanmaurer commented Jun 20, 2019

@carl-mastrangelo so should we close this then ?

@carl-mastrangelo

This comment has been minimized.

Copy link
Member Author

carl-mastrangelo commented Jun 20, 2019

@normanmaurer I think it will still have a minor benefit for large header values, just small headers won't see it. The two changes are independent, but the other is more important.

@normanmaurer

This comment has been minimized.

Copy link
Member

normanmaurer commented Jun 20, 2019

@carl-mastrangelo

This comment has been minimized.

Copy link
Member Author

carl-mastrangelo commented Jun 20, 2019

I removed the two methods. I also applied this change on top of the huffman change. There is still benefit for larger header values I think. Results are a little noisy, but still give insight:

DIFFBASE

Benchmark                     (duplicates)  (limitToAscii)  (sensitive)  (size)  Mode  Cnt     Score     Error  Units
HpackEncoderBenchmark.encode          true            true         true   SMALL  avgt    5   379.473 ± 133.815  ns/op
HpackEncoderBenchmark.encode          true            true         true  MEDIUM  avgt    5  1118.772 ±  89.258  ns/op
HpackEncoderBenchmark.encode          true            true         true   LARGE  avgt    5  5366.828 ±  89.746  ns/op
HpackEncoderBenchmark.encode          true            true        false   SMALL  avgt    5   284.401 ±   2.088  ns/op
HpackEncoderBenchmark.encode          true            true        false  MEDIUM  avgt    5   922.805 ±  10.796  ns/op
HpackEncoderBenchmark.encode          true            true        false   LARGE  avgt    5  8727.831 ± 462.138  ns/op
HpackEncoderBenchmark.encode          true           false         true   SMALL  avgt    5   337.093 ±  22.585  ns/op
HpackEncoderBenchmark.encode          true           false         true  MEDIUM  avgt    5   693.689 ±  16.351  ns/op
HpackEncoderBenchmark.encode          true           false         true   LARGE  avgt    5  5616.786 ±  98.647  ns/op
HpackEncoderBenchmark.encode          true           false        false   SMALL  avgt    5   286.708 ±  13.765  ns/op
HpackEncoderBenchmark.encode          true           false        false  MEDIUM  avgt    5   906.279 ±  32.338  ns/op
HpackEncoderBenchmark.encode          true           false        false   LARGE  avgt    5  8304.736 ± 128.584  ns/op
HpackEncoderBenchmark.encode         false            true         true   SMALL  avgt    5   351.381 ±  15.547  ns/op
HpackEncoderBenchmark.encode         false            true         true  MEDIUM  avgt    5  1188.166 ±   7.023  ns/op
HpackEncoderBenchmark.encode         false            true         true   LARGE  avgt    5  6876.009 ±  48.117  ns/op
HpackEncoderBenchmark.encode         false            true        false   SMALL  avgt    5   434.759 ±   8.619  ns/op
HpackEncoderBenchmark.encode         false            true        false  MEDIUM  avgt    5   954.588 ±  58.514  ns/op
HpackEncoderBenchmark.encode         false            true        false   LARGE  avgt    5  8534.017 ± 552.597  ns/op
HpackEncoderBenchmark.encode         false           false         true   SMALL  avgt    5   223.713 ±   4.823  ns/op
HpackEncoderBenchmark.encode         false           false         true  MEDIUM  avgt    5  1181.538 ±  11.851  ns/op
HpackEncoderBenchmark.encode         false           false         true   LARGE  avgt    5  6670.830 ± 267.927  ns/op
HpackEncoderBenchmark.encode         false           false        false   SMALL  avgt    5   424.609 ±  27.477  ns/op
HpackEncoderBenchmark.encode         false           false        false  MEDIUM  avgt    5  1003.578 ±  53.991  ns/op
HpackEncoderBenchmark.encode         false           false        false   LARGE  avgt    5  8428.932 ± 102.838  ns/op

AFTER

Benchmark                     (duplicates)  (limitToAscii)  (sensitive)  (size)  Mode  Cnt     Score     Error  Units
HpackEncoderBenchmark.encode          true            true         true   SMALL  avgt    5   359.901 ±  12.893  ns/op
HpackEncoderBenchmark.encode          true            true         true  MEDIUM  avgt    5  1070.761 ±  11.828  ns/op
HpackEncoderBenchmark.encode          true            true         true   LARGE  avgt    5  5340.499 ±  62.160  ns/op
HpackEncoderBenchmark.encode          true            true        false   SMALL  avgt    5   214.511 ±  11.789  ns/op
HpackEncoderBenchmark.encode          true            true        false  MEDIUM  avgt    5   424.725 ±  39.835  ns/op
HpackEncoderBenchmark.encode          true            true        false   LARGE  avgt    5  1391.647 ±  11.890  ns/op
HpackEncoderBenchmark.encode          true           false         true   SMALL  avgt    5   344.233 ±  32.231  ns/op
HpackEncoderBenchmark.encode          true           false         true  MEDIUM  avgt    5  1065.387 ±  22.257  ns/op
HpackEncoderBenchmark.encode          true           false         true   LARGE  avgt    5  5294.608 ± 197.011  ns/op
HpackEncoderBenchmark.encode          true           false        false   SMALL  avgt    5   211.384 ±  12.504  ns/op
HpackEncoderBenchmark.encode          true           false        false  MEDIUM  avgt    5   415.328 ±   3.515  ns/op
HpackEncoderBenchmark.encode          true           false        false   LARGE  avgt    5  1403.609 ±   3.463  ns/op
HpackEncoderBenchmark.encode         false            true         true   SMALL  avgt    5   332.837 ±   5.381  ns/op
HpackEncoderBenchmark.encode         false            true         true  MEDIUM  avgt    5  1160.590 ±  23.529  ns/op
HpackEncoderBenchmark.encode         false            true         true   LARGE  avgt    5  6855.142 ± 676.847  ns/op
HpackEncoderBenchmark.encode         false            true        false   SMALL  avgt    5   409.124 ±  38.120  ns/op
HpackEncoderBenchmark.encode         false            true        false  MEDIUM  avgt    5  1479.240 ±  24.127  ns/op
HpackEncoderBenchmark.encode         false            true        false   LARGE  avgt    5  7261.508 ± 106.516  ns/op
HpackEncoderBenchmark.encode         false           false         true   SMALL  avgt    5   344.280 ±   6.342  ns/op
HpackEncoderBenchmark.encode         false           false         true  MEDIUM  avgt    5  1174.367 ±  41.517  ns/op
HpackEncoderBenchmark.encode         false           false         true   LARGE  avgt    5  6193.716 ± 452.406  ns/op
HpackEncoderBenchmark.encode         false           false        false   SMALL  avgt    5   294.076 ±  13.072  ns/op
HpackEncoderBenchmark.encode         false           false        false  MEDIUM  avgt    5  1508.438 ±  21.544  ns/op
HpackEncoderBenchmark.encode         false           false        false   LARGE  avgt    5  8798.796 ± 126.874  ns/op
@normanmaurer

This comment has been minimized.

Copy link
Member

normanmaurer commented Jun 20, 2019

@netty-bot test this please

Copy link
Member

ejona86 left a comment

Can name lookups always use the constant equals and value lookups always use variable equals? Value lookups are where we'd expect the most gain from the variable equals.

@@ -145,7 +146,7 @@ static int getIndex(CharSequence name) {
* Returns the index value for the given header field in the static table. Returns -1 if the
* header field is not in the static table.
*/
static int getIndex(CharSequence name, CharSequence value) {
static int getIndexSensitive(CharSequence name, CharSequence value) {

This comment has been minimized.

Copy link
@ejona86

ejona86 Jun 20, 2019

Member

I don't see any usage of this. And I wouldn't expect any usage since sensitive header values won't use the HPACK table.

This comment has been minimized.

Copy link
@carl-mastrangelo

carl-mastrangelo Jun 20, 2019

Author Member

Yup, no use, removed.

* Returns the lowest index value for the header field name in the dynamic table. Returns -1 if
* the header field name is not in the dynamic table.
*/
private int getIndexInsensitive(CharSequence name) {

This comment has been minimized.

Copy link
@ejona86

ejona86 Jun 20, 2019

Member

How much does this buy us? I'd expect not much. Since we only pay the equals check if there is a hash collision, and these are header names which are short, it seems like there's not room for gain here.

This comment has been minimized.

Copy link
@carl-mastrangelo

carl-mastrangelo Jun 20, 2019

Author Member

Maybe, but it's about code clarity. This method says it doesn't care about security, as opposed to getIndexSensitive.

Also in the case the header names change frequently, I'd expect there would be frequent collisions and (evictions).

This comment has been minimized.

Copy link
@normanmaurer

This comment has been minimized.

Copy link
@ejona86

ejona86 Jun 24, 2019

Member

To me the sensitive/nonsensitive split just is a bit more confusing. I'd have preferred dumping a comment on the previous getIndex: // use constant-time comparison since used with sensitive headers. And for the value look ups // not used for sensitive headers, so no need to use constant-time comparison.

If we really like the split names, then I'd just have getIndexInsensitive() call getIndexSensitive() in the implementation.

Header name changes doesn't have anything to do with frequency of collisions. Only the number of entries plays a part in collision rate. I still maintain that for header names the difference between constant- vs variable-time comparison is well below what we should care about. I'd much rather optimize for code clarity and size here by not duplicating the getIndex implementation.

@normanmaurer

This comment has been minimized.

Copy link
Member

normanmaurer commented Jun 24, 2019

@netty-bot test this please

@carl-mastrangelo

This comment has been minimized.

Copy link
Member Author

carl-mastrangelo commented Jul 1, 2019

@normanmaurer I would like to get this in, if there aren't any other concerns. There are always going to be differences of opinion, so we need to decide on what is a blocker and what isn't. I feel that since this code is faster, correct, and doesn't pose a significant maintainability burden, it should be committed.

@normanmaurer

This comment has been minimized.

Copy link
Member

normanmaurer commented Jul 1, 2019

@carl-mastrangelo I am not strong on this... If @ejona86 is not "strongly" against it I think we can merge... @ejona86 WDYT ?

@ejona86

This comment has been minimized.

Copy link
Member

ejona86 commented Jul 1, 2019

Why are those both getNameIndexSensitive and getNameIndexInsensitive? It is in the name of performance, right? But there has not been any performance benefit shown from that change. Benefits are for the value change, not the name change.

So the name change should be dropped. The name change adds code for no benefit.

I don't care if it is renamed or the like. Avoiding the second implementation is all I'm concerned about.

@normanmaurer

This comment has been minimized.

Copy link
Member

normanmaurer commented Aug 16, 2019

@carl-mastrangelo

This comment has been minimized.

Copy link
Member Author

carl-mastrangelo commented Oct 17, 2019

@carl-mastrangelo carl-mastrangelo force-pushed the carl-mastrangelo:insensitive branch from 23c6924 to 021679b Oct 17, 2019
@carl-mastrangelo

This comment has been minimized.

Copy link
Member Author

carl-mastrangelo commented Oct 17, 2019

@normanmaurer should be ready again

@CodingFabian

This comment has been minimized.

Copy link
Contributor

CodingFabian commented Oct 17, 2019

I can confirm a positive impact. While on a micro level it is very little, over time it does sum up, and seems to me as a sensible optimization.

@carl-mastrangelo carl-mastrangelo force-pushed the carl-mastrangelo:insensitive branch from 021679b to 1fff2fb Oct 17, 2019
@normanmaurer normanmaurer requested a review from bryce-anderson Oct 21, 2019
@normanmaurer

This comment has been minimized.

Copy link
Member

normanmaurer commented Oct 21, 2019

@bryce-anderson can you take a look as well ?

@normanmaurer normanmaurer added this to the 4.1.43.Final milestone Oct 22, 2019
@normanmaurer

This comment has been minimized.

Copy link
Member

normanmaurer commented Oct 23, 2019

@CodingFabian thanks for confirming :)

@normanmaurer

This comment has been minimized.

Copy link
Member

normanmaurer commented Oct 23, 2019

@netty-bot test this please

Copy link
Contributor

bryce-anderson left a comment

Perhaps a comment to justify why we are now using *Insensitive methods would be helpful for future security-oriented readers.

@carl-mastrangelo carl-mastrangelo force-pushed the carl-mastrangelo:insensitive branch from 955def7 to 87fc23a Oct 23, 2019
Motivation:
Constant time comparison functions are used to compare HTTP/2 header
values, even if they are not sensitive.

Modification:
After checking for sensitivity, use fast comparison.

Result: Faster HPACK table reads/writes
@carl-mastrangelo carl-mastrangelo force-pushed the carl-mastrangelo:insensitive branch from 87fc23a to 87a3785 Oct 23, 2019
@carl-mastrangelo

This comment has been minimized.

Copy link
Member Author

carl-mastrangelo commented Oct 23, 2019

@bryce-anderson I added a class comment about it.

@normanmaurer any chance this can make it into 4.1.43 ?

@normanmaurer

This comment has been minimized.

Copy link
Member

normanmaurer commented Oct 23, 2019

@netty-bot test this please

@normanmaurer normanmaurer merged commit e8e7a20 into netty:4.1 Oct 24, 2019
3 checks passed
3 checks passed
pull request validation (centos6-java11) Build finished.
Details
pull request validation (centos6-java12) Build finished.
Details
pull request validation (centos6-java8) Build finished.
Details
@normanmaurer

This comment has been minimized.

Copy link
Member

normanmaurer commented Oct 24, 2019

@carl-mastrangelo merged... thanks :)

normanmaurer added a commit that referenced this pull request Oct 24, 2019
Motivation:
Constant time comparison functions are used to compare HTTP/2 header
values, even if they are not sensitive.

Modification:
After checking for sensitivity, use fast comparison.

Result: Faster HPACK table reads/writes
@carl-mastrangelo carl-mastrangelo deleted the carl-mastrangelo:insensitive branch Oct 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.