Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Compiled" static header maps instead of big trie #33932

Merged
merged 33 commits into from
May 23, 2024
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
9b0d62c
Add static lookup benchmarks for header_map_impl
ravenblackx May 1, 2024
2e83788
compiled_string_map v1
ravenblackx May 2, 2024
be5dd61
optimize
ravenblackx May 2, 2024
611e1dd
Format
ravenblackx May 2, 2024
150040e
Document
ravenblackx May 2, 2024
90ba622
Format
ravenblackx May 2, 2024
e055146
Header
ravenblackx May 2, 2024
dd3a9fa
Comment and fix issue.
ravenblackx May 2, 2024
1990767
More comment
ravenblackx May 2, 2024
5df1270
Fix and simplify
ravenblackx May 3, 2024
92625f3
Merge branch 'headers' into headers_compiled
ravenblackx May 3, 2024
23042de
Split finalize and compile to support legacy host header injection
ravenblackx May 3, 2024
0808002
Merge branch 'main' into headers_compiled
ravenblackx May 7, 2024
b293819
Empty commit to retest
ravenblackx May 10, 2024
9eccaa9
Reduce string copies while populating lookup table
ravenblackx May 10, 2024
a3f8919
Rearrange node creation to be clearer that it's effectively a single …
ravenblackx May 10, 2024
d1a8ed8
More comments
ravenblackx May 10, 2024
7ece7eb
Explicit move
ravenblackx May 10, 2024
57b9055
Comment explaining string-view by reference
ravenblackx May 10, 2024
facdef8
Spelling
ravenblackx May 10, 2024
941d3fa
len
ravenblackx May 13, 2024
d75190d
key_size to avoid calling size() repeatedly (no difference)
ravenblackx May 13, 2024
30bab4b
Virtual class instead of lambdas
ravenblackx May 13, 2024
5469839
PURE
ravenblackx May 13, 2024
dc91907
Cut out a size comparison
ravenblackx May 13, 2024
4348681
move
ravenblackx May 14, 2024
9bd6190
Trailing underscores, some readability rearranging
ravenblackx May 14, 2024
9ee31c0
Comment on KV ownership.
ravenblackx May 14, 2024
405ac4d
data() not &[0]
ravenblackx May 14, 2024
a743ba2
Comments
ravenblackx May 16, 2024
c7458f4
Spelling
ravenblackx May 16, 2024
c4858ec
Fix comment
ravenblackx May 16, 2024
56c5e4e
Add compiled_string_map.md
ravenblackx May 17, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
5 changes: 5 additions & 0 deletions source/common/common/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -473,6 +473,11 @@ envoy_cc_library(
],
)

envoy_cc_library(
name = "compiled_string_map_lib",
hdrs = ["compiled_string_map.h"],
)

envoy_cc_library(
name = "packed_struct_lib",
hdrs = ["packed_struct.h"],
Expand Down
157 changes: 157 additions & 0 deletions source/common/common/compiled_string_map.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
#pragma once

#include <algorithm>
#include <string>
#include <vector>

#include "absl/strings/string_view.h"

namespace Envoy {

/**
* This is a specialized structure intended for static header maps, but
* there may be other use cases.
* The structure is:
* 1. a length-based lookup table so only keys the same length as the
* target key are considered.
* 2. a trie that branches on the "most divisions" position of the key.
*
* For example, if we consider the case where the set of headers is
* `x-prefix-banana`
* `x-prefix-babana`
* `x-prefix-apple`
* `x-prefix-pineapple`
* `x-prefix-barana`
* `x-prefix-banaka`
*
* A standard front-first trie looking for `x-prefix-banana` would walk
* 7 nodes through the tree, first for `x`, then for `-`, etc.
*
* This structure first jumps to matching length, eliminating in this
* example case apple and pineapple.
* Then the "best split" node is on
* `x-prefix-banana`
* ^
* so the first node has 3 non-miss branches, n, b and r for that position.
* Down that n branch, the "best split" is on
* `x-prefix-banana`
* ^
* which has two branches, n or k.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC the "best split" can be anywhere in the string. I think modifying the example to the following, will clarify how this should work.

 * `x-prefix-banana`
 * `x-prefix-banara`
 * `x-prefix-apple`
 * `x-prefix-pineapple`
 * `x-prefix-barana`
 * `x-prefix-banaka`

then the best split would've been in the character before-last.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can't be anywhere in the string, only at an index where there's a difference. The example is specifically showing that 3 branches would be "best" over 2 branches. I don't understand what your example clarifies that the existing example doesn't.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment was looking at it from a "trie" perspective.
When I first read the comment, I thought that in each branch, the structure only keeps the "suffix" (after the branch point). I think that updating the example to a case where the branching happens due to a "best-split" that is not the first split, will reduce that confusion to future readers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully the md file explanation covers this sufficiently (though it doesn't do the second split earlier in the string).

If you'd like me to rearrange the strings in the new example so that it does the "last" index split first and the earlier one second I can do that, but I think the diagram having numbers in it probably makes it clear enough. (And late first would be just as potentially confusing as early first, and having three sequential splits would be getting so long as to be harder to read...)

* Down the n branch is the leaf node (only `x-prefix-banana` remains) - at
* this point a regular string-compare checks if the key is an exact match
* for the string node.
*/
ravenblackx marked this conversation as resolved.
Show resolved Hide resolved
template <class Value> class CompiledStringMap {
using FindFn = std::function<Value(const absl::string_view&)>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

standard best practice is to pass string_view as value rather than as a ref. I think it's really close but this was the decision made by smart people some time ago :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment about references being faster for this use-case.


public:
using KV = std::pair<std::string, Value>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scanning through the code I'm wondering if there' s some string copying going on here that we might be able to avoid.

Can we have a string-store held in the class, and have the KV that we manipulate in various sub-vectors have either a string_view or a string&?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made the input a vector of string_views which eliminates the string copying during compile() except the copy taken for a leaf node (which is kept in case at some point the table wants to be initialized from a transient source rather than global singleton-constant-likes).

/**
* Returns the value with a matching key, or the default value
* (typically nullptr) if the key was not present.
* @param key the key to look up.
*/
Value find(const absl::string_view& key) const {
if (key.size() >= table_.size() || table_[key.size()] == nullptr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we avoid the duplicate lookup of table_[key.size()]? It's probably optimized away but I am never certain.

I think for the compiler to optimize it away it would have to see inside table_.size() to ensure it could not mutate key. That seems obvious to humans but idk how smart compilers are with that. I think if you can't see inside table_.size() impl (ie not inlined) then you can't know that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried taking it to a local variable, but that has to be after key.size() being too large has already been checked, which makes it two if-branches, which from benchmarking appears to make both the hit paths ~5% slower.

It's not like these are actual table lookups, they're just offsets, so even if it wasn't optimized away it might still be faster than taking an additional copy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does it have to be after the first check? Can't you have the first line of the function be the assignment to a temp?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

table_[key.size()] is an out-of-bounds crash if key.size() >= table_.size().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const size_t key_size = key.size();
if (key_size >= table_.size() || table_[key_size] == nullptr) {
...

return {};
}
return table_[key.size()](key);
};
/**
* Construct the lookup table. This is a somewhat slow multi-pass
* operation - using this structure is not recommended unless the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this note be in the class comment?

For a moment I thought you were saying that you could use this class, but not call 'compile' and instead fall back to an interpret-mode, but I don't think that's what you are saying :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved.

* table is initialize-once, use-many.
* @param initial a vector of key->value pairs.
*/
void compile(std::vector<KV> initial) {
jmarantz marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the parameter called initial? Would contents be a better description?

ravenblackx marked this conversation as resolved.
Show resolved Hide resolved
if (initial.empty()) {
return;
}
size_t longest = 0;
for (const KV& pair : initial) {
longest = std::max(pair.first.size(), longest);
}
table_.resize(longest + 1);
std::sort(initial.begin(), initial.end(),
[](const KV& a, const KV& b) { return a.first.size() < b.first.size(); });
auto it = initial.begin();
for (size_t i = 0; i <= longest; i++) {
auto start = it;
while (it != initial.end() && it->first.size() == i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am guessing we don't consider compiling to be important to be fast. But this nested loop seems like it might be slower than needed.

How about this for an algo:

flat_hash_map<int, std::vector<std::reference<Value>> size_to_values_map;
for (const KV& pair : initial) {
  size_to_values_map[pair.second.size()] = pair.second;
}

I realize this creates a data structure which doesn't map the ones you are using, but it seems like a good way to go get the elements organized by size. After that I'm not sure how this would fit into your algo.

I'm having a little trouble following the algorithm also. What causes it to re-examine the entries that don't match size 'i'?

Maybe some more comment would help, or breaking down your compile flow into a few helper methods that might be more self-documenting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was already not really a nested loop, in that the two indexes were moving "in parallel" (i.e. it wasn't n*m, it was n+m), but I've changed it to make that a bit clearer. Though still not much clearer since the std::find_if still looks like it might be an m-length lookup. But it's not!

it++;
}
if (it != start) {
std::vector<KV> node_contents;
node_contents.reserve(it - start);
std::copy(start, it, std::back_inserter(node_contents));
table_[i] = createEqualLengthNode(node_contents);
}
}
}

private:
static FindFn createEqualLengthNode(std::vector<KV> node_contents) {
if (node_contents.size() == 1) {
return [pair = node_contents[0]](const absl::string_view& key) -> Value {
if (key != pair.first) {
return {};
}
return pair.second;
};
}
struct IndexSplitInfo {
uint8_t index, min, max, count;
ravenblackx marked this conversation as resolved.
Show resolved Hide resolved
} best{0, 0, 0, 0};
for (size_t i = 0; i < node_contents[0].first.size(); i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this be faster?

for (size_t i = 0, n = node_contents[0].first.size(); i < n; i++) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it would (it's just reading a value from an address), and we don't care about the microoptimization performance here anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to make this code more understandable consider naming:
const size_t each_key_length = node_contents[0].first.size(); (or something similar)

consider renaming:i -> key_char_idx, j -> key_idx, v -> ch/val

std::array<bool, 256> hits{};
IndexSplitInfo info{static_cast<uint8_t>(i), 255, 0, 0};
for (size_t j = 0; j < node_contents.size(); j++) {
uint8_t v = node_contents[j].first[i];
if (!hits[v]) {
hits[v] = true;
info.count++;
info.min = std::min(v, info.min);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if it would be faster to write this as:

if (v < info.min) {
 info.min = v;
} else if (v > info.max) {
  info.max = v;
}

It feels to me like what you have is a little more readable, but may have an extra branch. And it will be always executing two writes into info and only 0 or 1 are needed. Not sure if the compiler would do that optimization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't really care about microoptimizing the speed of any function that isn't find().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure. I was having trouble figuring out what's find() and what's not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully that's a lot clearer now that there's no lambdas. Now only find() is find(). :)

info.max = std::max(v, info.max);
}
}
if (info.count > best.count) {
best = info;
}
}
std::vector<FindFn> nodes;
nodes.resize(best.max - best.min + 1);
std::sort(node_contents.begin(), node_contents.end(), [&best](const KV& a, const KV& b) {
return a.first[best.index] < b.first[best.index];
});
auto it = node_contents.begin();
for (int i = best.min; i <= best.max; i++) {
auto start = it;
while (it != node_contents.end() && it->first[best.index] == i) {
it++;
}
if (it != start) {
// Optimization was tried here, std::array<KV, 256> rather than
// a smaller-range vector with bounds, to keep locality and reduce
// comparisons. It didn't help.
std::vector<KV> next_contents;
next_contents.reserve(it - start);
std::copy(start, it, std::back_inserter(next_contents));
nodes[i - best.min] = createEqualLengthNode(next_contents);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this recursive lambda generation looks really cool but I'm wondering if it's the most efficient thing we can do. As a thought experiment I'm wondering what the code would look like if you had to write this in C. I assume it would be possilbe (if a little more verbose). Would it be faster?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I was doing it in C it would be almost the same - I don't know how much overhead there is in std::function but for the C structure of this I'd be using something like a struct containing a function pointer and a union of the captured data that each of the node-types takes, and that function pointer would take a pointer to that same structure... which seems like it's probably at least about the same as what a lambda is doing.

It might be possible to optimize further with more manual constructs, but I think a 60% speedup is probably good enough, and it could have another pass later if someone wants to try to do better.

}
}
return [nodes = std::move(nodes), min = best.min,
index = best.index](const absl::string_view& key) -> Value {
uint8_t k = static_cast<uint8_t>(key[index]);
// Possible optimization was tried here, populating empty nodes with
// a function that returns {} to reduce branching vs checking for null
// nodes. Checking for null nodes benchmarked faster.
if (k < min || k >= min + nodes.size() || nodes[k - min] == nullptr) {
return {};
}
return nodes[k - min](key);
};
}
std::vector<FindFn> table_;
};

} // namespace Envoy
1 change: 1 addition & 0 deletions source/common/http/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -449,6 +449,7 @@ envoy_cc_library(
":headers_lib",
"//envoy/http:header_map_interface",
"//source/common/common:assert_lib",
"//source/common/common:compiled_string_map_lib",
"//source/common/common:dump_state_utils",
"//source/common/common:empty_string",
"//source/common/common:non_copyable",
Expand Down
10 changes: 5 additions & 5 deletions source/common/http/header_map_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -119,15 +119,15 @@ template <> HeaderMapImpl::StaticLookupTable<RequestHeaderMap>::StaticLookupTabl
INLINE_REQ_HEADERS(REGISTER_DEFAULT_REQUEST_HEADER)
INLINE_REQ_RESP_HEADERS(REGISTER_DEFAULT_REQUEST_HEADER)

finalizeTable();

// Special case where we map a legacy host header to :authority.
const auto handle =
CustomInlineHeaderRegistry::getInlineHeader<RequestHeaderMap::header_map_type>(
Headers::get().Host);
add(Headers::get().HostLegacy.get().c_str(), [handle](HeaderMapImpl& h) -> StaticLookupResponse {
return {&h.inlineHeaders()[handle.value().it_->second], &handle.value().it_->first};
});
finalizeTable(
{{std::string{Headers::get().HostLegacy.get()},
[handle](HeaderMapImpl& h) -> StaticLookupResponse {
return {&h.inlineHeaders()[handle.value().it_->second], &handle.value().it_->first};
}}});
}

template <> HeaderMapImpl::StaticLookupTable<RequestTrailerMap>::StaticLookupTable() {
Expand Down
20 changes: 14 additions & 6 deletions source/common/http/header_map_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
#include "envoy/config/core/v3/base.pb.h"
#include "envoy/http/header_map.h"

#include "source/common/common/compiled_string_map.h"
#include "source/common/common/non_copyable.h"
#include "source/common/common/utility.h"
#include "source/common/http/headers.h"
Expand Down Expand Up @@ -146,18 +147,23 @@ class HeaderMapImpl : NonCopyable {
*/
template <class Interface>
struct StaticLookupTable
: public TrieLookupTable<std::function<StaticLookupResponse(HeaderMapImpl&)>> {
: public CompiledStringMap<std::function<StaticLookupResponse(HeaderMapImpl&)>> {
StaticLookupTable();

void finalizeTable() {
void finalizeTable(std::vector<KV> extra = {}) {
CustomInlineHeaderRegistry::finalize<Interface::header_map_type>();
auto& headers = CustomInlineHeaderRegistry::headers<Interface::header_map_type>();
size_ = headers.size();
size_ = headers.size() + extra.size();
std::vector<KV> input;
input.reserve(size_);
for (const auto& header : headers) {
this->add(header.first.get().c_str(), [&header](HeaderMapImpl& h) -> StaticLookupResponse {
return {&h.inlineHeaders()[header.second], &header.first};
});
input.emplace_back(std::make_pair(
std::string{header.first.get()}, [&header](HeaderMapImpl& h) -> StaticLookupResponse {
return {&h.inlineHeaders()[header.second], &header.first};
}));
}
std::copy(extra.begin(), extra.end(), std::back_inserter(input));
compile(input);
}

static size_t size() {
Expand Down Expand Up @@ -345,6 +351,8 @@ class HeaderMapImpl : NonCopyable {
const uint32_t max_headers_kb_ = UINT32_MAX;
// This holds the max count of the headers in the HeaderMap.
const uint32_t max_headers_count_ = UINT32_MAX;

template <class T> friend class StaticLookupBenchmarker;
};

/**
Expand Down
5 changes: 5 additions & 0 deletions test/common/common/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -281,6 +281,11 @@ envoy_cc_test(
],
)

envoy_cc_test(
name = "compiled_string_map_test",
srcs = ["compiled_string_map_test.cc"],
)

envoy_cc_test(
name = "packed_struct_test",
srcs = ["packed_struct_test.cc"],
Expand Down
38 changes: 38 additions & 0 deletions test/common/common/compiled_string_map_test.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
#include "source/common/common/compiled_string_map.h"

#include "gmock/gmock.h"
#include "gtest/gtest.h"

namespace Envoy {

using testing::IsNull;

TEST(CompiledStringMapTest, FindsEntriesCorrectly) {
CompiledStringMap<const char*> map;
map.compile({
{"key-1", "value-1"},
{"key-2", "value-2"},
{"longer-key", "value-3"},
{"bonger-key", "value-4"},
{"bonger-bey", "value-5"},
{"only-key-of-this-length", "value-6"},
});
EXPECT_EQ(map.find("key-1"), "value-1");
EXPECT_EQ(map.find("key-2"), "value-2");
EXPECT_THAT(map.find("key-0"), IsNull());
EXPECT_THAT(map.find("key-3"), IsNull());
EXPECT_EQ(map.find("longer-key"), "value-3");
EXPECT_EQ(map.find("bonger-key"), "value-4");
EXPECT_EQ(map.find("bonger-bey"), "value-5");
EXPECT_EQ(map.find("only-key-of-this-length"), "value-6");
EXPECT_THAT(map.find("songer-key"), IsNull());
EXPECT_THAT(map.find("absent-length-key"), IsNull());
}

TEST(CompiledStringMapTest, EmptyMapReturnsNull) {
CompiledStringMap<const char*> map;
map.compile({});
EXPECT_THAT(map.find("key-1"), IsNull());
}

} // namespace Envoy
60 changes: 60 additions & 0 deletions test/common/http/header_map_impl_speed_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -295,5 +295,65 @@ static void headerMapImplRemovePrefix(benchmark::State& state) {
}
BENCHMARK(headerMapImplRemovePrefix)->Arg(0)->Arg(1)->Arg(5)->Arg(10)->Arg(50);

template <class HeaderMapType> class StaticLookupBenchmarker {
public:
absl::optional<HeaderMapImpl::StaticLookupResponse> lookup(absl::string_view key) {
return table_.lookup(*ignored_, key);
}

private:
std::unique_ptr<RequestHeaderMapImpl> ignored_ = RequestHeaderMapImpl::create();
HeaderMapImpl::StaticLookupTable<HeaderMapType> table_;
};

template <class HeaderMapType>
static void headerMapImplStaticLookups(benchmark::State& state,
const std::vector<std::string>& keys) {
int i = keys.size();
StaticLookupBenchmarker<HeaderMapType> table;
for (auto _ : state) {
UNREFERENCED_PARAMETER(_);
auto result = table.lookup(keys[--i]);
if (i == 0) {
i = keys.size();
}
benchmark::DoNotOptimize(result);
}
}

static std::vector<std::string> makeMismatchedHeaders() {
return {
"x-envoy-banana",
"some-unknown-header",
"what-is-this-header",
"nobody-expects-this-header",
"another-unexpected-header",
"x-is-a-letter",
"x-y-problems-are-the-worst",
};
}

#define ADD_HEADER_TO_KEYS(name) keys.emplace_back(Http::Headers::get().name);
static void bmHeaderMapImplRequestStaticLookupHits(benchmark::State& state) {
std::vector<std::string> keys;
INLINE_REQ_HEADERS(ADD_HEADER_TO_KEYS);
headerMapImplStaticLookups<RequestHeaderMap>(state, keys);
}
static void bmHeaderMapImplResponseStaticLookupHits(benchmark::State& state) {
std::vector<std::string> keys;
INLINE_RESP_HEADERS(ADD_HEADER_TO_KEYS);
headerMapImplStaticLookups<ResponseHeaderMap>(state, keys);
}
static void bmHeaderMapImplRequestStaticLookupMisses(benchmark::State& state) {
headerMapImplStaticLookups<RequestHeaderMap>(state, makeMismatchedHeaders());
}
static void bmHeaderMapImplResponseStaticLookupMisses(benchmark::State& state) {
headerMapImplStaticLookups<ResponseHeaderMap>(state, makeMismatchedHeaders());
}
BENCHMARK(bmHeaderMapImplRequestStaticLookupHits);
BENCHMARK(bmHeaderMapImplResponseStaticLookupHits);
BENCHMARK(bmHeaderMapImplRequestStaticLookupMisses);
BENCHMARK(bmHeaderMapImplResponseStaticLookupMisses);

} // namespace Http
} // namespace Envoy