Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

finagle-base-http: add MapBackedHeaderMap #805

Closed

Conversation

martijnhoekstra
Copy link
Contributor

Problem

The default headermap implementation is not future-proof, and relies on HashMap internals that will change in scala 2.13

Solution

Provide an alternative headermap implemtation that is backed by Map, an interface that will work across versions.

Result

The option to use a 2.13-proof headermap.

@martijnhoekstra
Copy link
Contributor Author

Needs to be benchmarked. I'll cherry-pick the benchmark extensions from https://github.com/twitter/finagle/pull/797/files#diff-b61dcd2fdf0b7a577f018c14e2d83800 into this, and we can bench-off

@martijnhoekstra martijnhoekstra changed the title add MapBackedHeaderMap finagle-base-http: add MapBackedHeaderMap Oct 4, 2019
@martijnhoekstra
Copy link
Contributor Author

flaky test error presumed spurious

@martijnhoekstra
Copy link
Contributor Author

https://docs.google.com/spreadsheets/d/1C402zjjqmEhC-Aw5LOf_XwzjYUFcPX1XT0kI-B_7lVQ/edit?usp=sharing has benchmarks (while laptop wasn't fully idle, so a bit dirty).

get takes four times as long in this impl vs baseline. How that reflects to real world performance is anybodies guess without macrobenchmarking/end-to-end tests

@mosesn
Copy link
Contributor

mosesn commented Oct 4, 2019

Thanks! I really appreciate you breaking up the patch, it's much easier to read now.

@mosesn
Copy link
Contributor

mosesn commented Oct 4, 2019

How does it work if it's backed by a java hashmap? I wonder if that will make it easier to avoid allocs etc

@martijnhoekstra
Copy link
Contributor Author

I can run another bench backed with a java hashmap, but if we want real insights in allocs, we should run under JFR, and analyse those results. I never did that before though.

I'm also not too sure the performance bottleneck hinges on allocation. That's also something that JFR dumps or flamegraphs could help us out more with than looking at microbenchmarks and guessing.

This weekend my volunteer/life balance will favour the living, so if there will be anything from me, it'll be after the weekend.

@mosesn
Copy link
Contributor

mosesn commented Oct 4, 2019

sgtm, no rush! jmh has an alloc mode also, I think it's -prof gc

@martijnhoekstra
Copy link
Contributor Author

Also related, the 2.13 Map implementation changed for performance reasons, and should be faster than the 2.12 Map. If performance of the 2.13 one is good (that's a big if, but still), you could just keep the current impl as the default for 2.12. It's just a matter of making what HeaderMap.apply returns version specific.

@bryce-anderson
Copy link
Contributor

I have an idea along the lines moses is suggesting that might be worth trying. We should try using a Java collection as the backing collection but use the java.util.TreeMap which allows you to provide a Comparitor[String] that is not case sensitive. For example this should work and be pretty snappy:

val SharedComparitor = new Comparator[String] {
  def compare(o1: String, o2: String): Int = {
    // Shorter strings are always less, regardless of their content
    val lenthDiff = key1.length - key2.length
    if (lenthDiff != 0) lenthDiff
    else {
      @tailrec
      def go(i: Int): Int = {
        if (i == key1.length) 0 // end, they are equal.
        else {
          val char1 = hashChar(key1.charAt(i))
          val char2 = hashChar(key2.charAt(i))
          val diff = char1 - char2
          if (diff == 0) go(i + 1)
          else diff
        }
      }
      go(0)
    }
  }

There's a really solid chance this would be faster than what is currently in master.

@martijnhoekstra
Copy link
Contributor Author

Sounds good! I also still want to give the trie approach another go. I also want to understand why the current approach is so much faster in finding an entry than the scala map backed approach is.

@bryce-anderson
Copy link
Contributor

bryce-anderson commented Oct 4, 2019 via email

@martijnhoekstra
Copy link
Contributor Author

That sounds plausible. For the plain tree version I'm afraid to lose a lot of performance on traversing a common prefix several times (e.g. content-*). At least it'll be good to have options to compare.

@martijnhoekstra
Copy link
Contributor Author

martijnhoekstra commented Oct 7, 2019

Some more benchmark results at this google doc (ran with too few iterations for too short for real results) indicate that the java tree map approach so far is indeed comes pretty close to the implementation master.

I'd propose going forward replacing the implementation in master with the java tree map backed version for all versions.

@martijnhoekstra martijnhoekstra force-pushed the mapBackedHeaderMap branch 3 times, most recently from b0ebd3f to 60e8df5 Compare October 7, 2019 10:57
@codecov-io
Copy link

Codecov Report

Merging #805 into develop will increase coverage by 0.14%.
The diff coverage is 1.6%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #805      +/-   ##
===========================================
+ Coverage    68.27%   68.41%   +0.14%     
===========================================
  Files          844      841       -3     
  Lines        24839    23821    -1018     
  Branches      1833     1854      +21     
===========================================
- Hits         16958    16297     -661     
+ Misses        7881     7524     -357
Impacted Files Coverage Δ
...tter/finagle/http/headers/MapBackedHeaderMap.scala 0% <0%> (ø)
...ter/finagle/http/headers/JMapBackedHeaderMap.scala 0% <0%> (ø)
...cala/com/twitter/finagle/http/headers/Header.scala 46.15% <0%> (-10.99%) ⬇️
.../com/twitter/finagle/http/HeaderMapBenchmark.scala 0% <0%> (ø) ⬆️
...finagle/http/headers/JTreeMapBackedHeaderMap.scala 0% <0%> (ø)
...com/twitter/finagle/http/headers/HeadersHash.scala 79.31% <100%> (-20.69%) ⬇️
.../twitter/finagle/netty4/http/Netty4HttpCodec.scala 0% <0%> (-100%) ⬇️
...com/twitter/finagle/http/service/NullService.scala 0% <0%> (-100%) ⬇️
...http2/transport/client/Http2ClientDispatcher.scala 0% <0%> (-100%) ⬇️
...finagle/http/filter/AddResponseHeadersFilter.scala 0% <0%> (-100%) ⬇️
... and 172 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 32bc9f4...60e8df5. Read the comment docs.

Copy link
Contributor

@bryce-anderson bryce-anderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the j.u.TreeMap approach looks promising, promising enough to become the main implementation.

@bryce-anderson
Copy link
Contributor

@martijnhoekstra, I just wanted to give you a heads up, but I'm working on some optimizations in Headers.scala. It looks like conflicts will be minimal, but just wanted you to know.

@martijnhoekstra
Copy link
Contributor Author

Thanks for letting me know, I'm sure the conflicts won't be a big deal.

@martijnhoekstra
Copy link
Contributor Author

addressed the feedback, and rebased for the Header changes

@martijnhoekstra martijnhoekstra force-pushed the mapBackedHeaderMap branch 2 times, most recently from d5d7a03 to 7dae34c Compare October 10, 2019 08:05
@martijnhoekstra
Copy link
Contributor Author

@bryce-anderson I think I got them all. Do you have any additional issues?

Copy link
Contributor

@bryce-anderson bryce-anderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it took so long to check this out, I must have missed it in my inbox.
This all looks really good, pretty much just nit-picks for performance etc. Thanks for taking care of this!

def iterator: Iterator[HeaderMap.NameValue] =
if (next == null) Iterator.single(this)
else {
var cur: Header = this
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you move cur into the definition of AbstractIterator you can avoid lifting it to the the heap via scala.runtime.ObjectRef.

Comment on lines 67 to 69
final def iterator: Iterator[(String, String)] = underlying.synchronized {
copyHeaders.flatMap(_.iterator.map(nv => (nv.name, nv.value)))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like nameValueIterator.map(nv => (nv.name, nv.value)).

* a mutable [[Map[String, Header]]], where the map key
* is forced to lower case
*/
private[http] trait JMapBackedHeaderMap extends HeaderMap {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it should be an abstract class, and probably just vanilla private.

Comment on lines 15 to 21

def getAll(key: String): Seq[String] = underlying.synchronized {
underlying.get(key.toLowerCase) match {
case null => Nil
case r: Header.Root => r.values
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation is off.

* a mutable [[Map[String, Header]]], where the map key
* is forced to lower case
*/
final private[http] class JTreeMapBackedHeaderMap extends JMapBackedHeaderMap {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the value of separating this from JMapBackedHeaderMap? I don't anticipate we'll want to change the underlying map all that often.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw, I'm not strongly opinionated about this, it's just that having an abstract backing map that is synchronized on in two different traits/classes makes me queasy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It made sense when the HashMap variety still existed. Not so much anymore.


// Does not validate key and value.
def addUnsafe(key: String, value: String): this.type = underlying.synchronized {
def header = Header.root(key, value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

header is only needed if we hit the null case. Can we move its instantiation there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a def for that reason, but it can go inside the match. The only reason it's outside now is neater indentation, so that's not much of a reason.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't notice it was a def. :) Maybe just inline it where header is used and we'll both be happy.

As an aside, do you know if that def allocates a lambda, or does that get lifted out into the object itself?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I inlined it.

As for the allocation, I assumed it gets hoysted to a method all the way to the top, especially since it doesn't capture anything and doesn't escape scope, but I'm not 100% sure either way.

}

def get(key: String): Option[String] = underlying.synchronized {
Option(underlying.get(key)).map(_.value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do a match to avoid allocating a second Some and the lambda used to map.

Copy link
Contributor

@bryce-anderson bryce-anderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one more piece of feedback, but this looks really good! Thanks again for tackling it.

Comment on lines 65 to 83
override def keysIterator: Iterator[String] = underlying.synchronized {
//the common case has a single element in Headers. Prevent unneeded List
//allocations for that case (don't flatMap)
val valuesIterator = copyHeaders
var currentEntries: Iterator[String] = Iterator.empty
new AbstractIterator[String]{
def hasNext: Boolean = currentEntries.hasNext || valuesIterator.hasNext
def next(): String =
if (currentEntries.hasNext) currentEntries.next()
else {
val h = valuesIterator.next()
if (h.next == null) h.name
else {
currentEntries = h.iterator.map(nv => nv.name).toList.distinct.iterator
currentEntries.next()
}
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty similar in result to HeadersHash.uniqueNamesIterator. Could we share that logic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I factored the (lazier) logic of HeaderHash out, and made both call that.

@martijnhoekstra
Copy link
Contributor Author

CI failures look spurious

Copy link
Contributor

@mosesn mosesn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks pretty much done except we need to work out this concurrentmodification thing. LGTM otherwise!


/**
* Underlying headers eagerly copied to an array, without synchronizing
* on the underlying collection.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a note that since it's not synchronized, it must be synchronized by the calling method? I think it might be safe to always synchronize here, since java synchronized does reentrancy pretty cheaply (I think)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the note -- I also think that re-entrance is pretty cheap, but I wasn't entirely sure, and it's called only a handful of times, and only private[this] I'm happy to change it if you see those trade-offs differently.

)
}

override def foreach[U](f: ((String, String)) => U): Unit =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was talking to @bryce-anderson and we think this might open us to ConcurrentModificationException issues, since we don't synchronize while iterating, and it iterates over the underlying map directly. it seems like we might have a few options here:

  1. instead of TreeMap use a ConcurrentSkipListMap
  2. synchronize foreach on underlying
  3. remove the optimization, instead synchronize and copy the iterator when we call foreach

The risk of synchronizing foreach is that we're going to lock on someone else's lambda, but it might be worth just doing it for now, since it will still let us use your optimization at very little cost in the common case (single-threaded access). On the other hand, it would be pretty cool to use a ConcurrentSkipListMap 😈. how do you feel about it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I wish I never removed the abstraction to swap out the underlying Map implementation.

But let's take a step back. How often do people really concurrently iterate over headers and also mutate the header map and don't synchronize that access because they don't care whether or not the modified data should or shouldn't be part of the iteration? Is that a use-case you want to facilitate in the first place?

Thinking about those questions, my conclusion is that throwing a ConcurrentModificationException that makes the user aware of the situation/race condition/potential bug is better than synchronizing and not being deterministic in what happens first. If the user really wants that for some unforeseen, unspoken and unspeakable reason, they can always still externally synchronize (maybe with a ReadWriteLock or something).

I think in terms of common usage patterns, that's the fastest and most reasonable thing to do too. Even if a ConcurrentSkipListMap would be pretty cool.

If we do want to get overboard and fancy, it should be validated with benchmarks that reflect real-world usage patterns, including the real-world concurrent patterns, because otherwise we're not really measuring anything close to reality.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the good outcome is that they get a ConcurrentModificationException–a bad outcome might be an infinite loop or worse. See https://ivoanjo.me/blog/2018/07/21/writing-to-a-java-treemap-concurrently-can-lead-to-an-infinite-loop-during-reads/

I think correctness needs to be our #1 concern. I agree with you that it's extremely unlikely–that's one of the reasons I'm willing to just wrap it in a synchronized block and call it a day.

I'm not sure I understand your concern about not being deterministic about what happens first–that's always going to be the case with concurrent use of a data structure. My concern is that HeaderMap used to be threadsafe, so people may have been already using it assuming it was threadsafe, and then one day we merge this in and their application breaks in an extremely subtle way. If this was a new abstraction, I would be open to considering, "HeaderMap is not threadsafe, all concurrent access must be synchronized by the user" but I don't think that's the right way forward given the circumstances.

My bet is the same as yours, that it's extremely unusual to access it concurrently, so I don't think we need to benchmark it. But I do think that we should protect against it. Locking is pretty cheap when there's no concurrent access, so I don't think it should be that painful to just lock it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jikes! I didn't realize the concurrent case could lead to non-termination. That makes not synchronizing a non-starter, and the below just an aside for academic curiosity.

The concern about concurrent iteration and modification (specifically those two) is that the concept of iteration of all headers and concurrently modifying those headers makes no sense -- either you want those in or out, but you always care, or you wouldn't iteratate them. You must synchronise externally for the operation to be meaningful and correct.

That's not always the case for two concurrent modifications that will never modify the same key, or concurrently getting a key, and modifying some keys that you know to be different to the keys you're concurrently getting.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've run into this non-termination before... It was a nightmare, especially considering it was happening on a Netty thread.

Normally I'd agree that we'd prefer a ConcurrentModificationException and we've tried that in the past but many users are doing the wrong thing in benign places (think reusing a request to do warmup requests or something) and it just wasn't worth the hassle to introduce the behavior change.

I think the solution you've already committed of simply synchronizing the foreach call is fine since you're right: this shouldn't be happening very often, if at all. 🤞

Copy link
Contributor

@vkostyukov vkostyukov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty great @martijnhoekstra! Any chance you can attach benchmark results to the PR description?

@martijnhoekstra
Copy link
Contributor Author

This looks pretty great @martijnhoekstra! Any chance you can attach benchmark results to the PR description?

Shall I re-run after I do the last round of feedback (and we know what to do with potentially concurrent modification and iteration?)

@martijnhoekstra
Copy link
Contributor Author

quick ( -i 5 -wi 3 -r 1 -w 5) benchmark run:

[info] Benchmark                                          Mode  Cnt     Score     Error  Units
[info] HashBackedHeaderMapBenchmark.create                avgt    5    58.571 ▒  18.232  ns/op
[info] HashBackedHeaderMapBenchmark.createAndAdd          avgt    5   148.290 ▒  28.821  ns/op
[info] HashBackedHeaderMapBenchmark.createAndAdd24        avgt    5  3435.482 ▒ 594.920  ns/op
[info] HashBackedHeaderMapBenchmark.getRandom             avgt    5    28.593 ▒   5.335  ns/op
[info] HashBackedHeaderMapBenchmark.getRealistic          avgt    5    26.842 ▒   2.421  ns/op
[info] HashBackedHeaderMapBenchmark.iterate               avgt    5   265.780 ▒  56.199  ns/op
[info] HashBackedHeaderMapBenchmark.iterateKeys           avgt    5   142.899 ▒  19.002  ns/op
[info] HashBackedHeaderMapBenchmark.iterateRepeatedKeys   avgt    5   109.857 ▒  48.143  ns/op
[info] HashBackedHeaderMapBenchmark.iterateValues         avgt    5   196.082 ▒ 113.488  ns/op
[info] HashBackedHeaderMapBenchmark.keySet                avgt    5     5.904 ▒   1.852  ns/op
[info] JTMapBackedHeaderMapBenchmark.create               avgt    5    15.710 ▒   3.428  ns/op
[info] JTMapBackedHeaderMapBenchmark.createAndAdd         avgt    5    78.716 ▒  18.136  ns/op
[info] JTMapBackedHeaderMapBenchmark.createAndAdd24       avgt    5  2242.704 ▒ 450.955  ns/op
[info] JTMapBackedHeaderMapBenchmark.getRandom            avgt    5    41.988 ▒   6.247  ns/op
[info] JTMapBackedHeaderMapBenchmark.getRealistic         avgt    5    41.815 ▒  24.553  ns/op
[info] JTMapBackedHeaderMapBenchmark.iterate              avgt    5   164.154 ▒  64.958  ns/op
[info] JTMapBackedHeaderMapBenchmark.iterateKeys          avgt    5   141.076 ▒  34.300  ns/op
[info] JTMapBackedHeaderMapBenchmark.iterateRepeatedKeys  avgt    5   110.051 ▒  32.983  ns/op
[info] JTMapBackedHeaderMapBenchmark.iterateValues        avgt    5   353.827 ▒  40.716  ns/op
[info] JTMapBackedHeaderMapBenchmark.keySet               avgt    5     6.344 ▒   2.265  ns/op

Copy link
Contributor

@mosesn mosesn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is really great, thanks so much @martijnhoekstra! really excited to start merging this in.

@mosesn mosesn added the Ship It label Oct 18, 2019
@mosesn
Copy link
Contributor

mosesn commented Nov 4, 2019

@martijnhoekstra the code for this was merged in 193c603, but we cut it back over to the old impl for now, so that we can do some performance tests before switching over entirely. I hope you're partially unblocked for now

@mosesn
Copy link
Contributor

mosesn commented Nov 7, 2019

@martijnhoekstra we found in someone else's test suite a use case where someone calls something like headerMap.foreach { case (k, v) => headerMap.add(k, ....) } which of course triggers a ConcurrentModificationException . . . so we're going to have to copy the iterator anyway. :/ Sorry about that, we'll be merging it in soon though!

@martijnhoekstra
Copy link
Contributor Author

https://xkcd.com/1172/

@enbnt
Copy link
Contributor

enbnt commented Nov 25, 2019

This branch was merged 193c603#diff-9cea8690cac77728b4a871d2c3eb222c. Just circling back now to close out the PR.

Thank you, @martijnhoekstra!

@enbnt enbnt closed this Nov 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

None yet

6 participants