Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of HPACK encoding and header tables #10

Merged
merged 30 commits into from
Aug 16, 2018

Conversation

AlanQuatermain
Copy link
Contributor

Implementation of HPACK routines, including integer compression, header compression and indexed header tables, and huffman string compression.

Motivation:

This is the beginning of removing the dependency on nghttp2.

Modifications:

I've added a new library target, NIOHPACK, which contains all the necessary pieces to implement HPACK encoding as specified in RFC 7541. There are unit tests exercising all the code and some additional edge cases around integer encoding, along with some performance tests for the Huffman encode/decode process. The latter currently takes longer to decode than to encode, which is worrying, but this is only really noticeable when encoding > 16KB strings as a rule; therefore I've decided not to pull out the optimization hammer just yet. Included in the unit tests are the inputs and outputs from all the examples in RFC 7541 and RFC 7540.

I've tried to match the coding style from the repo, but some of my own idioms may have crept in here & there. Please let me know if you see anything wrong in that regard.

Currently the static header table and Huffman encoding tables are implemented as Swift lets. For the static header table that's ok (only 62 entries), but the Huffman table is huge. I'm debating whether it's worth having that in a plain C file somewhere, but I'm not sure what the tradeoffs are vs. performance. Right now it seems to only affect compilation time, but I may be wrong.

Result:

In general, nothing will change: nothing in NIOHTTP2 is using this library yet. That part is what I'm working on now.

…er compression and indexed header tables, and huffman string compression.

Motivation:

This is the beginning of removing the dependency on `nghttp2`.

Modifications:

I've added a new library target, `NIOHPACK`, which contains all the necessary pieces to implement HPACK encoding as specified in RFC 7541. There are unit tests exercising all the code and some additional edge cases around integer encoding, along with some performance tests for the Huffman encode/decode process. The latter currently takes longer to decode than to encode, which is worrying, but this is only really noticeable when encoding > 16KB strings as a rule; therefore I've decided not to pull out the optimization hammer just yet. Included in the unit tests are the inputs and outputs from all the examples in RFC 7541 and RFC 7540.

I've tried to match the coding style from the repo, but some of my own idioms may have crept in here & there. Please let me know if you see anything wrong in that regard.

Currently the static header table and Huffman encoding tables are implemented as Swift `let`s. For the static header table that's ok (only 62 entries), but the Huffman table is huge. I'm debating whether it's worth having that in a plain C file somewhere, but I'm not sure what the tradeoffs are vs. performance. Right now it seems to only affect compilation time, but I may be wrong.

Result:

In general, nothing will change: nothing in NIOHTTP2 is using this library yet. That part is what I'm working on now.
@swift-nio-bot
Copy link

Can one of the admins verify this patch?

Copy link
Contributor

@Lukasa Lukasa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Firstly, thanks so much for this @AlanQuatermain! This is massive patch and is generally of really high quality, which is absolutely fantastic. It's also a great first step towards removing the C code from this implementation, which would be truly excellent.

I've left a number of comments in the diff. The large number should not be considered a reflection on the quality of the patch, only on its size.

It's also really good to see that you're interested in doing more work on removing the C code. If you're planning to keep going on it, can I suggest that you consider opening pull requests earlier on in the development flow, with your proposed designs and rationale in place? That will make it easier for us to collaborate on the process, reducing the amount of development work you risk doing that isn't in line with the project direction. It'd also let me parallelise my work with yours when I get some bandwidth to return to this package and chase down the rest of the nghttp2 removal work.

Again, great job with this patch, thanks so much!

Package.swift Outdated
@@ -30,8 +31,13 @@ let package = Package(
.target(name: "NIOHTTP2Server",
dependencies: ["NIOHTTP2"]),
.target(name: "NIOHTTP2",
dependencies: ["NIO", "NIOHTTP1", "NIOTLS", "CNIONghttp2"]),
dependencies: ["NIO", "NIOHTTP1", "NIOTLS", "CNIONghttp2", "NIOHPACK"]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for adding NIOHPACK as a dependency here, we can save the compiler some effort and not bother compiling this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem. I'll add it back in the next pull request once I've added the frame compiler/decoder to NIOHTTP2.

Package.swift Outdated
@@ -19,7 +19,8 @@ let package = Package(
name: "swift-nio-http2",
products: [
.executable(name: "NIOHTTP2Server", targets: ["NIOHTTP2Server"]),
.library(name: "NIOHTTP2", targets: ["NIOHTTP2"])
.library(name: "NIOHTTP2", targets: ["NIOHTTP2"]),
.library(name: "NIOHPACK", targets: ["NIOHPACK"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we want to make NIOHPACK publicly available: it's basically going to be an implementation detail for NIOHTTP2.


import NIOConcurrencyHelpers

class DynamicHeaderTable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that an HPACK dynamic table is a per-connection object, I don't think this should be a class, I think it should be a struct. That has the advantage of making the future memory layout of any HTTP/2 connection object substantially more cache-coherent, which is a big win.

/// The actual table, with items looked up by index.
fileprivate var table: HeaderTableStore = []

fileprivate var lock = ReadWriteLock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making this a struct also lets you elide this lock, as we no longer need to consider users concurrently mutating this structure from multiple threads: that's simply forbidden.


subscript(i: Int) -> HeaderTableEntry {
return self.table[i]
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we document that our dynamic table is zero indexed for the purposes of this subscript?

(0x7ffffee, 27), (0x7ffffef, 27), (0x7fffff0, 27), (0x3ffffee, 26), (0x3fffffff, 30)
]

// Great googly-moogly! This comes from the nice folks at nghttp.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah, nice comment. 😉

Want to extend this a bit with some explanatory text, like from this Python package? https://github.com/python-hyper/hpack/blob/c76d07a6b07a3473bde21b972353be3863a9b68f/hpack/huffman_table.py#L2-L72

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incidentally we probably need a NOTICES file to add the license text from nghttp2.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and more doc comments in the objects in this file would be really nice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, but I must insist on replacing every "nibble" with "nybble." Just because 😉

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will accept that editorial revision. 😂

static let failure = HuffmanDecoderFlags(rawValue: 0b100)
}

internal let HuffmanDecoderTable: [[HuffmanDecodeEntry]] = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is a great solution, but I think there are a few nice wins we can have here.

Firstly, array-of-array adds more indirection than we actually need. Rather than have an array of 256 elements, we should have an array of 256 * 16 elements by flattening out all the inner arrays (but preserving their contents).

Once you do that you lose the nice indexing behaviour, but we can get that back by wrapping this table in a struct that exposes a sensible subscript.

Something like this:

internal struct HuffmanDecoderTable {
    subscript(state state: UInt8, nibble nibble: UInt8) {
        assert(nibble < 16)
        return actualTable[(state * 16) + nibble]
    }

    private static let actualTable: [HuffmanDecodeEntry] = [
        // insert flattened table here.
    ]
}

That reduces one indirection at the cost of some math, which is a nice win I think.

HeaderTableEntry("vary", nil),
HeaderTableEntry("via", nil),
HeaderTableEntry("www-authenticate", nil)
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we pull this out to a new file and just leave the Huffman table here? I really like the idea of sequestering the Huffman table as far from everything else as possible so that we never have to look at the damn thing. 😉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most certainly.

// Created by Jim Dovey on 6/27/18.
//

import Foundation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can drop this. 😉

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. That shouldn't have been in the commit 🤫

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still here ;)

// now a silly version which should never have been encoded in so many bytes
XCTAssertEqual(try decodeInteger(from: [255, 129, 128, 128, 128, 128, 128, 128, 128, 0], prefix: 8), 256)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This testing is good, but I'd love to see even more. Are you open to bringing in these test fixtures and using them to generate tests? We probably want to test that a) we can decode any of the encoded samples and get the expected result, and b) that we can encode the samples, decode them, and get back to the expected result.

That would give us loads of test coverage, which would be awesome.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rawr. Definitely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's your preference for this? Have the regular XCTest implementation use these, or have a separate verification script that runs through them, leaving the XCTests for something simpler?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is a real future growth area for Swift tooling: the inability to dynamically turn each of those into an XCTest test case makes me extremely sad.

That said, I think I'd still prefer that they lived in XCTests. We probably get the best outcome by using custom error messages on each of the XCTAssert methods that encode the identity of the test case that failed the specific assertion, so that we can more easily track down failures.

@Lukasa
Copy link
Contributor

Lukasa commented Jun 28, 2018

@swift-nio-bot test this please

@Lukasa
Copy link
Contributor

Lukasa commented Jul 12, 2018

Hey @AlanQuatermain, what's the status of this patch?

@AlanQuatermain
Copy link
Contributor Author

Code is written, my own tests run successfully, and I'm integrating the tests from the HPACK test suite. Right now I'm trying to track down a weird bug: while encoding the fourth case of story 06, my decoder doesn't produce the same data that was input, and the two items' dynamic header tables' contents and size don't match. BUT that only happens if I run the entire suite of all stories. If I just run story 06, everything is happy.

The most annoying thing here is that I'm creating new encoder/decoder pairs for each story, with empty dynamic tables and new Huffman encoders. So I'm kinda scratching my head as to what's going on here.

@AlanQuatermain
Copy link
Contributor Author

Huh. It does sometimes go wrong if I just run that one test. Not often, but sometimes. In those cases, two bytes of the Huffman-encoded output are incorrect: A7 → AF and EB → E9. Both are single-bit changes.

Well, at least I know where it's happening now.

@AlanQuatermain
Copy link
Contributor Author

Okay, that problem is solved—UnsafeMutableRawPointer.allocate() isn't zeroing bytes, so I need to do that myself. Now it's crashing much later on in the decoder by going out of bounds while attempting to read an encoded integer. Sigh.

@Lukasa
Copy link
Contributor

Lukasa commented Jul 13, 2018

The joys of avoiding bounds checks! Thanks for keeping up the work though. 👍

Just to keep you appraised I’m swinging back around to the HTTP/2 stuff. I’m going to start work on the other side of the coin, with the frame parser. Hopefully I’ll have a patch up and ready by the end of next week.

@AlanQuatermain
Copy link
Contributor Author

AlanQuatermain commented Jul 13, 2018

Turns out the bounds problem was due to the indices on ByteBufferView not being zero based. There's something else going wrong now that I'll have to look into tomorrow. In the meantime, though, I do have a frame parser as well. I basically have all the wire protocol handling implemented, just needing conversion to mesh with the NIO HTTP/2 implementation. You can see broadly what it looks like at my older repository, where I initially started working on it. I have another variation here somewhere which has been ported to use NIO instead of Foundation, but that's not on Github apparently.

@Lukasa
Copy link
Contributor

Lukasa commented Jul 13, 2018

Turns out the bounds problem was due to the indices on ByteBufferView not being zero based.

Ah yeah, the typical Swift collections issue. Easy trap to fall into, I do it myself a lot.

In the meantime, though, I do have a frame parser as well.

Yeah, I did see that. In the end I concluded that it was probably just as easy to greenfield the frame parser, given that we'd have to make the following set of changes to yours:

  1. Do work early by eagerly parsing frame headers even when the entire frame body is not available.
  2. Avoid buffering DATA frames in their entirety.
  3. Produce NIO's HTTP/2 frame structures.
  4. Port to NIO data structures e.g. bytebuffer (as I didn't see any code with NIO support)

All-in-all that made me err on the side of doing the work in a green field implementation.

If you have a NIO-based implementation sitting around I'll happily re-evaluate it and see how much work it would be to adapt to NIOHTTP2.

@Lukasa
Copy link
Contributor

Lukasa commented Jul 13, 2018

Oh, also, it's my understanding that you have the wire format stuff but none of the protocol state machine, is that correct? It's totally fine, I've built a HTTP/2 state machine before and I'm happy to do it again, just wanted to check that I'd correctly understood the work you've done.

@AlanQuatermain
Copy link
Contributor Author

That's correct—wire protocol is implemented and tested (well, was, anyway) but I'd not yet begun work on the stream things. I never studied CS beyond sixth-form, so when specs throw out references to adjustable-priority queues like they're something everyone would know I rather break out in a cold sweat… 😰

I'm going to look at uploading the NIO-based decoder somewhere shortly, as soon as I've fixed my obviously incorrect decode-bytes-from-hex-string implementation in these integration tests. Oops.

- New `SimpleRingBuffer` type which is used to store the contents of the header tables.
- New `HeaderTableStorage` type which uses a `SimpleRingBuffer` and a `CircularBuffer` to contain header table bytes and indices respectively.
- New `HPACKHeaders` type modeled on `NIOHTTP1.HTTPHeaders` to keep headers around as contiguous UTF-8.
- HPACK encoders and decoders now prefer using `HPACKHeaders` for input and output, though they still provide public API for working with raw `String` types.
- Huffman encoder and decoder are now stateless, and operate on `ByteBuffer` and related types.
- Many things that use bytes are now using `Collection<UInt8>` or `ContiguousCollection<UInt8>` wherever possible, to free up the API to support different types of inputs. `String.UTF8View` is one such example.
- Huffman decoder table is no longer a big array of tuples in the source code. Instead the source contains a big Base-64 string from which the tuple-based table is decoded when it's first required. This reduced the compile time on my laptop by about 40 seconds, which frankly seems silly---in C, a plain array laid out like that wouldn't cause the compiler to bat an eyelid.

Aside from all the code changes, there's a major change in the tests. I'm now pulling in <https://github.com/http2jp/hpack-test-case> as a submodule and dynamically loading and running all the test stories there:

- `testEncoderWithoutHuffmanCoding()` runs all the stories in the `raw-data` folder, encoding and decoding, and writes out the resulting stories to `swift-nio-hpack-plain-text`.
- `testEncoderWithHuffmanCoding()` runs all the stories in the `raw-data` folder, encoding and decoding, and writes out the resulting stories to `swift-nio-hpack-huffman`.
- `testDecoder()` runs through all sets of stories containing wire data and decodes them all, including the ones we just generated while running our encode tests.
@AlanQuatermain
Copy link
Contributor Author

'Tis done, sir. Lots of information inside the commit message, though it's an even bigger diff than was originally the case. Sigh.

As to the frame encode/decode implementation, that would now need to be reworked again based on the changes I've made here. I can do that & submit a separate pull request early next week. One of the main differences between my approach and yours is around the handling of things like settings and frame flags; I quite liked my enumeration representation for those. I also found the OptionSet representation of frame flags to be useful when it comes to ignoring and/or not sending unknown flags. I would have each frame type understand what flags it allowed and vend that for use in intersection() or similar.

My initial first scratch-pad attempt has been uploaded now. It's incomplete, but it should give some idea how I was intending to go about things originally.

@Lukasa
Copy link
Contributor

Lukasa commented Jul 16, 2018

I never studied CS beyond sixth-form, so when specs throw out references to adjustable-priority queues like they're something everyone would know I rather break out in a cold sweat… 😰

Heh, the secret is that the priority queues are entirely optional, and can be omitted when first building the implementation. That's the plan here: we currently don't allow users to see any priority information, and will continue to do so while we replace nghttp2. Priority support can come later.

One of the main differences between my approach and yours is around the handling of things like settings and frame flags; I quite liked my enumeration representation for those.

I don't mind using an OptionSet for flags. I mostly avoided it for the sake of not overcomplicating the API in the initial version: I'd certainly be happy to accept a patch that swaps to that representation.

I'm certainly open to seeing a PR for the frame codec. We'll likely have to do some substantial reworking of it to achieve the incremental frame parsing support we need, but if you'd like to do that work I'm certainly not going to stop you! 😉

Package.swift Outdated
],
dependencies: [
.package(url: "https://github.com/apple/swift-nio.git", from: "1.7.0"),
.package(url: "https://github.com/apple/swift-nio.git", .branch("master")),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where has this requirement come from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of the tagged builds of swift-nio contain the fix for the precondition in ByteBufferView.init(), and that was causing me to crash all over the place when taking a view of the last chunk of a buffer 😉 If there's been a version-tagged release since last week which includes that change, I'll be happy to take that, because as I'm sure you're aware, targeting master isn't a great solution…

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checked, and the v1.8.0 tag on swift-nio still contains the following broken precondition in ByteBuffer-views.swift:

precondition(range.lowerBound >= 0 && range.upperBound < buffer.capacity)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, there isn't a release yet. 1.9.0 is probably a few weeks away at the moment, so we just need to remember this is here ;)

/// The actual table, with items looked up by index.
private var storage: HeaderTableStorage

/// List of indexes to actual values within the table.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is floating in the air, attached to nothing.


import NIO

struct DynamicHeaderTable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This structure should get a doc comment.

/// - Returns: A tuple containing the matching index and, if a value was specified as a
/// parameter, an indication whether that value was also found. Returns `nil`
/// if no matching header name could be located.
func findExistingHeader<C : Collection>(named name: C, value: C? = nil) -> (index: Int, containsValue: Bool)? where C.Element == UInt8 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is a bit more tightly constrained than it needs to be: there's no reason the name and value can't be different collection types.

firstNameMatch = index
}

if let value = value, self.storage.view(of: self.storage[index].value).matches(value) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an early-exit condition here we aren't taking advantage of.

Rather than repeatedly unwrap value, let's hoist this up to the top of the function and invert the condition. We can do this:

guard let value = value else {
    // No constraint on the value, so the first matching name is perfect.
    return self.storage.indices(matching: name).first.map { ($0, false) }
}

That avoids the wrinkle found in this current version of the code, where searching for a name with no value would search the entire header table unnecessarily. It also means we only unwrap the optional once.

guard self.count == other.count else {
return false
}
for (mine, theirs) in zip(self, other) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a function called elementsEqual that might do what we want here?

return true
}

public func matches<S : Sequence>(_ other: S) -> Bool where S.Element == UInt8 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: space before the colon.

}

public func matches<S : Sequence>(_ other: S) -> Bool where S.Element == UInt8 {
for (mine, theirs) in zip(self, other) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can definitely use elementsEqual.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, I knew I'd seen something else that would do this. Couldn't for the life of me remember what it was though.


import NIO

public struct RingBufferView : ContiguousCollection, RandomAccessCollection {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth making this conform to ContiguousCollection by forcing a memory copy when we overlap the ring buffer? Or would we do better by just using Collection and allowing this to straddle the end of the ring?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly, though its main use case is in providing a ContiguousCollection to be copied into ByteBuffers. The optimization here (and indeed everywhere else) is to try and avoid the byte-by-byte copy semantics of the Sequence writers in ByteBuffer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that's a false economy.

You're trading the byte-by-byte write for two memory allocations and two linear copies, one of the entire ring buffer and one of the view. Even if you assume the byte-by-byte write is 8x slower than a linear copy, and that the view is the entire ring buffer, if the allocations are even 3x more expensive than the copy then the efficiency has vanished. For small copies this is basically guaranteed. For full buffers this is also going to be an issue, as while we do an optimised vectorised copy of ring buffer we still have to copy the whole thing, and if that thing is 8x the size of the view we want then that copy is more expensive than the direct byte-by-byte write.

ContiguousCollection is really a protocol that is intended to be applied to things that already are contiguous, not to things that can become contiguous by allocating. That's why StaticString conforms to ContiguousCollection, but String does not: it's cheaper to use the Collection machinery on String than to make it contiguous and then memcpy the bytes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An allocation for this view would only happen in the case where a view straddled the ends of the ring buffer. For everything else—99% of the use cases—it's a straightforward reference to the underlying contiguous bytes.

I see your point about copying the entire buffer though. That was just me being lazy and not wanting to create a separate RingBuffer instance to hold the requested bytes. How about if I just do that? We still get the economy in 99% of cases, and in 1% we make one allocation and a pair of small copies?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have generic targets determined at runtime—I could return two instances of a protocol type, with one conforming to ContiguousCollection, and one to Collection, and expect the Right Thing to happen. Le sigh.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done some tests, and the ratio of copies to direct references is pretty small. Here's the results after running all the decoder tests:

Ring buffer views created: 805842. Number which required copies: 914. Ratio: 0.11342173776000754%

And here for the encoder tests:

Ring buffer views created: 237170. Number which required copies: 873. Ratio: 0.36809039929164733%

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd need to see the performance cost of the Collection vs ContiguousCollection over the decoder tests. Mind benchmarking what happens if you change the protocol conformance?

}
}

public struct SimpleRingBuffer {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason this shouldn't just wrap a ByteBuffer, rather than bring in all the implementation details of ByteBuffer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried a few different things, but from outside NIO itself I found it quite awkward to manage the internals of the ByteBuffer directly. I found I was doing basically all the same calculations, and the only real difference was that I was either using the ByteBuffer API or the UnsafeMutablePointer API for the underlying work. Since this had only a single client and one use-case I cut it down to just mirroring the String handling from ByteBuffer and none of the fancy raw pointer stuff, with the exception of withVeryUnsafePointer which I'd used somewhere in a private method somewhere else, or so I seemed to recall.

There's definitely an argument for something like ByteBuffer that actually operates as a fixed-capacity circular collection, but the right place for that would be inside NIO proper, sharing some of the implementation details with ByteBuffer directly. I could probably put one together as a PR for swift-nio, but that would happen later; after that I'd change the implementation of the header tables here to use that type instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'm a bit uncertain about why maintaining two indices outside of ByteBuffer and using the set/get APIs was more complex than bringing this code in. What am I missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have another go; maybe it'll become simpler now that I've really pared down the ring buffer's API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, that's done, and I've pruned a lot of the speculative API that wasn't being used yet.

- Huffman encode/decode is now implemented directly on `ByteBuffer` as an extension, with get/read and set/write method pairs.
- Various APIs that take header key/value pairs as Collections or similar now allow differing types for the name and value.
- A few API tweaks as requested by the core team.
@Lukasa
Copy link
Contributor

Lukasa commented Jul 17, 2018

Thanks for pushing up the new patch! I'm going to let us finish circling around our discussion points before I re-review: each review of this patch takes around half a day, so I want to spread them out a bit just to make sure I get other stuff done!

Jim Dovey added 6 commits July 17, 2018 15:12
…as its storage, and its read/write indices have been renamed as head/tail. Much of HPACKHeaders API has been pruned or made private, and RingBufferView has been replaced wholesale by ByteBufferView. In the rare (0.11%–0.36%) case that a requested view wraps the end of the ring, a small ByteBuffer will be allocated to serve as the backing store for the requested view.
…bose, and to report useful timing metrics.
…d. There are still some unused prospective API lurking in there though, lowering the code coverage statistics...
@AlanQuatermain
Copy link
Contributor Author

Oh, hooray, String.UTF8View has its own implementation of _copyContents() that calls fatalError() if it can't write the entire thing. So the initialize(from:) call used to copy sequences is just crashing. Sigh.

@AlanQuatermain
Copy link
Contributor Author

Okay, it's only complaining because the target byte-count is below underestimatedCount, I think. I'm using dropLast(n:) to keep it from complaining about that. Not entirely sure on the economy of that move for the indefinite-length Sequence type, but it should happen rarely enough that it's not a problem. It also reminds me to support set(staticString:) and friends, too.

@AlanQuatermain
Copy link
Contributor Author

Oh wait, I already do support static strings, hehe.

…ng buffer, and added some description-related tests to get the code coverage up to ~90%.
@Lukasa
Copy link
Contributor

Lukasa commented Aug 6, 2018

Ok cool, I don't have admin on the Jenkins box so I can't add the submodule checkout until @tomerd gets back on Thursday. Let's spin back around on this then.

// this function if at least one byte is available to read.
let initial: UInt8 = buffer.getInteger(at: buffer.readerIndex)!
switch initial {
case let x where x & 0x80 == 0x80:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that you use 0b1xxxxxxx to document, you could just write this case (and the ones below) as

case let x where x & 0b1000_0000 != 0:

that might make it easier to read?

return false
}

let lhsNames = Set(lhs.names.map { lhs.string(idx: $0).lowercased() })
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think we should use .lazy.map here as otherwise you'll first build (allocate & initialise) an Array and then build a Set throwing the newly built Array away again.

}

mutating func adjust(by delta: Int, wrappingAt max: Int) {
if start + delta < 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explicit self. for self.start

@Lukasa
Copy link
Contributor

Lukasa commented Aug 8, 2018

@weissi want to re-review?

Copy link
Member

@weissi weissi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much @AlanQuatermain! This looks pretty good! After this I'm happy, it's obviously a huge PR but there we do seem to have a good amount of tests here so I think we should start with this.

The only remaining things I have are regarding all these @_ attributes. I think we should not use them with a few exceptions:

  • public stuff that is performance sensitive and uses generics or closures can be @_inlineable and/or @_versioned because those attributes have now been made public (as of 4.2 as @inlinable and @usableFromInline)
  • other attributes that used to be @_ but are now public
  • the usual exceptions to the rule where an attribute is just obviously important (we should then consider filing a bugs.swift.org bug with this as an example why we think it should be public)

CONTRIBUTORS.txt Outdated
@@ -12,7 +12,11 @@ needs to be listed here.
### Contributors

- Cory Benfield <cbenfield@apple.com>
- Jim Dovey <jimdovey@mac.com>
- Johannes Weiß <github@tux4u.de>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mind making this read

Johannes Weiss <johannesweiss@apple.com>

just so my work email address is listed there, sorry

/// - Returns: A tuple containing the matching index and, if a value was specified as a
/// parameter, an indication whether that value was also found. Returns `nil`
/// if no matching header name could be located.
@_specialize(where Name == String.UTF8View, Value == String.UTF8View) // from String-based API
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we actually need to manually specialise here? We use them both from within this module so it should already be specialised, no? I don't think it's decided yet that @_specialize will become public as @specialize so if possible I'd prefer not to use it at least until it is. @Lukasa what's your take?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested it mainly to defend against running into problems. I'm not wedded to it, but it strikes me that we're unlikely to see a meaningful change to _specialize.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lukasa well, syntax might be different. Or the whole same set of problems that we had with @_inlineable and @_versioned: Fortunately Slava was nice enough to take the warnings out of the language version 4 but they could've also just renamed the attribute without a warning...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove them for now, and they can be re-added if necessary in a performance-focussed update. They caused a crash in the Swift 4.2 compiler anyway…

Personally I prefer getting the functionality in place with one commit, then adding the more esoteric performance adjustments separately. First commit works, more can be added for optimization and fine-tuning, of which there may be some here as the HTTP/2 framer implementation evolves.

/// - name: A collection of UTF-8 code points comprising the name of the header to insert.
/// - value: A collection of UTF-8 code points comprising the value of the header to insert.
/// - Returns: `true` if the header was added to the table, `false` if not.
@_specialize(where Name == String.UTF8View, Value == String.UTF8View) // from String-based API
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment here

/// - name: A contiguous collection of UTF-8 bytes comprising the name of the header to insert.
/// - value: A contiguous collection of UTF-8 bytes comprising the value of the header to insert.
/// - Returns: `true` if the header was added to the table, `false` if not.
@_specialize(where Name == ByteBufferView, Value == ByteBufferView) // from HPACKHeaders-based API
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

@_versioned private(set) var _readableBytes = 0

// private but tests
@_inlineable @_versioned
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we have @_inlineable and @_versioned here? the struct StringRing is not public, nor do they use generics or closures so I don't understand why they would need to be inlined across a module boundary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're an artifact of an earlier commit where I tried to have this be a general-purpose analogue to ByteBuffer. I should've removed them when I removed public.


// single write, should be straightforward
self._storage.withVeryUnsafeBytes { ptr in
let targetAddr = UnsafeMutableRawPointer(mutating: ptr.baseAddress!.advanced(by: self._ringTail))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be

let tailPtr = UnsafeMutableRawBufferPointer(rebasing: ptr[(ptr.startIndex + self._ringTail)...]) // given that we know we're not dealing with a slice here, we could drop the `ptr.startIndex +`
tailPtr.copyBytes(from: bytes)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler makes be do this in two steps, but yeah, I like this API more:

let mutable = UnsafeMutableRawBufferPointer(mutating: ptr)
let tailPtr = UnsafeMutableRawBufferPointer(rebasing: mutable[(mutable.startIndex + self._ringTail)...])
tailPtr.copyBytes(from: bytes)

if bytesNeeded <= self.writableBytes {
// just zero the requested number of bytes before we start OR-ing in our values
self.withUnsafeMutableWritableBytes { ptr in
ptr.baseAddress!.assumingMemoryBound(to: UInt8.self).assign(repeating: 0, count: bytesNeeded)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate that we only recently changed our rules here but we should not longer blindly assume the memory is bound to UInt8. To just zero out the memory I'd suggest

ptr.copyBytes(from: repeatElement(0, count: bytesNeeded))


// now zero all writable bytes that we expect to use
self.withUnsafeMutableWritableBytes { ptr in
ptr.baseAddress!.assumingMemoryBound(to: UInt8.self).assign(repeating: 0, count: bytesNeeded)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

}
return withVeryUnsafeBytes { buffer in
// This should never happens as we control when this is called. Adding an assert to ensure this.
let address = buffer.baseAddress!.assumingMemoryBound(to: UInt8.self)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to do any binding assumptions here. buffer does just fine to produce UInt8s.

let address = buffer.baseAddress!.assumingMemoryBound(to: UInt8.self)
for (idx, byte) in view.enumerated() {
// TODO(jim): Not a big fan of the modulo operation here.
guard byte.isASCII && address.advanced(by: ((index.start + idx) % self.capacity)).pointee & 0xdf == byte & 0xdf else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace address.advanced(by: ((index.start + idx) % self.capacity).pointee by buffer[(index.start + idx) % self.capacity].

Lukasa pushed a commit to apple/swift-nio that referenced this pull request Aug 9, 2018
…API (#525)

Motivation:

HTTP/2 adds an additional item to a header key/value pair:
its indexability. By default, headers can be inserted into
the HPACK header table index; sometimes this will be denied
(common case is 'content-length' header, which is rarely going
to be the same, and will just cause the table to purge more
often than is really necessary). Additionally, a header may be
marked as not only non-indexable but also non-rewritable by
proxies. HTTPHeaders provides nowhere to store this, so the
HPACKHeaders implementation referenced in
apple/swift-nio-http2#10 was created to add in that capability.

Now, given that we really want the HTTP version to be an
implementation detail, we want to keep the HPACK details hidden,
and would thus be using HTTPHeaders in client APIs. NIOHTTP2
already has a HTTP1-to-HTTP2 channel handler that translates
between the two. Thus it behooves us to have the means to copy
the actual key/value pairs between the two types without making
a round-trip from UTF-8 bytes to UTF-16 Strings. These changes
allow NIOHPACK or NIOHTTP2 to implement that round-trip internally.

Modifications:

- HTTPHeader and HTTPHeaderIndex types are now public.
- HTTPHeaders.buffer and HTTPHeaders.headers properties are now
  public.
- A new static method, HTTPHeaders.createHeaderBlock(buffer:headers:)
  was created to serve as a public means to create a HTTPHeaders from
  a ByteBuffer from outside NIOHTTP1. @Lukasa suggested this should
  be a static method rather than an initializer.

Result:

Nothing in NIOHTTP1 changes in operation. All public types have
documentation comments noting that they are only public for very
specific reasons, and are not intended for general use. Once this
is committed, NIOHPACK & NIOHTTP2 will be able to implement fast
round-trips between HTTPHeaders and HPACKHeaders.
@AlanQuatermain
Copy link
Contributor Author

Ping! How is this looking now?

Copy link
Member

@weissi weissi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy here, thanks very much @AlanQuatermain

@@ -0,0 +1,3 @@
[submodule "hpack-test-case"]
path = hpack-test-case
url = git@github.com:http2jp/hpack-test-case
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick question for @normanmaurer . Do we need to add this to NOTICE.txt? It's only used for tests.

@tomerd
Copy link
Member

tomerd commented Aug 14, 2018

@Lukasa happy to help, i see tests are passing now. do we still need to do something here re. CI setup?

@AlanQuatermain
Copy link
Contributor Author

@tomerd There's a submodule used by some of the HPACK tests. The CI would need to be updated to initialize that and pull it down. Also note the submodule remote will likely change once this is live—it currently points to my own fork of a repo.

@tomerd
Copy link
Member

tomerd commented Aug 14, 2018

@AlanQuatermain @Lukasa i update the CI setting to fetch submodules, i hope this is what you need. can you confirm?

@swift-nio-bot test this please

@AlanQuatermain
Copy link
Contributor Author

Assuming the tests all run (they'll take a bit longer with all the test data from the submodule), then everything should be peachy.

CONTRIBUTORS.txt Outdated
- Johannes Weiß <johannesweiss@apple.com>
- Tim <0xTim@users.noreply.github.com>
- tomer doron <tomer.doron@gmail.com>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use my apple email address? tomer@apple.com

@tomerd
Copy link
Member

tomerd commented Aug 14, 2018

00:00:02.402 using GIT_SSH to set credentials 
00:00:02.402  > git fetch --tags --progress git@github.com:apple/swift-nio-http2.git +refs/pull/10/*:refs/remotes/origin/pr/10/*
00:00:02.805  > git rev-parse refs/remotes/origin/pr/10/merge^{commit} # timeout=10
00:00:02.808  > git rev-parse refs/remotes/origin/origin/pr/10/merge^{commit} # timeout=10
00:00:02.810 Checking out Revision e3b48433877895ff6339b79a835f8ebbd0088f28 (refs/remotes/origin/pr/10/merge)
00:00:02.810  > git config core.sparsecheckout # timeout=10
00:00:02.812  > git checkout -f e3b48433877895ff6339b79a835f8ebbd0088f28
00:00:02.855 Commit message: "Merge 43d8ad5cdabfee061be096d9588b89b6b0c5bb59 into 80963ee509e2d82b68c6e576dde9fb126bc9bcb3"
00:00:02.856 First time build. Skipping changelog.
00:00:02.856 Cleaning workspace
00:00:02.856  > git rev-parse --verify HEAD # timeout=10
00:00:02.858 Resetting working tree
00:00:02.858  > git reset --hard # timeout=10
00:00:02.871  > git clean -fdx # timeout=10
00:00:02.914  > git submodule foreach --recursive git reset --hard # timeout=10
00:00:02.954  > git submodule foreach --recursive git clean -fdx # timeout=10
00:00:02.980  > git remote # timeout=10
00:00:02.982  > git submodule init # timeout=10
00:00:03.021  > git submodule sync # timeout=10
00:00:03.058  > git config --get remote.origin.url # timeout=10
00:00:03.064  > git submodule init # timeout=10
00:00:03.096  > git config -f .gitmodules --get-regexp ^submodule\.(.+)\.url # timeout=10
00:00:03.099  > git config --get submodule.hpack-test-case.url # timeout=10
00:00:03.101  > git remote # timeout=10
00:00:03.103  > git config --get remote.origin.url # timeout=10
00:00:03.119  > git config -f .gitmodules --get submodule.hpack-test-case.path # timeout=10
00:00:03.121 using GIT_SSH to set credentials 
00:00:03.122  > git submodule update --init --recursive hpack-test-case

seems like its checking it out?

@AlanQuatermain
Copy link
Contributor Author

Looks like the checkout works and the tests run as expected:

23:23:59 Test timing metrics:
23:23:59  - haskell-http2-naive-huffman           : 9.12 seconds.
23:23:59  - haskell-http2-static-huffman          : 7.98 seconds.
23:23:59  - haskell-http2-linear-huffman          : 6.43 seconds.
23:23:59  - go-hpack                              : 9.08 seconds.
23:23:59  - nghttp2-change-table-size             : 6.42 seconds.
23:23:59  - node-http2-hpack                      : 6.45 seconds.
23:23:59  - haskell-http2-naive                   : 4.86 seconds.
23:23:59  - haskell-http2-linear                  : 4.83 seconds.
23:23:59  - swift-nio-hpack-huffman               : 6.50 seconds.
23:23:59  - python-hpack                          : 6.54 seconds.
23:23:59  - haskell-http2-static                  : 4.76 seconds.
23:23:59  - nghttp2                               : 6.27 seconds.
23:23:59  - swift-nio-hpack-plain-text            : 4.94 seconds.
23:23:59  - nghttp2-16384-4096                    : 6.17 seconds.
23:23:59 Test Case 'HPACKIntegrationTests.testDecoder' passed (130.125 seconds)

@AlanQuatermain
Copy link
Contributor Author

@tomerd I've updated the contributor list and added a mapping in the mailmap from your gmail address to your apple one. Sigh. I used to have one of those… 😟

@tomerd
Copy link
Member

tomerd commented Aug 14, 2018

<3 thanks

@Lukasa
Copy link
Contributor

Lukasa commented Aug 16, 2018

Ok, let's try to pull the trigger on this monster.

@Lukasa
Copy link
Contributor

Lukasa commented Aug 16, 2018

@swift-nio-bot test this please

@Lukasa Lukasa merged commit 6f8f523 into apple:master Aug 16, 2018
@Lukasa
Copy link
Contributor

Lukasa commented Aug 16, 2018

Nicely done @AlanQuatermain, great chunk of work! Looking forward to seeing the next stage here. 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants