Skip to content
This repository has been archived by the owner on Nov 23, 2021. It is now read-only.

GCD based, Asynchronous Implementation of the API #96

Closed
wants to merge 17 commits into from

Conversation

helje5
Copy link
Contributor

@helje5 helje5 commented Nov 10, 2017

This PR contains a GCD DispatchIO based implementation of the API and presumably should scale significantly better than the synchronous one. Throughput should be somewhat worse for obvious reasons.

I also setup a small test server demonstrating the API: http-testserver. Of special interest is the slow handler, which does the typical sleep() async demo.

Unlike Noze.io or Node.js it does not use a single network processing queue, but rather a set of base queues as suggested by Johannes. To support that, the handler API has been enhanced with the queue argument. All API calls of a handler need to be dispatched back to that (but the handler itself runs on the same queue, so the trivial cases require no extra work).

This implementation is supposed to properly support back pressure and pipelining.

Caveats:

  • tests don't work
  • little testing has been done, probably has tons of bugs ;-)

The little socket stuff we need we can do directly in the
code. No need to wrap such basics.

The StreamingParser was some combination of
response writer and parser. Those are really
distinct things (the writer doesn't really
need to know about the parser, and the reverse).
Similar to PR swift-server#86. In this case it is non-optional,
that thing being a concrete implementation.

All API functions need to run on the given queue. But
since the handler itself also runs on that queue, it
only needs to be done if the handler actually dispatches
to a different queue.

This is an optimization to avoid excessive dispatching
in async environments. (if API calls could be issued
from arbitrary queues, they would always need to be
queued to the handler queue. which is possible but
pretty expensive).

P.S.: The argument is necessary because there is
nothing like `DispatchQueue.current` in GCD anymore.
This is a concrete implementation. Unless we do anything as
protocols, there is no need to have this as one.
HTTP server which uses GCD to handle async I/O.
... on Darwin. Use OSAtomicIncrement/Dec.
Presumably tests would need to be done differently in an async
setup. Not sure how much we could reuse here.
Import Dispatch explicitly, some SOCK_STREAM weirdness.
... debug log, remove.
Copy link
Contributor

@carlbrown carlbrown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach, but we at least need to get the tests running first before we merge it. I can help with that, if you think more than one of us can work on it without stepping all over each other.

@helje5
Copy link
Contributor Author

helje5 commented Nov 10, 2017

It would be awesome if you would get the tests working again. I don't plan to work on such any time soon, so feel free to do anything you want!

Adjusting the tests might be challenging, because, well, they need to be asynchronous. For Noze.io I have a helper class which tries to deal with the asynchronicity, not sure whether it might help you or not (or even whether this is the correct way to test that stuff): NozeIOTestCase.swift

The other hurdle is that I made the HTTPResponseWriter concrete (i.e. it is not a protocol anymore). I suppose that makes some mocking hard. We could change it back to a protocol, but I think it was suggested that we should avoid them for performance reasons.

@carlbrown carlbrown mentioned this pull request Nov 10, 2017
@helje5
Copy link
Contributor Author

helje5 commented Nov 10, 2017

BTW: I also still wonder whether we would want to allow multiple implementations, or whether that is just bad for a basic core module. E.g. I can imagine that we might be able to use generics to efficiently do something like Apache MPMs, for example:

let asyncServer = HTTPServer<DispatchMPM>(port: ...)

and

let syncServer = HTTPServer<ThreadedMPM>(port: ...)

Both have their unique advantages and disadvantages. But maybe this would make it too complex, not sure.

@carlbrown
Copy link
Contributor

Adjusting the tests might be challenging, because, well, they need to be asynchronous. For Noze.io I have a helper class which tries to deal with the asynchronicity, not sure whether it might help you or not

Ok. Let me go to work on that, and we'll see how far I get.

//
// See http://swift.org/LICENSE.txt for license information
// Created by Helge Hess on 22.10.17.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to keep the swift.org license stuff, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, I wrote it. It should have my name on it 😬 License is fine though.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably not a good idea.
None of the other files in this repo have neither this nor "Copyright © 2017 SomeCompany..."
Do you think each time someone changes this file, they should add "Changed by..." to it as well? We have a git history for it, so your work won't be forgotten :)


private var acceptCount = 0
private let handlerBaseQueues : [ DispatchQueue ] = {
var queues = [ DispatchQueue ]()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit spacing var queues = [DispatchQueue]()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that is ugly. But feel free to run one of your style 'cleanups', I don't care.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Sounds good

private let handlerBaseQueues : [ DispatchQueue ] = {
var queues = [ DispatchQueue ]()
for i in 0..<ProcessInfo.processInfo.processorCount {
let q = DispatchQueue(label: "de.zeezide.swift.server.http.handler\(i)")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting - the idea here is that its most efficient to have a queue per processor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is not an idea, that is a fact ;-) There is also activeProcessCount which might be more appropriate (is that ever different?)

All this really belongs into the Options, care to file an Issue for this? E.g. the above is good for a server, but an on-device server may only want to use one thread.

Copy link
Contributor

@anayini anayini Nov 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm cool didn't know about that

Whereas the processorCount property reports the number of advertised processing cores, the activeProcessorCount property reflects the actual number of active processing cores on the system. There are a number of different factors that may cause a core to not be active, including boot arguments, thermal throttling, or a manufacturing defect.

I guess they can be different

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be the responsibility of libdispatch? Having 1 warm thread per processor is great, but GCD is (partly among other things) supposed to free us from having to worry about this detail (# of threads per process, and arguable # of active threads at all).

If we think it'll be commonly desirable to warm threads for this kind of application (a server written in Swift), can we move this logic behind a GCD API or have GCD (if it doesn't do this already) warm threads in a library constructor function? At the very least, I think we should open a ticket/issue and leave a TODO here to come back and clean this up.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DispatchQueue.concurrentPerform already leans on the internal function _dispatch_qos_max_parallelism to understand the capabilities of system in terms of max parallelism, which contains some logic that is like missed here (or at least repeated). I think that if we could figure out a way to get some help from GCD for this purpose, we'd end up with a better (at least in terms of less repeated code, if not in terms of better performance) result.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more threads won't help us to be faster

I think keeping the base queues is worth it for the reverse. I also do on-device servers and I would usually prefer to run those on less threads than the device has cores (one or maybe three on the iPhone Y). The same may be true on servers which are not dedicated to just one app.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also usleep(10_000) vs foo(remaining: 10000) 🤔

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@helje5 the usleep is just a crappy way to force it to not re-use the very same thread.

If we want to over-engineer it we could also just have a baseQueueProvider closure or something that either returns an appropriate base queue or if the user doesn't want that just nil. Because base queue nil means no target queue AFAIK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Man, this was 10_000 vs 10000 ;-)

The closure doesn't sound too bad. Would be optional and part of the Options object passed in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the closure and dropped the specific algorithm.

for i in 0..<ProcessInfo.processInfo.processorCount {
let q = DispatchQueue(label: "de.zeezide.swift.server.http.handler\(i)")
queues.append(q)
q.async { _ = 46 + 2 } // warm up threads
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly happens in Dispatch when you do this? Is there like a fixed cost on a DispatchQueue during the first async to it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the thread with Johannes on the mailing list. My takeaway was that GCD only assigns threads on-demand (which can take 100ms on Linux or something I don't know).
But maybe this code is non-sense, a GCD expert should have a look over the whole codebase.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did indeed say that and it is still the case (on Linux) for concurrent queues but not serial queues. Apologies.

private func handleListenEvent(on fd : Int32,
handler : @escaping HTTPRequestHandler,
localAddress : sockaddr_in)
{
Copy link
Contributor

@anayini anayini Nov 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I prefer:

private func handleListenEvent(
  on fd: Int32,
  handler: @escaping HTTPRequestHandler,
  localAddress: sockaddr_in
) {

which I think is common in many Swift style guides.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't, but I don't care much either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I care a lot but I only know this style guide: https://github.com/raywenderlich/swift-style-guide and that doesn't have it ...

guard !data.isEmpty else { return nil }
// TODO: reuse header values by matching the data via memcmp(), maybe first
// switch on length, compare c0
guard let s = String(data: data, encoding: .utf8) else { return nil }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can put these both in one guard if you like

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would I do this? Those are separate things. This one doesn't make sense for me.

internal class HTTPConnection : CustomStringConvertible {

/// The `HTTPServer` which created the connection. If the connection shuts
/// down, it will unregister from that server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: spacing / color violations

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

color violations??? :-)

2 spaces are right, has ARI approval: http://www.alwaysrightinstitute.com/swifter-space/

but feel free to change it.

private var writers = [ HTTPResponseWriter ]()
// FIXME: make this a linked list

internal init(fd : Int32,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: usually use newline before first arg when splitting args up on multiple lines

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually sounds a little weird, is there a style guide or standard lib example which does it like this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm - yeah maybe this is just a thing that we do in my company's internal style guide.

I actually don't even know, is there an Apple approved style guide? Couldn't find one with a quick search.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should find out if you propose such changes ;->

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess they do this only if the line would have been too long without, in which case they start the arguments on the next line.

I can go check to see if other projects are adhering to some style guide somewhere. Eventually we will probably run into a bunch of different tiny things like this.

Copy link
Contributor Author

@helje5 helje5 Nov 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really ugly, but fair enough. On the pro side, it uses a proper 2-space indent :-> Maybe we should have an issue to track what style we want to do?

Just to be clear, I do care a lot about style (and know very well that mine is the only correct one), but I think that in a project like this it should just match the mainline - please never argue with we do in my company's internal style guide but with Swift does it like this, check [here].

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 . Will file an issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't really arguing with we do this in my company's internal style guide. Only that I'm used to that style guide so I made the mistake of marking this as wrong here.

writers.remove(at: idx)
}
else {
assert(false, "did not find writer which went done: \(self)")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can use assertionFailure(message: String)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't care.

Copy link
Contributor Author

@helje5 helje5 Nov 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because that got a thumb down, I guess I should clarify that what I meant is that I don't care whether or not we do this. I'm happy to do either :-) I don't care whether it is assert(false) or assertionFailure.

@helje5
Copy link
Contributor Author

helje5 commented Nov 12, 2017

I enabled Issues on my fork https://github.com/ZeeZide/http/issues and marked those which may be necessary to merge this PR with a Required for PR label.

These tests pass on Linux if run one at a time,
but there's some issue so if you run them all,
the first test passes and the second one hangs. 

Still need to fix that.
@carlbrown
Copy link
Contributor

@helje5 I added a PR against your fork that has a batch of changes for getting tests to run on the Mac. (There will be another one once I find/fix a bug on Linux).

Gets the simple end-to-end tests running on Mac
@tanner0101 tanner0101 mentioned this pull request Dec 6, 2017
helje5 and others added 4 commits December 7, 2017 09:30
... this was utter non-sense, no idea why I did this :-)
The listen socket is already non-blocking and it already
deals w/ nio accept. 🤦‍♀️
Remove the specific algorithm for selecting a base
queue and use an optional closure to return one, as
suggested by Johannes.

By default this will not assign a target queue and
let GCD deal with this. See related thread on
PR swift-server#96.
@GeorgeLyon
Copy link

Don't want to derail this PR, but I think leaking the DispatchQueue the server uses into the handler is mistake. Read and write operations on a descriptor don't need to be synchronized as far as I am aware. An example where the server and handler synchronization primitives are separate can be found here: https://github.com/GeorgeLyon/Server

@helje5
Copy link
Contributor Author

helje5 commented Dec 11, 2017

Don't want to derail this PR, but I think leaking the DispatchQueue the server uses into the handler is mistake.

It is an optimisation. I don't really like it either, but the alternatives seem too expensive to me (require the response-writer to be free-threaded).

Read and write operations on a descriptor don't need to be synchronized as far as I am aware

Yeah, but I don't see how that helps us in any way. The goal here is to allow for asynchronous handlers which support back pressure. That implies more book-keeping. Which needs to be synchronised.

P.S.: Please lets continue the discussion on-list.

The connection doesn't need to know about the
server anymore. Instead we pass in a closure to
be called when the connection tears down.

Note: there is an intentional cycle between
server and connection. It will be broken when
the connection is done.
@helje5
Copy link
Contributor Author

helje5 commented Mar 1, 2018

@helje5 helje5 closed this Mar 1, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants