Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import HTTP resumable upload sample code #203

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

guoye-zhang
Copy link
Contributor

Support HTTP resumable upload.

(NIOResumableUpload depends on NIOHTTPTypes, so this PR includes changes in #202. Please leave comments specific for NIOHTTPTypes in #202)

Motivation:

Supporting HTTP resumable upload protocol defined in https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-resumable-upload-01

Modifications:

2 new public classes, HTTPResumableUploadHandler and HTTPResumableUploadContext, and a few other supporting objects to manage resumable uploads and translate them into regular uploads.

@Lukasa Lukasa added the 🔼 needs-minor-version-bump For PRs that when merged cause a bump of the minor version, ie. 1.x.0 -> 1.(x+1).0 label Oct 26, 2023
@Lukasa
Copy link
Contributor

Lukasa commented Oct 26, 2023

@FranzBusch want to take another quick look at this?

Copy link
Contributor

@FranzBusch FranzBusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some smaller comments inline but this is not going to be usable from our new async APIs since we lose type information in the pipeline. Furthermore, I think with our new async APIs we would implement all of this in Swift Concurrency in an HTTP Server and not inside the NIO channel pipeline. I am wondering if it makes sense to add this here or just defer it functionality that our general purpose HTTP server should handle. @Lukasa WDYT?

public typealias OutboundIn = Never
public typealias OutboundOut = HTTPResponsePart

var upload: HTTPResumableUpload
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this private?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is currently used in HTTPResumableUpload class. We could look into refactoring it.

import NIOCore

/// `HTTPResumableUploadContext` manages ongoing uploads.
public final class HTTPResumableUploadContext {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two questions here. Why is this public in the first place? It looks like a type that holds context across the various uploads but what we actually only need from the user is origin, path, and timeout. In the end this type is passed to the handler so it might just be better to pass those parameters to the handler and internalise the context. Furthermore, I am questioning if we can make this a struct instead but I have to look a bit more to understand where the current need for the lock comes from.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey I'll be pitching in for Guoye to address feedback. I think HTTPResumableUploadContext is public because it provides a nice way for developers to configure unique or optional behaviors of their resumable uploads. In the future, the protocol draft may contain optional features such as progress reporting, and if we support them, this context could be the place to opt in or out (instead of passing more parameters to the handler). Guoye may have other considerations, too--I can ask when he's back from vacation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concretely, context holds all ongoing resumable uploads. Adopters must supply the same context when creating different handlers if they want the resumption to happen. The alternative is a global singleton which is less flexible.

self.allocator = parent.allocator
self.closePromise = parent.eventLoop.makePromise()
self.eventLoop = parent.eventLoop
self.autoRead = try! parent.syncOptions!.getOption(ChannelOptions.autoRead)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get rid of these force-unwraps here and either make the init throwing or default to false?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was told that any reasonable Channel should implement syncOptions and we can refuse to work with channels that do not. In general, autoRead should be true and handlers can defer a read by intercepting read() calls. Alternatively, we could just say our autoRead defaults to true and not inherit it from the parent channel.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also keep the force-unwraps but we should at least document why they are okay.

Comment on lines +36 to +80
let upload: HTTPResumableUpload

let allocator: ByteBufferAllocator

private let closePromise: EventLoopPromise<Void>

var closeFuture: EventLoopFuture<Void> {
self.closePromise.futureResult
}

private var _pipeline: ChannelPipeline!

var pipeline: ChannelPipeline {
self._pipeline
}

var localAddress: SocketAddress? {
self.parent?.localAddress
}

var remoteAddress: SocketAddress? {
self.parent?.remoteAddress
}

var parent: Channel? {
self.upload.parentChannel
}

var isWritable: Bool {
self.parent?.isWritable ?? false
}

private let _isActiveAtomic: ManagedAtomic<Bool> = .init(true)

var isActive: Bool {
self._isActiveAtomic.load(ordering: .relaxed)
}

var _channelCore: ChannelCore {
self
}

let eventLoop: EventLoop

private var autoRead: Bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move all of the vars above the init please.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 46a7e20

/// - channelConfigurator: A closure for configuring the child HTTP server channel.
public init(
context: HTTPResumableUploadContext,
channelConfigurator: @escaping (Channel) -> Void
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We normally call this the channelInitializer. Furthermore we normally let this return an EventLoopFuture to give people the chance to initialize their channel asynchronously.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked with @Lukasa before about this. Making it asynchronous is possible but it would definitely complicate the code by adding a pause after completing the header read. This is guaranteed to be invoked no more than once, so they can do the expensive work upfront and capture the result into this closure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think at least we should rename it to channelInitializer. I am still a bit worried about diverting from the norm here but if @Lukasa is happy with it we can go ahead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lukasa
Copy link
Contributor

Lukasa commented Oct 26, 2023

I think we should add this here: it's in support of sample code that was released with WWDC, and we should be willing to support that code.

if let path = request.path, let upload = self.context.findUpload(path: path) {
self.uploadHandler = nil
self.uploadHandlerChannel.withLockedValue { $0 = nil }
upload.offsetRetrieving(otherHandler: handler)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We immediately call detachUploadHandler in offsetRetrieving so I don't think we should set the handler or channel to nil here (otherwise detachUploadHandler does nothing).

}

public func read(context: ChannelHandlerContext) {
// Do nothing.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overriding this method to do nothing prevents upload resumption in HTTP/1.1 (the bug Marius was seeing), but I'm not entirely sure why. In the failing case, we do

HTTPResumableUpload offsetRetrieving(otherHandler:)
HTTPResumableUploadHandler channelRead(context:data:)
HTTPResumableUploadHandler channelReadComplete(context:)
HTTPResumableUploadHandler read(context:)

and the server hangs here while the client tries to send the PATCH request. If we remove this, the upload appending happens as expected right after offset retrieving. I'm still learning the NIO architecture, is there a reason we want to override the default read(context:) handling?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unlikely to be what we want. read is used to ask the channel to read more from the network, so doing this will suppress all reads. Probably not what we want.

In general, empty overrides should be removed: their default behaviour is better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a specific reason that we intercepted read. I think we wanted read to be driven by the child channel managed by the client's handlers, so this was the way to disable auto read behavior of the parent channel. Wondering if I missed something.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Intercepting read like this does allow the child channel to drive the read behavior. But since the offset retrieving is transparent to the client's handler, I think the read doesn't propagate to the client's handler. I'm looking into how we could solve this while still overriding the read, maybe we need to explicitly attach the handler during offset retrieving?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Lukasa Guoye and I were debugging this and found that multiple H1 requests were being received on the same handler, which caused the issue since we were setting self.uploadHandler to nil (e.g. the subsequent upload appending PATCH wasn't being handled). Is there a way to ensure that only one request is delivered per handler?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🔼 needs-minor-version-bump For PRs that when merged cause a bump of the minor version, ie. 1.x.0 -> 1.(x+1).0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants