Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local server for debugging #73

Merged
merged 14 commits into from
May 15, 2020
Merged

Local server for debugging #73

merged 14 commits into from
May 15, 2020

Conversation

fabianfett
Copy link
Member

@fabianfett fabianfett commented May 7, 2020

Made some progress to cancel the /next request if one is running. This is needed for our local testing.

Further I removed two unnecessary locks, that are not needed, since we run everything in a single eventloop. #68

@fabianfett fabianfett requested a review from tomerd May 7, 2020 15:11
@@ -66,7 +83,8 @@ internal final class HTTPClient {

let promise = channel.eventLoop.makePromise(of: Response.self)
promise.futureResult.whenComplete { _ in
precondition(self.executing.compareAndExchange(expected: true, desired: false), "invalid execution state")
precondition(self.executing == true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

restore the precondition message?

@@ -25,7 +25,7 @@ internal final class HTTPClient {
private let targetHost: String

private var state = State.disconnected
private let executing = NIOAtomic.makeAtomic(value: false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the idea of using an atomic here was to make sure this is not misused, now its assuming to be safe

@@ -114,6 +114,13 @@ extension Lambda {
}
}
}

#if DEBUG
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can probably be exposed without the DEBUG filter

/// cancels the current run, if we are waiting for next invocation (long poll from Lambda control plane)
/// only needed for debugging purposes.
func cancelWaitingForNextInvocation() {
guard self.isGettingNextInvocation else { return }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guard is idiomatically used for error conditions. if would be more idiomatic in this case

self.state = .shuttingdown
// make this method thread safe by dispatching onto the eventloop
self.eventLoop.execute {
guard case .active(let runner, _) = self.state else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if instead of guard

@@ -18,10 +18,12 @@ import NIO

extension Lambda {
/// LambdaRunner manages the Lambda runtime workflow, or business logic.
internal struct Runner {
internal final class Runner {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is unfortunate a bit, I assume its for the new isGettingNextInvocation state. is that state required?

@fabianfett fabianfett force-pushed the feature/ff-cancel-next branch 2 times, most recently from c7b9ff2 to bf280f4 Compare May 12, 2020 12:50
@fabianfett fabianfett changed the base branch from feature/local-server to master May 12, 2020 12:50
@fabianfett fabianfett force-pushed the feature/ff-cancel-next branch 2 times, most recently from a1f2597 to c782a01 Compare May 12, 2020 21:26
tomerd and others added 10 commits May 13, 2020 09:55
motivation: allow end to end testing locally

changes:
* add a Lambda+LocalServer which exposes Lambda.withLocalServer available only in DEBUG mode
* local server can recieve POST requests with payloads on a configurable endpoint and and send them to the Lambda
* add a "noContent" mode to Lambda runtime to allow polling
* Don’t exit immediately
* Removed locks. Just running in one EL
motivation: allow end to end testing locally

changes:
* add a Lambda+LocalServer which exposes Lambda.withLocalServer available only in DEBUG mode
* local server can recieve POST requests with payloads on a configurable endpoint and and send them to the Lambda
* add a "noContent" mode to Lambda runtime to allow polling
@@ -25,7 +25,7 @@ internal final class HTTPClient {
private let targetHost: String

private var state = State.disconnected
private let executing = NIOAtomic.makeAtomic(value: false)
private var executing = false
Copy link
Contributor

@tomerd tomerd May 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the executing atomic was originally designed to make sure the http client is not misused in a concurrent way - like an assert for the developers of the library. if we feel this check is redundant then it serves no purpose any longer and can be removed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this check is redundant. I just think it doesn't need to be atomic. Since we designed the Lambda to be run in a single EventLoop. With a single EventLoop we don't need any locks to check state. Developers might though still be able to try to send requests at the same time/while another request is running within the same EventLoop. Example:

httpClient.get()
httpClient.get()

We should, in my opinion, definitely crash, if someone tries to use the HTTPClient in such a manner.

self.logger.debug("lambda lifecycle state: \(newValue)")
}
}

private enum State {
Copy link
Contributor

@tomerd tomerd May 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about this ^^ change. the Lambda class can be called from the outside in a non-safe way and some of its public functions touch the state (e.g. start). by dropping this we are loosing the validation in L130 and also risk making concurrency mistakes. I'd argue to put this back

Copy link
Member Author

@fabianfett fabianfett May 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, we should definitely check the state validation in L130. That's why I re-added this in a willSet block in L30. (see above).

Regarding the lock: In the "normal" Lambda use-case the state check after the initialization doesn't add any value, since it is the Lambdas destiny to be run forever. Having spend much time in making this as performant as possible I'm not sure if we should have a lock to protect against a state that is unreachable, that is used on every invocation.

Let's look at the public funcs/vars and which measures we can take to make this concurrency safe:

private var state = State.idle

Not public, but the thing that needs to be thread safe in the end. As mentioned I have added the state order check back. Further I added an assert to verify that the state is only changed within the EventLoop. Any read access to state will happen on the EventLoop.

        private var state = State.idle {
            willSet {
                assert(self.eventLoop.inEventLoop, "State may only be changed on the `Lifecycle`'s `eventLoop`")
                precondition(newValue.order > state.order, "invalid state \(newValue) after \(state.order)")
            }
        }

public func start() -> EventLoopFuture<Void>

I've added a function note and an assert as well to ensure this function is only called on the correct eventLoop.

        /// Start the `Lifecycle`.
        ///
        /// - Returns: An `EventLoopFuture` that is fulfilled after the Lambda hander has been created and initiliazed, and a first run has been scheduled.
        /// 
        /// - note: This method must be called  on the `EventLoop` the `Lifecycle` has been initialized with.
        public func start() -> EventLoopFuture<Void> {
            assert(self.eventLoop.inEventLoop, "Start must be called on the `EventLoop` the `Lifecycle` has been initialized with.")

        }

We protect against calling start() twice by ensuring this is called on the correct EventLoop and the state precondition to only increase the state.order.

public func shutdown()

This method can be called from any thread. We ensure thread safety by dispatching onto the eventLoop. According to Johannes this is normally more expensive than a lock, but it can be argued that: The dispatch happens only once (and only in DEBUG mode), the lock would be used before every invocation.

        #if DEBUG
        /// Begin the `Lifecycle` shutdown. Only needed for debugging purposes, hence behind a `DEBUG` flag.
        public func shutdown() {
            // make this method thread safe by dispatching onto the eventloop
            self.eventLoop.execute {
                let oldState = self.state
                self.state = .shuttingdown
                if case .active(let runner, _) = oldState {
                    runner.cancelWaitingForNextInvocation()
                }
            }
        }
        #endif

public var shutdownFuture: EventLoopFuture<Int>

The shutdownPromise is created in the init() and a let. For this reason any access to shutdownFuture is readonly, which makes it safe.

        /// The `Lifecycle` shutdown future.
        ///
        /// - Returns: An `EventLoopFuture` that is fulfilled after the Lambda lifecycle has fully shutdown.
        public var shutdownFuture: EventLoopFuture<Int> {
            self.shutdownPromise.futureResult
        }

@@ -18,10 +18,12 @@ import NIO

extension Lambda {
/// LambdaRunner manages the Lambda runtime workflow, or business logic.
internal struct Runner {
internal final class Runner {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the isGettingNextInvocation state essential? would would happen if we dont track it always call runtimeClient.cancel()?

Copy link
Member Author

@fabianfett fabianfett May 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might cancel a /response call. That is probably not in our interest.

Copy link
Member Author

@fabianfett fabianfett May 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Further I would argue, that the Runner was malloced even when it was a struct:

Runner.swift L48-L75:

        func run(logger: Logger, handler: Handler) -> EventLoopFuture<Void> {
            logger.debug("lambda invocation sequence starting")
            // 1. request work from lambda runtime engine
            return self.runtimeClient.getNextInvocation(logger: logger).peekError { error in
                logger.error("could not fetch invocation from lambda runtime engine: \(error)")
            }.flatMap { invocation, payload in
                // 2. send work to handler
                let context = Context(logger: logger, eventLoop: self.eventLoop, invocation: invocation)
                logger.debug("sending invocation to lambda handler \(handler)")
                return handler.handle(context: context, payload: payload)
                    .mapResult { result in
                        if case .failure(let error) = result {
                            logger.warning("lambda handler returned an error: \(error)")
                        }
                        return (invocation, result)
                    }
            }.flatMap { invocation, result in
                // 3. report results to runtime engine
                self.runtimeClient.reportResults(logger: logger, invocation: invocation, result: result).peekError { error in
                    logger.error("could not report results to lambda runtime engine: \(error)")
                }
            }.always { result in
                // we are done!
                logger.log(level: result.successful ? .debug : .warning, "lambda invocation sequence completed \(result.successful ? "successfully" : "with failure")")
            }
        }

In the report closure (the last flatmap) we capture self. This is why, as far as i understand Swift, the struct is alloced on the heap anyways, to ensure the long levity of the struct.

We have a similar "problem" in the initialize method.

As I can understand the drive to put as many things as possible on the stack, I'd argue that whether we use class or struct we will end up with something on the heap anyway.

Last but not least, Runner is not really a value type, is it?

Copy link
Contributor

@tomerd tomerd May 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed that Runner is not a natural value type, and less concern about the one allocation (tho we may be able to extract runtimeClient out to avoid capturing self if we want to avoid it?). mostly not excited about tracking state if and when it can be avoided. creative ideas? otherwise I am good with this

@@ -12,7 +12,7 @@
//
//===----------------------------------------------------------------------===//

import AWSLambdaRuntimeCore
import AWSLambdaRuntime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

// therefore: it was shut down before asking for a new invocation
XCTAssertGreaterThan(invocationCount, 0, "should have stopped before any request made")
XCTAssertLessThan(invocationCount, maxTimes, "should have stopped before \(maxTimes)")
case .failure(HTTPClient.Errors.cancelled):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure the error should bubble out. I'd argue this is a success/expected case so we want to return a success with the number of iterations performed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomerd fixed!

@tomerd
Copy link
Contributor

tomerd commented May 15, 2020

@fabianfett great work. one last nit about the error bubbling an we are good to go imo

@tomerd tomerd changed the title Cancel the local server Local server for debugging May 15, 2020
Copy link
Contributor

@tomerd tomerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @fabianfett

@tomerd tomerd merged commit 53930fe into master May 15, 2020
@tomerd tomerd deleted the feature/ff-cancel-next branch May 15, 2020 23:56
@tomerd tomerd mentioned this pull request Jun 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants