Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Throw exception on "peer closed" event #937

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

bgamari
Copy link

@bgamari bgamari commented Aug 10, 2023

This is a prototype enabling support for the evtPeerClosed event being introduced into GHC's event manager. This allows warp to throw a PeerClosedException exception to handler threads for connections which have been prematurely closed.

See ghc!11080 and ghc#23825

To do

@kazu-yamamoto kazu-yamamoto self-requested a review August 11, 2023 00:02
@kazu-yamamoto
Copy link
Contributor

In my understanding, the EPOLLRDHUP event happens when the peer closes the connection OR shutdown its writing side. My concern is that PeerClosedException is also thrown when the client shutdowns its writing side after uploading a file or something.

@bgamari Did you consider the uploading scenario?

@bgamari
Copy link
Author

bgamari commented Aug 15, 2023

Hmm, this is an interesting point, @kazu-yamamoto; I had not considered the uploading case. This indeed complicates matters as we certainly do not want to terminate any handlers until they have at very least had a change to read and process the input sent by the client. Consequently, I suspect the exception should not be delivered until the handler has declared that it is "done" with the connection; the idea here is that then "pure" handlers (e.g. pure queries, not uploads) could signal immediately that they are done with the connection.

I can think of at least two concrete interfaces by which this could be done.

For instance, one could imagine that Application could have the form:

type Application =
       Request
         -- ^ the request
    -> (Response -> IO ResponseReceieved)
         -- ^ respond to the request
    -> IO ()
         -- ^ signal that the handler is finished reading from the request 
    -> IO ()

Alternatively, we could make this a property of the Request:

requestFinishedReading :: Request -> IO ()

Given that changing Application would be quite problematic at this juncture, I suspect the latter is the only viable option.

@@ -59,6 +66,14 @@ socketConnection _ s = do
bufferPool <- newBufferPool 2048 16384
writeBuffer <- createWriteBuffer 16384
writeBufferRef <- newIORef writeBuffer
#if MIN_VERSION_base(4,18,0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition is not good enough as error is called by GHC on macOS.

@kazu-yamamoto
Copy link
Contributor

I don't think I understand your idea.
Would you show me how to implement the latter?

@kazu-yamamoto
Copy link
Contributor

There are two kinds of applications.

  1. Not read the request body: for this applications, PeerClosedException is useful.
  2. Read the request body: for this applications, PeerClosedException is harmful since it is thrown immediately when TCP FIN is received. Pending data (not read yet) is lost. The proper way to detect peer's close/shutdown in this case is just use read() until EOF.

@kazu-yamamoto
Copy link
Contributor

Since HTTP/1.1 uses persist connections, clients rarely uses shutdown().
But it is OK for clients to use shutdown() after uploading with Connection: close.
We need to consider how to save this case.

@bgamari
Copy link
Author

bgamari commented Aug 17, 2023

I don't think I understand your idea. Would you show me how to implement the latter?

Sure. The idea is that distinguishing between your cases (1) and (2) in general
will require feedback from the user. To get this feedback, we equip each
connection with an "active" flag which can be deasserted by a type (1)
handlers. That is, a pure "query" handler might look like:

handleQuery :: Application
handleQuery request respond = do
    -- indicate that the handler can be safely killed if the client shuts down
    -- the connection
    connectionIsInactive request

    -- this may be a long computation
    result <- computeQueryResult
    sendResponse respond result

When the client closes their end of the connection, wai would first wait
until the connection is in the "inactive" state and would then throw a
PeerClosedException.

Naturally, ensuring that this is not racy in the face of connection reuse is
rather tricky. We likely want to re-assert the connection-active flag after
each request has been handled to avoid this sort of behavior:

Client                                               Server
---------------------------------                    --------------------------------------
opens connection
                                                     starts connection-handling thread
                                                     connection begins with active=True

submits GET query

                                                     starts handler for query
                                                     handler sets active=False

                                                     sends response

reads response

submits a non-query POST request
                                                     starts handler for request
shuts down connection
                                                     request killed with PeerClosedException

Note above that I am not terribly familiar with HTTP/2. My understanding is that it allows
multiple concurrent requests over a single connection; this obviously muddies the waters.
Any thoughts you might have on how to sensibly handle premature disconnection in
HTTP/2 would be valued.

@kazu-yamamoto
Copy link
Contributor

OK. It seems to me that this is a right direction.

HTTP/2 clients does not use shudown(). So, TCP FIN in HTTP/2 connections is just the end of connections.
I think that no special care for HTTP/2 is necessary in this scheme.

@bgamari Would you put this implementation forward?

@bgamari
Copy link
Author

bgamari commented Aug 24, 2023

@kazu-yamamoto, I have pushed an initial implementation.

By the way, please do ignore Repro.hs; this needs to be turned into a proper test.

@kazu-yamamoto
Copy link
Contributor

Looks good.

  • connectionIsInactive is not an appearing API name. What about readingIsDone or something?
  • Register a CB only on Linux (commented above)
  • Integrate the test case to hspec and remove Repro.hs and Test.hs

@bgamari
Copy link
Author

bgamari commented Aug 31, 2023

Register a CB only on Linux (commented above)

I am reluctant to make this platform dependent since we may implement evtPeerClosed on other platforms eventually (I suspect KQueue supports something similar, although I've struggled to figure out exactly how).

Currently GHC will throw an exception if you attempt to register for evtPeerClosed on an unsupported platform, which is why I surround the registration with tryIO. I suspect this is the right way forward since the implementation will naturally adapt to the capabilities of the host platform.

connectionIsInactive is not an appearing API name. What about readingIsDone or something?

readingIsDone sounds fine to me.

@kazu-yamamoto
Copy link
Contributor

I am reluctant to make this platform dependent since we may implement evtPeerClosed on other platforms eventually (I suspect KQueue supports something similar, although I've struggled to figure out exactly how).

Right. The combination of EV_ADD/EV_EOF and EVFILT_READ would implement this feature with KQueue.

@Vlix
Copy link
Contributor

Vlix commented Sep 11, 2023

Could we add the flag to the Request to then make a helper/wrapper function for users to implement on the path handlers they want the PeerClosedConnection exception to be thrown?
Like requestClientClosedConnectionFlag :: ConnActiveFlag and:

closeOnClientDisconnect :: Application -> Application
closeOnClientDisconnect app req respond = do
    setupDisconnectCheck $ requestClientClosedConnectionFlag req
    app req respond

or

closeOnClientDisconnect :: Request -> IO ()
closeOnClientDisconnect = setupDisconnectCheck . requestClientClosedConnectionFlag

Where the setupDisconnectCheck gets the ThreadId of that handler and throws the PeerClosedConnection exception when the flag switches?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants