Skip to content

Commit

Permalink
Shield HTTP request handlers from async cancellations.
Browse files Browse the repository at this point in the history
We now spawn a new task for every HTTP request, and wait on the
JoinHandle. If Hyper drops the Future, the spawned task will keep
running. This protects the rest of the pageserver code from unexpected
async cancellations.

This creates a CancellationToken for each request and passes it to the
handler function. If the HTTP request is dropped by the client, the
CancellationToken is signaled. None of the handler functions make use
for the CancellationToken currently, but they now they could.

The CancellationToken arguments also work like documentation. When
you're looking at a function signature and you see that it takes a
CancellationToken as argument, it's a nice hint that the function
might run for a long time, and won't be async cancelled. The default
assumption in the pageserver is now that async functions are not
cancellation-safe anyway, unless explictly marked as such, but this is
a nice extra reminder.

Spawning a task for each request is OK from a performance point of
view because spawning is very cheap in Tokio, and none of our HTTP
requests are very performance critical anyway.

Fixes issue #3478
  • Loading branch information
hlinnaka committed Jun 1, 2023
1 parent 3300836 commit efe1c37
Show file tree
Hide file tree
Showing 3 changed files with 206 additions and 127 deletions.
4 changes: 1 addition & 3 deletions docs/pageserver-thread-mgmt.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,9 +52,7 @@ completion, or shield the rest of the code from surprise cancellations
by spawning a separate task. The code that handles incoming HTTP
requests, for example, spawns a separate task for each request,
because Hyper will drop the request-handling Future if the HTTP
connection is lost. (FIXME: our HTTP handlers do not do that
currently, but we should fix that. See [issue
3478](https://github.com/neondatabase/neon/issues/3478)).
connection is lost.


#### How to cancel, then?
Expand Down
37 changes: 0 additions & 37 deletions libs/utils/src/http/endpoint.rs
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,6 @@ where

let log_quietly = method == Method::GET;
async move {
let cancellation_guard = RequestCancelled::warn_when_dropped_without_responding();
if log_quietly {
debug!("Handling request");
} else {
Expand All @@ -79,8 +78,6 @@ where
// module to do that globally.
let res = handler(request).await;

cancellation_guard.disarm();

// Log the result if needed.
//
// We also convert any errors into an Ok response with HTTP error code here.
Expand Down Expand Up @@ -108,40 +105,6 @@ where
.await
}

/// Drop guard to WARN in case the request was dropped before completion.
struct RequestCancelled {
warn: Option<tracing::Span>,
}

impl RequestCancelled {
/// Create the drop guard using the [`tracing::Span::current`] as the span.
fn warn_when_dropped_without_responding() -> Self {
RequestCancelled {
warn: Some(tracing::Span::current()),
}
}

/// Consume the drop guard without logging anything.
fn disarm(mut self) {
self.warn = None;
}
}

impl Drop for RequestCancelled {
fn drop(&mut self) {
if std::thread::panicking() {
// we are unwinding due to panicking, assume we are not dropped for cancellation
} else if let Some(span) = self.warn.take() {
// the span has all of the info already, but the outer `.instrument(span)` has already
// been dropped, so we need to manually re-enter it for this message.
//
// this is what the instrument would do before polling so it is fine.
let _g = span.entered();
warn!("request was dropped before completing");
}
}
}

async fn prometheus_metrics_handler(_req: Request<Body>) -> Result<Response<Body>, ApiError> {
SERVE_METRICS_COUNT.inc();

Expand Down
Loading

0 comments on commit efe1c37

Please sign in to comment.