add a metric to track request timeouts #2419

Geal · 2023-01-17T16:22:50Z

This adds a counter for timeouts. It will generate separate metrics for client request timeouts and subgraph requests timeouts.

Checklist

Complete the checklist (and note appropriate exceptions) before a final PR is raised.

apparently there's an issue with the downcast from error here. I think we should move towards a timeout layer that returns a Ok(response) instead of Err(Elapsed)

this is a bit complicated because the timeout future only knows about very limited generic parameters, while we actually want to: - generate different responses if we're in supergraph or subgraph - return a response containing the request context

Geal · 2023-01-18T11:43:10Z

🤔 if timeout returns a HTTP response, then that breaks the retry layer, which will only retry on Err. And the 504 status code does not make much sense at the subgraph request level, it's not acting as a proxy there.
It makes more sense to use 504 for the supergraph level timeout though.
So I'll revert those changes to reuse BoxError instead, update the status code in the axum response, and add a test

This reverts commit bc34e5f.

This reverts commit 7a6f8ee.

This reverts commit 84398d1.

SimonSapin · 2023-01-19T09:01:34Z

apollo-router/src/axum_factory/tests.rs

+    // we do the entire supergraph rebuilding instead of using `from_supergraph_mock_callback_and_configuration`
+    // because we need the plugins to apply on the supergraph


Could you say more about what this means? The amount of duplicated boilerplate code feels unfortunate

from_supergraph_mock_callback_and_configuration creates a mock supergraph without applying plugins on it, because it is meant as a foundation to test the router service. Here I want to test, at the router service response, the result of executing a plugin between the router service and supergraph service. That boiler plate could be extracted in another function if we see another case where we'd need it

add a metric to track request timeouts

62dac44

This comment has been minimized.

Sign in to view

apollo-bot2 assigned Geal Jan 17, 2023

Geal added 5 commits January 17, 2023 17:25

return a timeout response

52bdbc7

apparently there's an issue with the downcast from error here. I think we should move towards a timeout layer that returns a Ok(response) instead of Err(Elapsed)

lint

7a6f8ee

lint

bc34e5f

Merge branch 'dev' into geal/timeout-metrics

ec7cb5f

Geal added 5 commits January 18, 2023 16:01

Revert "lint"

c8e7bb9

This reverts commit bc34e5f.

Revert "lint"

66d3d7b

This reverts commit 7a6f8ee.

Revert "return responses from the timeout future"

ac78009

This reverts commit 84398d1.

add a test and set the correct status code

a2ff5c1

cleanup

942e6a7

Geal marked this pull request as ready for review January 18, 2023 15:04

Geal requested review from a team, SimonSapin and abernix and removed request for a team January 18, 2023 15:04

SimonSapin reviewed Jan 19, 2023

View reviewed changes

bnjjj approved these changes Jan 19, 2023

View reviewed changes

Geal added 2 commits January 19, 2023 16:54

Merge branch 'dev' into geal/timeout-metrics

f05e15e

docs and changelog

fc24793

Geal requested a review from StephenBarlow as a code owner January 19, 2023 16:04

Merge branch 'dev' into geal/timeout-metrics

e395f58

bnjjj approved these changes Jan 20, 2023

View reviewed changes

Merge branch 'dev' into geal/timeout-metrics

693eba8

Geal enabled auto-merge (squash) January 25, 2023 08:30

Geal merged commit 9e9fae8 into dev Jan 25, 2023

Geal deleted the geal/timeout-metrics branch January 25, 2023 09:35

abernix mentioned this pull request Feb 1, 2023

prep release: v1.10.0 #2516

Merged

glasser mentioned this pull request Mar 29, 2024

feat(pq): use 4xx status code on PQ errors #4887

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add a metric to track request timeouts #2419

add a metric to track request timeouts #2419

Geal commented Jan 17, 2023 •

edited

This comment has been minimized.

Geal commented Jan 18, 2023

SimonSapin Jan 19, 2023

Geal Jan 19, 2023

		// we do the entire supergraph rebuilding instead of using `from_supergraph_mock_callback_and_configuration`
		// because we need the plugins to apply on the supergraph

add a metric to track request timeouts #2419

add a metric to track request timeouts #2419

Conversation

Geal commented Jan 17, 2023 • edited

This comment has been minimized.

Geal commented Jan 18, 2023

SimonSapin Jan 19, 2023

Choose a reason for hiding this comment

Geal Jan 19, 2023

Choose a reason for hiding this comment

Geal commented Jan 17, 2023 •

edited