Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Add specific metric to time long-running /messages requests #13533

Merged
merged 4 commits into from
Aug 17, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/13533.misc
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Track HTTP response times over 10 seconds from `/messages` (`synapse_room_message_list_rest_servlet_response_time_seconds`).
32 changes: 32 additions & 0 deletions synapse/rest/client/room.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@
from typing import TYPE_CHECKING, Awaitable, Dict, List, Optional, Tuple
from urllib import parse as urlparse

from prometheus_client.core import Histogram

from twisted.web.server import Request

from synapse import event_auth
Expand Down Expand Up @@ -60,6 +62,35 @@

logger = logging.getLogger(__name__)

# This is an extra metric on top of `synapse_http_server_response_time_seconds`
# which times the same sort of thing but this one allows us to see values
# greater than 10s. We use a separate dedicated histogram with its own buckets
# so that we don't increase the cardinality of the general one because it's
# multiplied across hundreds of servlets.
messsages_response_timer = Histogram(
MadLittleMods marked this conversation as resolved.
Show resolved Hide resolved
"synapse_room_message_list_rest_servlet_response_time_seconds",
"sec",
[],
buckets=(
0.005,
0.01,
0.025,
0.05,
0.1,
0.25,
0.5,
1.0,
2.5,
5.0,
10.0,
30.0,
60.0,
120.0,
180.0,
Comment on lines +87 to +89
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: with such large buckets the resolution is going to be pretty low for anything above 10s. This is probably fine, but does mean that to move the needle meaningfully on this graph /messages timings would need to change by tens of seconds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're good for now. Didn't want to be too greedy with the number of buckets.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, like i said I'm way more blase of increasing the buckets on one of histograms

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Increased fidelity in #13584 (comment)

"+Inf",
),
)


class TransactionRestServlet(RestServlet):
def __init__(self, hs: "HomeServer"):
Expand Down Expand Up @@ -560,6 +591,7 @@ def __init__(self, hs: "HomeServer"):
self.auth = hs.get_auth()
self.store = hs.get_datastores().main

@messsages_response_timer.time()
MadLittleMods marked this conversation as resolved.
Show resolved Hide resolved
async def on_GET(
self, request: SynapseRequest, room_id: str
) -> Tuple[int, JsonDict]:
Expand Down