Skip to content

Tornado instrumentation HTTP_SERVER_DURATION metric is inaccurate and not async-safe #3486

Open
@devmonkey22

Description

@devmonkey22

Describe your environment

OS: Ubuntu
Python version: Python 3.10
SDK version: 1.31.1
API version: 0.52b1 (same code in 0.53b0 though)

What happened?

The tornado instrumenter and how it tracks a request's elapsed time is not async-safe and does not appear to be able to handle multiple concurrent requests without overwriting and using an incorrect value. This makes the HTTP_SERVER_DURATION metric inaccurate.

Steps to Reproduce

  1. Start one long-running request to a tornado server.
  2. Wait a moment, and start a second long-running request to a tornado server.
  3. Let the first request finish
  4. Let the second request finish.
  5. Review the calculated elapsed time of that request.

This effect is much more visible with streaming requests/handlers like an SSE handler or WebSocketHandler (if I ignore #2761) that may stay connected for hours/days.

The _START_TIME is stored in server_histograms which is a global dict created by the instrumentor. It stores this in the _prepare hook, then reads this back in _record_on_finish_metrics. You can get interleaved start times.

The start time really needs tracked per handler instance (connection).

Expected Result

Request1 Prepare -> StartTime = 10:00:00am
Request2 Prepare -> StartTime = 10:00:05am
Request1 OnFinish -> Reads original StartTime of 10:00:00am, so at 10:00:10am, elapsed time should be 10 seconds.
Request1 OnFinish -> Reads correct original StartTime as 10:00:05am, so at 10:00:10am, elapsed time should be 5 seconds.

Actual Result

Request1 Prepare -> StartTime = 10:00:00am
Request2 Prepare -> StartTime = 10:00:05am
Request1 OnFinish -> Reads incorrect StartTime of 10:00:05am, so at 10:00:10am, elapsed time reported incorrectly as 5 seconds.
Request1 OnFinish -> Reads correct original StartTime as 10:00:05am, so at 10:00:10am, elapsed reported as 5 seconds.

Additional context

No response

Would you like to implement a fix?

Yes

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions