Skip to content

[monitorlib] Record time required for query authorization#1414

Merged
BenjaminPelletier merged 5 commits intointeruss:mainfrom
BenjaminPelletier:auth-time
Apr 9, 2026
Merged

[monitorlib] Record time required for query authorization#1414
BenjaminPelletier merged 5 commits intointeruss:mainfrom
BenjaminPelletier:auth-time

Conversation

@BenjaminPelletier
Copy link
Copy Markdown
Member

@BenjaminPelletier BenjaminPelletier commented Apr 8, 2026

Currently, monitorlib.fetch's query tools record the start time of a query before the auth adapter potentially must obtain an access token. Therefore, the time to retrieve that access token is added to/included in the query's total time. This PR introduces a new field to Query's RequestDescription that identifies how long authorization took to obtain when it is more than a minimal amount (0.5ms experimentally determined to fairly reliably separate true fast local queries from simple logic processing with no queries) a new token is issued. In the future, this information can be used to adjust the true query duration by subtracting the auth time, and it can be used to track how much time is spent on authorization overall.

The new RequestDescription auth_dt field is populated when a new "authorized_dt" attribute is added to the PreparedRequest that describe_request ingests, and this attributed is added by UTMClientSession when it adds authorization that takes a non-trivial amount of time authorization results in issuing a token. AsyncUTMClientSession does not yet add this attribute.

I tested this behavior by examining report.json from f3548_self_contained and netrid_v22 and finding a reasonable number of "auth_dt" fields populated.

monitorlib.fetch was previously importing Optional indirectly from implicit dict. This PR fixes that minor issue by importing it directly from typing, but this angers ruff so a blanket UP045 (Optional[Foo] -> Foo | None) exclusion is added to prevent unwanted changes to Optional ImplicitDict fields.

@BenjaminPelletier BenjaminPelletier marked this pull request as ready for review April 8, 2026 06:30
Copy link
Copy Markdown
Contributor

@mickmis mickmis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when it is more than a minimal amount (0.5ms experimentally determined to fairly reliably separate true fast local queries from simple logic processing with no queries)

I'm not sure I understand why that is required? I.e. why bother with the additional complexity of separating actual queries from no queries? Especially if it is insignificant. In addition, here when recording the duration, it does include the processing time around the query.
And if we do care, have you considered returning this info from issue_token / get_headers / add_headers rather than inferring it from the caller? Probably that would be a slightly messy change, but the information would be reliable.

received_at: Optional[StringBasedDateTime]

auth_dt: Optional[StringBasedTimeDelta]
"""Amount of time required to obtain authorization before performing the primary query (de minimus or unknown by default)."""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

de minimus

TIL :D

@BenjaminPelletier
Copy link
Copy Markdown
Member Author

when it is more than a minimal amount (0.5ms experimentally determined to fairly reliably separate true fast local queries from simple logic processing with no queries)

I'm not sure I understand why that is required? I.e. why bother with the additional complexity of separating actual queries from no queries? Especially if it is insignificant.

From a conceptual standpoint, ideally we would only record the authorization time when it is doing something non-trivial (usually making a query to get an access token). Checking that we already have an access token that can be reused isn't something we want to record the time required to do. But, there is no mechanism on the AuthAdapter interface to indicate when something non-trivial has been done, and it doesn't seem like we would want to add that (though see answer to your second question below). Fortunately, checking for the ability to reuse a token is very quick while presumably anything non-trivial is not nearly as quick, so I think we can do a pretty good job differentiating on time like this.

From a practical standpoint, getting an access token (or performing some other non-trivial auth task) is relatively rare -- for instance, it happens roughly 30 times in f3548_self_contained. But, there are roughly 1700 queries in f3548_self_contained. So, if we always populated auth_dt, that would add 1670 * "auth_dt":"0.0001s", bytes to the report for effectively no informational gain. And, real-world tests are 1-2 orders of magnitude bigger than this.

In addition, here when recording the duration, it does include the processing time around the query. And if we do care, have you considered returning this info from issue_token / get_headers / add_headers rather than inferring it from the caller? Probably that would be a slightly messy change, but the information would be reliable.

My thought was that the current PR approach would be briefer and complicate the interfaces (get_headers, add_headers) less, but considering more it seems worthwhile to do things thoroughly right the first time rather than "mostly right" -- I'll update the approach to this suggestion.

@BenjaminPelletier BenjaminPelletier marked this pull request as draft April 8, 2026 16:42
@BenjaminPelletier BenjaminPelletier marked this pull request as ready for review April 8, 2026 18:11
Copy link
Copy Markdown
Contributor

@mickmis mickmis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, if we always populated auth_dt, that would add 1670 * "auth_dt":"0.0001s", bytes to the report for effectively no informational gain.

I haven't actually considered the impact of this going into the report, makes sense.

PR LGTM, just a typo.

@dataclass
class AdditionalHeaders:
headers: dict[str, str]
token_issueance_seconds: float | None = None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
token_issueance_seconds: float | None = None
token_issuance_seconds: float | None = None

@BenjaminPelletier BenjaminPelletier merged commit 61700c7 into interuss:main Apr 9, 2026
22 checks passed
@BenjaminPelletier BenjaminPelletier deleted the auth-time branch April 9, 2026 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants