-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Closed
Labels
flaky-testtestingIssues around tests in the codebase and the QA processIssues around tests in the codebase and the QA process
Description
We have several different metrics libraries in the clients that all use some variant of a global state (expvar, clientmetrics, soon usermetrics(expvar)(#13309). This runs fine when there is a single tailscale client, or tsnet, but all of these metrics becomes unpredictable and sometimes wrong if multiple tailscale's are ran, for example if an application has multiple tsnet instances or tests are ran in parallell (or in sequence as they dont clear between tests).
@knyar gathered a list of the behaviour we are currently seeing:
If I understand things correctly, when we do this, such “global” metrics become unusable/inaccurate in a few different ways:
- gauge metrics get overwritten by whatever instance set them last.
- counter metrics remain somewhat “correct” and track the cumulative count across all instances, without giving us ability to distinguish them.
- on clientmetric collection side, each tsnet app reports the same “global” values to logz independently, so we end up tracking the same inaccurate set of metrics multiple times.
Metadata
Metadata
Assignees
Labels
flaky-testtestingIssues around tests in the codebase and the QA processIssues around tests in the codebase and the QA process