Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: return authoritative span statistics for db details endpoint #108037

Merged

Conversation

THardy98
Copy link
Contributor

@THardy98 THardy98 commented Aug 2, 2023

Resolves: #96163

This change makes the admin API endpoint getting database statistics scan KV for span statistics instead of using the range descriptor cache. This provides authoritative output, helping deflake TestMultiRegionDatabaseStats.

Release note (sql change): admin API database details endpoint now returns authoritative range statistics.

@THardy98 THardy98 requested a review from a team August 2, 2023 17:30
@THardy98 THardy98 requested review from a team as code owners August 2, 2023 17:30
@THardy98 THardy98 requested review from nkodali and removed request for a team August 2, 2023 17:30
@blathers-crl
Copy link

blathers-crl bot commented Aug 2, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@THardy98 THardy98 added the backport-23.1.x Flags PRs that need to be backported to 23.1 label Aug 2, 2023
@THardy98 THardy98 removed request for a team and nkodali August 2, 2023 17:56
Copy link
Contributor

@j82w j82w left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @THardy98)


-- commits line 12 at r1:
Is there any concern about the additional overhead and perf impact of not using the cache? Would it not be better to just reset the stats for the test?

Copy link
Contributor Author

@THardy98 THardy98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @j82w)


-- commits line 12 at r1:

Previously, j82w (Jake) wrote…

Is there any concern about the additional overhead and perf impact of not using the cache? Would it not be better to just reset the stats for the test?

I think this makes more sense for the endpoint generally. Using the descriptor cache to measure size seems odd. There's no guarantee the cache holds all the database/table's ranges (i.e. we could be missing ranges or reading old ranges that no longer correspond to the database/table) and there's also no guarantee that the cache at some point will be up-to-date either.

I do agree that there is a concern about the additional overhead/perf impact but we also use the same logic already for the status server which we use for the console. I think the alignment between the two is also good.

@knz
Copy link
Contributor

knz commented Aug 2, 2023

I agree with Thomas.

Additionally, if we find that performance is a concern, I would welcome a caching layer for the computed stats in the o11y service, separate from the range cache.

Copy link
Contributor

@maryliag maryliag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanna see that the tests are passing before approval, to make sure this change is working as expected, since some of the failed ones are related to retrieving database details

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @j82w)

@THardy98 THardy98 force-pushed the deflake_TestMultiRegionDatabaseStats branch 2 times, most recently from 777c1a0 to de3b318 Compare August 11, 2023 21:47
@THardy98
Copy link
Contributor Author

THardy98 commented Aug 11, 2023

Tests were failing due to directly accessing meta2 keyspace which would fail in tenant scenarios.

Added a conditional to use the range iterator from tenantConnect if it was populated. Not sure if the tenantConnect != nil pattern is one we want to adopt but thought it was a simple solution here.

@THardy98 THardy98 requested review from maryliag and j82w August 11, 2023 21:50
Resolves: cockroachdb#96163

This change makes the admin API endpoint getting database statistics
scan KV for span statistics instead of using the range descriptor cache.
This provides authoritative output, helping deflake
`TestMultiRegionDatabaseStats`.

Release note (sql change): admin API database details endpoint now
returns authoritative range statistics.
@THardy98 THardy98 force-pushed the deflake_TestMultiRegionDatabaseStats branch from de3b318 to b2faa2b Compare August 11, 2023 21:50
Copy link
Contributor

@maryliag maryliag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 2 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @j82w)

@THardy98
Copy link
Contributor Author

bors r+

@craig
Copy link
Contributor

craig bot commented Aug 14, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Aug 14, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Aug 14, 2023

Build succeeded:

@craig craig bot merged commit dc2c52d into cockroachdb:master Aug 14, 2023
7 checks passed
@blathers-crl
Copy link

blathers-crl bot commented Aug 14, 2023

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from b2faa2b to blathers/backport-release-23.1-108037: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 23.1.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-23.1.x Flags PRs that need to be backported to 23.1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ccl/multiregionccl: TestMultiRegionDatabaseStats failed
5 participants