Optimize data collection: add index and batch deletes#44692
Conversation
There was a problem hiding this comment.
Claude Code Review
This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.
Tip: disable this comment in your organization's Code Review settings.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Pull request overview
This PR aims to reduce MySQL load from the chart historical data (SCD) system by (1) adding an index to accelerate host_scd_data lookups by valid_to, and (2) changing cleanup to delete old rows in smaller batches to shorten lock windows and better interleave with concurrent writers.
Changes:
- Add a new secondary index on
host_scd_dataintended to optimize queries filtering onvalid_to. - Update
CleanupSCDDatato delete rows in batches (loopingDELETE ... LIMIT ...) instead of one large delete. - Add a batch-size constant for SCD cleanup.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| server/datastore/mysql/migrations/tables/20260423161823_AddHostSCDData.go | Adds an index definition to the host_scd_data table creation SQL. |
| server/chart/internal/mysql/data.go | Implements batched deletion for old SCD rows and introduces scdCleanupBatch. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughAdded package-level variable Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
server/chart/internal/mysql/data.go (1)
384-389: 💤 Low valueConsider adding
ORDER BY valid_toto the batchedDELETEfor statement-based replication safety.
DELETE ... LIMITwithoutORDER BYis flagged as unsafe by MySQL inbinlog_format=STATEMENTmode and can produce replication errors or warnings in mixed-mode setups. With RDS's default ROW format this is benign, but adding an explicitORDER BYcosts nothing and makes the statement portable:♻️ Proposed change
res, err := ds.writer(ctx).ExecContext(ctx, `DELETE FROM host_scd_data WHERE valid_to < ? AND valid_to <> ? + ORDER BY valid_to LIMIT ?`, cutoff, scdOpenSentinel, scdCleanupBatch)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@server/chart/internal/mysql/data.go` around lines 384 - 389, The batched deletion on host_scd_data using ds.writer(...).ExecContext currently uses DELETE ... LIMIT without a deterministic order; update the SQL in the ExecContext call that deletes from host_scd_data to include an explicit ORDER BY valid_to (ascending) so rows are deleted in a stable order and the statement is safe for statement-based replication. Keep the same WHERE predicates (valid_to < ? AND valid_to <> ?) and LIMIT parameter; just append ORDER BY valid_to before LIMIT in the query string used by the ExecContext invocation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@server/datastore/mysql/migrations/tables/20260423161823_AddHostSCDData.go`:
- Around line 35-36: The CREATE TABLE migration in
20260423161823_AddHostSCDData.go will not add idx_valid_to_dataset on
already-migrated DBs, so create a new migration file (e.g.,
20260503000000_AddHostSCDDataValidToIdx.go) that implements Up_20260503000000(tx
*sql.Tx) to run an ALTER TABLE host_scd_data ADD KEY idx_valid_to_dataset
(valid_to, dataset, entity_id), wrap and return any error (e.g., fmt.Errorf("add
idx_valid_to_dataset to host_scd_data: %w", err)); include a corresponding Down
if your migration framework requires it.
---
Nitpick comments:
In `@server/chart/internal/mysql/data.go`:
- Around line 384-389: The batched deletion on host_scd_data using
ds.writer(...).ExecContext currently uses DELETE ... LIMIT without a
deterministic order; update the SQL in the ExecContext call that deletes from
host_scd_data to include an explicit ORDER BY valid_to (ascending) so rows are
deleted in a stable order and the statement is safe for statement-based
replication. Keep the same WHERE predicates (valid_to < ? AND valid_to <> ?) and
LIMIT parameter; just append ORDER BY valid_to before LIMIT in the query string
used by the ExecContext invocation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 74fcc29e-30a5-42de-a49a-94abbef0ff53
📒 Files selected for processing (2)
server/chart/internal/mysql/data.goserver/datastore/mysql/migrations/tables/20260423161823_AddHostSCDData.go
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #44692 +/- ##
========================================
Coverage 66.69% 66.69%
========================================
Files 2651 2652 +1
Lines 213440 213559 +119
Branches 9638 9638
========================================
+ Hits 142344 142438 +94
- Misses 58135 58154 +19
- Partials 12961 12967 +6
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
JordanMontgomery
left a comment
There was a problem hiding this comment.
This seems like a good solution, may want to ping infra or possibly just manually add the index on QA-wolf(I think our DB acess will allow it) to see if it resolves things
Related issue: Resolves #44609
Details
This PR optimizes the historical data collection system in two ways:
host_scd_datatable allowing more efficient lookups of rows by theirvalid_to, to optimize both closing out open rows and deleting old rowsChecklist for submitter
If some of the following don't apply, delete the relevant line.
changes/,orbit/changes/oree/fleetd-chrome/changes.See Changes files for more information.
n/a, unreleased
SELECT *is avoided, SQL injection is prevented (using placeholders for values in statements), JS inline code is prevented especially for url redirects, and untrusted data interpolated into shell scripts/commands is validated against shell metacharacters.Testing
Added/updated automated tests
Where appropriate, automated tests simulate multiple hosts and test for host isolation (updates to one hosts's records do not affect another)
QA'd all new/changed functionality manually
SQL explains -- before:
Using a test set of data (~144k "open" rows), UPDATES happened at 9 ops per second.
after:
Using the same test set of data, UPDATES happened at 4,910 ops per second.
For unreleased bug fixes in a release candidate, one of:
this should significantly improve results!
Database migrations
COLLATE utf8mb4_unicode_ci).Summary by CodeRabbit