Skip to content

Conversation

@zaibkhan
Copy link

@zaibkhan zaibkhan commented Sep 9, 2025

Mirrors ai-code-review-evaluation#7 for like-for-like benchmarking.

  • Base: db-cleanup-baseline
  • Head: db-cleanup-optimized

Original PR excerpt:

Test 7

… deadlocks on MySQL (#80329)

* Split subquery when cleaning annotations

* update comment

* Raise batch size, now that we pay attention to it

* Iterate in batches

* Separate cancellable batch implementation to allow for multi-statement callbacks, add overload for single-statement use

* Use split-out utility in outer batching loop so it respects context cancellation

* guard against empty queries

* Use SQL parameters

* Use same approach for tags

* drop unused function

* Work around parameter limit on sqlite for large batches

* Bulk insert test data in DB

* Refactor test to customise test data creation

* Add test for catching SQLITE_MAX_VARIABLE_NUMBER limit

* Turn annotation cleanup test to integration tests

* lint

---------

Co-authored-by: Sofia Papagiannaki <1632407+papagian@users.noreply.github.com>
@codoki-pr-intelligence
Copy link

codoki-pr-intelligence bot commented Sep 9, 2025

Codoki PR Review

Summary: Fix batch deletes to avoid deadlocks, correct test compile, tame logging
What’s good: Refactoring deletes to fetch-and-delete by ID in bounded batches is a solid approach to avoid MySQL deadlocks and aligns with the new untilDoneOrCancelled helper.
Review Status: ❌ Requires changes
Overall Priority: High

Issues (Critical & High only)

Severity Issue Why it matters
High Correctness — Compilation failure due to undefined min() in test …/annotationsimpl/cleanup_test.go
The test does not compile: min is undefined in Go 1.21. Replace with a bounds-checked end index (and apply the same fix in the annotation_tag block below) to restore build.

Showing top 1 issues. Critical: 0, High: 1. See inline suggestions for more.

Key Feedback (click to expand)
  • Needs improvement: Several new log lines emit at error level with large payloads (IDs), which will flood logs and slow cleanup; use debug and avoid listing IDs. Also, the cleanup ticker frequency change to 1 minute may cause unnecessary load.
  • Testing: Good coverage for large SQLite parameter limits and configuration permutations. However, the test currently does not compile due to use of undefined min; fix this to re-enable CI. Consider adding a small test to assert that both MaxAge and MaxCount can run in the same pass without interference.
  • Documentation: Consider a brief comment near untilDoneOrCancelled clarifying that both MaxAge and MaxCount cleanup can run in one invocation (changed from prior behavior), as this is operationally meaningful.
  • Compatibility: Deleting in batches changes timing/logging but not semantics; however, increasing the cleanup ticker to 1 minute is a behavioral change that increases background activity and may impact low-resource deployments.
  • Performance: Avoid logging full ID lists at error level on every batch; this is costly in hot cleanup paths and can degrade DB and logging throughput.
  • Open questions: Do we intend to run both MaxAge and MaxCount cleanups in the same pass now (previously MaxCount returned early)? If so, can we document this behavior change?

Confidence: 3/5 — Needs work before merge (1 high · status: Requires changes)

React with 👍 or 👎 if you found this review useful.

}
return err
})
require.NoError(t, err)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ High: The test does not compile: min is undefined in Go 1.21. Replace with a bounds-checked end index (and apply the same fix in the annotation_tag block below) to restore build.

Suggested change
require.NoError(t, err)
```go
end := i + batchsize
if end > len(newAnnotations) {
end = len(newAnnotations)
}
_, err := sess.InsertMulti(newAnnotations[i:end])

ids, err := r.fetchIDs(ctx, "annotation", cond)
if err != nil {
return 0, err
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔷 Medium: Excessive error-level logging with full ID lists will flood logs and slow down cleanup. Use debug level and avoid logging IDs to keep logs lightweight (apply similarly to the other cleanup logs).

Suggested change
}
```go
r.log.Debug("Annotations to clean by time", "count", len(ids), "cond", cond)

srv.cleanUpTmpFiles(ctx)

ticker := time.NewTicker(time.Minute * 10)
ticker := time.NewTicker(time.Minute * 1)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔷 Medium: Increasing the cadence to 1 minute may significantly raise background DB load. Unless driven by a config knob, consider keeping 10 minutes for production stability.

Suggested change
ticker := time.NewTicker(time.Minute * 1)
```go
ticker := time.NewTicker(time.Minute * 10)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants