executor: deflake TestStorageEnginesInSlowQuery#68168
executor: deflake TestStorageEnginesInSlowQuery#68168ti-chi-bot[bot] merged 3 commits intopingcap:masterfrom
Conversation
|
@henrybw I've received your pull request and will start the review. I'll conduct a thorough review covering code quality, potential issues, and implementation details. ⏳ This process typically takes 10-30 minutes depending on the complexity of the changes. ℹ️ Learn more details on Pantheon AI. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds a test helper and expands TestStorageEnginesInSlowQuery in ChangesSlow Query Test Improvements
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.12.1)Command failed Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #68168 +/- ##
================================================
- Coverage 77.7514% 77.0757% -0.6757%
================================================
Files 1990 1972 -18
Lines 551828 552480 +652
================================================
- Hits 429054 425828 -3226
- Misses 121854 126650 +4796
+ Partials 920 2 -918
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
pkg/executor/slow_query_sql_test.go (1)
479-494:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win
t.Cleanupdiagnostic never fires — file is already removed by the time it runs.In Go's testing model,
deferstatements in the test function body execute when the function returns, whilet.Cleanupcallbacks run after the function returns. Because thedeferat line 479 calls bothf.Close()andos.Remove(...), the slow-log file is gone before thet.Cleanupat line 487 attemptsos.ReadFile. Theif err == nilguard silently swallows the "no such file" error, making the diagnostic a no-op on every failure.Fix: convert the file teardown from
deferto at.Cleanupregistered before the diagnostic one. Sincet.Cleanupis LIFO, the diagnostic (registered later) will run before the removal (registered earlier).🐛 Proposed fix — swap defer for t.Cleanup with correct ordering
- defer func() { - config.StoreGlobalConfig(originCfg) - require.NoError(t, f.Close()) - require.NoError(t, os.Remove(newCfg.Log.SlowQueryFile)) - }() require.NoError(t, logutil.InitLogger(newCfg.Log.ToLogConfig())) - // On failure, dump the slow log to disambiguate a missing entry from one - // that's present but doesn't match the expected pattern (issue `#66727`). + // Registered first → runs last (LIFO): tear down file after the dump. + t.Cleanup(func() { + config.StoreGlobalConfig(originCfg) + require.NoError(t, f.Close()) + require.NoError(t, os.Remove(newCfg.Log.SlowQueryFile)) + }) + // Registered second → runs first (LIFO): dump before removal. + // On failure, dump the slow log to disambiguate a missing entry from one + // that's present but doesn't match the expected pattern (issue `#66727`). t.Cleanup(func() { if !t.Failed() { return } if data, err := os.ReadFile(f.Name()); err == nil { t.Logf("slow log contents (%d bytes):\n%s", len(data), data) } })🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/executor/slow_query_sql_test.go` around lines 479 - 494, The teardown that closes and removes the slow-log file is currently done with defer (calling config.StoreGlobalConfig(originCfg), f.Close(), and os.Remove(newCfg.Log.SlowQueryFile)) which runs before the t.Cleanup diagnostic and causes the diagnostic's os.ReadFile(f.Name()) to always fail; change that defer into a t.Cleanup registered before the existing diagnostic t.Cleanup so cleanup runs LIFO (register t.Cleanup(func(){ config.StoreGlobalConfig(originCfg); require.NoError(t, f.Close()); require.NoError(t, os.Remove(newCfg.Log.SlowQueryFile)) }) before the diagnostic one) ensuring the diagnostic t.Cleanup that reads f.Name() runs while the file still exists; keep the loginit call to logutil.InitLogger(newCfg.Log.ToLogConfig()) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@pkg/executor/slow_query_sql_test.go`:
- Around line 479-494: The teardown that closes and removes the slow-log file is
currently done with defer (calling config.StoreGlobalConfig(originCfg),
f.Close(), and os.Remove(newCfg.Log.SlowQueryFile)) which runs before the
t.Cleanup diagnostic and causes the diagnostic's os.ReadFile(f.Name()) to always
fail; change that defer into a t.Cleanup registered before the existing
diagnostic t.Cleanup so cleanup runs LIFO (register t.Cleanup(func(){
config.StoreGlobalConfig(originCfg); require.NoError(t, f.Close());
require.NoError(t, os.Remove(newCfg.Log.SlowQueryFile)) }) before the diagnostic
one) ensuring the diagnostic t.Cleanup that reads f.Name() runs while the file
still exists; keep the loginit call to
logutil.InitLogger(newCfg.Log.ToLogConfig()) unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 6600e1cb-297b-487a-91f4-2cdefe159b9b
📒 Files selected for processing (1)
pkg/executor/slow_query_sql_test.go
|
Hi @henrybw. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/executor/slow_query_sql_test.go (1)
517-517: 💤 Low value
'select%t_tiflash;'pattern also matches the subsequent join query — latent ordering dependency.The join query on line 520 ends with
from t_tikv, t_tiflash;, so it is also matched byselect%t_tiflash;. The check at line 517 is currently safe only because it executes before line 520 runs. If these two blocks were reordered, or another query matching this suffix were inserted above line 516,checkStorageEnginesat line 517 would see 2 rows and fail non-obviously.Consider anchoring the pattern more tightly — for example, matching the full hint text — or document the ordering constraint with a comment:
♻️ Suggested fix
- checkStorageEngines(t, tk, "query like 'select%t_tiflash;'", "0 1") + // Pattern must not match the later join query (also ends in "t_tiflash;"); keep this + // check before the join scenario below. + checkStorageEngines(t, tk, "query like 'select%read_from_storage(tiflash[t_tiflash])%t_tiflash;'", "0 1")Also applies to: 520-521
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 3bed41ac-5326-4da5-9075-2b3e9270b7cc
📒 Files selected for processing (1)
pkg/executor/slow_query_sql_test.go
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: lance6716, windtalker The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
What problem does this PR solve?
Issue Number: ref #66727
Problem Summary:
TestStorageEnginesInSlowQueryis still flaky after #66773. The slow-log write insideMustExecisn't always immediately visible to the immediately following read ofinformation_schema.slow_queryunder CI load, so the row-count check fails with 0 rows.What changed and how does it work?
slow_queryto call a newcheckStorageEngineshelper function that usestk.EventuallyMustQueryAndCheck.t.Cleanupthat dumps the slow-log file on test failure to disambiguate a missing entry from a present-but-not matching one if this test is still flaky.Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.
Summary by CodeRabbit