tests: add temporary next gen bootstrap delay#4439
tests: add temporary next gen bootstrap delay#4439ti-chi-bot[bot] merged 1 commit intopingcap:masterfrom
Conversation
Improve robustness.
📝 WalkthroughWalkthroughThe change introduces a pre-start delay in the TiDB bootstrap sequence to await keyspace pre-allocation. A 30-second sleep with an echo statement is added before creating upstream TiDB system config and starting TiDB instances, with comments noting that PD health alone is insufficient and that this quiet window reduces CI startup flakiness. Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/integration_tests/_utils/start_tidb_cluster_nextgen (1)
346-351: Consider making the delay configurable.The fixed 30-second delay unconditionally adds time to every CI run. In slower CI environments, 30 seconds may still be insufficient, while in faster environments it wastes time.
Consider making this configurable via an environment variable with a sensible default:
♻️ Suggested refactor for configurable delay
# TODO: replace this fixed delay with an explicit keyspace readiness check. # PD health only guarantees the service is up. In next-gen bootstrap we also # rely on PD finishing keyspace pre-allocation before the SYSTEM TiDB starts. # Giving PD a short quiet window here reduces flaky startup failures in CI. +KEYSPACE_PREALLOC_DELAY=${KEYSPACE_PREALLOC_DELAY:-30} echo "Waiting for next-gen keyspace pre-allocation to settle..." -sleep 30 +sleep "$KEYSPACE_PREALLOC_DELAY"This allows overriding the delay per environment (e.g., longer for slow CI runners) without modifying the script.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration_tests/_utils/start_tidb_cluster_nextgen` around lines 346 - 351, Replace the hardcoded sleep with a configurable environment variable: read a variable like NEXTGEN_KEYSPACE_SETTLE_SECONDS (defaulting to 30) and use it in place of the literal "sleep 30" so CI can override the delay; keep the existing echo "Waiting for next-gen keyspace pre-allocation to settle..." message and validate/coerce the env value to an integer/fallback to 30 before calling sleep to avoid invalid input.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tests/integration_tests/_utils/start_tidb_cluster_nextgen`:
- Around line 346-351: Replace the hardcoded sleep with a configurable
environment variable: read a variable like NEXTGEN_KEYSPACE_SETTLE_SECONDS
(defaulting to 30) and use it in place of the literal "sleep 30" so CI can
override the delay; keep the existing echo "Waiting for next-gen keyspace
pre-allocation to settle..." message and validate/coerce the env value to an
integer/fallback to 30 before calling sleep to avoid invalid input.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 94fb399f-3b4b-4a35-b23c-6e3c0cff6dba
📒 Files selected for processing (1)
tests/integration_tests/_utils/start_tidb_cluster_nextgen
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: tenfyzhong, wk989898 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the reliability of integration tests by introducing a temporary delay during the bootstrap process of next-generation TiDB clusters. This delay ensures that the Placement Driver (PD) has sufficient time to complete keyspace pre-allocation before the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a temporary 30-second delay into an integration test script to address CI flakiness, which is a reasonable short-term solution. My review includes a suggestion to make this delay configurable via an environment variable. This would improve the flexibility of the script, allowing for easier adjustment of the delay in different environments without requiring code changes.
| # rely on PD finishing keyspace pre-allocation before the SYSTEM TiDB starts. | ||
| # Giving PD a short quiet window here reduces flaky startup failures in CI. | ||
| echo "Waiting for next-gen keyspace pre-allocation to settle..." | ||
| sleep 30 |
There was a problem hiding this comment.
Using a hardcoded sleep duration can be inflexible, even for a temporary solution. It's better to make the delay configurable via an environment variable. This allows for easier tuning in different environments (e.g., CI vs. local) without changing the code.
sleep "${NEXT_GEN_KEYSPACE_SETTLE_DELAY:-30}"
What problem does this PR solve?
Issue Number: close #4438
What is changed and how it works?
This PR adds a temporary 30 second delay in
tests/integration_tests/_utils/start_tidb_cluster_nextgenafter TiKV and TiKV-worker are ready and before the firstSYSTEMTiDB starts.PD health only guarantees that the service is up. In the flaky next-gen bootstrap path, giving PD a short quiet window here reduces cases where keyspace pre-allocation is still incomplete when
SYSTEMTiDB starts.Check List
Tests
bash -n tests/integration_tests/_utils/start_tidb_cluster_nextgenQuestions
Will it cause performance regression or break compatibility?
No. This only changes integration test bootstrap timing.
Do you need to update user documentation, design documentation or monitoring documentation?
No.
Release note
Summary by CodeRabbit