-
Notifications
You must be signed in to change notification settings - Fork 67
Reduce DataVolume wait_deleted timeout to 1 minute #2292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe pull request modifies the Changes
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
|
Report bugs in Issues The following are automatically added:
Available user actions:
Supported /retest check runs
Supported labels
|
|
@jpeimer Why reduce it? if you need less timeout send it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
ocp_resources/datavolume.py(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: python-module-install
- GitHub Check: tox
🔇 Additional comments (1)
ocp_resources/datavolume.py (1)
Line range hint
332-332: Verify PVC binding timeout for CSI storage scenarios.The 1-minute timeout for PVC binding after DataVolume success might be insufficient for CSI storage, especially in environments with slower storage provisioning.
Run this script to analyze PVC binding times in CSI storage:
✅ Verification successful
PVC binding timeout for CSI storage appears appropriate
The 1-minute timeout for PVC binding after DataVolume success is consistent with the codebase's standard timeout patterns, and there's no evidence of timeout-related issues. The behavior is explicitly documented and appears to be an intentional design decision.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Analyze PVC binding times for CSI storage to verify if 1-minute timeout is sufficient # Search for CSI PVC binding logs with timestamps rg --multiline 'CSI.*PVC.*Bound' -A 5 # Search for timeout errors during PVC binding rg 'TimeoutExpiredError.*PVC.*Bound' # List CSI storage classes to understand provisioner types kubectl get sc -o yaml | grep -A 5 'provisioner: csi'Length of output: 680
Script:
#!/bin/bash # Search for test cases and configurations related to CSI storage and PVC binding # Look for test cases involving CSI storage and PVC binding rg -B 2 -A 2 "test.*CSI.*PVC" --type python rg -B 2 -A 2 "test.*storage.*bind" --type python # Search for timeout configuration or documentation rg "TIMEOUT_1MINUTE" -B 2 -A 2 rg "storage.*timeout" -B 2 -A 2 --type-add 'docs:*.{md,rst,txt}' # Look for any existing CSI storage test configurations fd -e yaml -e json | xargs rg "csi.*storage"Length of output: 16278
| }) | ||
|
|
||
| def wait_deleted(self, timeout=TIMEOUT_4MINUTES): | ||
| def wait_deleted(self, timeout=TIMEOUT_1MINUTE): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Revert the timeout reduction or provide evidence for 1-minute sufficiency
The 1-minute timeout for DataVolume deletion appears risky and inconsistent with the codebase patterns:
- Most resources use 4-minute timeouts for deletion operations
- DataVolume deletion requires two sequential operations (DV + PVC deletion)
- Each operation would effectively get only ~30 seconds
- No evidence in codebase supporting 1-minute being sufficient
Please either:
- Revert to TIMEOUT_4MINUTES for consistency and reliability
- Provide metrics/data showing that 1-minute consistently suffices for both operations
🔗 Analysis chain
Verify that 1-minute timeout is sufficient for DataVolume deletion.
Reducing the timeout from 4 minutes to 1 minute for both DataVolume and PVC deletion could be risky:
- The timeout is shared between two sequential operations (DV deletion + PVC deletion)
- Each operation effectively gets only ~30 seconds
- This might not be sufficient in slower environments or under heavy load
Please provide data/evidence showing that 1 minute is consistently sufficient for both operations to complete.
Run this script to analyze historical deletion times:
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Analyze DataVolume deletion times from logs to verify if 1-minute timeout is sufficient
# Search for DataVolume deletion logs with timestamps
rg --multiline --multiline-dotall 'Wait DV.*deletion.*\n.*deletion.*completed' -A 5
# Search for any timeout errors related to DataVolume deletion
rg 'TimeoutExpiredError.*deletion.*DataVolume'
Length of output: 134
Script:
#!/bin/bash
# Search for test files and code related to DataVolume deletion timeouts
# Find test files related to DataVolume deletion
echo "=== Test files with DataVolume deletion tests ==="
rg -l "test.*delete.*datavolume" --type python
rg -l "delete.*test.*datavolume" --type python
# Look for timeout constants and their usage
echo -e "\n=== Timeout constants and their usage ==="
rg "TIMEOUT_[1234]MINUTE" --type python -B 1 -A 1
# Look for any DataVolume deletion related failures in test files
echo -e "\n=== DataVolume deletion failure patterns in tests ==="
rg "deletion.*fail" --type python -A 2
Length of output: 645
Script:
#!/bin/bash
# Search for DataVolume deletion and timeout patterns without file type filtering
# Find files with DataVolume deletion tests
echo "=== DataVolume deletion related files ==="
rg -l "def test.*delete.*[Dd]ata[Vv]olume"
rg -l "wait_deleted.*[Dd]ata[Vv]olume"
# Look for timeout constants
echo -e "\n=== Timeout constants and their usage ==="
rg "TIMEOUT_[1234]MINUTE"
# Look for deletion timeouts and failures
echo -e "\n=== Deletion timeout patterns ==="
rg "TimeoutExpired.*delete"
rg "deletion.*failed"
rg "wait_deleted.*timeout"
Length of output: 19106
|
/hold |
From what I see, 4 minutes is too long: I also think we can reduce the |
Short description:
More details:
What this PR does / why we need it:
Which issue(s) this PR fixes:
Special notes for reviewer:
Bug:
Summary by CodeRabbit