-
Notifications
You must be signed in to change notification settings - Fork 23
fix: resolve stuck accounts in process of undelegation #595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Manual Deploy AvailableYou can trigger a manual deploy of this PR branch to testnet: Alternative: Comment
Comment updated automatically when the PR is synchronized. |
WalkthroughExtended account-removal predicate to also allow removing accounts owned by Changes
Sequence Diagram(s)(Skipped — changes are localized and do not introduce new cross-component control flow requiring a sequence diagram.) Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
bmuddha
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
GabrielePicco
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
| (!account.delegated() | ||
| // This fixes the edge-case of accounts that were in the process of | ||
| // being undelegated but never completed while the validator was running | ||
| || account.owner().eq(&dlp::id())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this be fixed if we set delegation flag to false in magic-program right when undelegation was requested? @thlorenz @GabrielePicco
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be an improper fix since we don't know if the account was properly undelegated yet.
There is a reason for this two step process.
We will provide a more proper fix soon .. this is just an unblocker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it improper? In context of ER it was undelegated. Everything else is the matter of properly committing it. We shall not bring a network or Intent service related issues into Account state.
In context of ER account it whether delegated or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think setting the delegation flag to false is necessary but not sufficient to properly handle all edge cases. We also need additional context to determine when an account is awaiting_undelegation, so we can correctly handle transactions that attempt to use that account while it is in the process of being undelegated versus after it has already been undelegated. In the first case, transactions in the ER should fail. In the second case, the cloning pipeline should "re-fetch" the account
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
test-integration/test-chainlink/tests/ix_remote_account_provider.rs(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
test-integration/test-chainlink/tests/ix_remote_account_provider.rs (1)
test-integration/test-tools/src/integration_test_context.rs (1)
airdrop(609-634)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: run_make_ci_lint
- GitHub Check: run_make_ci_test
- GitHub Check: run_make_ci_test
🔇 Additional comments (2)
test-integration/test-chainlink/tests/ix_remote_account_provider.rs (2)
110-117: Consider timing flakiness with fixed sleep durations.The test relies on fixed sleep durations (500ms and 1500ms) to allow subscription updates to propagate. These hardcoded values could lead to flaky test behavior on slower systems or under load.
Consider whether these durations are sufficient for all environments, or if there's a more robust way to verify subscription updates have been processed (e.g., polling with a timeout, or waiting for specific events).
Have you observed this test running reliably across different environments (CI, local development, various system loads)?
94-128: Clarify connection to PR objective.The PR objective describes fixing stuck accounts during undelegation, and the related changes in
magicblock-chainlink/src/chainlink/mod.rsupdate the removal filter for accounts owned bydlp::id(). However, this test modification focuses on airdrop subscription updates rather than undelegation scenarios.Could you clarify how these test changes relate to the undelegation fix? Is this testing a side effect of the chainlink changes, or validating that the subscription system properly handles state changes that could affect undelegation processing?
Adding a comment in the test explaining this connection would help reviewers and future maintainers understand the broader context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
test-integration/test-cloning/tests/01_program-deploy.rs(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: run_make_ci_lint
- GitHub Check: run_make_ci_test
- GitHub Check: run_make_ci_test
- GitHub Check: run_make_ci_format
🔇 Additional comments (1)
test-integration/test-cloning/tests/01_program-deploy.rs (1)
181-181: Verify the rationale for increasing MAX_RETRIES.The retry limit increased from 20 to 50 (2.5x). While this may address timing changes from the stuck account fix, please verify that this change is intentional and not masking a deeper issue with upgrade propagation timing.
Consider documenting why the increase was necessary, especially if the stuck account fix impacts upgrade timing characteristics.
| ctx.wait_for_delta_slot_ephem(5).unwrap(); | ||
|
|
||
| let bump = remaining_retries - MAX_RETRIES + 1; | ||
| let bump = (remaining_retries - MAX_RETRIES) + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the incorrect bump calculation formula.
The current formula produces negative values after the first iteration:
- Iteration 1:
remaining_retries=50,bump = (50-50)+1 = 1✓ - Iteration 2:
remaining_retries=49,bump = (49-50)+1 = 0 - Iteration 3:
remaining_retries=48,bump = (48-50)+1 = -1✗
This results in messages like "Hola Mundo 0", "Hola Mundo -1", which is incorrect for a retry counter.
Apply this diff to fix the formula:
- let bump = (remaining_retries - MAX_RETRIES) + 1;
+ let bump = (MAX_RETRIES - remaining_retries) + 1;This will produce incrementing values: 1, 2, 3, 4, ... as expected for a retry counter.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let bump = (remaining_retries - MAX_RETRIES) + 1; | |
| let bump = (MAX_RETRIES - remaining_retries) + 1; |
🤖 Prompt for AI Agents
In test-integration/test-cloning/tests/01_program-deploy.rs around line 186, the
bump calculation uses (remaining_retries - MAX_RETRIES) + 1 which yields
negative or zero values after the first iteration; replace that formula with
(MAX_RETRIES - remaining_retries) + 1 so bump increments 1,2,3,... as retries
decrease.
## <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Bug Fixes** * Improved account cleanup to include accounts stuck in intermediate or pending states so eligible accounts are no longer skipped during removal, increasing reliability. * **Tests** * Adjusted integration timing to ensure subscriptions reflect consecutive airdrops and processing delays for consistent balance updates. * Increased retry limit in deployment tests to reduce transient failures and improve test stability. <!-- end of auto-generated comment: release notes by coderabbit.ai --> This PR fixes an edge case where accounts in the process of undelegation can get stuck if the validator misses the update of the undelegation completion. The fix ensures that such accounts are properly cleaned up on restart. ## Details The issue occurred when accounts were being undelegated but the process didn't complete while the validator was active or we missed the update.
Summary by CodeRabbit
Bug Fixes
Tests
This PR fixes an edge case where accounts in the process of undelegation can get stuck if the validator misses the update of the undelegation completion.
The fix ensures that such accounts are properly cleaned up on restart.
Details
The issue occurred when accounts were being undelegated but the process didn't complete while the validator was active or we missed the update.