-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
obront - Fault game factory can be manipulated to DOS game type using malicious l2BlockNumber
#90
Comments
Factually valid although the impact here isn't different than having any game resolve incorrectly which would poison the |
Based on scoping details below, I believe this issue is valid and in-scope of the contest, as the root cause stems from the lack of a sanity check within the dispute game factory allowing large https://docs.google.com/document/d/1xjvPwAzD2Zxtx8-P6UE69TuoBwtZPbpwf5zBHAvBJBw/edit The potential to block the entire fault proofs system entirely by preventing further creation of new games is significant, so I believe it warrants high severity given the potential to block withdrawals from an OP bridge. Although the admin can temporarily resolve this by switching game type, I believe it is not a valid solution given the attack can be easily repeated. |
l2BlockNumber
l2BlockNumber
just a thought @nevillehuang:
Is this actually true? If they change the game type, the new FaultDisputeGame implementation will be fixed and won't have this vulnerability so the attack can't be repeated. Because of that, the sponsor's comment seems to make the most sense and calling this a high severity issue is quite sus. |
Forwarding a comment from the protocol team -- This issue isn't valid because the decoupling of the L2 block number that's determined during output bisection and the one on the root claim is intentional.
Essentially a proposal of this form would be invalidated by the current fault proof system so the bug itself wouldn't be possible |
This bug operates under assumption that the FP system can cause a invalid game to be resolved as valid, and there were multiple ways to do this in the contest (see #8). If this can occur then no more dispute games can be created for the same game type which will lead to DoS. Only possible solution is to update game type as pointed out by comments above. |
In this case, one single invalid game resolution with very large block number DOS the whole game type, update game type does not seems to be a long term solution, there are not many game type to update. The fix is still add proper validation for block number or the fix in this report can be used as well
|
I want to note that the attacker is risking the bonds. They will likely to lose it if any honest party challenges them. |
Whether the attacker get challenged or not is not in-scope, the audit and report is under the assumption that the game can be resolved incorrectly
and in case the game resolved incorrectly, massive DOS for game type occurs as outlined in the report. |
To share my perspective here: TLDR: This is a difficult case, because the issue should be in scope, but the outcome that it causes is no worse than the manual fixes that would happen when the safeguards work properly. Severity: As much as I'd like to, I can't see a justification for High. The outcome does not seem bad enough. Scope: This does seem to meet the definitions laid out in the scope document. The issue is in the in scope contracts, and the outcome (DOS of game type) should be sufficient for a Medium. However, it is a weird dynamic because when safeguards are used, it also causes a DOS of game type, so it seems strange that the same outcome could be a valid issue. Conclusion: My assessment is that this should remain as a valid Medium, because the contest rules didn't rule out all game type DOS, only those caused by game contract logic. That being said, I recognize this is difficult to judge and respect whatever decision the judge makes. |
as the original well-written report highlights
the impact of DOS game creation and game type means no user can finalize their withdraw transaction / execution l2 -> l1 message, which is a leads to clearly loss of fund and lock of fund as multiple duplicates highlight such as #206 |
I explained my assessment based on that sponsor comment here:
#90 (comment)
…On Fri, May 3, 2024 at 4:02 PM Haxatron ***@***.***> wrote:
Hi,
Already given my reasoning above.
I will defer to @zobront <https://github.com/zobront> and @JeffCX
<https://github.com/JeffCX> for any additional comments
Acknowledge this one is quite tricky to judge.
—
Reply to this email directly, view it on GitHub
<#90 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABL3ULAOEIBLV2L6ENQ5SL3ZAP3OPAVCNFSM6AAAAABFXP5JESVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJTG42TQMRVGU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***
com>
|
Emm seems like this is saying that the game cannot be resolved incorrectly.... but during judging, we mark the game resolution logic out of scope and use the argument to invalid many issue It is contradictory to use the argument "incorrect game resolution out of scope" to invalid many other issue. while use the argument "game cannot be resolved incorrectly" to invalid this issue. the comments strongly contradicts the readme as well: from read me
the report is perfect derivation from the statement above without worrying about the game dispute logic...
then I think the original judging decision still stands. |
I actually don't understand why the sponsor's claim is correct. If it does make sense to anyone else, can someone please explain? What I understand they're saying is that the game can't resolve correctly in this way. But I don't understand how that is possible if the There is no check I can see that checks that From the point of view of the proof system, the block number it is using is unrelated to the one later being passed to If so, isn't it the case that anyone can always frontrun the legitimate proposals and use the legitimate data, but pass in What am I missing here? |
If I am not wrong, it will be used in VM.step() after adding the block
number into the preimage oracle via addLocalData()
…On Sun, 5 May 2024, 10:04 Guhu, ***@***.***> wrote:
I actually don't understand why the sponsor's claim
<#90 (comment)>
is correct. If it does make sense to anyone else, can someone please
explain?
What I understand they're saying is that the game can't resolve correctly
in this way. But I don't understand how that is possible if the extraData's
l2Blocknumber is never actually used by the proof system? The only user
of that value is AnchorStateRegistry and it doesn't validate it.
There is no check I can see that checks that extraData's l2Blocknumber is
actually in the rootClaim in any way.
From the point of view of the proof system, the block number it is using
is unrelated to the one later being passed to AnchorStateRegistry.
If so, isn't it the case that anyone can always frontrun the legitimate
proposals and use the legitimate data, but pass in type(uint).max as
l2BlockNumber? Isn't that a perpetual DoS of the system?
What am I missing here?
—
Reply to this email directly, view it on GitHub
<#90 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ASHOYPIERRIR4LDWW7ZS5YTZAWHSBAVCNFSM6AAAAABFXP5JESVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJUGU2TCOJWGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***
com>
|
Apologies, you are correct, the L2 block number referenced on that line is the anchor root block number rather than the l2 block number passed via the extraData. Perhaps, this requires more clarification from protocol team... |
@Evert0x @nevillehuang it looks like the sponsor's argument for this being invalid is not well understood. The finding is also marked "won't fix", so if it is valid, it is not mitigated. It would appear that a mitigation would be needed at both game level and at the registry level, as fixing one without the other would leave the other vulnerable. Can @smartcontracts maybe have another look at this, and possibly address the questions in #90 (comment)? |
After a discussion with the protocol it's clear that this issue should be a valid Medium |
So permissionless shutdown of withdrawals/messaging until redeploy (Freeze of Funds of 2 weeks) is considered a Medium on Sherlock? Was the magnitude of effect this would have on the Optimism ecosystem considered? A 1-hour shutdown of Blast was the most talked about incident for months, how would a 2-week FoF be interpreted? |
In addition to the prolonged DoS, the DoS appears to be repeatable, and require changes to the registry and portal, and not just updating the game. In order for the DoS to NOT be repeatable, the fix must be possible at the game level, such that updating the game type is sufficient. But it doesn't appear to be the case: Please see these (new) fix PRs in OP dealing with this issue: https://github.com/ethereum-optimism/optimism/pull/10431/files, https://github.com/ethereum-optimism/optimism/pull/10434/files. They add extensive changes to both the ASR and the Portal to deal with the l2BlockNumber issue. The changes to the game are minimal, and it appears that this issue is NOT fixable by just updating the game implementation. [I suspect this is because in the game, the l2blocknumber is PART of disputed L2 output state, so is not a mutually agreed on external input, unlike the L1 number, which comes directly from the L1 Summary: because the issue is not fixable by updating ONLY the game, and an upgrade of the ASR and Portal are needed, the safety measures are inadequate and the DoS WITHOUT the full fix is repeatable. |
this incident has not even happened 2 months ago and people cared about it for a few days at most |
@Evert0x can you elaborate why? (for those of us who weren't on that discussion) |
The justification for Medium severity is as follows
As stated in the README, part of the security model is a honest and responsive admin that can recover from a DoS within 72 hours.
The supplied block being higher than the actual block number in the EVM is obviously detectable malicious activity. In conclusion, once the proposal is detected, the Proxy Admin Owner is trusted to be responsive within 72 hours and is able to switch the game type to a permissioned implementation within a new AnchorStateRegistry to mitigate the DoS. Note: It’s important to note that there’s a difference between switching the game type (which invalidates all withdrawals with that game type) and switching the implementation (which does not invalidate withdrawals). |
@Evert0x but can switching to a permissioned implementation be considered final "recovery"? If assumed permanent - it permanently breaks core functionality (no fraud proofs from that point). If assumed temporary - it only postpones the switch of the game type and the withdrawals DoS. |
This downplayed take is plagued with intellectual dishonesty.
If it was obvious as something to look for, Opt would have validated the l2BlockNumber is the same as the VM block number. Detection is highly unprobable. Please provide the defender off-chain code to show awareness of this vulnerability. Once again the benefit of the doubt is given to an opaque statement by the sponsor and against honest Watsons. For fairness of discussion, it must be assumed Opt is aware only at the moment games cannot be created.
For the past month where Optimism had access to the repo, their suggested fix was moving to a new game type, confirming the 2 week DOS. Only couple of days ago came the idea of overriding the same game type to avoid the DOS. Using this to reduce severity is unacceptable. It essentially extended their 3 day SLA to 1 month, letting them theorize over best response over a tremendously long time and then argue the optimized response would be what they would be rolling with on day 1. Clear intellectual dishonesty. Additionally, from an air-gap perspective (up to High according to the README), the resolution and updating of the anchor state registry is instantaneous, making new withdrawals impossible from day 0 and bypassing intended airgaps. I will also state that over the past week Optimism has catapulted a variety of arguments against the submission which were technically proven wrong, showing they have no problem misrepresenting an issue or its characteristics in order to reduce its severity. |
I dont know why the optimism team is trying to find some weird loopholes to argue for downgrade if they could just use official sherlock docs to justify it: according to https://docs.sherlock.xyz/audits/judging/judging#v.-how-to-identify-a-medium-issue : about DOS: https://docs.sherlock.xyz/audits/judging/judging#iii.-sherlocks-standards
to be high severity 1 is true and 2 is questionable if we assume that admin deploys a new game type in time so funds can be recovered and users can just make new game instance, where these functions are available again. I dont see any air-gap bypass unless we use different definitions. My understanding is that the air-gap is the delay before withdraw of funds can happen and its not possible for users to withdraw early |
@guhu95 It's not a final recovery, but safety mechanisms are put in place first to mitigate the DoS and, secondly, to remove the DoS factor. |
@trust1995 Forwarding from the protocol team the detection code for this case. So in a nutshell the monitoring service is here: https://github.com/ethereum-optimism/optimism/blob/5137f3b74c6ebcac4f0f5a118b0f4909df03aec6/op-dispute-mon/mon/monitor.go#L87 This service calls out to a forecasting function which checks the L2 block number and the claim provided against the real output root for that block number: https://github.com/ethereum-optimism/optimism/blob/5137f3b74c6ebcac4f0f5a118b0f4909df03aec6/op-dispute-mon/mon/forecast.go#L69 Claimed L2 block number and output are pulled from the game’s metadata: https://github.com/ethereum-optimism/optimism/blob/5137f3b74c6ebcac4f0f5a118b0f4909df03aec6/op-dispute-mon/mon/extract/extractor.go#L54 So in the case of that bug, the service would try to get the block number for the future block that doesn’t exist yet, get the following error, disagree, and raise an alert: https://github.com/ethereum-optimism/optimism/blob/5137f3b74c6ebcac4f0f5a118b0f4909df03aec6/op-dispute-mon/mon/validator.go#L40 |
@spearfish5609 I don't think I'm using weird loopholes to decide on the severity of this issue. It's not always clear if the DoS should be judged as indefinite just because the admin can recover from it. However, in this case, the language in the README makes it clear. |
The protocol team fixed this issue in the following PRs/commits: |
obront
medium
Fault game factory can be manipulated to DOS game type using malicious
l2BlockNumber
Summary
All new games are proven against the most recent L2 block number in the
ANCHOR_STATE_REGISTRY
. This includes requiring that the block number we are intending to prove is greater than the latest proven block number in the registry. Due to insufficient validations of the passed L2 block number, it is possible for a user to set the latest block totype(uint256).max
, blocking all possible future games from being initialized.Vulnerability Detail
New games are created for a given root claim and L2 block number using the factory, by cloning the implementation of the specified game type and passing these values as immutable args (where
_extraData
is the L2 block number).As a part of the initialize function, we pull the latest confirmed
root
androotBlockNumber
from theANCHOR_STATE_REGISTRY
. These will be used as the "starting points" for our proof. In order to confirm they are valid starting points, we require that the L2 block number we passed is greater than the last proven root block number.However, the L2 block number we pass does not appear to be sufficiently validated. If we look at the Fault Dispute Game, we can see that disputed L2 block number passed to the oracle is calculated using the
_execLeafIdx
and does not make any reference to the L2 block number passed viaextraData
:This allows us to pass an L2 block number that is disconnected from the proof being provided.
After the claim is resolved, we update the
ANCHOR_STATE_REGISTRY
to include our new root by callingtryUpdateAnchorState()
.As long as the L2 block number we passed is greater than the last proven one, we update it with our new root. This allows us to set the
ANCHOR_STATE_REGISTRY
to contain an arbitrarily highblockRootNumber
.If we were to pass
type(uint256).max
as this value, it would be set in theanchors
mapping, and would cause all other games to fail to initialize, because there is no value they could pass for the L2 block number that would be greater, and would therefore fail the check described above.Proof of Concept
The following test can be dropped into
DisputeGameFactory.t.sol
to demonstrate the vulnerability:Impact
For no cost, the factory can be DOS'd from creating new games of a given type.
Code Snippet
https://github.com/sherlock-audit/2024-02-optimism-2024/blob/main/optimism/packages/contracts-bedrock/src/dispute/FaultDisputeGame.sol#L528-L539
https://github.com/sherlock-audit/2024-02-optimism-2024/blob/main/optimism/packages/contracts-bedrock/src/dispute/AnchorStateRegistry.sol#L59-L87
Tool used
Manual Review
Recommendation
In order to ensure that ordering does not need to be preserved,
ANCHOR_STATE_REGISTRY
should store a mapping of claims to booleans. This would allow users to prove against any proven state, instead of being restricted to proving against the latest state, which could be manipulated.The text was updated successfully, but these errors were encountered: