-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
derivation: remove EngineQueue
#10643
Conversation
Semgrep found 2 No Semgrep found 2 Variable Semgrep found 1 Please create a GitHub ticket for this TODO. Ignore this finding from todos_require_linear. |
232d425
to
3517849
Compare
0bbb879
to
82b6f03
Compare
EngineQueue
(work in progress)EngineQueue
82b6f03
to
ead738a
Compare
Might need a rebase to fix the finalize-test flake |
Trying to debug what is wrong with |
9059f6d
to
fbe198b
Compare
7f93249
to
103bd56
Compare
Rebased on the rebased base |
The op-program-compat test is failing because it's now reading an additional preimage to what it used to. We can regenerate that test if needed and it will grab the extra preimages required, but it's slightly suspicious that the behaviour changed. it is in the l2 state trie reads though and we have seen that hash map order can cause an extra read in the case of a deletion with those sometimes (not an issue in cannon, but can be when running natively) - odd that we haven't hit it for that test before though given how much its run since it went in. @Inphi any chance you can look at the failing test and tell if it is this random difference quickly or not? It could just be a new read op-node is doing which isn't an issue though (fault proofs used a fixed version of op-program so repeatability isn't a concern as long as this new behaviour is deterministic). |
Ok I'm hitting that test failure locally as well with the |
Code looks sane, though I need to dig into it better to follow the actual flow. I've captured fresh test data for the mainnet-genesis case - on this branch it winds up with 167758 preimages and on develop it has 167728. It quite consistently fails with the existing data and passes with the new data. So it is requesting something new, but it's not going crazy requesting a ton of extra data which is good. Looking at the difference in logs I think it's not stopping derivation when it should and winds up importing a bunch more blocks. The old log stops at:
but the new one stops at:
And it's part of importing one of those additional blocks that it winds up needing the extra l2 state nodes. It ultimately works because |
I'll take a closer look later. But from glancing at the stacktrace, I see that the missing trie node was fetched during deletion:
It seems similar to the Go map iteration order issue. |
Thanks @Inphi I dug a bit deeper after my initial comment and it is because op-program is now processing more blocks than before - these state accesses happen because one of those blocks contains a transaction that accesses them. So thankfully not the iteration order issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me - we just need to put in a fix for op-program not stopping at the target block. I'd suggest just having the if head.number >= d.targetBlockNum
from the op-program driver.go
run only if there isn't an error from the pipeline rather than on NotEnoughData
. That actually picks up the new block slightly earlier even in the old code and avoids depending on implementation details - typically nil
return value means the Step
made some progress so we should check if we've reached the target block.
3f5004d
to
3bc8330
Compare
103bd56
to
eae42d1
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #10643 +/- ##
============================================
+ Coverage 54.65% 81.92% +27.27%
============================================
Files 37 10 -27
Lines 2944 1079 -1865
Branches 415 0 -415
============================================
- Hits 1609 884 -725
+ Misses 1303 163 -1140
Partials 32 32
Flags with carried forward coverage won't be shown. Click here to find out more. |
Rebased on develop. Reviewing the op-program issue / comments of above now. |
Semgrep found 5
Named return arguments to functions must be appended with an underscore ( |
(squashed) remove debug line
a4cf28b
to
aadb641
Compare
Rebased on develop, trying to deal with Fjord flakes. At least the op-program tests seem to be passing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
* op-node: remove engine queue (squashed) remove debug line * op-node: test VerifyNewL1Origin * op-node: engine-queue removal review fixes
Description
Removal of the engine-queue.
This decouples the execution-engine from the L2 derivation-pipeline, which now attributes L2 block attributes.
The "sync deriver" (in earlier PRs,
syncStep
function) encapsulates the synchronous sequence of sub-derivers, preserving the old behavior still.The
DerivationPipeline
does not know about any related to theEngineController
anymore, making it possible to schedule it asynchronous to the other derivation work.Builds on top of #10642
Tests
Minor test changes:
Additional context
Completes phase 1a of op-node derivers design-doc
Metadata
Fix https://github.com/ethereum-optimism/protocol-quest/issues/272