Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove EEI from EVM v2 #2649

Merged
merged 28 commits into from May 11, 2023
Merged

Remove EEI from EVM v2 #2649

merged 28 commits into from May 11, 2023

Conversation

jochem-brouwer
Copy link
Member

@jochem-brouwer jochem-brouwer commented Apr 18, 2023

Follow-up of #2604

The goal of this PR is to remove the EEI. All the logic (except blockchain logic) can be moved into StateManager.

This PR:

  • Removes EEI
  • Creates an interface in Common
  • Setups a mock blockchain in EVM by default (blockchain in EVM only used for BLOCKHASH opcode: mockchain always returns zeros(32))

Follow-up PRs;

  1. Move all the touched/warm account journaling and friends into a journaler class
  2. All touched/warm accounts logic is now removed (no separate journals). These are instead tracked from the account and storage cache

Another PR which will build on top of this but can be done at any place before above PRs;

  • Ensure EVM runs ethereum/tests state tests (so remove @ethereumjs/vm dev dependency)

TODO in this PR:

  • Remove enableDefaultBlockchain logic
  • Move EVM logic from DefaultStateManager into EVM
  • Update StateManagerInterface such that it only exposes the "minimal" interface

evm: import eei

vm: update changes to evm eei

evm: fix test syntax

evm: ensure default blockchain is loaded in tests

vm/evm: fix vm test such that they run [no ci]

vm: fix vm copy

client/vm: fix build

vm: fix test runner

evm: fix example

common: fix interface function sig

stateManager/vm: fix tests

client: fix tests

client: fix tests

evm: fix example
@jochem-brouwer jochem-brouwer added the type: test all hardforks This special label enables VM state and blockchain tests for all hardforks on the respective PR. label Apr 23, 2023
@codecov
Copy link

codecov bot commented Apr 23, 2023

Codecov Report

❗ No coverage uploaded for pull request base (develop-v7@8b84f5d). Click here to learn what that means.
The diff coverage is n/a.

Additional details and impacted files

Impacted file tree graph

Flag Coverage Δ
block 90.30% <0.00%> (?)
blockchain 90.48% <0.00%> (?)
client 86.86% <0.00%> (?)
common 96.06% <0.00%> (?)
devp2p 91.74% <0.00%> (?)
ethash ∅ <0.00%> (?)
evm 79.49% <0.00%> (?)
rlp ∅ <0.00%> (?)
statemanager 81.12% <0.00%> (?)
trie 90.02% <0.00%> (?)
tx 95.50% <0.00%> (?)
util 81.34% <0.00%> (?)
vm 81.36% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Copy link
Contributor

@acolytec3 acolytec3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have just done an initial look at things based on the current state of the PR. Looks generally pretty good!

const interpreter = new Interpreter(this, this.eei, env, message.gasLimit)
const interpreter = new Interpreter(
this,
this.stateManager,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really necessary to pass this.stateManager and this.blockchain in this constructor since they are both properties of this (i.e. the EVM object)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interpreter does currently not have access to EVM currently IIRC (am on mobile)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems to have as far the the Interpreter constructor goes, but the question is will there be a usecase where to pass a different statemanager than the evm's?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, using EthersStateManager for instance :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are missing the point here, the question was not about using another StateManager but a different one for EVM and StateManager, so if we need to pass both „this“ and „this.StateManager“, where I would also think the answer is clearly: not.

(„This“ is the EVM, just not to overlook)

I think for this.blockchain question goes as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I understand the question. The passed StateManager is part of EVM constructor, and Interpreter also needs access to it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but we do not need to pass this extra on since we already pass this (the EVM) so we can do evm.stateManager or evm.blockchain inside Interpreter and so do not need three Interpreter constructor paramters for this, that's is the whole point here. 🙂

// TODO: Switch to diff based version similar to _touchedStack
// (_accessStorage representing the actual state, separate _accessedStorageStack dictionary
// tracking the access diffs per commit)
protected _accessedStorage: Map<string, Set<string>>[]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much of (if any) this caching/journaling stuff is duplicating the various bits of caching that has built out in the recent caching work merged into v7? I only ask because it feels like we're duplicating a lot of work here (though I do recognize that the stateManager is doing different work with the caching it does). What I'm wondering if is we need a generalized caching mechanism in the stateManager when we are tracking the state changes in each call frame as a checkpoint through the journaling here so seems like there's an opportunity to streamline caching.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree we should check if we can generalize it. Note that the PR is still super WIP. I am now at the phase where I can really start refactoring out things from the old statemanager.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! Was just giving this an early look

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, great and thanks 😄 👍

@g11tech
Copy link
Contributor

g11tech commented May 1, 2023

Note: state manager tests fail, because EthersStateManager does not have the now newly required methods which are necessary in the state manager if you want to run it in EVM.

Shall we temporarily disable those tests..?

ha ha this was the exact question that was coming in my mind: what now happens to/implications are for EthersStateManager
since the EVM's intermittent state (previous vmState with checkpoints, warmed addresses etc) is now part of the state manager (which makes sense, but should the underlying committed data access layer be also carved out into a separate class, which can be plugged by EthersStateManager)

If we have an answer to how to best handle this, we can disable EthersStateManager to be refactored and re-activated in a followup PR.

@jochem-brouwer
Copy link
Member Author

Yup, so we can basically go two ways,

  1. Create a new BaseStateManager which handles this so we can just super.method() on both state managers (Holger does not like this since we just removed it)
  2. Create a new class which handles this. The methods should call into this class. We can create some subclass like StateManager.EVMJournal / EthersStateManager.EVMJournal and call directly into that from EVM.

@jochem-brouwer
Copy link
Member Author

Wait, there is also a third option, which is moving the journal logic inside EVM. 🤔 Which upon thought might be the cleanest..? Thoughts?

@g11tech
Copy link
Contributor

g11tech commented May 1, 2023

Wait, there is also a third option, which is moving the journal logic inside EVM. thinking Which upon thought might be the cleanest..? Thoughts?

I like this the best and to restrict state manager to dealing with committed data to provide us nice clean abstractions to do things like EthersStateManager (or PortalNetworkStateManager) etc

@jochem-brouwer
Copy link
Member Author

I think I agree, I don't know why I had not thought about this earlier. It makes more sense to move this EVM-related journal logic inside EVM and not outsource it to another package. When I find time I will move it into EVM 😄

@@ -356,7 +346,7 @@ async function _runTx(this: VM, opts: RunTxOpts): Promise<RunTxResult> {
if (opts.skipBalance === true && fromAccount.balance < maxCost) {
// if skipBalance, ensure caller balance is enough to run transaction
fromAccount.balance = maxCost
await this.stateManager.putAccount(caller, fromAccount)
await this.stateManager.putAccount(caller, fromAccount, true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any false case here?

(so is the parameter necessary?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

putAccount defaults to touched = false. This fixes edge cases like: generating canonical accounts where empty accounts are put in trie (note that if we have empty account at address A, where this account exists in trie B and not in trie C, we get a different state root). In previous versions where putAccount on EEI was called, this would touch the account (and generateCanonicalGenesis called this). This would mark these empty accounts as touched. If you now execute a tx and you do not touch these accounts, they get wiped in the end anyways.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, from an API standpoint this is not so beautiful to have this extra touched option which puts the burden to think about what to put there also for more generic StateManager users, or keep too different method signatures, which is also not so nice.

I would think that it's worth to think about if we can just track these (very very few I guess) exceptions EVM internally in an inside data-structure _nonTouched or so and then also handle internally? 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this is the general idea also thus promoting the idea of moving this EVM logic which now resides in the StateManager (before this PR, it was inside EEI) back into EVM to track it there and not let StateManager track this EVM logic.

@holgerd77
Copy link
Member

I think I agree, I don't know why I had not thought about this earlier. It makes more sense to move this EVM-related journal logic inside EVM and not outsource it to another package. When I find time I will move it into EVM 😄

Can we take on this part a bit slow, from a performance POV it is relatively crucial that we get this right and I‘m still not 100% convinced that this could not just „fall out from the cache“, especially things like getOriginalContractStorage().

when I did some measurements these things were insanely slow and if we keep unnecessary double structures here we will never able to fully optimize this eventually.

Maybe rather something for next week, but eventually let’s do another call here.

@holgerd77
Copy link
Member

I have to say after two more days of thinking about this I am slightly panicking, haha. We are so close here and with moving this back to the EVM we would loose all the potential performance gains. This is not a smallish thing, since all this state stuff is so delicate and "performance impactful", I would roughly estimate this can lead to up to 1/4 to 1/3 of overall performance gain for VM runs.

I would still think all the VM specific code should now "fall automatically" [TM] out of the cache and most/all VM-specific state code can be somewhat considered "garbage" and be thrown away.

Side note: for EthersStateManager (and others in the future) for this to work, we need to obviously (?) in a first round have ESM also use the generic caches and throw the proprietary custom stuff away, but we would have neede to do this anyhow.

Anyhow: I will give this a try now and see how far I will come (no worries, in a separate PR 🙂).

If I realize that I am hunting ghosts here we can happily throw my stuff away again!

@holgerd77
Copy link
Member

@jochem-brouwer Update: I will just write down my thoughts and design suggestions in a first round to discuss, this is just too much stuff and touching too many thing to "just do". 🙂

Then we can discuss and eventually work on it together during the next week!

@holgerd77
Copy link
Member

Another update: ok, I am trippleing down step by step, this overloads me right now to be plainly honest. I can now also definitely say that moving to EVM is definitely the better option than the current status quo with this too overloaded StateManager interface. On the other hand I am seeing so many things which are just quite as the same as what I have done in the caches. Hmm.

I guess we are just coming from these very different angles right now, I am from my caching implementation experiences and additionally all these performance experiments, you being the expert on all these touched/warmed/EVM stuff.

I think it's likely most productive if we just do a longer call on this early next week (directly Monday or Tuesday?, both ok for me) than we can exchange the different perspectives and brainstorm a bit on this together.

Will lay this aside for now.

@jochem-brouwer
Copy link
Member Author

I agree that we come from different angles, but don't get me wrong, I care a lot about performance as well. I think the fact that we are both working on these things (so: me extracting EEI, and you working on the performance/caches) is a bit doing work which touches the same stuff and therefore we might both get a bit confused on what to do next.

Shall we do a call Monday afternoon? @holgerd77

@holgerd77
Copy link
Member

Hmm, since this will likely be a substantial round of additional work, I wonder: should we just merge this in "as is", eventually comment out 1-2 tests or so or fix some last stuff? 🤔

Then we can finally do this develop-v7 merge and bring this a bit forward, otherwise I fear that we'll always loose this point again.

@jochem-brouwer
Copy link
Member Author

We can do this, I will move the cleanup work into another PR.

@jochem-brouwer jochem-brouwer marked this pull request as ready for review May 11, 2023 13:43
Copy link
Contributor

@acolytec3 acolytec3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@acolytec3 acolytec3 merged commit 32ffec2 into develop-v7 May 11, 2023
69 checks passed
@holgerd77 holgerd77 deleted the remove-eei-rebase-v2 branch May 12, 2023 09:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
package: common package: evm package: statemanager package: vm PR state: needs review type: test all hardforks This special label enables VM state and blockchain tests for all hardforks on the respective PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants