Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFP Proposal: Tool for applying test vectors from Ethereum on FEVM #1202

Open
wenxinnnnn opened this issue Dec 5, 2022 · 22 comments
Open

RFP Proposal: Tool for applying test vectors from Ethereum on FEVM #1202

wenxinnnnn opened this issue Dec 5, 2022 · 22 comments
Assignees
Labels
Projects

Comments

@wenxinnnnn
Copy link

wenxinnnnn commented Dec 5, 2022

RFP Proposal: Tool for applying test vectors from Ethereum on FEVM

Name of Project: Tool for applying test vectors from Ethereum on FEVM

Link to RFP: fvm-ethereum-test-vectors

RFP Category: devtools-libraries

Proposer: FroghubMan

Do you agree to open source all work you do on behalf of this RFP and dual-license under MIT and APACHE2 licenses?: Yes

Project Description

FVM 2.1 which introduces support for Ethereum smart contracts in the Filecoin network (FEVM) will be the biggest network upgrade since mainnet launch. At the same time, it immensely widens the area of possible problems. That’s why it must be tested thoroughly.
One of possibilities to test FEVM is to run already existing and proven smart contracts from Ethereum and compare the state it produces on Ethereum vs Filecoin.

We are seeking proposals from teams that can build such a testing tool. This tool is expected to be in the form of a CLI.
It should be able to:

  1. Fetch and store on the order of hundreds of contracts from Ethereum mainnet. These should be widely used contracts covering various use-cases.
  2. Deploy these contracts to a Filecoin testnet (Butterflynet).
  3. Fetch thousands of transactions sent to these contracts on Ethereum in the past.
  4. Replay these transactions to contracts deployed on Filecoin.
  5. Compare the Ethereum state before and after these transactions with the results on Filecoin. We expect the team to research the ideas of how to do state comparison reliably.

We want to run this tool a few times if needed and fix encountered errors every time.
This tool can be written in either Go or Rust.

Development Roadmap

Milestone1:

  1. Fetch and store thousands of transactions sent to constracts on Ethereum in the past.
    And through EVM Tracing, Intercept and store data for opcodes of the following types:

    • context: BLOCKHASH, COINBASE, GASPRICE, GAS, TIMESTAMP, NUMBER, DIFFICULTY, GASLIMIT, CHAINID, BASEFEE
    • ext: EXTCODESIZE, EXTCODECOPY, EXTCODEHASH, BALANCE
    • call: CALLDATALOAD, CALLDATASIZE, CALLDATACOPY, CODESIZE, CODECOPY, CALL
    • storage: SLOAD, SSTORE
    • state: SELFBALANCE
  2. With the help of ref-fvm and the opcode data of call, state and storage types, correctly simulate
    the data (address, balance, bytecode, slot data) of the account on the state tree. Write the opcode data of
    context and ext types to blockstore, for reading by related opcode functions in testing phase.
    Finally intercept blockstore writing and generate tvx test vector file.

Milestone2:

  1. Expand the function implementation of context and ext type opcodes under ref-fvm,
    allowing data to be read from blockstore directly.
  2. Expand the tvx function so that the test vector of the evm type can be replayed.

Documentation, Education, and Community

The code repository will contain detailed instructions for using the CLI tool

Milestone Summary

Milestone No. Milestone Summary & Staffing Funding Estimated Timeframe
1 Generate tvx test vectors from transactions on Ethereum & 2 people $16000 2 weeks
2 Tvx tools comsume test vectors of type evm & 2 people $12000 1.5 weeks

Total Budget Requested

$28000

Maintenance and Upgrade Plans

We are willing to maintain this test vector tool for a long time

Team

Contact Info

wenxin@froghub.io

Team Members

  • He Zheng zhenghe@froghub.io
  • Minghang Fan (slack)
  • OM - FrogHub (slack)
  • Plus-FrogHub (slack)

Team Website

https://www.froghub.io/

Relevant Experience

Please describe (in words) your team's relevant experience, and why you think would do a great job with this RFP. You can cite your team's prior experience in similar domains, doing similar dev work, individual team members' backgrounds, etc.

  1. We are familiar with the implementation of EVM. By using the source code of the public contract, we can output the StateChanges caused by Transaction on Ethereum in a human-readable form
  2. We are familiar with the implementation of FVM \ FEVM, and understand the underlying data structure (eg. HAMT AMT BitField)
  3. We are familiar with the implementation of tvx and the design of its Test Vector

Team code repositories

https://github.com/froghub-io

@jennijuju
Copy link
Member

@wenxinnnnn could you please provide the funding breakdowns? i.e: work hours, $ hr/dev ?

@wenxinnnnn
Copy link
Author

wenxinnnnn commented Dec 5, 2022

Funding for this M1:

1 Software engineer lead resource: 1 * 120 $/h * 40 h/week * 2 week = $9600
1 Software engineer resource: 3 * 80 $/h * 40 h/week * 2 week = $6400

Total = $16000

Funding for this M2:

1 Software engineer lead resource: 1 * 120 $/h * 40 h/week * 1.5 week = $7200
1 Software engineer resource: 3 * 80 $/h * 40 h/week * 1.5 week = $4800

Total = $12000

@jennijuju
Copy link
Member

capturing slack conversation
image

@maciejwitowski
Copy link
Contributor

Hey @wenxinnnnn thanks for updating the team/price info. @raulk will review it.

Can you share any updates on the work so far? IIRC you started last week, right?

@wenxinnnnn
Copy link
Author

wenxinnnnn commented Dec 13, 2022

  1. Mock Blockstore, completed Generate Account, mock_single_actor_blockstore, and then write slot data and opcodes, etc., and finally generate Blockstore to generate car files, which are then used to generate part of the corpus of JSON test vectors.
    https://github.com/froghub-io/builtin-actors/tree/tvx

  2. Fetching data from Ethereum and generating a corpus of JSON test vectors is also in progress

@raulk
Copy link
Member

raulk commented Dec 13, 2022

@wenxinnnnn I think we are broadly aligned here, but there are some details that need to be adjusted.

  • You will need to modify the test vector schema to capture environmental data that is not part of the state (e.g. chainid).
  • You will need to generate a message test vector with the Filecoin message bytes (using the same logic as eth_sendRawTransaction in our JSON-RPC endpoint).
  • context: BLOCKHASH, COINBASE, GASPRICE, GAS, TIMESTAMP, NUMBER, DIFFICULTY, GASLIMIT, CHAINID, BASEFEE
    • COINBASE is 0
    • DIFFICULTY will be mapped to some randomness (we just found EIP-4399), so we might be ok using the randomness fields in the schema.
    • TIMESTAMP, BLOCKHASH (tipset CID) are provided to the machine on construction and will need to be part of the test vector schema if not already (can't remember).
    • GASPRICE and GAS will need to be provided manually on extraction, since Ethereum gas is irrelevant. This means that we cannot perform assertions on gas usage.
    • Same for BASEFEE, this will need to be provided manually.
    • CHAINID => you will need to enhance the test vector schema to store and supply this value on machine construction.
  • ext: EXTCODESIZE, EXTCODECOPY, EXTCODEHASH, BALANCE
    • You will need to intercept EXT operations to capture the bytecode of the requested contract an populate it in the state tree so that it's available when EXT* ops are called inside FEVM.
    • You will need to intercept BALANCE operations to populate the state tree.
  • call: CALLDATALOAD, CALLDATASIZE, CALLDATACOPY, CODESIZE, CODECOPY, CALL
    • CALLDATA operations operate on the input data. This requires nothing special on your end, you can discard any special actions here.
    • CALL and variants (DELEGATECALL, STATICCALL, CALLCODE) will need to be intercepted to capture storage slots accessed to populate the precondition state, and storage slots written to perform assertions.
  • storage: SLOAD, SSTORE
    • These are key! You will need to intercept SLOADs to populate the storage KAMT of the smart contract to make the precondition state available.
    • You will need to intercept SSTOREs to populate the post condition state so you can perform assertions.
  • state: SELFBALANCE
    • Also to populate the state tree.

If you agree on these, I'm happy to approve the submission.

@wenxinnnnn
Copy link
Author

We agree with these adjustment details, The tips are greatly appreciated.

@raulk raulk added this to Recommend for Grants WG Approval in FVM Grants Dec 14, 2022
@raulk
Copy link
Member

raulk commented Dec 14, 2022

@realChainLife this is approved from the FVM team. Can we formalise?

@BlocksOnAChain
Copy link
Collaborator

@wenxinnnnn - Hey, Dragan from the core FVM team here, I just wanted to check did you already started some work on this RFP?
Would be great if we have something to try in January 2023?

@wenxinnnnn
Copy link
Author

I think the progress is very smooth, and we will see results soon.

@jennijuju
Copy link
Member

@BlocksOnAChain we are aiming to get MVP by EOW.

@jennijuju
Copy link
Member

@wenxinnnnn Can we sync again tonight? I wanna touch point on:

For Milestone 1:

  • Which repo can we track the working progress? when we can get a quick brief demo on the tooling MVP?
  • re: _Generate tvx test vectors from transactions on Ethereum _, we'd like to be more specific with the tractions that we want to drive tvx from.
    • We have a list of top evm contracts captured here, and ideally we can abstract the most recent 2000 transactions from these contracts and generate the test vectors against them. Is this feasible from your end?

For Milestone 2 Tvx tools consume test vectors
Other than the test vector coverage above, we think it would be really useful if we can build:

  • a bot that samples transactions on the Ethereum mainnet and replays them against FVM continuously
  • By continuously it means - for each Ethereum block, we do X amount of random sampling of the transactions excuted in that block. Among that transaction set, we should have 30% of them being contract deployment and 70% being contract invocation.
  • Run the test vector against that transaction set, and report the key results: including
    • eth gas, fvm gas => to confirm that the gas difference are in acceptable range. reporting warning when not
    • exit code => the tx either succeed or fail in both EVM and FEVM
    • storage keys => pre-execution state/storage keys and post-execution state/storagekeys shall be the same in EVM and FEVM
      This bot is lower priory than Milestone 1, and would be great if we can have an MVP in early Jan.
      Let me know how you feel about that!

@wenxinnnnn
Copy link
Author

wenxinnnnn commented Dec 21, 2022

For Milestone 1, repo
Generate tvx test vectors from transactions on Ethereum, applicable to all Ethereum transactions, including “Constructor” and “applyTransaction”, but it requires the chain status of Ethereum nodes, so too old transactions are not easily exported.
The top evm contract is of course good test case, but the contracts that depend on the evm context are also critical, such as timestamp, chainId, etc.

For Milestone 2, repo
Testing with the test vector generated by Milestone 1 is underway.

@wenxinnnnn
Copy link
Author

MVP has been met.
@raulk Can you help us review whether it meets your expectations?
Some transactions can be perfectly replayed, but the actual gasLimit on Ethereum will generally fail. In addition, there are some strange problems, and the reason is still being found.

@raulk
Copy link
Member

raulk commented Dec 23, 2022

@wenxinnnnn that is great news! Yes, the gas limit on Ethereum is absolutely not translatable to Filecoin gas. See comment in the respective FIP: https://github.com/filecoin-project/FIPs/pull/569/files#diff-42db5d06aa41036fbab88aef4ad06e69fbfec97dba280a6227f27a1cffbd8f5dR808-R821

@wenxinnnnn
Copy link
Author

@jennijuju
Copy link
Member

Testing with the test vector generated by Milestone 1 is underway.

how is this going?

@wenxinnnnn
Copy link
Author

Found another problem:Simply sending funds to the contract without input_data will not trigger bytecode execution, which is inconsistent with the behavior of evm.

@jennijuju
Copy link
Member

@wenxinnnnn i think this is a known behaviour tho could you please create a ticket in ref-fvm repo with each of your discoveries so we can keep track of them? In the ticket, please include the original etherum tx id that triggered behaviour.
thank you!

@maciejwitowski
Copy link
Contributor

@wenxinnnnn IIUC this is the issue you're talking about filecoin-project/ref-fvm#835. Feel free to contribute to the discussion, the decision hasn't been reached there yet.

@maciejwitowski
Copy link
Contributor

More generally though, @wenxinnnnn do you have any report after running these test vectors already?

@wenxinnnnn
Copy link
Author

The M1 and M2 goals of the current Test Vector Tool have been completed.

By running some test vectors of transactions, we found the following problems:

  1. filecoin-project/ref-fvm#1369
  2. Simply sending funds to the contract without input_data will not trigger bytecode execution,
    which is inconsistent with the behavior of evm. (related to filecoin-project/ref-fvm#835)

But there are still some problems that have not been found:

  1. Rollbacks occurs when transactions involving complex calls are replayed. (eg. uniswap v2, v3 transactions)
    The original etherum tx is 0x1e7fafb64440bd4c2e2793544baa613ad81a62edc5ec5918c302847b55627994,
    0xd4ff32d2b048dc0bce17b697e68766f218cdf0c81b9e931cab2da83afb9fd7a8.

Due to the lack of fevm tracing, it is still difficult to locate the problem.
It may be necessary to consider implementing evm tracing similar to geth format for fevm?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
FVM Grants
Approved
Development

No branches or pull requests

6 participants