Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add state tests for Istanbul #607

Merged
merged 15 commits into from
Nov 18, 2019
Merged

Add state tests for Istanbul #607

merged 15 commits into from
Nov 18, 2019

Conversation

s1na
Copy link
Contributor

@s1na s1na commented Nov 5, 2019

This points to ethereumjs/ethereumjs-testing#42 for upstream state tests. The dependency has to be fixed after ethereumjs-testing releases the istanbul tests.

Due to the re-structuring of the tests directories, we have to configure tester where to look for tests depending on the HF (tests for HFs prior to Istanbul are in LegacyTests).

A lot of tests are failing, even for Petersburg state tests. I'll now focus on figuring out what the issues are. @evertonfraga offered to help with resolving them.

(also closes #608 and #611)

@s1na
Copy link
Contributor Author

s1na commented Nov 5, 2019

The tester was instantiating Transaction objects without passing in the HF. Because the default HF in ethereumjs-tx is still Petersburg it resulted in wrong tx basefee costs. Fixing that the number of failing tests went down from 6k to 151.

"testBuildIntegrity": "npm run build:dist && node ./tests/tester -s --dist --test='stackOverflow'",
"testBlockchain": "npm run build:dist && node --stack-size=1500 ./tests/tester -b --fork='Petersburg' --dist --excludeDir='GeneralStateTests'",
"testBlockchain": "npm run build:dist && node --stack-size=1500 ./tests/tester -b --fork='Istanbul' --dist --excludeDir='GeneralStateTests'",
"testBlockchainPetersburg": "npm run build:dist && node --stack-size=1500 ./tests/tester -b --fork='Petersburg' --dist --excludeDir='GeneralStateTests'",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about this line. Weren't we testing some of the HF in the CI?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are always only testing the latest or eventually the latest two HFs on CI, otherwise this would blow CI time (this nevertheless improved recently and e.g. allowed to add the blockchain tests beneath the state tests at all, before only state tests were running).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks for the reply.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, maybe we can cron a daily full test run on master using https://circleci.com/docs/2.0/configuration-reference/#schedule

I use the travis equivalent for Buidler, and lets me catch errors introduced by dependencies asap.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we're running the state tests for all HFs from Byzantium, but the blockchain tests were being run only for the latest HF because of their runtime, which was improved in #536. So I guess now we could run the blockchain tests for maybe 2-3 HFs.

Scheduling cron jobs is a good idea, I like it. We could also run run slow tests (#472) as a cron job.

The VM's testing infrastructure could generally use some improvements.

s1na and others added 2 commits November 11, 2019 14:20
@s1na
Copy link
Contributor Author

s1na commented Nov 11, 2019

Some Istanbul tests included accounts in the pre state which had 0x00 for storage value (e.g. this test). This was being inserted to the storage trie of that account by setupPreConditions in tests/util.js. The result was some tests failing because state root was wrong, or wrong EIP-2200 gas calculations. Fixed in b0d71f0.

@s1na
Copy link
Contributor Author

s1na commented Nov 11, 2019

#612 had caused some of the EIP-2200 API tests to break because they were checking against execResult.gasRefund which was removed. I noticed that by moving gasRefund to be a property of EVM it's not accessible anymore from outside (users calling runTx, runCall or runCode), and it was kind of a breaking change.

So after EVM finishes execution I add a clone of the refund counter to EVMResult which would it make it accessible when calling runTx or runCall (but not runCode). It'll still be breaking because it's not stored in execResult though. I'm open to changing this if people don't agree.

@alcuadrado
Copy link
Member

I just left a comment about the breaking change in the other PR.

I'm not against the change itself, but I think there's probably some people depending con that field, so we shouldn't change it without releasing a major version.

@s1na
Copy link
Contributor Author

s1na commented Nov 11, 2019

BN.js's imuln asserts the number to be smaller than 0x4000000 (not sure why). This caused the CALLBlake2f_MaxRounds which has 0xffffffff as number of rounds. Fixed in 1b5538e and verified that the test case passes locally, but because it takes a really long time I marked it as a slow test.

@s1na
Copy link
Contributor Author

s1na commented Nov 12, 2019

I'm not against the change itself, but I think there's probably some people depending con that field, so we shouldn't change it without releasing a major version.

I thought about this some more and I think we can keep it quasi-backwards-compatible (not quite). The difference would be that gasRefund could potentially be negative if it's kept as ExecResult (i.e. result of a call). It'll also likely make the code a bit (but not much) more complicated.

The reason is that before EIP-2200 (excluding constantinople), gasRefund was only additive. We could compute it for each call, and accumulate the result if the call succeeded and at the end of the tx we'd have the total gas refund.

Now however (after EIP-2200) gasRefund, a nested call can deduct from a gasRefund that was added in a parent call. This means either we need a global refund counter (which we added in #611, this is the geth way) or if we keep the counter local to each call, we need to allow it to become negative and then add these positive and negative numbers at the end of tx to get total (this is the py-evm way).

I don't have a strong preference for either, and could change to the second option.
We could keep the field on ExecResult, but from what we discovered in #611 it seems gas refund has a lifetime as long as a transaction and not a call message. If we want to have it in `

@s1na
Copy link
Contributor Author

s1na commented Nov 12, 2019

Haven't confirmed yet, but I think the reasons many blockchain tests are failing is that ethereumjs-block is using an old version of ethereumjs-tx.

^^ Another argument for a monorepo :)

@alcuadrado
Copy link
Member

I have to think a little more about the gasRefund thing. But in the meantime,

The difference would be that gasRefund could potentially be negative if it's kept as ExecResult (i.e. result of a call).

@davidmurdoch would this be a problem for Ganache?

@davidmurdoch
Copy link
Contributor

Ganache's gas estimations rely on the gasRefund property returned in the result of runTx. We do not currently need to access this value until after the conclusion of a transaction.

Probably not helpful, but here is some background details on why the property was introduced: #283

@evertonfraga
Copy link
Contributor

I've been playing with TAPE and its reporting methods, EVMDIS and other tools, to make a better analysis of these failing tests. Will update here as I go.

RevertPrecompiledTouch.json

# expected actual
2503 c4373f01b4d302d94a2ebe1a590a61594224909570a584a02a87cbe335a21ab0 c8983cc2d39ed8d5cf575e36003fa2aea420d04e931423442df843e55ad77213
2504 c4373f01b4d302d94a2ebe1a590a61594224909570a584a02a87cbe335a21ab0 c8983cc2d39ed8d5cf575e36003fa2aea420d04e931423442df843e55ad77213

RevertPrecompiledTouchExactOOG.json

test expected actual
2539 54eb6e8f350e8390ab10e56b8760a5108da8d4b8988ed32c989f0664e07981c5 4d9daa4d0f4e86f5878b1d46899f18e0d2392f562d21668b8334270e1ed44162
2540 54eb6e8f350e8390ab10e56b8760a5108da8d4b8988ed32c989f0664e07981c5 4d9daa4d0f4e86f5878b1d46899f18e0d2392f562d21668b8334270e1ed44162
2541 5db419df81dfa44071ed60c48a617b88af64ad4ae24f4c0624c542ef326508ad a196ae26b79148baacfeae2544d3a57076c6cbde3c78a2d6a70c9b546535f40b
2542 5db419df81dfa44071ed60c48a617b88af64ad4ae24f4c0624c542ef326508ad a196ae26b79148baacfeae2544d3a57076c6cbde3c78a2d6a70c9b546535f40b
2543 3160a042b1167fc558ab4b3ef1d5252c0d5d665bf89d441443404ffa178cfa7c a196ae26b79148baacfeae2544d3a57076c6cbde3c78a2d6a70c9b546535f40b
2544 3160a042b1167fc558ab4b3ef1d5252c0d5d665bf89d441443404ffa178cfa7c a196ae26b79148baacfeae2544d3a57076c6cbde3c78a2d6a70c9b546535f40b
2545 2468f0f4040c02cdda444db6ee3aabfb2b93e181da4d66bd57374515cb40f082 4d9daa4d0f4e86f5878b1d46899f18e0d2392f562d21668b8334270e1ed44162
2546 2468f0f4040c02cdda444db6ee3aabfb2b93e181da4d66bd57374515cb40f082 4d9daa4d0f4e86f5878b1d46899f18e0d2392f562d21668b8334270e1ed44162
2599 b4a96a1af2bc41677ed9e8eb46054fbaeecda5838f96f0f1c762447613dccde8 8f63acee18f6a2ea07238eab4104f0de5f56fbfe256071fdc62fcb104faef4d8
2600 b4a96a1af2bc41677ed9e8eb46054fbaeecda5838f96f0f1c762447613dccde8 8f63acee18f6a2ea07238eab4104f0de5f56fbfe256071fdc62fcb104faef4d8

RevertPrecompiledTouch_storage.json

test expected actual
2611 15ec8dfdbc33f634b77e62d0c1e92e11c54e4f691256e705c9dfb2a05da029c8 b59d6917799d9ea601d4c1841db5fd8f9003cda19d9697203600d05ebb64af47
2612 15ec8dfdbc33f634b77e62d0c1e92e11c54e4f691256e705c9dfb2a05da029c8 b59d6917799d9ea601d4c1841db5fd8f9003cda19d9697203600d05ebb64af47

@lgtm-com
Copy link

lgtm-com bot commented Nov 13, 2019

This pull request introduces 1 alert when merging c3e1ccf into 00d8fe3 - view on LGTM.com

new alerts:

  • 1 for Unused variable, import, function or class

@coveralls
Copy link

coveralls commented Nov 13, 2019

Coverage Status

Coverage decreased (-0.6%) to 95.41% when pulling 970ddbf on istanbul-tests into 00d8fe3 on master.

Copy link
Contributor Author

@s1na s1na left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c3e1ccf fixes #608, and now all state tests are passing 😄

For context on the issue of deleting precompiles when a revert happens see EIP-716 and ethereum/aleth#5664.

@@ -163,10 +161,6 @@ export default class EEI {
* input data passed with the message call instruction or transaction.
*/
getCallDataSize(): BN {
if (this._env.callData.length === 1 && this._env.callData[0] === 0) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some tests were using CALLDATASIZE as a way of determining the address to call. They were failing for the case were calldata was Buffer<00> (CALLDATASIZE returned 0 instead of 1).

I'm not sure why this condition was here? any ideas?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really weird. I checked in pyevm, and it doesn't have this logic either.


const err = result.execResult.exceptionError
if (err) {
result.execResult.logs = []
await this._state.revert()
if (message.isCompiled) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we were re-touching any precompile in case of OOG.

// See [EIP-716](https://github.com/ethereum/EIPs/issues/716) for context.
// The RIPEMD precompile has to remain *touched* even when the call reverts,
// and be considered for deletion.
if (this._touched.has(ripemdPrecompileAddress)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the re-touching of precompiles in case of OOG here (not a strong preference, I could re-move it to evm.ts). It is also limited to the precompile at 0x03.

this._result.gasRefund = new BN(0)
this._evm._refund.isub(amount)
if (this._evm._refund.ltn(0)) {
this._evm._refund = new BN(0)
trap(ERROR.REFUND_EXHAUSTED)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can do this._result.gasRefund = this._evm._refund after modifying it. It clutters the code a little, but it avoids a breaking change.

As David, said Ganache only depends on gasRefund at the end of every tx, so this would work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I re-added gasRefund to ExecResult, by copying EVM._refund at the end of message execution in EVM. This should work for the Ganache use-case. I think we should be backwards-compatible now, at least for the most part.

@lgtm-com
Copy link

lgtm-com bot commented Nov 15, 2019

This pull request introduces 1 alert when merging b26f15c into 00d8fe3 - view on LGTM.com

new alerts:

  • 1 for Unused variable, import, function or class

@lgtm-com
Copy link

lgtm-com bot commented Nov 15, 2019

This pull request introduces 1 alert when merging b9ebe69 into 00d8fe3 - view on LGTM.com

new alerts:

  • 1 for Unused variable, import, function or class

@s1na
Copy link
Contributor Author

s1na commented Nov 15, 2019

Strange. Circleci is failing with this error on install:

Command failed: /usr/bin/git checkout istanbul-tests
npm ERR! error: pathspec 'istanbul-tests' did not match any file(s) known to git.

It's as is if the istanbul-tests branch doesn't exist, but it clearly does. Anyone faced this?

Update: it was failing to checkout to the istanbul-tests branch of ethereumjs-testing (not VM which has the same branch name 🤦‍♂️) which was deleted after the PR was merged.

@lgtm-com
Copy link

lgtm-com bot commented Nov 15, 2019

This pull request introduces 1 alert when merging a29de25 into 00d8fe3 - view on LGTM.com

new alerts:

  • 1 for Unused variable, import, function or class

@holgerd77
Copy link
Member

Ok, we are stumbling over our own improvements:

# should contain values of hardcoded bitvector
/home/circleci/project/ethereumjs-vm/node_modules/ethereumjs-util/dist/bytes.js:79
                throw new Error("Cannot convert string to buffer. toBuffer only supports 0x-prefixed hex strings and this string was given: " + v);
                ^

Error: Cannot convert string to buffer. toBuffer only supports 0x-prefixed hex strings and this string was given: value 1
    at Object.exports.toBuffer (/home/circleci/project/ethereumjs-vm/node_modules/ethereumjs-util/dist/bytes.js:79:23)
    at Test.t.test (/home/circleci/project/ethereumjs-vm/tests/api/bloom.js:25:27)
    at Test.bound [as _cb] (/home/circleci/project/ethereumjs-vm/node_modules/tape/lib/test.js:66:32)
    at Test.run (/home/circleci/project/ethereumjs-vm/node_modules/tape/lib/test.js:85:10)
    at Test.bound [as run] (/home/circleci/project/ethereumjs-vm/node_modules/tape/lib/test.js:66:32)
    at Test._end (/home/circleci/project/ethereumjs-vm/node_modules/tape/lib/test.js:154:11)
    at Test.bound [as _end] (/home/circleci/project/ethereumjs-vm/node_modules/tape/lib/test.js:66:32)
    at Test.<anonymous> (/home/circleci/project/ethereumjs-vm/node_modules/tape/lib/test.js:153:40)
    at emitNone (events.js:106:13)
    at Test.emit (events.js:208:7)

test_api and test_api_browser are still failing, have you seen this? The above is what is given as output.

Otherwise, great! 😀 Blockchain test passing as well!

On the release question: would it be enough to release this as v4.1.1? Or would you rather suggest to go to v4.2.0? Maybe we target a release next Tuesday or Wednesday? Would this work?

@s1na
Copy link
Contributor Author

s1na commented Nov 15, 2019

test_api and test_api_browser are still failing, have you seen this? The above is what is given as output.

Interestingly #617 is also failing with the same error, hmm 🤔

would it be enough to release this as v4.1.1? Or would you rather suggest to go to v4.2.0? Maybe we target a release next Tuesday or Wednesday? Would this work?

I'll try to use Pato's approach for gasRefund to avoid a breaking change in this release and only make it deprecated for now. So we shouldn't have a breaking change and v4.1.1 should be fine IMO. I'll have a look again when the PR is completely ready and re-visit this question. Yes Tuesday/Wednesday should be fine on my end.

@lgtm-com
Copy link

lgtm-com bot commented Nov 15, 2019

This pull request introduces 1 alert when merging 18b34f9 into 00d8fe3 - view on LGTM.com

new alerts:

  • 1 for Unused variable, import, function or class

@alcuadrado
Copy link
Member

alcuadrado commented Nov 15, 2019

Ok, we are stumbling over our own improvements:

I had to fix some tests in -util because they were using utf8 test vectors. Doing things like toBuffer('A'). I fixed them by transforming those to their hex equivalent (e.g. toBuffer('0x41')), or by using new test values.

The problem is that it's hard to detect when a change like this would affect other packages before releasing it. I know I sound like a broken record, but this is why I'm a big fan of monorepos 😅

@lgtm-com
Copy link

lgtm-com bot commented Nov 18, 2019

This pull request introduces 1 alert when merging 57d7483 into 00d8fe3 - view on LGTM.com

new alerts:

  • 1 for Unused variable, import, function or class

@lgtm-com
Copy link

lgtm-com bot commented Nov 18, 2019

This pull request introduces 1 alert when merging 970ddbf into 00d8fe3 - view on LGTM.com

new alerts:

  • 1 for Unused variable, import, function or class

@s1na
Copy link
Contributor Author

s1na commented Nov 18, 2019

The PR is ready for review 😄

Ok, we are stumbling over our own improvements:

For this error I made the ethereumjs-util semver more strict by limiting it to ~6.1.0, as I didn't want to overload this PR which already includes many unrelated changes.

Also updated ethereumjs-blockchain to latest version and fixed some errors (hacky way) in how the API tests were instantiating Block and Blockchain objects in 57d7483, which were causing Common mismatch.

I think we might need to do something about this. Each of Tx, Block, Blockchain and VM accept a Common (or chain and hardfork) object and mismatch could easily happen, specially since Block and Blockchain set hardfork to null by default.

Copy link
Member

@holgerd77 holgerd77 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most changes here were already in review respectively had been discussed here, had another overall look, additions to the testers and test utils also look good, would give this a go now.

@holgerd77 holgerd77 merged commit 41ebfba into master Nov 18, 2019
@holgerd77 holgerd77 deleted the istanbul-tests branch November 18, 2019 22:05
@holgerd77 holgerd77 changed the title [WIP] Add state tests for Istanbul Add state tests for Istanbul Nov 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failing Petersburg tests after update to tests v7.0.0-beta.1
6 participants