-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement the block execution pipeline #17
Conversation
…ions Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
…ich requied fixes Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
…ing snapshoting more seameless Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
.root_source_file = .{ .path = "src/main.zig" }, | ||
.root_source_file = .{ .path = "src/lib.zig" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's usually recommended to put library tests in their own file, to separate the unit_test
executable from the "real" executable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's good, but maybe give it a more specific name than lib
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, maybe. I'll think about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rockstars in this PR are two new files:
src/blockchain/blockchain.zig
src/blockchain/vm.zig
The former (this file) contains most of the execution specs code, which includes:
- Block validation
- Tx validation
- Gas fee calcs (base fee, refunds, intrinsic costs, etc)
- Access lists initialization
- Tx -> Message creation (i.e: the first message to exec)
The latter (i.e: .../vm.zig
) has the message execution pipeline, which mostly translates to EVMOne implementation.
const VM = vm.VM; | ||
const Keccak256 = std.crypto.hash.sha3.Keccak256; | ||
|
||
pub const Blockchain = struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the top level entity that manages the execution pipeline.
state: *StateDB, | ||
prev_block: BlockHeader, | ||
last_256_blocks_hashes: [256]Hash32, // ordered in asc order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the three main fields that it tracks:
- The current state.
- The previous block, which will be used when asked to run the next block to do some validations.
- The last 256 block hashes since this state is needed for BLOCKHASH.
- The txn signer for this chain id (to be used in getting sender from signature).
}; | ||
} | ||
|
||
pub fn runBlock(self: *Blockchain, block: Block) !void { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On every new block, the client calls this fn to move forward.
@@ -24,24 +24,38 @@ pub const Txn = union(TxnTypes) { | |||
} | |||
|
|||
// decode decodes a transaction from bytes. The provided bytes are referenced in the returned transaction. | |||
pub fn decode(bytes: []const u8) !Txn { | |||
pub fn decode(arena: Allocator, bytes: []const u8) !Txn { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we receive an allocator, since now we fixed zig-rlp to allow decoding the data
field which is a slice that requires an allocation.
Note something important. The rlp.deserialize(...)
receives an allocator and might potentially do a lot of allocations (in theory, depending how nested the struct is and how many slices might have).
But... there's no deinit()
. So actually the RLP library is using the allocator in a way that is not easy to do a "deinit" unless the caller knows all the fields that might have been allocated, etc.
This is clearly a case for an arena allocator, which how I use the library. (SImilar to how std.json works). The client should create an arena, send that as the allocator, and for "deinit()" simply destroy the arena.
|
||
return error.UnsupportedTxnType; | ||
} | ||
|
||
pub fn decodeFromRLP(self: *Txn, arena: Allocator, serialized: []const u8) !usize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a custom decoder I had to do, since the Txn
type is an union and obviously that type isn't (and can't) be supported in zig-rlp.
Moreover, I need a logic to see if the encoded tx is a LegacyTx or is envelop-based one.
This is done by checking if the first byte:
- If it's a RLP-struct, then the
serialized
is a LegacyTx directly. - If it isn't, we have to first decode it as a
string
and then we can delegate toTxn.decode(..)
so it can decode it appropriately depending on the envelope tx type.
Probably the niciest example of a custom RLP decoder for zig-rlp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually we could probably implement rlp unions, assuming that they work exactly like transactions: you start with an index in the union and then the payload of the subtype. But let's see if that shows up anywhere else than here. I'll make a pr when this one is merged, so that the logic is abstracted to comptime
stuff.
var ltx: LegacyTxn = undefined; | ||
const size = rlp.deserialize(LegacyTxn, arena, serialized, <x); | ||
self.* = .{ .LegacyTxn = ltx }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that (unfortunately) I cant do self.* = try Txn.decode(arena, str);
here (as done in L54), since I need the usize
of rlp.deserialize
to return in the custom deserializer.
In the case of envelope based tx (L52-L54), I can use Txn.decode
since I had to first deserialize serialized
as a string with rlp.deserialized
which gave me the size I needed. Pretty funny!
@@ -1,8 +1,8 @@ | |||
const std = @import("std"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I was tempted to do the Txn
->Tx
renaming in this PR but I prefer to do that in a separate one to avoid even more changed lines. I'll do that soon after.
const prev_block_hash = try common.decodeRLPAndHash(BlockHeader, allocator, prev_block, null); | ||
if (!std.mem.eql(u8, &curr_block.parent_hash, &prev_block_hash)) | ||
return error.InvalidParentHash; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing this decodeRLPAndHash(...)
is that I found this (fixed now) bug.
The usual thing of "Ah, doing this check is easy" and you get in a rabbit hole. :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🐇
Signed-off-by: Ignacio Hagopian <jsign.uy@gmail.com>
return evmc.EVMC_ACCESS_COLD; | ||
} | ||
|
||
fn call(ctx: ?*evmc.struct_evmc_host_context, msg: [*c]const evmc.struct_evmc_message) callconv(.C) evmc.struct_evmc_result { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Days after comment: this call
is missing some things for contract creation transactions. I'm working on that now and it will be in another PR. Just mentioning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, that was intense! I left a lot of questions and comments, but they can all be ignored for now, except the one about the max code size. But feel free to merge whenever you feel it's ready.
.root_source_file = .{ .path = "src/main.zig" }, | ||
.root_source_file = .{ .path = "src/lib.zig" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's good, but maybe give it a more specific name than lib
?
pub const Blockchain = struct { | ||
allocator: Allocator, | ||
chain_id: config.ChainId, | ||
state: *StateDB, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
meh, that name strikes terror in the hearts of men. Let's name it something more contributor-friendly like just State
. It's not meant to be a db, is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you refer to the field name or the type name?
StateDB
is a database, yes (in memory one).
state
refers to the state of the blockchain.
Are you suggesting State
instead of StateDB
?
I don't have a strong opinion, just checking that's your suggestion.
var arena = std.heap.ArenaAllocator.init(self.allocator); | ||
defer arena.deinit(); | ||
const allocator = arena.allocator(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couldn't we make the top-level allocator an area allocator, and just pass self.allocator()
when we call runBlock
? and then let the caller deinit()
it? This way, we can use other types of allocators for "unit" tests without having to worry about it, and also so that some data can be extracted by the client call after runBlock
? I fear that, otherwise, it might force us to allocate+copy data that we then need to pass to the called, and then have to make the runBlock
function signature impossibly long.
pub fn processMessageCall(self: *VM, msg: Message) !evmc.struct_evmc_result { | ||
const evmc_message: evmc.struct_evmc_message = .{ | ||
.kind = if (msg.target != null) evmc.EVMC_CALL else evmc.EVMC_CREATE, | ||
.flags = 0, | ||
.depth = 0, | ||
.gas = @intCast(msg.gas), | ||
.recipient = toEVMCAddress(msg.current_target), | ||
.sender = toEVMCAddress(msg.caller), | ||
.input_data = msg.data.ptr, | ||
.input_size = msg.data.len, | ||
.value = blk: { | ||
var txn_value: [32]u8 = undefined; | ||
std.mem.writeIntSliceBig(u256, &txn_value, msg.value); | ||
break :blk .{ .bytes = txn_value }; | ||
}, | ||
.create2_salt = undefined, // EVMC docs: field only mandatory for CREATE2 kind which doesn't apply at depth 0. | ||
.code_address = toEVMCAddress(msg.code_address), | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok if that's what you need, I was wondering if we could instead just abstract the struct_evmc_message
as a Message
and not have to worry about the internals/conversion, but I guess it's not necessary right now since we don't have any alternative interpreter.
|
||
// EVMOneHost contains the implementation of the EVMC host interface. | ||
// https://evmc.ethereum.org/structevmc__host__interface.html | ||
const EVMOneHost = struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move this to its own emvc.zig
file for cleanliness
allocator: Allocator, | ||
chain_id: config.ChainId, | ||
state: *StateDB, | ||
prev_block: BlockHeader, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about making this a ?BlockHeader
since we execute in stateless and might not have that previous block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh? A stateless client must always have the previous block.
Even if it's the first one that it validates. If that isn't the case, how would verify the witness proof?
const prev_block_hash = try common.decodeRLPAndHash(BlockHeader, allocator, prev_block, null); | ||
if (!std.mem.eql(u8, &curr_block.parent_hash, &prev_block_hash)) | ||
return error.InvalidParentHash; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🐇
|
||
fn applyBody(allocator: Allocator, chain: *Blockchain, state: *StateDB, block: Block) !BlockExecutionResult { | ||
var gas_available = block.header.gas_limit; | ||
for (block.transactions) |tx| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually it should be called applyTransactions
or something of the sort, because the body contains more information than just the txs, like uncles (not implemented, but still) and withdrawals. If you want to process witndrawals in here was well, then maybe applyStateTransition
?
// TODO: self destruct processing | ||
// for address in output.accounts_to_delete: | ||
// destroy_account(env.state, address) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I agree, but in spite of what I said in other comments about the tooling, we should start by supporting our basic use case. I said otherwise before I got to this massive file 😉
return false; | ||
if (tx.getNonce() >= (2 << 64) - 1) | ||
return false; | ||
if (tx.getTo() == null and tx.getData().len > 2 * params.max_code_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why 2*
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it was surprising to me too. Turns out that the inint code size seems to be double the max code size. This is done this way in the spec, and also double checked in geth.
This PR rebuilds our block execution pipeline to something rather complete. The implementation is based on the execution-spec and the EVMC docs to integrate with EVMOne.
Laterally, this also fixed existing bugs, added more signature support, RLP decoding for types, and many things needed to make this work.
Instead of writing a long PR description, I'll do some heavy PR commenting to help review and explain these other things.