Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add EIP: EOF - Data section access instructions #7480

Merged
merged 6 commits into from
Sep 22, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 92 additions & 0 deletions EIPS/eip-7480.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
---
eip: 7480
title: EOF - Data section access instructions
description: Instructions to read data section of EOF container
author: Andrei Maiboroda (@gumb0), Alex Beregszaszi (@axic), Paweł Bylica (@chfast)
discussions-to: https://ethereum-magicians.org/t/eip-7480-eof-data-instructions/15414
status: Draft
type: Standards Track
category: Core
created: 2023-08-11
requires: 3540, 3670
---

## Abstract

Four new instrutions are introduced, that allow to read EOF container's data section: `DATALOAD` loads 32-byte word to stack, `DATALOADN` loads 32-byte word to stack where the word is addressed by a static immediate argument, `DATASIZE` loads data section size and `DATACOPY` copies a segment of data section to memory.

## Motivation

Clear separation between code and data is one of the main features of EOF1. Data section may contain anything, e.g. compiler's metadata, but to make it useful for smart contracts, EVM has to have instructions that allow to read from data section. Previously existing instructions for bytecode inspection (`CODECOPY`, `CODESIZE` etc.) are deprecated in EOF1 and cannot be used for this purpose.

The `DATALOAD`, `DATASIZE`, `DATACOPY` instruction pattern follows the design of existing instructions for reading other kinds of data (i.e. returndata and calldata).

`DATALOADN` is an optimized version of `DATALOAD`, where data offset to read is set at compilation time, and therefore need not be validated at run-time, which makes the instruction cheaper.

## Specification

We introduce four new instructions on the same block number [EIP-3540](./eip-3540.md) is activated on:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it would be better to specify activation independently (and this EIP already specifies that it depends on 3540). Their activation is dependent on inclusion of the EIP in a particular hardfork which would be an independent decision by the ACD.

Suggested change
We introduce four new instructions on the same block number [EIP-3540](./eip-3540.md) is activated on:
We introduce four new instructions:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of EOF must be activated at the same time, otherwise the functionality is very limited.


1 `DATALOAD` (0xe8)
2.`DATALOADN` (0xe9)
3.`DATASIZE` (0xea)
4.`DATACOPY` (0xeb)

If the code is legacy bytecode, all of these instructions result in an *exceptional halt*. (*Note: This means no change to behaviour.*)

If the code is valid EOF1, the following execution rules apply:

### `DATALOAD`

1. Pops one value, `offset`, from the stack.
2. If `offset + 32` is greater than the data section size of the active container, execution results in exceptional halt.
3. Reads `[offset:offset+32]` segment from the data section and pushes it as 32-byte value to the stack.
4. Deducts 3 gas.

### `DATALOADN`

1. Has one immediate argument,`offset`, encoded as a 16-bit unsigned big-endian value.
2. Pops nothing from the stack.
3. Reads `[offset:offset+32]` segment from the data section and pushes it as 32-byte value to the stack.
4. Deducts 2 gas.

`[offset:offset+32]` is guaranteed to be within data bounds by [code validation](#code-validation).

### `DATASIZE`

1. Pops nothing from the stack.
2. Pushes the size of the data section of the active container to the stack.
3. Deducts 2 gas.

### `DATACOPY`

1. Pops three values from the stack: `mem_offset`, `offset`, `size`.
2. Performs memory expansion to `mem_offset + size` and deducts memory expansion cost.
3. Deducts `3 * ((size + 31) // 32)` gas for copying.
3. If `offset + size` is greater than data section size of the active container, execution results in exceptional halt.
4. Reads `[offset:offset+size]` segment from the data section and writes it to memory starting at offset `mem_offset`.
gumb0 marked this conversation as resolved.
Show resolved Hide resolved


### Code Validation

We extend code section validation rules (as defined in [EIP-3670](./eip-3670.md)).

1. Code section is invalid in case an immediate argument `offset` of any `DATALOADN` is such that `offset + 32` is greater than data section size.
2. `RJUMP`, `RJUMPI` and `RJUMPV` immediate argument value (jump destination relative offset) validation: code section is invalid in case offset points to one of two bytes directly following `DATALOADN` instruction.


## Rationale

TBA

## Backwards Compatibility

This change poses no risk to backwards compatibility, as it is introduced only for EOF1 contracts, for which deploying undefined instructions is not allowed, therefore there are no existing contracts using these instructions. The new instructions are not introduced for legacy bytecode (code which is not EOF formatted).

## Security Considerations

TBA

## Copyright

Copyright and related rights waived via [CC0](../LICENSE.md).
Loading