SSTORE/SLOAD for byte arrays #97

vbuterin · 2016-04-26T08:26:49Z

NOTE: this is split off from #86, and updated to incorporate what was agreed on the dev call on 2016.04.25.

Parameters

BASE_COST: 2500
BYTE_STORAGE_COST: 250
BYTE_CHANGE_COST: 40
MIN_COST: 2500
GSSIZE: 50
GSLOADBYTES: 50
METROPOLIS_FORK_BLKNUM: TBA

Specification

If block.number >= METROPOLIS_FORK_BLKNUM, then:

Add an opcode SLOADBYTES at 0xe1, which takes three arguments off the stack: key, mstart, msize. Reads the storage of the account at the given key, and outputs the result into memory; unused memory is left unmodified, and memory is only extended to the point where it is needed (eg. if msize = 2**200 but the result only returns 1000 bytes, then you only pay gas for extending up to mstart + 1000). Gas cost GSLOADBYTES, plus GCOPY gas per 32 bytes (eg. 319 bytes -> 9 * GCOPY, 320 -> 10 * GCOPY, 321 -> 10 * GCOPY, just like CALLDATACOPY, CODECOPY and EXTCODECOPY)
Add an opcode SSTOREBYTES at 0xe2, which takes three arguments off the stack: key, mstart, msize. Copies the given memory slice into that key in storage. Gas cost is computed according to the following schedule: total_cost = BASE_COST + BYTE_STORAGE_COST * (adjusted byte count of memory slice - adjusted byte count of previous contents) + BYTE_CHANGE_COST * (adjusted byte count of memory slice), gas_cost = max(total_cost, MIN_COST), refund = MIN_COST - min(total_cost, MIN_COST). Adjusted byte count = 32 + length if length is nonzero, otherwize 0. An SLOAD operation on contract data that was previously filled with SSTOREBYTES return the last 32 bytes of the data, right-zeropadded. Also, note that using SSTOREBYTES to save a slice of zero bytes actually saves the zero bytes; it does not delete the slot.
Add an opcode SSIZE at 0xe3, which takes a key off the stack, and returns the length of the value in storage there. Gas cost GSSIZE.

Code for SSTOREBYTES

BASE_COST = 2500
BYTE_STORAGE_COST = 250
BYTE_CHANGE_COST = 40
MIN_COST = 2500
...
elif op == 'SSTOREBYTES':
    key, mstart, msize = stk.pop(), stk.pop(), stk.pop()
    if not mem_extend(mem, compustate, op, mstart, msize):
        return vm_exception('OOG EXTENDING MEMORY')
    prev_adjbyte_count = len(ext.get_storage_data(msg.to, key))
    if prev_adjbyte_count >= 0:
        prev_adjbyte_count += 32
    post_adjbyte_count = msize + (32 if msize else 0)
    gas_cost = BASE_COST + BYTE_STORAGE_COST * (post_adjbyte_count - prev_adjbyte_count) + BYTE_CHANGE_COST * post_adjbyte_count
    gas_payment = max(MIN_COST, gas_cost)
    refund = gas_payment - gas_cost
    if compustate.gas < gas_payment:
        return vm_exception('OUT OF GAS')
    compustate.gas -= gas_payment
    ext.set_storage_data(msg.to, ''.join(map(chr, mem[mstart: mstart + msize])))

Rationale

This allows contracts to store data more efficiently by reducing the number of trie/DB writes and reads needed for contract execution. Recent measurements show that trie and DB writes are a primary source of overhead, and serenity tests show a >2x speed improvement in some cases if code is written well. The canonical examples include:

Cases where a byte array or string needs to be saved to storage. One practical use case is the storage of 80-byte bitcoin block headers in btcrelay.
Cases where a struct needs to be saved to storage, and many elements in the struct are updated at the same time. One practical use case "order book" contracts where buyer, seller, amount, price, currency, etc are all entered at once.

Additionally, compiler logic may be simplified as storing data in sequential slots is no longer required. This does come at the cost of requiring updates to blockchain tools that currently assume that all storage values are either empty or exactly 32 bytes long.

The rationale behind adding a 32 byte cost to nonempty storage slots is that even a one-byte storage value will necessarily have overhead including key/value storage size costs, merkle tree branches/leaves, etc; additionally, an incentive is required to encourage saving a constant 320-byte chunk in a single key rather than splitting it up among 10 keys.

The text was updated successfully, but these errors were encountered:

wanderer · 2016-04-26T08:59:57Z

Possible implementation concern. Leveldb doesn't handle large values very well. To get around this, break large values into chunks and append nonce to the key. Since leveldb sorts key lexicographical it very efficient to stream the chunks. Leveldb has lg and lt operations, so the query needed here would be gt <key> and lt <key + 1> where +1 refers to the next value, not append.

vbuterin · 2016-05-01T14:44:23Z

If you look at the bytes per second from that benchmark, you get 779k writes/sec * 100 bytes/write = 77.9 mbps for small values and 1100 writes/sec * 100000 bytes/write = 110 mbps for large values. So it doesn't seem to be fatal or even that problematic.

chriseth · 2016-05-05T18:36:51Z

If this is adopted and should be usable from Solidity, it will add another layer of complexity. This "store as single blob" has to be an optional feature for structs and arrays, because otherwise, elements of the structs and arrays are not accessible as lvalues anymore.

Furthermore, the fact that storage size is dynamic will need some additional handling: If sloadbytes does not write to the full range of memory, we always have to check the size beforehand and optionally even throw if the size is not what we expect.

Finally, as already commented on the original issue, I think that having two "types" of storage slots will add similar complexity, although I think that this is not too big an issue.

janx · 2016-05-06T01:53:36Z

elements of the structs and arrays are not accessible as lvalues anymore.

@chriseth can you give more details?

Nashatyrev · 2016-05-13T10:28:04Z

Are there any other rationales behind this EIP besides optimization?
If not than in my opinion this is the overkill to solve pure engineering performance problem with such a serious protocol change.
That change introduces a number of complexities (and potential vulnerabilities accordingly) for clients which are mostly relying on 32 bytes values. The change also complicates the protocol itself without introducing any new fundamental improvements.

tawaren · 2017-02-17T12:21:01Z

What is the status of this?
I assume the gas costs as stated now are not updated to include adaptions from #150

github-actions · 2022-01-16T06:14:00Z

There has been no activity on this issue for two months. It will be closed in a week if no further activity occurs. If you would like to move this EIP forward, please respond to any outstanding feedback or add a comment indicating that you have addressed all required feedback and are ready for a review.

github-actions · 2022-01-30T07:10:25Z

This issue was closed due to inactivity. If you are still pursuing it, feel free to reopen it and respond to any feedback or request a review in a comment.

qizhou · 2022-06-16T05:15:54Z

Some thoughts on backward compatibility for SSTORE/SLOAD:

If SSTOREBYTES writes a bytes data with len < 32, then SLOAD will read the data with right-padded zeros to 32 bytes
If SSTOREBYTES writes a bytes data with len > 32, then SLOAD will return a left-32-bytes-truncated data
SSTORE = SSTOREBYES with size = 32

Further, to keep the IO loads of SSTORE/SLOAD as before, we could implement the storage layout by maintaining two mappings:

Original Mapping 32-bytes key => 32-bytes or 36-bytes original_value. If original_value size is 32 bytes (which is the same as the existing SSTORE/SLOAD), then the value size is 32 bytes. If original_value size is 36 bytes, then the last 4 bytes encode the size of the value. If the size < 32, then the value is the truncated version of original_value. If the size > 32, then the rest size-32 bytes are stored in the following mapping.
Extended Mapping 32-bytes key => arbitrary-size extended_value. This maps the rest part of the value if the size of the value.

SSIZE would simply read original_value from Original Mapping and return 32 if the value is 32 bytes or the decoded value of the last 4 bytes of the value.

This should make SSTOREBYTES/SLOADBYTES/SSIZE as a pure add-on feature of existing EVM - existing SSTORE/SLOAD semantics are kept and their IO costs are the same as before. However, if a contract mixes the usage of fixed-bytes SSTORE/SLOAD and arbitrary-size SSTOREBYTES/SLOADBYTES/SSIZE, then the gas metering of all these opcodes needs further discussions.

vbuterin mentioned this issue Apr 26, 2016

Proposed initial abstraction changes for Metropolis #86

Closed

konradkonrad mentioned this issue Jul 8, 2016

EIP 97 ethereum/pyethereum#381

Closed

wanderer mentioned this issue Jul 15, 2016

sstore/sload without fixed field size ewasm/design#11

Open

Souptacular added the editor-needs-to-review label Feb 10, 2017

wanderer mentioned this issue Mar 14, 2017

Storage Model and API for Prima primea/design#2

Open

Arachnid removed the editor-needs-to-review label Mar 27, 2018

computerality mentioned this issue Sep 4, 2018

Split instruction table into implemented vs un-implemented instructions crytic/evm-opcodes#5

Open

chfast mentioned this issue Feb 6, 2019

Extend storage values to byte arrays of arbitrary length ewasm/design#176

Closed

axic mentioned this issue Mar 9, 2019

EIP 104 (Serenity): Unlimited storage key or value size #36

Closed

github-actions bot added the stale label Jan 16, 2022

github-actions bot closed this as completed Jan 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SSTORE/SLOAD for byte arrays #97

SSTORE/SLOAD for byte arrays #97

vbuterin commented Apr 26, 2016 •

edited

Loading

wanderer commented Apr 26, 2016

vbuterin commented May 1, 2016

chriseth commented May 5, 2016

janx commented May 6, 2016

Nashatyrev commented May 13, 2016 •

edited

Loading

tawaren commented Feb 17, 2017 •

edited

Loading

github-actions bot commented Jan 16, 2022

github-actions bot commented Jan 30, 2022

qizhou commented Jun 16, 2022

SSTORE/SLOAD for byte arrays #97

SSTORE/SLOAD for byte arrays #97

Comments

vbuterin commented Apr 26, 2016 • edited Loading

Parameters

Specification

Code for SSTOREBYTES

Rationale

wanderer commented Apr 26, 2016

vbuterin commented May 1, 2016

chriseth commented May 5, 2016

janx commented May 6, 2016

Nashatyrev commented May 13, 2016 • edited Loading

tawaren commented Feb 17, 2017 • edited Loading

github-actions bot commented Jan 16, 2022

github-actions bot commented Jan 30, 2022

qizhou commented Jun 16, 2022

vbuterin commented Apr 26, 2016 •

edited

Loading

Nashatyrev commented May 13, 2016 •

edited

Loading

tawaren commented Feb 17, 2017 •

edited

Loading