Skip to content

Latest commit

 

History

History
654 lines (489 loc) · 25 KB

File metadata and controls

654 lines (489 loc) · 25 KB
eip 6601
title EVM Modular Arithmetic Extensions (EVMMAX)
status Draft
type standards track
author Jared Wasinger <@jwasinger>, Alex Beregszaszi (@axic)
discussions-to https://ethereum-magicians.org/t/eip-6601-evm-modular-arithmetic-extensions-evmmax
category Core
created 2023-03-02
requires 4750, 3670

Abstract

This EIP proposes the addition of new optimized modular addition, subtraction and multiplication opcodes to the EVM. These support odd moduli up to 4096 bits in size.

Motivation

Benefits of the changes proposed in this EIP:

  • enables elliptic curve arithmetic operations on various curves including BLS12-381 to be implemented as EVM contracts
  • For operations on values up to 256bits in size, reduces gas cost per operation by 90-95% compared to the current MULMOD and ADDMOD opcodes.
  • for all cases where modexp precompile is useful, it could now be implemented as an EVM contract.
  • enables substantial cost reductions for algebraic hash functions (e.g. MiMC/Poseidon), zkp verification in the EVM.

Specification

Overview

Contracts that use EVMMAX make use of a new EOF section type called a setup section. This contains a modulus, the number of values that will be allocated to perform operations on and two special parameters (referred as the Montgomery parameters) specific to the chosen modulus which are precomputed off-chain by the contract author.

Multiple setup sections can be present to allow use of different moduli in the same contract.

To use a modulus that is chosen during contract execution, a setup section omits the modulus and Montgomery parameters.

During contract execution, a contract calls a setup instruction SETUPX, sourcing modulus and Montgomery parameters from a setup section (or if the setup section is configured to use a dynamic modulus, loading the modulus from memory and computing the Montgomery parameters). SETUPX allocates a zeroed memory space separate from EVM memory.

The modulus, Montgomery parameters and memory space are associated with the current call frame state and referred to as the active modulus state. If SETUPX is called again to switch to a different modulus, the memory space and Montgomery parameters of the previous active modulus state remain allocated (the memory spaces of active/previously-active modulus state are separate).

New store and load opcodes STOREX/LOADX are used to copy multiples values to/from EVM memory and the memory space of the active modulus state.

Arithmetic is performed with ADDMODX/SUBMODX/MULMODX opcodes which take and return no stack items, require a 3-byte immediate value appended to the opcode.

The immediate is interpreted as 3 1-byte values z, x, y which are indexes to the array of EVMMAX values that comprise the memory space of the active modulus state.

An arithmetic operation is performed on inputs at index x/y placing the result in index z.

Conventions

  1. x === y % m: x % m == y % m
  2. pow(x, -1, m): The modular multiplicative inverse of x with respect to modulus m.
  3. Opcode definition syntax is formatted as mneumonic {immediate - type} {immediate2 - type} ...: stack_arg_1, stack_arg_2, ... where immediates are listed in the order that they proceed the opcode and stack arguments are ordered starting at the top of the stack.
  4. In the provided pseudocode, it is assumed that opcode gas charging logic is executed prior to execution logic.
  5. Any exception thrown should immediately end the current execution frame and return to the caller.

Constants

Name Value Description
STOREX_BASE_GAS 3 base gas cost for STOREX opcode
LOADX_BASE_GAS 3 base gas cost for LOADX opcode
SETUPX_BASE_GAS 3 base gas cost for SETUPX opcode
EVMMAX_MAX_MEM 65,536 bytes maximum amount of EVMMAX memory that can be used in a call frame
MAX_MOD_SIZE 4096 bits tentative modulus size limit (can probably be removed because EVMMAX_MAX_MEM_SIZE effectively caps the modulus size)
MULMODX_SUBQUADRATIC_START 50 modulus size in in multiples of 8 bytes where we switch to subquadratic mulmont cost model
SYSTEM_WORD_SIZE_BITS varies depending on the system word size in bits of a client's CPU

Context Variables

Name Type Meaning
evmmax_state EVMMAXState a variable representing ephemeral state which exists for the duration of the current call and in the scope of the current call frame
evm_memory bytes EVM memory for the current call context
expand_evm_memory func(size_words: int) expands EVM memory by size_words * 32 bytes
cost_evm_memory_expansion func(new_size_evm_words: int) -> int EVM memory expansion cost function, modified according to this EIP
evm_stack object Allows access to the stack via pop() and peek(n) which return int stack elements
contract_code bytes code of the currently-executing contract
setup_sections dict map of mod_id (int) to a setup section object (field names are from setup section spec. int values)
pc int EVM program counter
class EVMMAXState():
    def __init__(self):
        # ModState currently being used
        self.active_mod_state = None
        # a lookup of mod_id (int) -> ModState
        self.mods = {}

class ModState():
    def __init__(self, mod: int, num_vals_used: int, mod: int, r: int, r_squared: int, mod_inv_full=None, mod_inv=None):
        self.mod = mod
        # size (expressed in multiples of 8 bytes) needed to represent mod
        self.val_size_multiplier = math.ceil(len(hex(mod)[2:]) / (2 * 8))
        
        self.num_vals_used = num_vals_used
        self.mod_inv = mod_inv
        self.mod_inv_full = mod_inv_full
        self.r = r
        self.r_squared = r_squared
        # a memory space of size num_vals_used * val_size_multiplier
        self.values = [0] * self.num_vals_used

Helpers

# -----------------------------------------------------------------------------
#  gas-charging helpers

def cost_precompute_mont(val_size_multiplier: int) -> int:
    PRECOMPUTE_MONT_LO_GAS_A = ?
    PRECOMPUTE_MONT_LO_GAS_B = ?

    PRECOMPUTE_MONT_HI_GAS_A = ?
    PRECOMPUTE_MONT_HI_GAS_B = ?
    
    cost = 0

    if val_size_multiplier < MULMODX_SUBQUADRATIC_START:
        cost = math.ceil(PRECOMPUTE_MONT_LO_GAS_A * val_size_multiplier + \
            PRECOMPUTE_MONT_LO_GAS_B)
    else:
        cost = math.ceil(PRECOMPUTE_MONT_HI_GAS_A * val_size_multiplier + \
            PRECOMPUTE_MONT_HI_GAS_B)

    return cost

def cost_addmodx(val_size_multiplier: int) -> int:
    ADDMODX_GAS_A = 0.20
    ADDMODX_GAS_B = 0.15
    
    cost = 0
    if val_size_multiplier == 6:
        cost = 1
    else:
        cost = round(ADDMODX_GAS_A * limb_count + ADDMODX_GAS_B)

    if cost == 0:
        cost = 1
    
    return cost

def cost_mulmodx(val_size_multiplier: int) -> int:
    MULMODX_LO_GAS_A = 0.090
    MULMODX_LO_GAS_B = 0 
    MULMODX_LO_GAS_C = 0.24

    MULMODX_HI_GAS_A = 0 
    MULMODX_HI_GAS_B = 10.0
    MULMODX_HI_GAS_C = -270.0
    
    cost = 0

    if val_size_multiplier == 6:
        cost = 2
    elif val_size_multiplier < MULMODX_SUBQUADRATIC_START:
        cost = math.ceil(MULMODX_LO_GAS_A * (val_size_multiplier ** 2) + \
            MULMODX_LO_GAS_B * val_size_multiplier + \
            MULMODX_LO_GAS_C)
    else:
        cost = math.ceil(MULMODX_HI_GAS_A * val_size_multiplier ** 2 + \
            MULMODX_HI_GAS_B * val_size_multiplier + \
            MULMODX_HI_GAS_C)

    if cost == 0:
        cost = 1
        
    return cost

# -----------------------------------------------------------------------------
#  bigint helpers
#   a bigint is a unsigned number represented as a list of unsigned system words in descending order of significance

# split a double-width value into hi/low words
def hi_lo(double_width_val: int) -> (int, int):
    base = 2**SYSTEM_WORD_SIZE_BITS
    assert double_width_val < base**SYSTEM_WORD_SIZE_BITS, "val must fit in two words"
    return (double_width_val >> SYSTEM_WORD_SIZE_BITS) % base, double_width_val % base

def bigint_to_int(x: [int]) -> int:
    res = 0
    for i in reversed(range(len(x))):
        res += x[i] * 2**(SYSTEM_WORD_BITS * (len(x) - i - 1))
    return res

def int_to_bigint(x: int, word_count: int):
    res = [0] * word_count
    for i in range(word_count):
        res[word_count - i - 1] = x & (2**SYSTEM_WORD_BITS - 1)
        x >>= SYSTEM_WORD_BITS
    return res

# return x - y (omitting borrow-out)
def bigint_sub(x: [int], y: [int]) -> [int]:
    num_words = len(x)
    res = [0] * num_words
    c = 0 

    for i in reversed(range(num_words)):
        c, res[i] = sub_with_borrow(x[i], y[i], c)

    return res

# return x >= y
def bigint_gte(x: [int], y: [int]) -> bool:
    for (x_word, y_word) in list(zip(x,y)):
        if x_word > y_word:
            return True
        elif x_word < y_word:
            return False
    # x == y
    return True

# CIOS Montgomery multiplication algorithm
#
# input:
# * x, y, mod - bigint inputs of `val_size_multiplier` length.  the most significant limb of the modulus cannot be zero.
# * mod_inv - pow(-mod, -1, 2**SYSTEM_WORD_SIZE_BITS)
# requires:
# * x < mod and y < mod
# * mod_int % 2 != 0
# * mod[0] != 0
# returns:
#    (x * y * pow(2**(SYSTEM_WORD_SIZE_BITS * val_size_multiplier), -1, mod)) % mod represented as a bigint
# note: references to x_int/y_int/mod_int/t_int refer to the python int representation of the corresponding bigint variable
def mulmont_quadratic(x: [int], y: [int], mod: [int], modinv: int) -> [int]:
    assert len(x) == len(y) and len(y) == len(mod), "{}, {}, {}".format(x, y, mod)
    assert mod[0] != 0, "modulus must occupy all words"

    word_count = len(mod)

    t = [0] * (word_count + 2)

    for i in reversed(range(word_count)):
        # first inner-loop: t <- t + x_int * y[i]
        c = 0
        for j in reversed(range(word_count)):
            c, t[j + 2] = hi_lo(t[j + 2] + x[j] * y[i] + c)

        t[0], t[1] = hi_lo(t[1] + c)

        m = (modinv * t[-1]) % BASE
        c, _ = hi_lo(m * mod[-1] + t[-1])

        # second inner-loop:
        #    1. t_int <- t_int + modinv * mod_int * t[-1]
        #    2. t_int <- t_int // (2**SYSTEM_WORD_SIZE)
        # note:
        #    after step 1:
        #    * modinv * mod_int * t[-1] === -1 % (2**SYSTEM_WORD_SIZE_BITS)
        #    * t_int === (t_int + (-1) t_int) % (2**SYSTEM_WORD_SIZE_BITS) === 0 % (2**SYSTEM_WORD_SIZE_BITS)
        #    so the shift in step 2 is a word-sized right shift.
        #    Steps 1 and 2 are combined and the shift is implict.
        for j in reversed(range(1, word_count)):
            c, t[j + 2] = hi_lo(t[j + 1] + mod[j - 1] * m + c)

        hi, t[2] = hi_lo(t[1] + c)
        t[1] = t[0] + hi

    # t_int = (t_int + t_int * mod_int * pow(-(2**(SYSTEM_WORD_SIZE_BITS*len(mod))), -1, mod_int)) // (2 ** (len(mod) * SYSTEM_WORD_SIZE_BITS))
    # 0 < t_int < 2 * mod_int
    t = t[1:]
    if t[0] != 0:
        # result occupies len(mod) + 1 words so it must be greater than modulus
        return bigint_sub(t, [0] + mod)[1:]
    elif bigint_gte(t[1:], mod):
        return bigint_sub(t[1:], mod)
    else:
        return t[1:]

# subquadratic mulmont:  same general algorithm as mulmont_quadratic with the assumption
#   that any multiplications will be performed using Karatsuba subquadratic multiplication algorithm
# input:
#   x, y, mod (int) - x < mod and y < mod
#   mod (int) - an odd modulus
#   R (int) -  a power of two, and greater than mod
#   mod_inv (int) - pow(-mod, -1, R)
# output:
#  (x * y * pow(R, -1, mod)) % mod
#
def mulmont_subquadratic(x: int, y: int, mod: int, mod_inv_full: int, R: int) -> int:
    T = x * y
    m = ((T % R) * mod_inv_full) % R
    T = T + m * mod
    T /= R
    if T >= mod:
        T -= mod
    return T

def mulmont(mod_state: ModState, x: int, y: int) -> int:
    if mod_state.val_size_multiplier >= MULMODX_SUBQUADRATIC_START:
        return mulmont_subquadratic(x, y, mod_state.mod, mod_state.mod_inv)
    else:
        x_bigint = int_to_bigint(x, (mod_state.val_size_multiplier * 64) // SYSTEM_WORD_SIZE_BITS)
        y_bigint = int_to_bigint(y, (mod_state.val_size_multiplier * 64) // SYSTEM_WORD_SIZE_BITS)
        mod_bigint = int_to_bigint(mod_state.mod)
        return bigint_to_int(mulmont_quadratic(x_bigint, y_bigint, mod_bigint, mod_state.mod_inv_full, mod_state.r))

Setup Section

Introduce a new section type with section_kind = 3.

Section Body

mod_id          <uint64> - setup section identifier
mod_size        <uint64> - size in bytes needed to represent the modulus
num_vals_used   <uint16> - number of allocated values (1-256)
mod_entry_size  <uint64> - size of the modulus entry in the setup section (0 if setup section is dynamic)
mod             <mod_size bytes> - modulus in big-endian ordering.  (omitted if setup section is dynamic)
mod_inv         <8 bytes> - Montgomery parameter for quadratic mulmont: pow(-mod, -1, 1<<64). (omitted if setup section is dynamic or ceil(mod_size / 8) >= MULMODX_SUBQUADRATIC_START))
mod_inv_full    <mod_size bytes> - Montgomery parameter for subquadratic mulmont.  (omitted if setup section is dynamic or ceil(mod_size / 8) < MULMODX_SUBQUADRATIC_START)
r_squared       <mod_size bytes> - Montgomery parameter used by STOREX to convert values to Montgomery form (omitted if setup section is dynamic)

New Opcodes

Mneumonic Opcode Immediate size (bytes) Stack in Stack out
SETUPX 0x21 1 1 0
ADDMODX 0x22 3 0 0
SUBMODX 0x23 3 0 0
MULMODX 0x24 3 0 0
LOADX 0x25 0 3 0
STOREX 0x26 0 3 0

SETUPX

SETUPX {mod_id - byte}: mod_offset

Gas Charging
mod_id = int(contract_code[pc+1:pc+2])

mod_offset = evm_stack.peek(0)
mod_size = setup_sections[mod_id].mod_size
cost = SETUPX_BASE_GAS

if mod_id in evmmax_state.mods:
    # the modulus state keyed by mod_id was already active in this call-frame.  nothing to do.
    return 

if mod_offset + mod_size > len(evm_memory):
    raise Exception("cannot load a modulus that would extend beyond the bounds of EVM memory")

num_vals_used = setup_sections[mod_id].num_vals_used
val_size_multiplier = math.ceil(setup_sections[mod_id].val_size / 8)

if setup_sections[mod_id].mod_entry_size == 0:
    # dynamic setup section
    cost += cost_precompute_mont(val_size_multiplier)
    
cost += cost_evm_memory_expansion(math.ceil((num_vals_used * val_size_multiplier * 8) / 32))
Execution
mod_offset = stack.pop()
mod = None
mod_inv = None
mod_id = int(contract_code[pc+1:pc+2])

if mod_id in evmmax_state.mods[mod_id]:
    # this mod state was previously used in this call frame. it is already allocated
    evmmax_state.active_mod_state = evmmax_state.mods[mod_id]
    return

val_size = setup_sections[mod_id].val_size
val_size_multiplier = math.ceil(val_size / 8)

if setup_sections[mod_id].mod_entry_size == 0:
    # setup section with dynamic modulus
    mod = int.from_bytes(evm_memory[mod_offset:mod_offset+val_size], byteorder='big')
    if mod == 0 or mod % 2 == 0:
        raise Exception("modulus must be nonzero and odd")

	if val_size_multiplier >= MULMODX_SUBQUADRATIC_START:
		mod_inv_full = pow(-r, -1, mod)
    else:
        mod_inv = pow(-mod, -1, 2**SYSTEM_WORD_SIZE_BITS)
else:
    mod = setup_sections[mod_id].mod
    mod_inv = setup_sections[mod_id].mod_inv
    mod_inv_full = setup_sections[mod_id].mod_inv_full

r = 2**(SYSTEM_WORD_SIZE_BITS * val_size_multiplier)
r_squared = r**2 % mod

mod_state = ModState(mod, val_size, r, r_squared, mod_inv_full=mod_inv_full, mod_inv=mod_inv)

evmmax_state.mods[mod_id] = mod_state
evmmax_state.active_mod_state = mod_state

LOADX

LOADX: dst_offset, val_idx, num_vals

Description

Load EVMMAX values in the current active modulus state to EVM memory.

Gas Charging
cost = LOADX_BASE_GAS
dst_offset = evm_stack.peek(0)
val_idx = evm_stack.peek(1)
num_vals = evm_stack.peek(2)

val_size_multiplier = evmmax_state.active_mod_state.val_size_multiplier
if dst_offset + num_vals * val_size_multiplier > len(evm_memory):
    cost += cost_evm_mem_expansion(evm_memory, (dst_offset + num_vals * val_size_multiplier) - len(evm_memory))

cost += cost_mulmodx(val_size_multiplier) * mod_state.num_vals
Execution
dst_offset = evm_stack.pop()
val_idx = evm_stack.pop()
num_vals = evm_stack.pop()

if num_vals == 0:
    return

mod_state = evmmax_state.active_mod_state
if mod_state == None:
    raise Exception("no modulus set")

if val_idx + num_vals > len(mod_state.vals):
    raise Exception("attempt to load beyond allocated values")

if dst_offset + num_vals * mod_state.val_size_multiplier > len(evm_memory):
    expand_evm_memory(evm_memory, (dst_offset + num_vals * mod_state.val_size_multiplier * 8) - len(evm_memory))

cur_dst_offset = dst_offset
for i in range(num_vals):
    mont_val = mod_state.vals[start_val + i]

    # convert the value to canonical form
    val = mulmont(mod_state, mont_val, 1)

    evm_memory[cur_dst_offset:cur_dst_offset + mod_state.val_size_multiplier] = val.to_bytes(mod_state.val_size_multiplier * 8, byteorder='big')
    cur_dst_offset += mod_state.val_size_multiplier * 8

STOREX

STOREX: dst_val, offset, num_vals

Description

Store values from EVM memory into EVMMAX memory space of the current active modulus state, validating that they are reduced by the modulus.

Gas Charging
dst_val = evm_stack.peek(0)
offset = evm_stack.peek(1)
num_vals = evm_stack.peek(2)

val_size_multiplier = evmmax_state.active_mod_state.val_size_multiplier
cost = STOREX_BASE_COST + num_vals * cost_mulmodx(val_size_multiplier)
Execution
dst_val = evm_stack.pop()
offset = evm_stack.pop()
num_vals = evm_stack.pop()

if num_vals == 0:
    return

mod_state = evmmax_state.active_mod_state
if mod_state == None:
    raise Exception("no modulus set")

if dst_val + num_vals > len(mod_state.vals):
    raise Exception("attempt to copy to destination beyond allocated values")

if offset + num_vals * mod_state.val_size_multiplier * 8 > len(evm_memory):
    raise Exception("source of copy would extend beyond allocated memory")

cur_src_offset = offset
r = 2** (mod_state.val_size_multiplier * SYSTEM_WORD_SIZE_BITS) % mod_state.mod
r_squared = r ** 2 % mod_state.mod

for i in range(num_vals):
    val = int.from_bytes(evm_memory[cur_src_offset:cur_src_offset + mod_state.val_size_multiplier * 8], byteorder='big')
    
    if val >= mod_state.modulus:
        raise Exception("values cannot be greater than the modulus")
    
    # convert the value to Montgomery form
    mont_val = mulmont(mod_state, val, mod_state.r_squared)

    mod_state.vals[dst_val + i] = mont_val
    cur_offset += mod_state.val_size_multiplier * 8

ADDMODX

ADDMODX {z_offset - byte}, {x_offset - byte}, {y_offset - byte}:

Description

Compute the modular addition of two EVMMAX values, storing the result in an output.

Gas Charging
val_size_multiplier = evmmax_state.active_mod_state.val_size_multiplier
cost = cost_addmodx(val_size_multiplier)
Execution
mod_state = evmmax_state.active_modulus
if mod_state == None:
    raise Exception("no mod state set")

z_offset = int(contract_code[pc+1:pc+2])
x_offset = int(contract_code[pc+2:pc+3])
y_offset = int(contract_code[pc+3:pc+4])

if x_offset >= mod_state.num_vals_used or y_offset >= mod_state.num_vals_used or z_offset >= mod_state.num_vals_used:
    raise Exception("out of bounds value reference")

mod_state.values[z_offset] = (mod_state.values[x_offset] + mod_state.values[y_offset]) % mod_state.mod

SUBMODX

SUBMODX {z_offset - byte}, {x_offset - byte}, {y_offset - byte}:

Description

Compute the modular subtraction of two EVMMAX values in the current active modulus state, storing the result in an output.

Gas Charging

Same as ADDMODX.

Execution
mod_state = evmmax_state.active_modulus
if mod_state == None:
    raise Exception("no mod state set")

z_offset = int(contract_code[pc+1:pc+2])
x_offset = int(contract_code[pc+2:pc+3])
y_offset = int(contract_code[pc+3:pc+4])

if x_offset >= mod_state.num_vals_used or y_offset >= mod_state.num_vals_used or z_offset >= mod_state.num_vals_used:
    raise Exception("out of bounds value reference")

mod_state.values[z_offset] = (mod_state.values[x_offset] - mod_state.values[y_offset]) % mod_state.mod

MULMODX

MULMODX {z_offset - byte}, {x_offset - byte}, {y_offset - byte}:

Description

Compute the Montgomery modular multiplication of two EVMMAX values in the current active modulus state, storing the result in an output.

Gas Charging
val_size_multiplier = evmmax_state.active_mod_state.val_size_multiplier
cost = cost_mulmodx(val_size_multiplier)
Execution
mod_state = evmmax_state.active_modulus
if mod_state == None:
    raise Exception("no mod state set")

z_offset = int(contract_code[pc+1:pc+2])
x_offset = int(contract_code[pc+2:pc+3])
y_offset = int(contract_code[pc+3:pc+4])

if x_offset >= mod_state.num_vals_used or y_offset >= mod_state.num_vals_used or z_offset >= mod_state.num_vals_used:
    raise Exception("out of bounds value reference")

mod_state.values[z_offset] = mulmont(mod_state, mod_state.values[x_offset], mod_state.values[y_offset])

Changes to Contract Execution

Any EVM operation which expands memory x bytes will charge to expand memory to cur_evm_mem_size + x + evmmax_mem_size bytes where evmmax_mem_size is the size of all allocated EVMMAX values in the current call context (the sum of the values used by each mod_id that has been previously/currently set with SETUPX).

Changes to Contract Creation

Code Section Validation:

  • cannot contain SETUPX{id} if there is not a setup section with mod_id == id.

Setup Section Validation:

  • assert that the section is not malformed
  • assert there is no other setup section with the same mod_id
  • if mod_entry_size != 0:
    • assert that the modulus is odd and the msb is set
    • assert r_squared == 2**(64 * ceil(mod_size / 8) * 2) % mod
    • if ceil(mod_size / 8) < MULMODX_SUBQUADRATIC_START:
      • assert (mod * mod_inv) % (2 ** 64) == 1
    • else:
      • assert (mod * mod_inv_full) % (2 ** (ceil(mod_size / 8) * 64) == 1

Reject contracts where:

  • there exists a setup section with mod_id = id where there is no SETUPX{id} in any code section.
  • A code section contains SETUPX{id} where there is not a corresponding setup section with mod_id = id.
  • the sum of num_vals_used * ceil(mod_size / 8) * 8 from each setup section is greater than EVMMAX_MAX_MEM.

Assume that there is some extra cost to account for these validation rules (amount TBD based on benchmarks. will scale linearly with the number of dynamic setup sections).

Rationale

Montgomery Modular Multiplication

EVMMAX values are stored internally in Montgomery form. Expressing values in Montgomery form enables the use of Montgomery reduction in modular multiplication which gives a substantial performance gain versus naive modular multiplication.

Modular addition and subtraction on Montgomery form values is computed the same as normal.

Memory Alignment for EVMMAX Values

LOADX/STOREX move 64bit-aligned big-endian values to/from the memory space of the active modulus state. SETUPX memory expansion pricing is tuned to assume that values will be stored in a as 64bit-aligned values in their EVMMAX memory space.

This choice is made to keep EVMMAX memory aligned to ensure performance.

Gas Costs

Gas models assume a rate of 1 gas per 25ns of execution time.

Arithmetic Opcodes

ADDMODX and SUBMODX can each be implemented using a single extended-precision addition, and single extended precision subtraction. This justifies a linear cost model.

MULMODX runtime scales quadratically with input size. After a certain threshold, the quadratic complexity of mulmont_quadratic dominates and it becomes more performant to use mulmont_subquadratic. Thus, there is a segmented cost model to reflect different asymptotic behavior between quadratic/subquadratic mulmont.

ADDMODX/SUBMODX/MULMODX pricing includes the cost of arithmetic and latency of accessing input values from CPU cache.

The price model assumes that the implementation will be generic for most bitwidths with the exception of 321-384bits which is priced aggressively.

Load/Store Opcodes

These perform conversion to/from Montgomery and canonical forms for each value copied (a single mulmont per value converted). The overhead of memory loading/copying is covered by cost_mulmontx.

Setup opcode

Specific costs TBD.

In both static/dynamic case, if SETUPX has not been called with a given setup_id, memory expansion cost is charged.

Dynamic Modulus

Gas cost should include overhead of computing the Montgomery parameters.

For choices of val_size_multiplier up to MULMODX_SUBQUADRATIC, this is constant (the inverse can be computed using the lowest SYSTEM_WORD_SIZE_BITS bits of the modulus).

At larger sizes, a full modular inverse has to be computed over a SYSTEM_WORD_SIZE_BITS * val_size_multiplier bit modulus.

The r_squared parameter requires computing a double-width product and modular reduction by mod.