Skip to content
Permalink
main
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Go to file
 
 
Cannot retrieve contributors at this time

Memory64

Summary

This page describes a proposal to support linear memory of sizes larger than 232 bits. It provides no new instructions, but instead extends the currently existing instructions to allow 64-bit indexes.

Implementation Status

  • spec interpreter: Done
  • v8/chrome: Done
  • Firefox: Done
  • Safari: ?
  • wabt: Done
  • binaryen: Done
  • emscripten: Done

Motivation

WebAssembly linear memory objects have sizes measured in pages. Each page is 65536 (216) bytes. In WebAssembly version 1, a linear memory can have at most 65536 pages, for a total of 232 bytes (4 gibibytes).

In addition to this page limit, all memory instructions currently use the i32 type as a memory index. This means they can address at most 232 bytes as well.

For many applications, 4 gibibytes of memory is enough. Using 32-bit memory indexes is sufficient in this case, and has the additional benefit that pointers in the producer language are smaller, which can yield memory savings. However, for applications that need more memory than this, there are no easy workarounds given the current WebAssembly feature set. Allowing the WebAssembly module to choose between 32-bit and 64-bit memory indexes addresses both concerns.

Similarly, since WebAssembly is a Virtual Instruction Set Architecture (ISA), some hosts may want to use the WebAssembly binary format as a portable executable format, in addition to supporting other non-virtual ISAs. Nearly all ISAs have support for 64-bit memory addresses now, and a host may not want to have to support 32-bit memory addresses in their ABI.

Overview

Structure

  • A new idxtype can be either i32 or i64

    • idxtype ::= i32 | i64
  • The limits structure is parameterised by index syntax

    • limits_iv ::= {min iv, max iv?} The parameter is omitted where it is immaterial.
  • The table type continues to use i32 indices

    • tabletype ::= limits_i32 elemtype
  • The memory type structure is extended to have an index type

    • memtype ::= idxtype limits_iv (iff idxtype = type(iv))
    • where
      type(\i32) = \I32
      type(\i64) = \I64
      
  • The memarg immediate is changed to allow a 64-bit offset

    • memarg ::= {offset u64, align u32}

Validation

  • Index types are classified by their value range:

    • ----------------
      ⊦ i32 : 2**16
      
    • ----------------
      ⊦ i64 : 2**48
      
  • Memory page limits are classified by their index types

    • ⊦ it : k    n <= k    (m <= k)?    (n < m)?
      -------------------------------------------
      ⊦ { min n, max m? } : it
      
  • Memory types are validated accordingly:

    • ⊦ limits : it
      --------------
      ⊦ it limits ok
      
  • All memory instructions are changed to use the index type, and the offset must also be in range of the index type

    • t.load memarg
      • C.mems[0] = it limits   2**memarg.align <= |t|/8   memarg.offset < 2**|it|
        --------------------------------------------------------------------------
                            C ⊦ t.load memarg : [it] → [t]
        
    • t.loadN_sx memarg
      • C.mems[0] = it limits   2**memarg.align <= N/8   memarg.offset < 2**|it|
        ------------------------------------------------------------------------
                          C ⊦ t.loadN_sx memarg : [it] → [t]
        
    • t.store memarg
      • C.mems[0] = it limits   2**memarg.align <= |t|/8   memarg.offset < 2**|it|
        --------------------------------------------------------------------------
                           C ⊦ t.store memarg : [it t] → []
        
    • t.storeN_sx memarg
      • C.mems[0] = it limits   2**memarg.align <= N/8   memarg.offset < 2**|it|
        ------------------------------------------------------------------------
                         C ⊦ t.storeN_sx memarg : [it t] → []
        
    • memory.size
      •    C.mems[0] = it limits
        ---------------------------
        C ⊦ memory.size : [] → [it]
        
    • memory.grow
      •     C.mems[0] = it limits
        -----------------------------
        C ⊦ memory.grow : [it] → [it]
        
    • memory.fill
      •     C.mems[0] = it limits
        -----------------------------
        C ⊦ memory.fill : [it i32 it] → []
        
    • memory.copy
      •     C.mems[0] = it limits
        -----------------------------
        C ⊦ memory.copy : [it it it] → []
        
    • memory.init x
      •     C.mems[0] = it limits   C.datas[x] = ok
        -------------------------------------------
            C ⊦ memory.init : [it i32 i32] → []
        
    • (and similar for memory instructions from other proposals)
  • The SIMD proposal extends t.load memarg and t.store memarg above such that t may now also be v128, which accesses a 16-byte quantity in memory that is also 16-byte aligned.

    In addition to this, it also has these SIMD specific memory operations (see SIMD proposal for full semantics):

    • v128.loadN_zero memarg (where N = 32/64): Load a single 32-bit or 64-bit element into the lowest bits of a v128 vector, and initialize all other bits of the v128 vector to zero.
    • v128.loadN_splat memarg (where N = 8/16/32/64): Load a single element and splat to all lanes of a v128 vector. The natural alignment is the size of the element loaded.
    • v128.loadN_lane memarg v128 immlaneidx (where N = 8/16/32/64): Load a single element from memarg into the lane of the v128 specified in the immediate mode operand immlaneidx. The values of all other lanes of the v128 are bypassed as is.
    • v128.storeN_lane memarg v128 immlaneidx (where N = 8/16/32/64): Store into memarg the lane of v128 specified in the immediate mode operand immlaneidx.
    • v128.loadL_sx memarg (where L is 8x8/16x4/32x2, and sx is s/u): Fetch consecutive integers up to 32-bit wide and produce a vector with lanes up to 64 bits. The natural alignment is 8 bytes.

    All these operations now take 64-bit address operands when used with a 64-bit memory.

  • The Threads proposal has atomic versions of t.load, t.store, (and t.loadN_u / t.storeN_u, no sign-extend) specified above, except with . replaced by .atomic., and the guarantee of ordering of accesses being sequentially consistent.

    In addition to this, it has the following memory operations (see Threads proposal for full semantics):

    • t.atomic.rmwN.op_u memarg (where t = 32/64, N = 8/16/32 when < t or empty otherwise, op is add/sub/and/or/xor/xchg/cmpxchg, and _u only present when N is not empty): The first 6 operations atomically read a value from an address, modify the value, and store the resulting value to the same address. They then return the value read from memory before the modify operation was performed. In the case of cmpxchg, the operands are an address, an expected value, and a replacement value. If the loaded value is equal to the expected value, the replacement value is stored to the same memory address. If the values are not equal, no value is stored. In either case, the loaded value is returned.
    • memory.atomic.waitN (where N = 32/64): The wait operator take three operands: an address operand, an expected value, and a relative timeout in nanoseconds as an i64. The return value is 0, 1, or 2, returned as an i32.
    • memory.atomic.notify: The notify operator takes two operands: an address operand and a count as an unsigned i32. The operation will notify as many waiters as are waiting on the same effective address, up to the maximum as specified by count. The operator returns the number of waiters that were woken as an unsigned i32.

    All these operations now take 64-bit address operands when used with a 64-bit memory.

  • The Multi-memory proposal extends each of these instructions with one or two memory index immediates. The index type for that memory will be used. For example,

    • memory.size x
      •    C.mems[x] = it limits
        ---------------------------
        C ⊦ memory.size x : [] → [it]
        

    memory.copy has two memory index immediates, so will have multiple possible signatures:

    • memory.copy d s
      • C.mems[d] = iN limits   C.mems[s] = iM limits    K = min {N, M}
        ---------------------------------------------------------------
            C ⊦ memory.copy d s : [iN iM iK] → []
        
  • Data segment validation uses the index type

    • C.mems[0] = it limits   C ⊦ expr: [it]   C ⊦ expr const
      -------------------------------------------------------
            C ⊦ {data x, offset expr, init b*} ok
      

Execution

  • Memory instances are extended to have 64-bit vectors and a u64 max size

    • meminst ::= { data vec64(byte), max u64? }
  • Memory instructions use the index type instead of i32

    • t.load memarg
    • t.loadN_sx memarg
    • t.store memarg
    • t.storeN memarg
    • memory.size
    • memory.grow
    • (spec text omitted)
  • memory.grow has behavior that depends on the index type:

    • for i32: no change
    • for i64: check for a size greater than 264 - 1, and return 264 - 1 when memory.grow fails.
  • Memory import matching requires that the index type matches

    •   it_1 = it_2   ⊦ limits_1 <= limits_2
      ----------------------------------------
      ⊦ mem it_1 limits_1 <= mem it_2 limits_2
      
  • Bounds checking is required to be the same as for 32-bit memories, that is, the index + offset (a u65) of a load or store operation is required to be checked against the current memory size and trap if out of range.

    It is expected that the cost of this check remains low, if an implementation can implement the index check with a branch, and the offset separately using a guard page for all smaller offsets. Repeated accesses over the same index and different offsets allow simple elimination of subsequent checks.

Binary format

  • The limits structure also encodes an additional value to indicate the index type

    • limits ::= 0x00 n:u32        ⇒ i32, {min n, max ϵ}, 0
              |  0x01 n:u32 m:u32  ⇒ i32, {min n, max m}, 0
              |  0x02 n:u32        ⇒ i32, {min n, max ϵ}, 1  ;; from threads proposal
              |  0x03 n:u32 m:u32  ⇒ i32, {min n, max m}, 1  ;; from threads proposal
              |  0x04 n:u64        ⇒ i64, {min n, max ϵ}, 0
              |  0x05 n:u64 m:u64  ⇒ i64, {min n, max m}, 0
              |  0x06 n:u32        ⇒ i64, {min n, max ϵ}, 1  ;; from threads proposal
              |  0x07 n:u32 m:u32  ⇒ i64, {min n, max m}, 1  ;; from threads proposal
      
  • The memory type structure is extended to use this limits encoding

    • memtype ::= (it, lim, _):limits ⇒ it lim
      
  • The memarg's offset is read as u64

    • memarg ::= a:u32 o:u64

Text format

  • There is a new index type:

    • idxtype ::= 'i32' ⇒ i32
               |  'i64' ⇒ i64
      
  • The memory type definition is extended to allow an optional index type, which must be either i32 or i64

    • memtype ::= lim:limits             ⇒ i32 lim
               |  it:idxtype lim:limits  ⇒ it lim
      
  • The memory abbreviation definition is extended to allow an optional index type too, which must be either i32 or i64

    • '(' 'memory' id? index_type? '(' 'data' b_n:datastring ')' ')' === ...