Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhanced Orthogonal Persistence (64-Bit with Graph Copy) #4475

Open
wants to merge 201 commits into
base: luc/stable-heap64
Choose a base branch
from

Conversation

luc-blaeser
Copy link
Contributor

@luc-blaeser luc-blaeser commented Mar 27, 2024

Note: This requires adjustments of the IC runtime and execution layers - the PR is thus not yet ready to be merged.

PR Stack

Enhanced orthogonal persistence support is structured in four PRs to ease review:

  1. Enhanced Orthogonal Persistence (Complete Integration) #4488
  2. Enhanced Orthogonal Persistence (64-Bit with Graph Copy) #4475 <-- this PR
  3. Enhanced Orthogonal Persistence (64-Bit without Graph Copy) #4225
  4. Enhanced Orthogonal Persistence (32 Bit without Graph Copy) #4193

Enhanced Orthogonal Persistence (64-Bit with Graph Copy)

This implements the vision of enhanced orthogonal persistence in Motoko that combines:

  • Stable heap: Persisting the program main memory across canister upgrades.
  • 64-bit heap: Extending the main memory to 64-bit for large-scaled persistence.

As a result, the use of secondary storage (explicit stable memory, dedicated stable data structures, DB-like storage abstractions) will no longer be necessary: Motoko developers can directly work on their normal object-oriented program structures that are automatically persisted and retained across program version changes.

Advantages

Compared to the existing orthogonal persistence in Motoko, this design offers:

  • Performance: New program versions directly resume from the existing main memory and have access to the memory-compatible data.
  • Scalability: The upgrade mechanism scales with larger heaps and in contrast to serialization, does not hit IC instruction limits.

Compared to the explicit use of stable memory, this design improves:

  • Simplicity: Developers do not need to deal with explicit stable memory.
  • Performance: No copying to and from the separate stable memory is necessary.

Design

The enhanced orthogonal persistence is based on the following main properties:

  • Extension of the IC to retain main memory on upgrades.
  • Supporting 64-bit main memory on the IC.
  • A long-term memory layout that is invariant to new compiled program versions.
  • A fast memory compatibility check performed on each canister upgrade.
  • Incremental garbage collection using a partitioned heap.

IC Extension

The necessary IC extensions are implemented in a separate PR: dfinity/ic#139. This PR is based on these extensions.

Memory Layout

In a co-design between the compiler and the runtime system, the main memory is arranged in the following structure, invariant of the compiled program version:

  • Lower 4MB: Rust call stack.
  • Space between 4MB and 4.5MB: Limited reserved space Wasm data segments, only used for the Motoko runtime system.
  • Between 4.5MB and 5MB: Persistent metadata.
  • Thereafter: Dynamic heap space. Fix start address at 5MB.

Persistent Metadata

The persistent metadata describes all anchor information for the program to resume after an upgrade.

More specifically, it comprises:

  • A stable heap version that allows evolving the persistent memory layout in the future.
  • The stable subset of the main actor, containing all stable variables declared in the main actor.
  • A descriptor of the stable static types to check memory compatibility on upgrades.
  • The runtime state of the garbage collector, including the dynamic heap metadata and memory statistics.
  • A reserve for future metadata extensions.

Compatibility Check

Upgrades are only permitted if the new program version is compatible with the old version, such that the runtime system guarantees a compatible memory structure.

Compatible changes for immutable types are largely analogous to the allowed Motoko subtype relation, e.g.

  • Adding or removing actor fields.
  • Removing object fields.
  • Adding variant fields.
  • Nat to Int.
  • Shared function parameter contravariance and return type covariance.

The existing IDL-subtype functionality is reused with some adjustments to check memory compatibility: The compiler generates the type descriptor, a type table, that is recorded in the persistent metadata. Upon an upgrade, the new type descriptor is compared against the existing type descriptor, and the upgrade only succeeds for compatible changes.

This compatibility check serves as an additional safety measure on top of the DFX Candid subtype check that can be bypassed by users (when ignoring a warning). Moreover, in some aspects, the memory compatibility rules differ to the Candid sub-type check:

  • Top-level actor fields (stable fields) can change mutability (let to var and vice-versa).
  • Support of variable (MutBox) with type invariance.
  • Types cannot be made optional (no insertion of Option).
  • Same arity for function parameters and function return types (no removed optional parameters, no additional optional results).
  • Records cannot introduce additional optional fields.
  • Same arity for tuple types (no insertion of optional items).
  • Records and tuples are distinct.

Garbage Collection

The implementation focuses on the incremental GC and abandons the other GCs because the GCs use different memory layouts. For example, the incremental GC uses a partitioned heap with objects carrying a forwarding pointer.

The incremental GC is chosen because it is designed to scale on large heaps and the stable heap design also aims to increase scalability.

The garbage collection state needs to be persisted and retained across upgrades. This is because the GC may not yet be completed at the time of an upgrade, such that object forwarding is still in use. The partition table is stored as part of the GC state.

The garbage collector uses two kinds of roots:

  • Persistent roots: These refer to root objects that need to survive canister upgrades.
  • Transient roots: These cover additional roots that are only valid in a specific version of a program and are discarded on an upgrade.

The persistent roots are registered in the persistent metadata and comprise:

  • All stable variables of the main actor, only stored during an upgrade.
  • The stable type table.

The transient roots are referenced by the Wasm data segments and comprise:

  • All canister variables of the current version, including flexible variables.

Main Actor

On an upgrade, the main actor is recreated and existing stable variables are recovered from the persistent root. The remaining actor variables, the flexible fields as well as new stable variables, are (re)initialized.

As a result, the GC can collect unreachable flexible objects of previous canister versions. Unused stable variables of former versions can also be reclaimed by the GC.

No Static Heap

The static heap is abandoned and former static objects need to be allocated in the dynamic heap. This is because these objects may also need to survive upgrades and the persistent main memory cannot accommodate a growing static heap of a new program version in front of the existing dynamic heap. The incremental GC also operates on these objects, meaning that forwarding pointer resolution is also necessary for these objects.

For memory and runtime efficiency, object pooling is implemented for compile-time-known constant objects (with side-effect-free initialization), i.e. those objects are already created on program initialization/upgrade in the dynamic heap and thereafter the reference to the corresponding prefabricated object is looked up whenever the constant value is needed at runtime.

The runtime system avoids any global Wasm variables for state that needs to be preserved on upgrades. Instead, such global runtime state is stored in the persistent metadata.

Wasm Data Segments

Only passive Wasm data segments are used by the Motoko compiler and runtime system. In contrast to ordinary active data segments, passive segments can be explicitly loaded to a dynamic address.

This simplifies two aspects:

  • The generated Motoko code can contain arbitrarily large data segments (to the maximum that is supported by the IC). The segments can be loaded to the dynamic heap when needed.
  • The IC can simply retain the main memory on an upgrade without needing to patch any active data segments of the new program version to the persistent main memory.

However, more specific handling is required for the Rust-implemented runtime system (RTS): The Rust-generated active data segment of the runtime system is changed to the passive mode and loaded to the expected static address on the program start (canister initialization and upgrade). The location and size of the RTS data segments is therefore limited to a defined reserve of 512 KB, see above. This is acceptable because the RTS only requires a controlled small amount of memory for its data segments, independent of the compiled Motoko program.

Null Sentinel

As an optimization, the top-level null pointer is represented as a constant sentinel value pointing to the last unallocated Wasm page. This allows fast null tests without involving forwarding pointer resolution of potential non-null comparand pointers.

Migration Path

When migrating from the old serialization-based stabilization to the new persistent heap, the old data is deserialized one last time from stable memory and then placed in the new persistent heap layout. Once operating on the persistent heap, the system should prevent downgrade attempts to the old serialization-based persistence.

Assuming that the persistent memory layout needs to be changed in the future, the runtime system supports serialization and deserialization to and from stable memory in a defined data format using graph copy.

Graph Copy

The graph copy is an alternative persistence mechanism that will be only used in the rare situation when the persistent memory layout will be changed in the future. Arbitrarily large data can be serialized and deserialized beyond the instruction and working set limit of upgrades: Large data serialization and deserialization is split in multiple messages, running before and/or after the IC upgrade to migrate large heaps. Of course, other messages will be blocked during this process and only the canister owner or the canister controllers are permitted to initiate this process.

Graph copying needs to be explicitly initiated before an upgrade to new Motoko version that is incompatible to the current enhanced orthogonal persistent layout. For large data, the graph copy needs to be manually completed after the actual upgrade.

dfx canister call CANISTER_ID __motoko_stabilize_before_upgrade "()"
dfx deploy CANISTER_ID
dfx canister call CANISTER_ID __motoko_destabilze_after_upgrade "()"

More detailed information and instructions on graph copy are contained in design/GraphCopyStabilization.md.

Old Stable Memory

The old stable memory remains equally accessible as secondary (legacy) memory with the new support.

Current Limitations

  • Freeing old object fields: While new program versions can drop object fields, the runtime system should also delete the redundant fields of persistent objects of previous program versions. This could be realized during garbage collection when objects are copied. For this purpose, the runtime system may maintain a set of field hashes in use and consult this table during garbage collection. Another, probably too restrictive solution could be to disallow field removal (subtyping) on object upgrades during the memory compatibility check.
  • The incremental GC only allows 64 GB. Transitioning to a dynamic partition table would be necessary to go beyond this limit. This is to be be implemented in a separate PR.
  • The floating point display format differs in Wasm64 for special values, e.g. nan becomes NaN. There is currently no support for hexadecimal floating point text formatting.
  • Workaround for Rust needed to build PIC (position-independent code) libraries. Explicit use of emscripten via LLVM IR.
  • ic-wasm would need to be extended to Wasm64. The Wasm optimizations in test/bench are thus currently deactivated.
  • The Wasm profiler (only used for the flamegraphs) is no longer applicable because the underlying parity-wasm crate is deprecated before Wasm Memory64 support. It also lacks full support of passive data segments. A re-implementation of the profiler would be needed.

Related PRs

Underlying partial implementations:

@luc-blaeser
Copy link
Contributor Author

luc-blaeser commented May 17, 2024

I'm curious: what happens if you call __motoko_stabilize_before_upgrade() followed by __motoko_stabilize_after_upgrade(), without any upgrade in between? I guess it should just trap.

I double checked, it traps:

+ingress Completed: Reject: IC0503: Canister rwlgt-iiaaa-aaaaa-aaaaa-cai trapped explicitly: RTS error: No destabilization needed

The same happens when __motoko_destabilize_after_upgrade() is applied to a running canister.

Copy link
Contributor

@crusso crusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

virtual -> physical mem_size and grow function changes LGTM. Thanks!

Copy link
Contributor

@crusso crusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try
stabilize-large-blob.mo

//MOC-FLAG --stabilization-instruction-limit=10000 --max-stable-pages 65536

import Prim "mo:prim";

actor {

    let pages : Nat64 = 65536;
    if (Prim.stableMemorySize() == 0) {
      Prim.debugPrint("growing stable memory");
      ignore Prim.stableMemoryGrow(pages);
    };
    assert Prim.stableMemorySize() == pages;
    stable let blob = Prim.stableMemoryLoadBlob(0, Prim.nat64ToNat pages * 65536);

    public func check() : async () {
        Prim.debugPrint(debug_show (blob.size()))
    };

    system func preupgrade() {
        Prim.debugPrint("PRE-UPGRADE HOOK!");
    };

    system func postupgrade() {
        Prim.debugPrint("POST-UPGRADE HOOK!");
    };
};

//CALL ingress check "DIDL\x00\x00"
//CALL ingress __motoko_stabilize_before_upgrade "DIDL\x00\x00"
//CALL upgrade ""
//CALL ingress __motoko_destabilize_after_upgrade "DIDL\x00\x00"
//CALL ingress check "DIDL\x00\x00"
//CALL ingress __motoko_stabilize_before_upgrade "DIDL\x00\x00"
//CALL upgrade ""
//CALL ingress __motoko_destabilize_after_upgrade "DIDL\x00\x00"
//CALL ingress check "DIDL\x00\x00"

//SKIP run
//SKIP run-ir
//SKIP run-low

produces

[nix-shell:~/clean/motoko/test/run-drun]$ ../run.sh -d stabilize-large-blob.mo
WARNING: Could not run ic-ref-run, will skip running some tests
stabilize-large-blob: [tc] [comp] [comp-ref] [valid] [valid-ref] [drun-run]
--- stabilize-large-blob.drun-run.ret (expected)
+++ stabilize-large-blob.drun-run.ret (actual)
@@ -0,0 +1 @@
+Return code 101
--- stabilize-large-blob.drun-run (expected)
+++ stabilize-large-blob.drun-run (actual)
@@ -0,0 +1,18 @@
+ingress Completed: Reply: 0x4449444c016c01b3c4b1f204680100010a00000000000000000101
+debug.print: growing stable memory
+ingress Completed: Reply: 0x4449444c0000
+debug.print: 4_294_967_296
+ingress Completed: Reply: 0x4449444c0000
+debug.print: PRE-UPGRADE HOOK!
+ingress Completed: Reply: 0x4449444c0000
+debug.print: POST-UPGRADE HOOK!
+ingress Completed: Reply: 0x4449444c0000
+ingress Completed: Reply: 0x4449444c0000
+debug.print: 4_294_967_296
+ingress Completed: Reply: 0x4449444c0000
+debug.print: PRE-UPGRADE HOOK!
+ingress Completed: Reply: 0x4449444c0000
+debug.print: POST-UPGRADE HOOK!
+thread 'main' panicked at rs/drun/src/lib.rs:394:5:
+Ingress message did not finish executing within 10000 batches, panicking
+note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Some tests failed:
stabilize-large-blob.mo

That seems ok to me - I think the drun failure is just drun crappiness.

luc-blaeser and others added 17 commits May 17, 2024 14:29
Co-authored-by: Claudio Russo <claudio@dfinity.org>
Co-authored-by: Claudio Russo <claudio@dfinity.org>
Co-authored-by: Claudio Russo <claudio@dfinity.org>
Co-authored-by: Claudio Russo <claudio@dfinity.org>
Co-authored-by: Claudio Russo <claudio@dfinity.org>
Co-authored-by: Claudio Russo <claudio@dfinity.org>
Co-authored-by: Claudio Russo <claudio@dfinity.org>
Co-authored-by: Claudio Russo <claudio@dfinity.org>
@luc-blaeser
Copy link
Contributor Author

virtual -> physical mem_size and grow function changes LGTM. Thanks!

Thank you too for having found the bug!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants