-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhanced Orthogonal Persistence (64-Bit with Graph Copy) #4475
base: luc/stable-heap64
Are you sure you want to change the base?
Conversation
…dfinity/motoko into luc/graph-copy-on-stable-heap64
I double checked, it traps:
The same happens when |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
virtual -> physical mem_size and grow function changes LGTM. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try
stabilize-large-blob.mo
//MOC-FLAG --stabilization-instruction-limit=10000 --max-stable-pages 65536
import Prim "mo:prim";
actor {
let pages : Nat64 = 65536;
if (Prim.stableMemorySize() == 0) {
Prim.debugPrint("growing stable memory");
ignore Prim.stableMemoryGrow(pages);
};
assert Prim.stableMemorySize() == pages;
stable let blob = Prim.stableMemoryLoadBlob(0, Prim.nat64ToNat pages * 65536);
public func check() : async () {
Prim.debugPrint(debug_show (blob.size()))
};
system func preupgrade() {
Prim.debugPrint("PRE-UPGRADE HOOK!");
};
system func postupgrade() {
Prim.debugPrint("POST-UPGRADE HOOK!");
};
};
//CALL ingress check "DIDL\x00\x00"
//CALL ingress __motoko_stabilize_before_upgrade "DIDL\x00\x00"
//CALL upgrade ""
//CALL ingress __motoko_destabilize_after_upgrade "DIDL\x00\x00"
//CALL ingress check "DIDL\x00\x00"
//CALL ingress __motoko_stabilize_before_upgrade "DIDL\x00\x00"
//CALL upgrade ""
//CALL ingress __motoko_destabilize_after_upgrade "DIDL\x00\x00"
//CALL ingress check "DIDL\x00\x00"
//SKIP run
//SKIP run-ir
//SKIP run-low
produces
[nix-shell:~/clean/motoko/test/run-drun]$ ../run.sh -d stabilize-large-blob.mo
WARNING: Could not run ic-ref-run, will skip running some tests
stabilize-large-blob: [tc] [comp] [comp-ref] [valid] [valid-ref] [drun-run]
--- stabilize-large-blob.drun-run.ret (expected)
+++ stabilize-large-blob.drun-run.ret (actual)
@@ -0,0 +1 @@
+Return code 101
--- stabilize-large-blob.drun-run (expected)
+++ stabilize-large-blob.drun-run (actual)
@@ -0,0 +1,18 @@
+ingress Completed: Reply: 0x4449444c016c01b3c4b1f204680100010a00000000000000000101
+debug.print: growing stable memory
+ingress Completed: Reply: 0x4449444c0000
+debug.print: 4_294_967_296
+ingress Completed: Reply: 0x4449444c0000
+debug.print: PRE-UPGRADE HOOK!
+ingress Completed: Reply: 0x4449444c0000
+debug.print: POST-UPGRADE HOOK!
+ingress Completed: Reply: 0x4449444c0000
+ingress Completed: Reply: 0x4449444c0000
+debug.print: 4_294_967_296
+ingress Completed: Reply: 0x4449444c0000
+debug.print: PRE-UPGRADE HOOK!
+ingress Completed: Reply: 0x4449444c0000
+debug.print: POST-UPGRADE HOOK!
+thread 'main' panicked at rs/drun/src/lib.rs:394:5:
+Ingress message did not finish executing within 10000 batches, panicking
+note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Some tests failed:
stabilize-large-blob.mo
That seems ok to me - I think the drun failure is just drun crappiness.
Co-authored-by: Claudio Russo <claudio@dfinity.org>
Co-authored-by: Claudio Russo <claudio@dfinity.org>
Co-authored-by: Claudio Russo <claudio@dfinity.org>
Co-authored-by: Claudio Russo <claudio@dfinity.org>
…dfinity/motoko into luc/graph-copy-on-stable-heap64
Co-authored-by: Claudio Russo <claudio@dfinity.org>
…dfinity/motoko into luc/graph-copy-on-stable-heap64
Co-authored-by: Claudio Russo <claudio@dfinity.org>
…dfinity/motoko into luc/graph-copy-on-stable-heap64
Co-authored-by: Claudio Russo <claudio@dfinity.org>
Co-authored-by: Claudio Russo <claudio@dfinity.org>
Thank you too for having found the bug! |
* refactoring of ir * fix arrange_ir.ml --------- Co-authored-by: luc-blaeser <luc.blaeser@dfinity.org>
Note: This requires adjustments of the IC runtime and execution layers - the PR is thus not yet ready to be merged.
PR Stack
Enhanced orthogonal persistence support is structured in four PRs to ease review:
Enhanced Orthogonal Persistence (64-Bit with Graph Copy)
This implements the vision of enhanced orthogonal persistence in Motoko that combines:
As a result, the use of secondary storage (explicit stable memory, dedicated stable data structures, DB-like storage abstractions) will no longer be necessary: Motoko developers can directly work on their normal object-oriented program structures that are automatically persisted and retained across program version changes.
Advantages
Compared to the existing orthogonal persistence in Motoko, this design offers:
Compared to the explicit use of stable memory, this design improves:
Design
The enhanced orthogonal persistence is based on the following main properties:
IC Extension
The necessary IC extensions are implemented in a separate PR: dfinity/ic#139. This PR is based on these extensions.
Memory Layout
In a co-design between the compiler and the runtime system, the main memory is arranged in the following structure, invariant of the compiled program version:
Persistent Metadata
The persistent metadata describes all anchor information for the program to resume after an upgrade.
More specifically, it comprises:
Compatibility Check
Upgrades are only permitted if the new program version is compatible with the old version, such that the runtime system guarantees a compatible memory structure.
Compatible changes for immutable types are largely analogous to the allowed Motoko subtype relation, e.g.
Nat
toInt
.The existing IDL-subtype functionality is reused with some adjustments to check memory compatibility: The compiler generates the type descriptor, a type table, that is recorded in the persistent metadata. Upon an upgrade, the new type descriptor is compared against the existing type descriptor, and the upgrade only succeeds for compatible changes.
This compatibility check serves as an additional safety measure on top of the DFX Candid subtype check that can be bypassed by users (when ignoring a warning). Moreover, in some aspects, the memory compatibility rules differ to the Candid sub-type check:
stable
fields) can change mutability (let
tovar
and vice-versa).Garbage Collection
The implementation focuses on the incremental GC and abandons the other GCs because the GCs use different memory layouts. For example, the incremental GC uses a partitioned heap with objects carrying a forwarding pointer.
The incremental GC is chosen because it is designed to scale on large heaps and the stable heap design also aims to increase scalability.
The garbage collection state needs to be persisted and retained across upgrades. This is because the GC may not yet be completed at the time of an upgrade, such that object forwarding is still in use. The partition table is stored as part of the GC state.
The garbage collector uses two kinds of roots:
The persistent roots are registered in the persistent metadata and comprise:
The transient roots are referenced by the Wasm data segments and comprise:
Main Actor
On an upgrade, the main actor is recreated and existing stable variables are recovered from the persistent root. The remaining actor variables, the flexible fields as well as new stable variables, are (re)initialized.
As a result, the GC can collect unreachable flexible objects of previous canister versions. Unused stable variables of former versions can also be reclaimed by the GC.
No Static Heap
The static heap is abandoned and former static objects need to be allocated in the dynamic heap. This is because these objects may also need to survive upgrades and the persistent main memory cannot accommodate a growing static heap of a new program version in front of the existing dynamic heap. The incremental GC also operates on these objects, meaning that forwarding pointer resolution is also necessary for these objects.
For memory and runtime efficiency, object pooling is implemented for compile-time-known constant objects (with side-effect-free initialization), i.e. those objects are already created on program initialization/upgrade in the dynamic heap and thereafter the reference to the corresponding prefabricated object is looked up whenever the constant value is needed at runtime.
The runtime system avoids any global Wasm variables for state that needs to be preserved on upgrades. Instead, such global runtime state is stored in the persistent metadata.
Wasm Data Segments
Only passive Wasm data segments are used by the Motoko compiler and runtime system. In contrast to ordinary active data segments, passive segments can be explicitly loaded to a dynamic address.
This simplifies two aspects:
However, more specific handling is required for the Rust-implemented runtime system (RTS): The Rust-generated active data segment of the runtime system is changed to the passive mode and loaded to the expected static address on the program start (canister initialization and upgrade). The location and size of the RTS data segments is therefore limited to a defined reserve of 512 KB, see above. This is acceptable because the RTS only requires a controlled small amount of memory for its data segments, independent of the compiled Motoko program.
Null Sentinel
As an optimization, the top-level
null
pointer is represented as a constant sentinel value pointing to the last unallocated Wasm page. This allows fast null tests without involving forwarding pointer resolution of potential non-null comparand pointers.Migration Path
When migrating from the old serialization-based stabilization to the new persistent heap, the old data is deserialized one last time from stable memory and then placed in the new persistent heap layout. Once operating on the persistent heap, the system should prevent downgrade attempts to the old serialization-based persistence.
Assuming that the persistent memory layout needs to be changed in the future, the runtime system supports serialization and deserialization to and from stable memory in a defined data format using graph copy.
Graph Copy
The graph copy is an alternative persistence mechanism that will be only used in the rare situation when the persistent memory layout will be changed in the future. Arbitrarily large data can be serialized and deserialized beyond the instruction and working set limit of upgrades: Large data serialization and deserialization is split in multiple messages, running before and/or after the IC upgrade to migrate large heaps. Of course, other messages will be blocked during this process and only the canister owner or the canister controllers are permitted to initiate this process.
Graph copying needs to be explicitly initiated before an upgrade to new Motoko version that is incompatible to the current enhanced orthogonal persistent layout. For large data, the graph copy needs to be manually completed after the actual upgrade.
More detailed information and instructions on graph copy are contained in
design/GraphCopyStabilization.md
.Old Stable Memory
The old stable memory remains equally accessible as secondary (legacy) memory with the new support.
Current Limitations
nan
becomesNaN
. There is currently no support for hexadecimal floating point text formatting.emscripten
via LLVM IR.ic-wasm
would need to be extended to Wasm64. The Wasm optimizations intest/bench
are thus currently deactivated.parity-wasm
crate is deprecated before Wasm Memory64 support. It also lacks full support of passive data segments. A re-implementation of the profiler would be needed.Related PRs
Underlying partial implementations: