diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS
index dbd77ef..7e493ac 100644
--- a/.github/CODEOWNERS
+++ b/.github/CODEOWNERS
@@ -3,3 +3,9 @@
# DX team members are also configured as bypass actors in the branch ruleset
# and can merge their own PRs without a separate review.
* @dfinity/dx
+
+# Security — product-security team must approve changes to security best practices
+docs/guides/security/ @dfinity/product-security @dfinity/dx
+docs/concepts/security.md @dfinity/product-security @dfinity/dx
+docs/references/message-execution-properties.md @dfinity/product-security @dfinity/dx
+docs/guides/canister-calls/idempotency.md @dfinity/product-security @dfinity/dx
diff --git a/docs/404.mdx b/docs/404.mdx
index a0f159d..dfa8a52 100644
--- a/docs/404.mdx
+++ b/docs/404.mdx
@@ -16,7 +16,7 @@ Pick a guide that matches what you're building.
- **[Frontends](/guides/frontends/asset-canister/)**: Serve assets, integrate frameworks, configure custom domains, and certify responses.
- **[Authentication](/guides/authentication/internet-identity/)**: Add passwordless login and verifiable user identity with Internet Identity.
- **[Chain Fusion](/guides/chain-fusion/bitcoin/)**: Connect canisters to Bitcoin, Ethereum, and Solana.
-- **[Security](/guides/security/access-management/)**: Access control, encryption, DoS prevention, and safe upgrade patterns.
+- **[Security](/guides/security/identity-and-access-management/)**: Access control, DoS prevention, and safe upgrade patterns.
## Other places to look
diff --git a/docs/concepts/security.md b/docs/concepts/security.md
index ef23dff..b281fd1 100644
--- a/docs/concepts/security.md
+++ b/docs/concepts/security.md
@@ -61,7 +61,7 @@ The following threats are your responsibility to mitigate:
Every update method is publicly callable. If you do not check the caller, anyone can invoke admin functions, drain funds, or corrupt state. The anonymous principal (`2vxsx-fae`) is a particularly common gap: it must be explicitly rejected in any authenticated endpoint, because otherwise it acts as a shared identity that anyone can use.
-See [Access management](../guides/security/access-management.md#reject-anonymous-callers) for implementation patterns.
+See [Access management](../guides/security/identity-and-access-management.md#reject-anonymous-callers) for implementation patterns.
### Reentrancy and async interleaving
@@ -97,11 +97,11 @@ Users have no way to verify that a canister's running code matches its published
## What's next
-- [Access management](../guides/security/access-management.md): caller checks, guards, and role-based access control
+- [Access management](../guides/security/identity-and-access-management.md): caller checks, guards, and role-based access control
- [Upgrade safety](../guides/security/canister-upgrades.md): safe upgrade patterns
- [Inter-canister call safety](../guides/security/inter-canister-calls.md): async pitfalls and mitigations
- [DoS prevention](../guides/security/dos-prevention.md): cycle drain protection
-- [Data integrity](../guides/security/data-integrity.md): input validation and storage safety
+- [Data integrity](../guides/security/data-integrity-and-authenticity.md): input validation and storage safety
- [Response certification](../guides/frontends/certification.md): certified variables for query responses
diff --git a/docs/concepts/vetkeys.md b/docs/concepts/vetkeys.md
index e27d660..cc14ef6 100644
--- a/docs/concepts/vetkeys.md
+++ b/docs/concepts/vetkeys.md
@@ -99,7 +99,6 @@ The vetKD management canister API is live on mainnet. The `ic-vetkeys` Rust crat
## Next steps
-- [Encryption with VetKeys](../guides/security/encryption.md): implement encrypted storage, IBE, and the full end-to-end key derivation flow
- [Chain-Key Cryptography](chain-key-cryptography.md): the threshold cryptographic foundation that vetKeys build on
- [Security](security.md): where vetKeys fit in the broader canister security model
diff --git a/docs/guides/backends/randomness.md b/docs/guides/backends/randomness.md
index ec94e8f..59438e6 100644
--- a/docs/guides/backends/randomness.md
+++ b/docs/guides/backends/randomness.md
@@ -225,7 +225,7 @@ Note: this example predates `mo:core` and uses the older `Random.Finite` API. Th
- [Verifiable Randomness (concept)](../../concepts/verifiable-randomness.md): how the IC's threshold VRF works
- [Management Canister](../../references/management-canister.md): `raw_rand` API reference
-- [Data Integrity](../security/data-integrity.md): using randomness in a secure application design
+- [Data Integrity](../security/data-integrity-and-authenticity.md): using randomness in a secure application design
- [Inter-canister calls](../canister-calls/inter-canister-calls.md#reentrancy): async patterns and reentrancy
diff --git a/docs/guides/canister-calls/idempotency.md b/docs/guides/canister-calls/idempotency.md
new file mode 100644
index 0000000..baff003
--- /dev/null
+++ b/docs/guides/canister-calls/idempotency.md
@@ -0,0 +1,121 @@
+---
+title: "Safe Retries and Idempotency"
+description: "Design idempotent canister APIs to enable safe retries for ingress calls and bounded-wait inter-canister calls, preventing double-spend and other correctness issues."
+---
+
+In the case of network issues or other unexpected behavior, ICP clients (such as agents) that issue ingress update calls may be unable to determine whether their ingress request has been processed. For example, this can happen if the client loses connection until after the request's ingress expiry ends and the request's status is removed from the system state tree.
+
+Similarly, canisters that call other canisters using bounded-wait calls may be unable to determine whether the call was successful or not.
+
+This can be risky as the callers (external users or applications for ingress messages, or canisters for inter-canister calls) might decide to retry the transaction, potentially leading to serious security vulnerabilities such as double spending.
+
+Thus, it is important to design and/or use canister APIs such that it is possible to retry requests safely, even when the ICP provides no information about previous request attempts. This page describes general approaches that both the canister authors and clients can adopt to enable safe retries.
+
+## Idempotent canister APIs
+
+A canister endpoint is idempotent if executing it multiple times is equivalent to executing it once.[^1] Whenever an endpoint is idempotent or can be made idempotent by the developer, this provides an easy way to implement safe retries.
+
+Given an idempotent endpoint, you can implement retries from an external application by retrying the call until you observe a certified response, either a replied or rejected status; see the illustration below. If such a response is ever observed, it's sure that the transaction has been executed at least once, which, thanks to idempotency, has the same result as executing it exactly once. However, the application may not be willing to wait for a response indefinitely, and a timeout could be implemented. Upon timeout, an error should be displayed to the user instructing them to wait until the latest message that has been sent has expired (as defined by the request's `ingress_expiry`) and then manually check the status of the transaction. Ideally, timeouts should be rare and not occur during normal operation.
+
+```plantuml
+actor User
+participant "Web Browser" as Browser
+participant Agent
+participant "Boundary Node" as BN
+participant "IC Node" as IC
+
+User -> Browser: Start transaction
+
+loop until certified response or timeout
+ Browser -> Agent: idempotent call
+ Agent -> BN: call & subsequent read_state calls
+ BN -> IC
+ IC --> BN
+ BN --> Agent: certified response or error
+ Agent --> Browser: certified response or error
+end
+
+Browser --> User: certified response\nor timeout error message
+```
+
+The situation is similar for bounded-wait inter-canister calls. Given an idempotent endpoint, the calling canister can keep retrying until a response other than `SYS_UNKNOWN` is observed or give up after a timeout if waiting indefinitely is not an option.
+
+Below are two approaches to making endpoints idempotent: sequence numbers and (time window) ID deduplication.
+
+### Update sequence numbers
+
+An endpoint can make use of sequence numbers to provide idempotency by taking a sequence number parameter in addition to other parameters. In the extreme case, a canister could keep a single expected sequence number for every endpoint, and a call could only be accepted if it contained the next expected sequence number, causing the expected sequence number to be incremented upon call execution. This trivially implies that any call can only be executed once. More practically, an expected sequence number is kept for each caller principal, or, in the case of ledger-like canisters, each ledger account. Note that Ethereum implements this mechanism.
+
+The advantages of this approach are:
+
+1. Sequence numbers are simple to implement and understand.
+2. When applicable, it has a modest memory footprint because only the next expected sequence number must be stored (for example, per active account).
+
+The approach also has some disadvantages:
+
+1. It limits the throughput. When per-caller sequence numbers are used, it means that the caller can generally perform only one ingress call per consensus block, translating to a throughput of about 1 ingress call per second for that user. The situation is better for inter-canister calls, as the requests (if delivered) will be delivered in the order in which they were sent. Thus, the calling canister can issue multiple requests simultaneously, using appropriate sequence numbers. Under normal load, all requests should be delivered. However, under heavy load where the system may drop some requests, requests that follow such a dropped request may become invalid.
+
+2. It limits concurrency. The user has to sequentialize all their calls. This is straightforward to do when the user is another canister, but it can be much more difficult when the canister is called through ingress messages. In particular, it's complicated when the user is using multiple clients or devices to access the canister, for example. This concurrency problem also makes the approach inapplicable to cases where anonymous users are allowed to trigger update calls.
+
+3. If the sequence number is stored per user or per account, tracking them for too many users can exhaust the canister memory, even if each individual number is small. This could, e.g., be exploited by an attacker to exhaust the memory. The approach is thus best suited for cases where the user has to pay for the usage in some way (e.g., the ledgers usually require a fee to both create an account and transfer funds), which thwarts attackers by requiring them to invest significant funds in an attack.
+
+### ID deduplication
+
+Another approach to idempotency is to make the calls uniquely identifiable on the receiving canister side (e.g., by using user-chosen IDs, sequence numbers, or a combination of several argument fields) to make sure a given call is executed at most once. The canister then deduplicates calls before executing them; if a call with the same ID has been executed previously, the new call is simply ignored (potentially returning the result of the previous call). Thus, the user can safely keep retrying the call until they get a response.
+
+For example, the ICRC ledger standard provides deduplication in this way. Using identical values for all call parameters, including the `created_at_time` and `memo` parameters, when issuing a transaction makes the transaction call idempotent by deduplicating calls with the same parameters.
+
+However, a naive implementation of this approach can exhaust the canister memory, as all successfully executed IDs need to be kept around forever. Thus, the deduplication is usually time-limited to a certain time window. For example, the ICP ledger uses a 24-hour window, and the ICRC standard defines a configuration parameter `TX_WINDOW` that determines the window length.
+
+Moreover, the ICP/ICRC ledgers use the `created_at_time` parameter to limit the validity period of a call. Roughly, the call is only considered valid if its `created_at_time` is not in the future and at most 24 hours in the past.[^2] This avoids the problem where the deduplication window expiring would allow a retried call to succeed again.
+
+But even with this improvement used in the ledgers, the time window approach implicitly assumes that the client will be able to get a definite answer to their call within the time window. For example, after the 24 hours expire, the user cannot easily tell if their ledger transfer happened; their only option is to analyze the ledger blocks, which is somewhat tedious and has to be done carefully to avoid asynchrony issues; see the section on [queryable call results](#queryable-call-results).
+
+Relying solely on a time window for deduplication does not guarantee bounded memory usage. In theory, an unlimited number of updates could occur within the time window, though in practice, this is constrained by the scaling limits of the ICP. The ICP/ICRC ledgers thus also define a maximum capacity: a limit on the number of deduplicated transactions (i.e., deduplication IDs) that can be stored in their deduplication store. Once this capacity is reached, further transactions are rejected until older transactions expire from the deduplication store at the end of the time window. Yet another extension of the approach is to guarantee deduplication for the stated time window as above but keep storing deduplication IDs even beyond that window, as long as the capacity is not reached. This way, the clients obtain a hard deduplication guarantee for the time window and a best-effort attempt to deduplicate transactions even past the window.
+
+An alternative is to do away with the time window and store the deduplication data forever. This requires storing this data in multiple canisters in order to prevent exhausting canister memory, similar to how the ICP/ICRC ledgers store transaction data in the archive canister. This shifts the tedious part of querying the deduplication data (e.g., ledger blocks) from the user to the canister.
+
+Summarizing, the advantages of this approach are:
+
+1. It can support high throughput.
+2. It requires no synchronization on the part of the user and supports use cases like multiple devices.
+
+The disadvantages are:
+
+1. It is more complicated to implement than sequence numbers.
+2. If a time window is used, it usually implicitly assumes that the user learns the call outcome within the time window.
+3. The memory usage can grow fairly high with high supported throughput and long deduplication windows. For example, supporting 100 transactions per second with a deduplication window of 24 hours can require hundreds of megabytes of heap space. This can be mitigated by using multiple canisters to store the deduplication data, at the expense of further implementation complexity and higher latency.
+
+## Other approaches to safe retries
+
+In the absence of idempotent endpoints, or even in addition to them, clients may be able to use other endpoints to make their retries safe.
+
+### Queryable call results
+
+If the canister, in addition to the update endpoint, also exposes a query that can inform the user of the result of the update, the client can also use this for safe retries as follows:
+
+1. Attempt to perform the update.
+2. If the result of the update is unknown (e.g., not present in the ingress history anymore, or a `SYS_UNKNOWN` error is returned for an inter-canister call), query the call result endpoint to determine whether the update was applied or not. Moreover, one needs to ensure that the previously sent call cannot be applied in the future. If both of these are true, the call might be retried or safely reported as failed.
+
+In practice, this pattern may be more complicated. For example, the ICP ledger exposes a `query_blocks` method that can be used to implement the above pattern for transfers initiated as ingress messages:
+
+1. Call the `query_blocks` method on the ledger to determine what the last block (as specified in the `chain_length` field of the response) currently is. Let's call this `last_block`.
+2. Attempt to perform a transfer. This ingress message includes an `ingress_expiry` field.
+3. If the result of the transfer is unknown, ensure that the transfer will not be applied at a later point:
+ - If using ingress messages, call the `read_state` endpoint on the ledger canister to obtain the `/time` branch of the system state tree. Repeat this until the reported time exceeds the `ingress_expiry` time.
+ - If using inter-canister calls, perform all subsequent calls (`query_blocks`) listed below from the same canister that initiated the transfer. The [ordering guarantees](../../references/message-execution-properties.md) then ensure that the transfer cannot happen later.
+4. Call the `query_blocks` method on the ledger again to retrieve all ledger blocks since `last_block`, and check that the `timestamp` also exceeds the `ingress_expiry` time. In case of failure, retry until a result is obtained. Then, scan through the returned blocks to determine whether the transaction has been included or not.
+
+### 2-step transfers
+
+Another approach applicable to ledgers (such as ICRC-1 or ICP) is to perform transfers in two steps:
+
+1. First, transfer the tokens to an intermediate subaccount of the sender that's specific to this transaction. For example, if the transaction has a unique ID, the client can hash the ID to obtain a subaccount. The transferred amount should be the desired amount plus the ledger transaction fee.
+2. If the result of the above transfer is unknown, query the balance of the transaction-specific subaccount. Like in the [queryable call result](#queryable-call-results) approach, if using ingress messages, this should be repeated until the `timestamp` accompanying the response exceeds the `ingress_expiry`. If the balance is 0, the transaction can safely be reported as failed, or it can be retried (starting from step 1). If the balance is at least the expected balance, one can proceed.
+3. If the transfer to the transaction-specific subaccount succeeded (as determined either by the transfer result or by the balance query above), the client sends another transfer from the transaction-specific subaccount to the desired target account. This can be repeated as many times as necessary until a result of the call is known. Once a result is known, the overall transfer can be declared as succeeded, even if this step fails with an error, as this signifies that some previous attempt to transfer the money to the target succeeded.
+
+[^1]: "Equivalent" is meant from the user perspective here. Multiple executions may trigger changes such as those in the canister's cycle balance, but they are not relevant for the user.
+
+[^2]: More precisely, the ledger also allows for a small time drift of `created_at_time` into the future, which has to be taken into account when clearing the deduplication window.
+
+
diff --git a/docs/guides/index.md b/docs/guides/index.md
index ff1a2f4..506a02a 100644
--- a/docs/guides/index.md
+++ b/docs/guides/index.md
@@ -20,7 +20,7 @@ Practical how-to guides organized by development stage. Each guide solves a spec
- **[Testing](testing/strategies.md)**: Write unit tests, run integration tests with PocketIC, and set up end-to-end testing.
- **[Canister Management](canister-management/lifecycle.md)**: Deploy, upgrade, fund, optimize, and back up canisters.
-- **[Security](security/access-management.md)**: Implement access control, encryption, DoS prevention, and safe upgrade patterns.
+- **[Security](security/identity-and-access-management.md)**: Implement access control, DoS prevention, and safe upgrade patterns.
## Advanced features
diff --git a/docs/guides/security/access-management.mdx b/docs/guides/security/access-management.mdx
deleted file mode 100644
index 8c756eb..0000000
--- a/docs/guides/security/access-management.mdx
+++ /dev/null
@@ -1,375 +0,0 @@
----
-title: "Access Management"
-description: "Control who can call your canister with guards, caller checks, and controller management"
-sidebar:
- order: 1
----
-
-import { Tabs, TabItem } from '@astrojs/starlight/components';
-
-Every canister method is callable by anyone on the internet. Without explicit access checks, any user or canister can invoke any of your public functions. This guide covers the patterns you need to restrict access.
-
-## Checklist
-
-Use this as a quick reference when securing your canister:
-
-- [ ] Reject the anonymous principal (`2vxsx-fae`) in every authenticated endpoint
-- [ ] Check the caller inside each update method: not just in `canister_inspect_message`
-- [ ] Use the `guard` attribute (Rust) or guard functions (Motoko) to enforce access rules
-- [ ] Add a backup controller so you never lose canister access
-- [ ] Use `canister_inspect_message` only as a cycle-saving optimization, never as a security boundary
-
-## How caller identity works
-
-When a canister receives a message, the network includes the caller's principal. This identity is provided by the system: it cannot be forged or spoofed. You access it with:
-
-- **Motoko:** `shared({ caller })` pattern on public functions
-- **Rust:** `ic_cdk::api::msg_caller()`
-
-Every principal is one of these types:
-
-| Type | Format | Example | Meaning |
-|------|--------|---------|---------|
-| User | Varies (self-authenticating) | `wo5qg-ysjaa-aaaaa-...` | Human with a cryptographic identity |
-| Canister | 10 bytes, ends in `-cai` | `rrkah-fqaaa-aaaaa-aaaaq-cai` | Another canister making an inter-canister call |
-| Anonymous | Fixed | `2vxsx-fae` | Unauthenticated caller: no identity |
-| Management | Fixed | `aaaaa-aa` | IC management canister (system calls) |
-
-## Reject anonymous callers
-
-Any endpoint that requires authentication must reject the anonymous principal. Without this check, unauthenticated callers can invoke protected methods. If your canister uses the caller principal as an identity key (for balances, ownership, etc.), the anonymous principal becomes a shared identity anyone can use.
-
-
-
-
-```motoko
-import Principal "mo:core/Principal";
-import Runtime "mo:core/Runtime";
-
-// Inside persistent actor { ... }
-
- func requireAuthenticated(caller : Principal) {
- if (Principal.isAnonymous(caller)) {
- Runtime.trap("anonymous caller not allowed");
- };
- };
-
- public shared ({ caller }) func protectedAction() : async Text {
- requireAuthenticated(caller);
- "ok";
- };
-```
-
-
-
-
-```rust
-use ic_cdk::update;
-use ic_cdk::api::msg_caller;
-use candid::Principal;
-
-fn require_authenticated() -> Result<(), String> {
- if msg_caller() == Principal::anonymous() {
- return Err("anonymous caller not allowed".to_string());
- }
- Ok(())
-}
-
-#[update(guard = "require_authenticated")]
-fn protected_action() -> String {
- "ok".to_string()
-}
-```
-
-The Rust `guard` attribute runs the check before the method body executes. If the guard returns `Err`, the call is rejected. This is more robust than calling guard functions inside the method: you cannot forget to add it. Multiple guards can be chained:
-
-```rust
-#[update(guard = "require_authenticated", guard = "require_admin")]
-fn admin_action() {
- // both guards passed
-}
-```
-
-
-
-
-## Owner and role-based access control
-
-There is no built-in role system on ICP. You implement it yourself by tracking principals in your canister state.
-
-
-
-
-The `shared(msg)` pattern on an actor class captures the deployer's principal atomically. No separate init call, no front-running risk. Use `transient` for the owner since it gets recomputed from `msg.caller` on each install/upgrade.
-
-```motoko
-import Principal "mo:core/Principal";
-import Set "mo:core/pure/Set";
-import Runtime "mo:core/Runtime";
-
-shared(msg) persistent actor class MyCanister() {
-
- transient let owner = msg.caller;
- var admins : Set.Set = Set.empty();
-
- func requireOwner(caller : Principal) {
- if (Principal.isAnonymous(caller)) {
- Runtime.trap("anonymous caller not allowed");
- };
- if (caller != owner) {
- Runtime.trap("caller is not the owner");
- };
- };
-
- func requireAdmin(caller : Principal) {
- if (Principal.isAnonymous(caller)) {
- Runtime.trap("anonymous caller not allowed");
- };
- if (caller != owner and not Set.contains(admins, Principal.compare, caller)) {
- Runtime.trap("caller is not an admin");
- };
- };
-
- public shared ({ caller }) func addAdmin(newAdmin : Principal) : async () {
- requireOwner(caller);
- admins := Set.add(admins, Principal.compare, newAdmin);
- };
-
- public shared ({ caller }) func removeAdmin(admin : Principal) : async () {
- requireOwner(caller);
- admins := Set.remove(admins, Principal.compare, admin);
- };
-
- public shared ({ caller }) func adminAction() : async () {
- requireAdmin(caller);
- // ... protected logic
- };
-};
-```
-
-
-
-
-```rust
-use ic_cdk::{init, update};
-use ic_cdk::api::msg_caller;
-use candid::Principal;
-use std::cell::RefCell;
-
-thread_local! {
- static OWNER: RefCell = RefCell::new(Principal::anonymous());
- static ADMINS: RefCell> = RefCell::new(vec![]);
-}
-
-fn require_authenticated() -> Result<(), String> {
- if msg_caller() == Principal::anonymous() {
- return Err("anonymous caller not allowed".to_string());
- }
- Ok(())
-}
-
-fn require_owner() -> Result<(), String> {
- require_authenticated()?;
- OWNER.with(|o| {
- if msg_caller() != *o.borrow() {
- return Err("caller is not the owner".to_string());
- }
- Ok(())
- })
-}
-
-fn require_admin() -> Result<(), String> {
- require_authenticated()?;
- let caller = msg_caller();
- let is_authorized = OWNER.with(|o| caller == *o.borrow())
- || ADMINS.with(|a| a.borrow().contains(&caller));
- if !is_authorized {
- return Err("caller is not an admin".to_string());
- }
- Ok(())
-}
-
-#[init]
-fn init(owner: Principal) {
- OWNER.with(|o| *o.borrow_mut() = owner);
-}
-// Unlike Motoko's shared(msg) pattern which captures the deployer automatically,
-// the Rust #[init] requires passing the owner explicitly at deploy time:
-// icp canister deploy backend --argument '(principal "your-principal-here")'
-
-#[update(guard = "require_owner")]
-fn add_admin(new_admin: Principal) {
- ADMINS.with(|a| a.borrow_mut().push(new_admin));
-}
-
-#[update(guard = "require_owner")]
-fn remove_admin(admin: Principal) {
- ADMINS.with(|a| a.borrow_mut().retain(|p| p != &admin));
-}
-
-#[update(guard = "require_admin")]
-fn admin_action() {
- // ... protected logic: guard already validated caller
-}
-```
-
-
-
-
-Always include admin revocation (`removeAdmin`). Missing revocation is a common source of bugs: once granted, admin access should be removable.
-
-## Controller checks
-
-Controllers are the principals authorized to manage a canister (install code, change settings, stop/delete). The controller list is managed at the IC level, not inside your canister code.
-
-
-
-
-**Motoko** provides `Principal.isController` to check if a principal is a controller of the current canister:
-
-```motoko
-import Principal "mo:core/Principal";
-import Runtime "mo:core/Runtime";
-
-// Inside persistent actor { ... }
-
- public shared ({ caller }) func controllerOnly() : async () {
- if (not Principal.isController(caller)) {
- Runtime.trap("caller is not a controller");
- };
- // ...
- };
-```
-
-
-
-
-In Rust, there is no built-in `is_controller` function: checking controllers requires an async call to the management canister. See [inter-canister calls](../canister-calls/inter-canister-calls.md#making-calls) for inter-canister call patterns.
-
-
-
-
-**Managing controllers with icp-cli:**
-
-```bash
-# View current canister settings including controllers
-icp canister settings show backend -e ic
-
-# Add a backup controller
-icp canister settings update backend --add-controller -e ic
-
-# Remove a controller (warning: removing yourself locks you out)
-icp canister settings update backend --remove-controller -e ic
-```
-
-Always add a backup controller. If you lose the private key of the only controller, the canister becomes permanently unupgradeable: there is no recovery mechanism.
-
-## `canister_inspect_message`: cycle optimization only
-
-`canister_inspect_message` is a hook that runs on a single replica before consensus. It can reject ingress messages early to save cycles on Candid decoding and execution. However, it is **not a security boundary**:
-
-- It runs on one node without consensus: a malicious boundary node can bypass it
-- It is never called for inter-canister calls, query calls, or management canister calls
-
-Always duplicate real access checks inside each method. Use `inspect_message` only to reduce cycle waste from spam.
-
-
-
-
-```motoko
-import Principal "mo:core/Principal";
-
-// Inside persistent actor { ... }
-// Method variants must match your public methods
-
- system func inspect(
- {
- caller : Principal;
- msg : {
- #adminAction : () -> ();
- #addAdmin : () -> Principal;
- #removeAdmin : () -> Principal;
- #protectedAction : () -> ();
- }
- }
- ) : Bool {
- switch (msg) {
- case (#adminAction _) { not Principal.isAnonymous(caller) };
- case (#addAdmin _) { not Principal.isAnonymous(caller) };
- case (#removeAdmin _) { not Principal.isAnonymous(caller) };
- case (#protectedAction _) { not Principal.isAnonymous(caller) };
- case (_) { true };
- };
- };
-```
-
-
-
-
-```rust
-use ic_cdk::api::{accept_message, msg_caller, msg_method_name};
-use candid::Principal;
-
-#[ic_cdk::inspect_message]
-fn inspect_message() {
- let method = msg_method_name();
- match method.as_str() {
- "admin_action" | "add_admin" | "remove_admin" | "protected_action" => {
- if msg_caller() != Principal::anonymous() {
- accept_message();
- }
- // Silently reject anonymous: saves cycles
- }
- _ => accept_message(),
- }
-}
-```
-
-
-
-
-## Debugging identity
-
-When troubleshooting access control issues, it helps to know which principal your canister sees. A simple `whoami` endpoint returns the caller's identity:
-
-
-
-
-```motoko
-// Inside persistent actor { ... }
-
- public shared ({ caller }) func whoami() : async Principal {
- caller;
- };
-```
-
-
-
-
-```rust
-use ic_cdk::query;
-use ic_cdk::api::msg_caller;
-use candid::Principal;
-
-#[query]
-fn whoami() -> Principal {
- msg_caller()
-}
-```
-
-
-
-
-Call it to verify which identity is being used:
-
-```bash
-icp canister call backend whoami
-```
-
-## Next steps
-
-- [Security concepts](../../concepts/security.md): understand the IC security model
-- [Canister settings](../canister-management/settings.md): configure controllers and freezing thresholds
-- [DoS prevention](dos-prevention.md): rate limiting as an access control mechanism
-
-{/* Upstream: informed by dfinity/icskills (skills/canister-security/SKILL.md, dfinity/portal) docs/building-apps/best-practices/general.mdx */}
diff --git a/docs/guides/security/canister-upgrades.md b/docs/guides/security/canister-upgrades.md
index 7e8ba7a..0af030d 100644
--- a/docs/guides/security/canister-upgrades.md
+++ b/docs/guides/security/canister-upgrades.md
@@ -1,350 +1,52 @@
---
-title: "Secure Upgrades"
-description: "Upgrade canisters safely: pre/post hooks, stable memory, Candid compatibility, snapshot rollbacks, schema evolution, and testing"
+title: "Canister Upgrade Security"
+description: "Security best practices for canister upgrade hooks, panics during upgrades, and timer reinstatement."
sidebar:
- order: 2
+ order: 8
---
-Canister upgrades are one of the highest-risk operations in production. A bad upgrade can corrupt state, make the canister permanently non-upgradeable, or break clients. This guide covers the patterns and checks you need to upgrade safely.
+## Be careful with panics during upgrades
-## Checklist
+### Security concern
-Use this before every production upgrade:
+If a canister traps or panics in `pre_upgrade`, this can lead to permanently blocking the canister, resulting in a situation where upgrades fail or are no longer possible at all.
-- [ ] Take a snapshot immediately before upgrading
-- [ ] Run the upgrade locally first with `icp deploy`
-- [ ] Verify data survives: write → upgrade → read
-- [ ] Check Candid interface compatibility. No removed methods, no breaking type changes
-- [ ] Avoid `pre_upgrade` hooks that serialize large state (use stable structures instead)
-- [ ] In Motoko, use `persistent actor` (which eliminates the need for pre_upgrade hooks): avoid manual `pre_upgrade`/`post_upgrade`
-- [ ] Confirm you have a backup controller (cannot recover from a trapped `post_upgrade` without one)
-- [ ] Add a rollback plan: snapshot ID recorded, restore procedure tested
+### Recommendation
-## How upgrades work
+- Avoid using `pre_upgrade` hooks if possible. Panics in the `pre_upgrade` hook prevent upgrades, and since the `pre_upgrade` hook is controlled by the old code, it can permanently block upgrading.
-When you run `icp deploy` on an existing canister, the IC executes the following sequence:
+- Panic in the `post_upgrade` hook if the state is invalid so that one can retry the upgrade and try to fix the invalid state. Panics in the `post_upgrade` hook abort the upgrade, but one can retry with new code.
-1. **Stop** the canister (waits for in-flight messages to complete)
-2. Run `pre_upgrade` on the old code (if defined)
-3. Replace the Wasm module with the new code
-4. Run `post_upgrade` on the new code (if defined)
-5. **Restart** the canister
+- [Test the upgrade hooks](https://mmapped.blog/posts/01-effective-rust-canisters.html#test-upgrades) (from [effective Rust canisters](https://mmapped.blog/posts/01-effective-rust-canisters.html)).
-Stable memory is preserved through steps 2–4. Heap memory is cleared when the new Wasm loads. If `pre_upgrade` or `post_upgrade` traps, the upgrade fails with different consequences:
+- See also the section on upgrades in [how to audit an Internet Computer canister](https://www.joachim-breitner.de/blog/788-How_to_audit_an_Internet_Computer_canister) (though focused on Motoko).
-| Hook | Trap result |
-|------|-------------|
-| `pre_upgrade` | Upgrade cancelled. Old code still running. State intact but may need attention. |
-| `post_upgrade` | New Wasm installed but initialization failed. Canister may be in an inconsistent state. |
+- See [current limitations of the Internet Computer](https://wiki.internetcomputer.org/wiki/Current_limitations_of_the_Internet_Computer), section "Bugs in `pre_upgrade` hooks."
-Both scenarios leave the canister in a difficult state. Prevention is far better than recovery.
+## Reinstantiate timers during upgrades
-## Stable memory patterns
+### Security concern
-### Motoko: use `persistent actor`
+Global timers are deactivated upon changes to the canister's Wasm module. The [IC specification](../../references/ic-interface-spec/canister-interface.md#global-timer) states this as follows:
-The `persistent actor` declaration automatically stores all `let` and `var` fields in stable memory. No serialization, no upgrade hooks, no instruction-limit traps.
+> "The timer is also deactivated upon changes to the canister's Wasm module (calling install_code or uninstall_code methods of the management canister or if the canister runs out of cycles). In particular, the function canister_global_timer won't be scheduled again unless the canister sets the global timer again (using the System API function ic0.global_timer_set)."
-```motoko
-persistent actor Counter {
- var count : Nat = 0;
+Upgrade is a mode of `install_code`, and hence the timers are deactivated during an upgrade.
- public func increment() : async Nat {
- count += 1;
- count;
- };
+This could result in a vulnerability in certain cases where security controls or other critical features rely on these timers to function. For example, a DEX that relies on timers to update the exchange rates of currencies could be vulnerable to arbitraging opportunities if the rates are no longer updated.
- public query func get() : async Nat { count };
+Since global timers are used internally by the Motoko `Timer` mechanism, the same holds true for the Motoko Timer. As explained in the [pull request](https://github.com/dfinity/motoko/pull/3542) under "The upgrade story," the global timer gets discarded on upgrade, and the timers need to be set up in the `post_upgrade` hook.
- // transient: resets to [] on each upgrade: correct for caches, transient logs, and reset-on-upgrade counters
- transient var recentCallers : [Principal] = [];
-};
-```
+This behavior is different when [using Motoko](https://github.com/dfinity/motoko/pull/3542) and implementing `system func timer`. The `timer` function will be called after an upgrade. In case your canister was using timers for recurring tasks, the `timer` function would likely set the global timer again for a later time. However, the time between invocations of `timer` would not be consistent as the upgrade triggered an "unexpected" call to `timer`.
-**Key rules:**
+Using the Rust CDK, the recurring timer is also lost on upgrade as explained in the API documentation of [set_timer_interval](https://docs.rs/ic-cdk/0.6.9/ic_cdk/timer/fn.set_timer_interval.html).
-- All `let`/`var` fields persist automatically. No `stable` keyword needed
-- `transient var` for caches or counters that should reset on upgrade
-- Do not write manual `pre_upgrade`/`post_upgrade` hooks. The runtime handles everything
-- If a persistent field's type changes incompatibly, the upgrade traps. See [Schema evolution](#schema-evolution).
+### Recommendation
-### Rust: use stable structures
+- In Motoko canisters, global timers should be set up in the actor initializer for canister installation or reinstallation. Canister-wide timers should be set in the `post_upgrade` hook for upgrades, as timers do not survive upgrades and must be explicitly set up thereafter.
-In Rust, use [`ic-stable-structures`](https://docs.rs/ic-stable-structures/latest/ic_stable_structures/) to store data directly in stable memory. Data lives there from the start. No serialization step on upgrade.
+- See the Motoko documentation on [timers](../../languages/motoko/icp-features/timers.md).
-```rust
-use ic_stable_structures::{
- memory_manager::{MemoryId, MemoryManager, VirtualMemory},
- DefaultMemoryImpl, StableBTreeMap, StableCell,
-};
-use std::cell::RefCell;
+- See the Rust documentation on [set_timer_interval](https://docs.rs/ic-cdk/0.6.9/ic_cdk/timer/fn.set_timer_interval.html).
-type Memory = VirtualMemory;
-
-// Each structure must have its own unique MemoryId: never reuse IDs
-const USERS_MEM_ID: MemoryId = MemoryId::new(0);
-const COUNTER_MEM_ID: MemoryId = MemoryId::new(1);
-
-thread_local! {
- static MEMORY_MANAGER: RefCell> =
- RefCell::new(MemoryManager::init(DefaultMemoryImpl::default()));
-
- static USERS: RefCell, Memory>> =
- RefCell::new(StableBTreeMap::init(
- MEMORY_MANAGER.with(|m| m.borrow().get(USERS_MEM_ID))
- ));
-
- static COUNTER: RefCell> =
- RefCell::new(StableCell::init(
- MEMORY_MANAGER.with(|m| m.borrow().get(COUNTER_MEM_ID)),
- 0u64,
- ).expect("Failed to init counter"));
-}
-
-#[ic_cdk::post_upgrade]
-fn post_upgrade() {
- // Stable structures auto-restore: no deserialization needed.
- // Re-initialize timers or transient state here if required.
-}
-```
-
-> **Warning:** Each `MemoryId` must map to exactly one data structure for the lifetime of the canister. Reusing a `MemoryId` for a different structure after an upgrade corrupts both. Keep a written record of your `MemoryId` allocations and never reorder them.
-
-### Avoid `pre_upgrade` serialization
-
-The serialization-based upgrade pattern is common in older Rust code but is fundamentally fragile:
-
-```rust
-// DO NOT DO THIS in production
-#[ic_cdk::pre_upgrade]
-fn pre_upgrade() {
- // If STATE is large, this hits the instruction limit and traps.
- // A trapped pre_upgrade prevents the upgrade: canister stays on old code.
- ic_cdk::storage::stable_save((STATE.with(|s| s.borrow().clone()),)).unwrap();
-}
-```
-
-When `pre_upgrade` traps due to instruction exhaustion, the canister cannot be upgraded. The `skip_pre_upgrade` flag (an emergency escape hatch via the management canister's `install_code` API (see [Management canister reference](../../references/management-canister.md#install_code)) bypasses the hook) but anything the hook would have saved is lost. Use stable structures so the upgrade path cannot brick itself under load.
-
-## Candid interface compatibility
-
-The IC checks your new Wasm module's Candid interface against the old one before completing the upgrade. If the new interface is not backward-compatible, the upgrade is rejected.
-
-**Safe changes:**
-
-| Change | Why it is safe |
-|--------|---------------|
-| Add a new method | Existing clients don't call it |
-| Add optional parameters to an existing method | Old clients send no value; IC substitutes `null` |
-| Remove trailing parameters from an existing method | Old clients send extra values; IC ignores them |
-| Return additional values from a method | Old clients ignore extra return values |
-| Change a parameter type to a supertype | Old values remain valid inputs |
-| Change a return type to a subtype | New values remain valid for old clients |
-
-**Breaking changes (upgrade rejected or clients break):**
-
-| Change | Why it breaks |
-|--------|--------------|
-| Remove a method | Clients calling it get errors |
-| Add a required (non-optional) parameter | Old clients don't send it |
-| Change a parameter type to an incompatible type | Old clients send invalid values |
-
-**Example: safe evolution:**
-
-```candid
-// Before
-service counter : {
- add : (nat) -> ();
- get : () -> (int) query;
-}
-
-// After: safe: optional param added, new return value, new method
-service counter : {
- add : (nat, label : opt text) -> (new_val : nat);
- get : () -> (nat, last_change : nat) query;
- reset : () -> ();
-}
-```
-
-icp-cli checks Candid compatibility during deploy and prompts for confirmation if it detects a potentially breaking change. Use `--yes` in automated pipelines after manually verifying compatibility:
-
-```bash
-icp deploy my-canister -e ic --yes
-```
-
-## Snapshot-based rollback
-
-Always take a snapshot immediately before a risky upgrade. If the upgrade causes unexpected behavior, you can restore the previous state within minutes.
-
-```bash
-# 1. Stop the canister and create a snapshot
-icp canister stop my-canister -e ic
-icp canister snapshot create my-canister -e ic
-# Note the snapshot ID printed in the output
-icp canister start my-canister -e ic
-
-# 2. Deploy the upgrade
-icp deploy my-canister -e ic
-
-# 3. Verify correctness
-icp canister call my-canister health_check -e ic
-
-# 4a. If everything works, clean up when no longer needed
-icp canister snapshot delete my-canister -e ic
-
-# 4b. If something is wrong, stop and restore
-icp canister stop my-canister -e ic
-icp canister snapshot restore my-canister -e ic
-icp canister start my-canister -e ic
-```
-
-Snapshots capture the full canister state: Wasm module, Wasm heap memory, stable memory, and chunk store. Restoring from a snapshot brings back all of this state atomically.
-
-See [Canister snapshots](../canister-management/snapshots.md) for listing, downloading, and the state transfer workflow.
-
-## Schema evolution
-
-Upgrading canister code sometimes requires changing the shape of stored data. The rules differ by language.
-
-### Motoko
-
-When upgrading a `persistent actor`, the runtime checks that every persistent field's current type is compatible with the value stored in stable memory. Incompatible changes cause the upgrade to trap.
-
-**Safe changes:**
-
-- Add new `let` or `var` fields with initial values. The runtime initializes them on upgrade
-- Add optional record fields (e.g., change `{ name : Text }` to `{ name : Text; email : ?Text }`)
-- Widen a field's type (e.g., `Nat` → `Int`)
-
-**Unsafe changes (upgrade traps):**
-
-- Remove or rename a persistent field
-- Narrow a field's type (e.g., `Int` → `Nat`)
-- Change a non-optional field to an incompatible type
-
-If you need to make an unsafe change, migrate the data in two upgrades: add the new field alongside the old one, upgrade once (both fields present), then upgrade again to remove the old field. Test this two-step process locally before deploying to mainnet.
-
-### Rust
-
-Rust stable structures use serialized bytes on disk. Schema evolution safety depends on the serialization format and versioning strategy.
-
-**Adding fields safely with Candid encoding:**
-
-```rust
-use candid::{CandidType, Decode, Deserialize, Encode};
-use ic_stable_structures::storable::{Bound, Storable};
-use std::borrow::Cow;
-
-#[derive(CandidType, Deserialize, Clone)]
-struct UserV2 {
- id: u64,
- name: String,
- created: u64,
- // New optional field: safe to add: old records deserialize with None
- email: Option,
-}
-
-impl Storable for UserV2 {
- // Unbounded avoids write failures when struct grows.
- // Bounded requires a fixed max_size; if encoded size exceeds it after
- // adding fields, writes trap.
- const BOUND: Bound = Bound::Unbounded;
-
- fn to_bytes(&self) -> Cow<'_, [u8]> {
- Cow::Owned(Encode!(self).expect("failed to encode UserV2"))
- }
-
- fn from_bytes(bytes: Cow<'_, [u8]>) -> Self {
- Decode!(&bytes, Self).expect("failed to decode UserV2")
- }
-}
-```
-
-**Rules:**
-
-- Use `Option` for new fields: Candid deserializes absent fields as `None`, so old records remain readable after the upgrade
-- Use `Bound::Unbounded` unless you have a strict size requirement
-- Never reorder `MemoryId` allocations across upgrades: same effect as changing a field type
-- For breaking schema changes, use a versioned enum and migrate records lazily on read
-
-## Testing upgrades locally
-
-Never upgrade on mainnet without first verifying locally that data written before the upgrade is still readable after.
-
-**Motoko:**
-
-```bash
-# Start local network
-icp network start -d
-
-# Deploy initial version
-icp deploy backend
-
-# Write data
-icp canister call backend increment '()'
-icp canister call backend increment '()'
-icp canister call backend get '()'
-# Returns: (2 : nat)
-
-# Modify source code, then redeploy
-icp deploy backend
-
-# Verify data survived
-icp canister call backend get '()'
-# Must still return: (2 : nat)
-```
-
-**Rust:**
-
-```bash
-# Start local network
-icp network start -d
-
-# Deploy initial version
-icp deploy backend
-
-# Write data
-icp canister call backend add_user '("Alice")'
-icp canister call backend get_user_count '()'
-# Returns: (1 : nat64)
-
-# Modify source code, then upgrade
-icp deploy backend
-
-# Verify data survived
-icp canister call backend get_user_count '()'
-# Must still return: (1 : nat64)
-```
-
-If the count drops to zero after upgrade, your data is not in stable memory: review your storage declarations before touching mainnet.
-
-For advanced scenarios (upgrade rollbacks, schema migrations, concurrent call safety), use [PocketIC](../testing/pocket-ic.md) to script multi-step upgrade scenarios in a controlled environment.
-
-## Controller safety
-
-You cannot upgrade a canister without a valid controller. Losing all controller keys leaves the canister permanently frozen at its current code: there is no recovery path on the IC.
-
-```bash
-# Check current controllers
-icp canister settings show my-canister -e ic
-
-# Add a backup controller before any risky upgrade
-icp canister settings update my-canister --add-controller -e ic
-```
-
-For production canisters:
-
-- Maintain at least two controllers (primary identity + hardware wallet or multisig)
-- For fully onchain governance, add an SNS or DAO canister as controller and remove personal principals
-
-See [Access management](access-management.md) for detailed controller management patterns.
-
-## Next steps
-
-- [Data persistence](../backends/data-persistence.md): stable structures and upgrade patterns in depth
-- [Canister lifecycle](../canister-management/lifecycle.md#upgrade-a-canister): the full upgrade sequence and install modes
-- [Canister snapshots](../canister-management/snapshots.md): create and restore snapshots
-- [Testing strategies](../testing/strategies.md): test upgrade scenarios before deploying to mainnet
-- [Access management](access-management.md): manage controllers and prevent lock-out
-
-
+
diff --git a/docs/guides/security/data-integrity-and-authenticity.md b/docs/guides/security/data-integrity-and-authenticity.md
new file mode 100644
index 0000000..1433e65
--- /dev/null
+++ b/docs/guides/security/data-integrity-and-authenticity.md
@@ -0,0 +1,629 @@
+---
+title: "Data Integrity and Authenticity"
+description: "Security best practices for certified variables, asset certification, and protecting data authenticity on ICP."
+sidebar:
+ order: 4
+---
+
+## Certified variables
+
+### Security concern
+
+ICP offers three modes of operation for canisters: `update`, `query`, and `composite_query`. For the sake of simplicity, we will club `composite_query` under queries for the rest of this section.
+
+Update calls are slow and expensive but provide integrity guarantees as their responses include a threshold signature signed by the subnet.
+
+On the other hand, query calls are fast since a single replica formulates the response, but **there is no integrity guarantee, since the response can be manipulated by a single replica or boundary node.** For example, if the NNS dapp fetches proposal information from the governance canister via query calls and the responding node is malicious, it can mask an ill-intentioned proposal that causes irrevocable damage as innocuous by modifying the proposal payload in the response and mislead voters into voting yes. Another consequence of query calls is that users can't rely on [canister_inspect_message](../../references/ic-interface-spec/canister-interface.md#system-api-inspect-message) as a guard. **This makes query calls, in their raw form, unfit to serve data for security-critical applications.**
+
+### Using certified variables for secure queries
+In certain use cases, there is a third option whereby query results can return data that has been certified by the subnet in an earlier update call. This is the concept of certified data, and it requires changes to the update call to create the certification, the query call to return the certificate, and the frontend to verify the certificate. Using certified data provides the best of both worlds with query-like response times and update-like certified responses.
+
+Some examples of certified variables are asset certification in [Internet Identity](https://github.com/dfinity/internet-identity/blob/b29a6f68bbe5a49d048e12bc7a3263a9f43d080b/src/internet_identity/src/main.rs#L775-L808), [NNS dapp](https://github.com/dfinity/nns-dapp/blob/372c3562127d70c2fde059bc9c268e8ae858583e/rs/src/assets.rs#L121-L145), or the [canister signature implementation in Internet Identity](https://github.com/dfinity/ic-canister-sig-creation).
+
+:::tip
+Certified variables are an advanced feature that require careful implementation of authenticated data structures and verification on the canister and client sides, respectively. **If the client doesn't require fast response times, call the query method as an update call (replicated query).** The response would be certified by the subnet, and a single malicious or boundary node can't modify the response.
+:::
+
+:::tip
+ICP also provides replica signed queries, where query responses are signed by the answering replica node; however, it doesn't have the same security guarantees as an `update` call and only protects from malicious boundary nodes. Replica signed queries are enabled by default on both the ICP Rust agent and the ICP JavaScript agent.
+:::
+
+### What is certified data?
+Aside from update calls, the subnet certifies (creates a threshold signature) a part of the canister data every round. This is stored in the state tree under the label `certified_data`. However, since it's certified every round, the amount of data that can be stored in `certified_data` is limited to 32 bytes. Hence, when you modify the state of your canister during an update call, if you can convert the state into a unique representation that can fit into 32 bytes, you can store it under `certified_data`, and it will be certified. Naturally, this can be done by computing a hash of the data structure of the canister state. This is also why certified variables are difficult to implement. Depending on your data structures, you will need to develop a different kind of hashing function.
+
+Subsequent query calls can return the data as-is, including the signature on the `certified_data`, which the frontend can verify with the IC root public key. This means that data aggregation or other calculations can't be done in query calls, as there would be no way to produce a signature over that newly created data. There are two workarounds: either this data is precomputed in the update call or all raw data is sent to the frontend, which verifies it and does the calculations. Combining these features, a canister should be able to certify a variable in a query response with this [design](https://medium.com/dfinity/how-internet-computer-responses-are-certified-as-authentic-2ff1bb1ea659).
+
+On a high level, in your canister:
+1. Choose an [authenticated data structure](https://cs.brown.edu/research/pubs/pdfs/2003/Tamassia-2003-ADS.pdf) like Merkle trees to store a value in canister memory.
+2. In the **update** call:
+ - Perform the computation and store the result in the Merkle tree.
+ - The lookup path for the result must act as its `key`. Ideally this `key` should be the parameters provided by the caller in the query method.
+ - Recompute the Merkle proof (`root_hash`)
+ - Store the `root_hash` as the canister's certified data.
+ - Return the `key` as response.
+3. In the **query** call:
+ - Fetch the result from the Merkle structure using the query parameters as the lookup path.
+ - Fetch the current `certified_data` for the canister.
+ - Compute the witness for the result using the same lookup path. The Merkle witness provides proof of inclusion that the requested result exists in the Merkle tree under the given path.
+ - Return `(result, certified_data, witness)` as the response.
+
+The rest of the section shows an example canister, which can serve a certified response for a `query` using `certified_data` that is verified in the frontend. The examples are written in Rust and Motoko, but the overall design can be implemented in other languages.
+
+### Building a canister with certified variables
+Let's consider the following canister interface:
+
+```c
+type User = record {
+ name: text;
+ age: nat8;
+};
+
+type CertifiedUser = record {
+ user : User;
+ certificate : blob;
+ witness : blob;
+};
+
+service : {
+ "set_user": (User) -> (nat64);
+ "get_user": (nat64) -> (CertifiedUser) query;
+}
+```
+
+The canister exposes the following service:
+- **set_user**: The caller provides a `User` object to the canister. The canister records it and serves a corresponding `index` for the entry as the response. Since `certified_data` can only store 32 bytes of data, it uses a specialized data structure from `ic_certified_map` to store the `User` data.
+ - The data structure internally stores the data in a `HashTree` (or [Merkle tree](https://en.wikipedia.org/wiki/Merkle_tree)) and records the `root_hash` of the data structure in the `certified_data`, which is 32 bytes.
+ - The `root_hash` cryptographically guarantees that only one tree can correspond to that hash. The `root_hash` is also referred to as the Merkle proof.
+- **get_user**: The caller provides a `index: nat64` to the canister and gets a certified response for the corresponding `User`. The `CertifiedUser` response must have the following structure for verifying the response:
+ - **user**: The actual response.
+ - **certificate**: The payload for verifying the signature on the `certified_data`. ICP provides the system API `data_certificate()` for this.
+ - **witness**: Allows for the final verification of the response to be completed with the requested input and `certified_data`.
+
+You can find an example implementation of the canister below.
+
+**Motoko:**
+
+```motoko
+import CertifiedData "mo:core/CertifiedData";
+import Blob "mo:core/Blob";
+import Nat8 "mo:core/Nat8";
+import Debug "mo:core/Debug";
+import Text "mo:core/Text";
+import Nat64 "mo:core/Nat64";
+import Array "mo:core/Array";
+import CertTree "mo:ic-certification/CertTree";
+import CV "mo:cbor/Value";
+import CborEncoder "mo:cbor/Encoder";
+import CborDecoder "mo:cbor/Decoder";
+
+actor CertifiedVariable {
+
+ type User = {
+ name : Text;
+ age : Nat8;
+ };
+
+ type CertifiedUser = {
+ user : User;
+ certificate : Blob;
+ witness : Blob;
+ };
+
+ stable var count : Nat64 = 0;
+ stable let cert_store : CertTree.Store = CertTree.newStore();
+ let ct = CertTree.Ops(cert_store);
+
+ public func set_user(user : User) : async Nat64 {
+ count += 1;
+ let path : [Blob] = [Text.encodeUtf8("user"), blobOfNat64(count)];
+ ct.put(path, encodeUser(user));
+ ct.setCertifiedData();
+ return count;
+ };
+
+ public query func get_user(index : Nat64) : async CertifiedUser {
+ let certificate = switch (CertifiedData.getCertificate()) {
+ case (?certificate) {
+ certificate;
+ };
+ case (null) {
+ Debug.trap("Certified data not set");
+ };
+ };
+
+ let path : [Blob] = [Text.encodeUtf8("user"), blobOfNat64(index)];
+
+ let value = switch (ct.lookup(path)) {
+ case (?value) {
+ value;
+ };
+ case (null) {
+ Debug.trap("Lookup failed");
+ };
+ };
+
+ let user : User = decodeUser(value);
+ let witness = ct.encodeWitness(ct.reveal(path));
+
+ let certifiedUser : CertifiedUser = {
+ certificate = certificate;
+ witness = witness;
+ user = user;
+ };
+
+ return certifiedUser;
+ };
+
+ func encodeUser(user : User) : Blob {
+ let bytes : CV.Value = #majorType5([
+ (#majorType3("name"), #majorType3(user.name)),
+ (#majorType3("age"), #majorType0(Nat64.fromNat(Nat8.toNat(user.age)))),
+ ]);
+
+ let #ok(encoded_user) = CborEncoder.encode(bytes);
+ return Blob.fromArray(encoded_user);
+ };
+
+ func decodeUser(bytes : Blob) : User {
+ let #ok(#majorType5(map)) = CborDecoder.decode(bytes);
+ let name_tag = Array.find<(CV.Value, CV.Value)>(map, func x = x.0 == #majorType3("name"));
+ let age_tag = Array.find<(CV.Value, CV.Value)>(map, func x = x.0 == #majorType3("age"));
+
+ let name = switch (name_tag) {
+ case (?name_value) {
+ let #majorType3(name) = name_value.1;
+ name;
+ };
+ case (null) {
+ Debug.trap("Decoding failed for name");
+ };
+ };
+
+ let age = switch (age_tag) {
+ case (?age_value) {
+ let #majorType0(age) = age_value.1;
+ Nat8.fromNat(Nat64.toNat(age));
+ };
+ case (null) {
+ Debug.trap("Decoding failed for age");
+ };
+ };
+
+ return {
+ name = name;
+ age = age;
+ };
+ };
+
+ func blobOfNat64(n : Nat64) : Blob {
+ let byteMask : Nat64 = 0xff;
+ func byte(x : Nat64) : Nat8 {
+ Nat8.fromNat(Nat64.toNat(x));
+ };
+ Blob.fromArray([
+ byte(((byteMask << 56) & n) >> 56),
+ byte(((byteMask << 48) & n) >> 48),
+ byte(((byteMask << 40) & n) >> 40),
+ byte(((byteMask << 32) & n) >> 32),
+ byte(((byteMask << 24) & n) >> 24),
+ byte(((byteMask << 16) & n) >> 16),
+ byte(((byteMask << 8) & n) >> 8),
+ byte(((byteMask << 0) & n) >> 0),
+ ]);
+ };
+
+};
+```
+
+**Rust:**
+
+```rust
+use candid::CandidType;
+use ic_certified_map::HashTree;
+use ic_certified_map::{leaf_hash, AsHashTree, Hash, RbTree};
+use serde::{Deserialize, Serialize};
+use std::borrow::Cow;
+use std::cell::Cell;
+use std::cell::RefCell;
+
+#[derive(CandidType, Serialize, Deserialize, Clone)]
+struct User {
+ name: String,
+ age: u8,
+}
+
+impl AsHashTree for User {
+ fn root_hash(&self) -> Hash {
+ let user_serialized = serde_cbor::to_vec(&self).unwrap();
+ leaf_hash(&user_serialized[..])
+ }
+ fn as_hash_tree(&self) -> HashTree<'_> {
+ HashTree::Leaf(Cow::from(serde_cbor::to_vec(&self).unwrap()))
+ }
+}
+
+#[derive(CandidType)]
+struct CertifiedUser {
+ user: User,
+ certificate: Vec,
+ witness: Vec,
+}
+
+thread_local! {
+ static INDEX : Cell = Cell::new(0);
+ static TREE: RefCell>> = RefCell::new(RbTree::new());
+}
+
+#[ic_cdk::update]
+fn set_user(user: User) -> u64 {
+ let index = INDEX.with(|index| {
+ let count = index.get() + 1;
+ index.set(count);
+ count
+ });
+
+ TREE.with_borrow_mut(|tree| {
+ match tree.get(b"user") {
+ Some(_) => {
+ tree.modify(b"user", |inner| {
+ inner.insert(index.to_be_bytes(), user);
+ });
+ }
+ None => {
+ let mut inner = RbTree::new();
+ inner.insert(index.to_be_bytes(), user);
+ tree.insert("user", inner);
+ }
+ }
+ ic_cdk::api::set_certified_data(&tree.root_hash());
+ });
+ index
+}
+
+#[ic_cdk::query]
+fn get_user(index: u64) -> CertifiedUser {
+ let certificate = ic_cdk::api::data_certificate().expect("No data certificate available");
+
+ TREE.with_borrow(|tree| {
+ let user = match tree.get(b"user") {
+ Some(inner) => {
+ let user = inner.get(&index.to_be_bytes()[..]).expect("User not found");
+ user.to_owned()
+ }
+ None => {
+ panic!("Tree isn't initialized");
+ }
+ };
+
+ let mut witness = vec![];
+ let mut witness_serializer = serde_cbor::Serializer::new(&mut witness);
+ let _ = witness_serializer.self_describe();
+ tree.nested_witness(b"user", |inner| inner.witness(&index.to_be_bytes()[..]))
+ .serialize(&mut witness_serializer)
+ .unwrap();
+
+ CertifiedUser {
+ user,
+ certificate,
+ witness,
+ }
+ })
+}
+```
+
+### Verifying certified variables
+
+Once you have the response `CertifiedUser`, for the integrity guarantee, the frontend must verify the certification in the response. This is broken down into several steps implemented in the Rust and JavaScript example below.
+
+:::note
+The example has some extra steps to set up the canister with some `User` data before verification. You can ignore the section marked between `// ==== START of canister data setup` and `// ==== END of canister data setup`.
+:::
+
+1. Verify the IC certificate: Recompute the `root_hash` of `certificate.tree` (pruned state tree with the canister's `certified_data`) and verify the `certificate.signature` with `root_hash` as the message, `certificate.delegation`, and the IC `root_key` as the public key. This confirms that the signature is valid for the current state tree.
+2. Validate that the response is not stale by verifying the time at `/time` in `certificate.tree` is less than a certain delta of current time. The recommended delta is 5 minutes but should be adapted to the use case.
+3. Recompute the `root_hash` of the witness and verify equality with the `certified_data`. The `certified_data` can be obtained from `certificate.tree` under the path `/canister//certified_data`.
+4. Check if query parameters are in the witness. In this example, the lookup path is `/user/` and should be present in the witness.
+5. Validate if the value found in `/user/` matches `user` from the response.
+6. If all of the previous steps succeed, return `user` as the valid response.
+
+**Rust (client-side verification):**
+
+```rust
+use arbitrary::{Arbitrary, Unstructured};
+use candid::Encode;
+use candid::Principal;
+use candid::{CandidType, Decode, Deserialize};
+use futures::future::join_all;
+use ic_agent::identity::AnonymousIdentity;
+use ic_agent::Agent;
+use ic_certificate_verification::validate_certificate_time;
+use ic_certificate_verification::VerifyCertificate;
+use ic_certification::hash_tree::HashTree;
+use ic_certification::{Certificate, LookupResult};
+use rand::prelude::*;
+use serde_cbor::Deserializer;
+use std::time::{SystemTime, UNIX_EPOCH};
+
+#[derive(CandidType, Deserialize, Debug, PartialEq, Eq, Arbitrary)]
+struct User {
+ name: String,
+ age: u8,
+}
+
+#[derive(CandidType, Deserialize)]
+struct CertifiedUser {
+ user: User,
+ certificate: Vec,
+ witness: Vec,
+}
+
+static URL: &str = "http://localhost:41749";
+static CANISTER: &str = "a3shf-5eaaa-aaaaa-qaafa-cai";
+const MAX_CERT_TIME_OFFSET_NS: u128 = 300_000_000_000; // 5 min
+const MAX_CALLS: usize = 10;
+
+#[tokio::main]
+async fn main() {
+
+ let agent = Agent::builder()
+ .with_url(URL)
+ .with_identity(AnonymousIdentity)
+ .build()
+ .expect("Unable to create agent");
+
+ // This should be done only in demo environments.
+ // When interacting with mainnet, hardcode the root_key.
+ agent
+ .fetch_root_key()
+ .await
+ .expect("Unable to fetch root key");
+ let root_key = agent.read_root_key();
+
+ let canister_id = Principal::from_text(CANISTER).unwrap();
+
+ // ==== START of canister data setup
+ let mut rng = rand::thread_rng();
+
+ // Make MAX_CALLS to set_user
+ let mut get_user_calls = Vec::new();
+ for _ in 0..MAX_CALLS {
+ let bytes: [u8; 16] = rng.gen();
+ let mut u = Unstructured::new(&bytes[..]);
+ let temp_user = User::arbitrary(&mut u).unwrap();
+
+ println!("Calling set_user with {:?}", temp_user);
+ let response = agent
+ .update(&canister_id, "set_user")
+ .with_effective_canister_id(canister_id)
+ .with_arg(Encode!(&temp_user).unwrap())
+ .call_and_wait();
+ get_user_calls.push(response);
+ }
+ let results: Vec = join_all(get_user_calls)
+ .await
+ .into_iter()
+ .map(|result| {
+ Decode!(
+ result
+ .expect("Query call get_user failed")
+ .as_slice(),
+ u64
+ )
+ .unwrap()
+ })
+ .collect();
+
+ // From response indexes, choose a random index for get_user
+ let index: usize = rng.gen();
+ let index: u64 = *results.get(index % MAX_CALLS).unwrap();
+ // ==== END of canister data setup
+
+ println!("Fetching index {:?}", index);
+
+ let query_response = agent
+ .query(&canister_id, "get_user")
+ .with_effective_canister_id(canister_id)
+ .with_arg(Encode!(&index).unwrap())
+ .call()
+ .await
+ .expect("Unable to call query call get_user");
+
+ let certified_user = Decode!(&query_response, CertifiedUser).unwrap();
+
+ let mut deserializer = Deserializer::from_slice(&certified_user.certificate);
+ let certificate: Certificate = serde::de::Deserialize::deserialize(&mut deserializer).unwrap();
+
+ let start = SystemTime::now();
+ let current_time = start
+ .duration_since(UNIX_EPOCH)
+ .expect("Time went backwards")
+ .as_nanos();
+
+ // Step 1: Check if signature in the certificate can be validated with the
+ // root_hash of the tree in certificate as message and root_key as public_key
+ let verification_result = certificate.verify(canister_id.as_slice(), &root_key[..]);
+
+ println!(
+ "Step 1: Digest match & Signature verification: {:?}",
+ verification_result
+ );
+
+ // Step 2: Check if the response is not stale with the given time offset MAX_CERT_TIME_OFFSET_NS.
+ let time_verification_result =
+ validate_certificate_time(&certificate, ¤t_time, &MAX_CERT_TIME_OFFSET_NS);
+
+ println!("Step 2: Time skew: {:?}", time_verification_result);
+
+ // Step 3: Check if witness root_hash matches the certified_data
+ let lookup_result =
+ certificate
+ .tree
+ .lookup_path([b"canister", canister_id.as_slice(), b"certified_data"]);
+
+ let certified_data: [u8; 32] = match lookup_result {
+ LookupResult::Found(result) => result.try_into().unwrap(),
+ _ => panic!("Certified data not found"),
+ };
+
+ let mut deserializer = Deserializer::from_slice(&certified_user.witness);
+ let witness_decoded: HashTree> =
+ serde::de::Deserialize::deserialize(&mut deserializer).unwrap();
+ let witness_digest = witness_decoded.digest();
+
+ println!(
+ "Step 3: Witness digest matches certified data: {:?} ",
+ witness_digest == certified_data
+ );
+
+ // Step 4: Check if the query parameters are in the witness
+ let witness_lookup: User =
+ match witness_decoded.lookup_path([b"user", &index.to_be_bytes()[..]]) {
+ LookupResult::Found(result) => serde_cbor::from_slice(result).unwrap(),
+ _ => panic!("user {} not found", index),
+ };
+
+ // Step 5: Check if the data found in Witness matches the returned result from the query.
+ println!(
+ "Step 4 & Step 5: Witness data matches User value: {:?}",
+ witness_lookup == certified_user.user
+ );
+
+ // Step 6: Return the result
+ println!("Result: {:?}", certified_user.user);
+}
+```
+
+**JavaScript (client-side verification):**
+
+```js
+import pkg from "@dfinity/agent";
+const { Actor, HttpAgent, Certificate, blsVerify, Cbor, reconstruct, lookup_path } = pkg;
+import { IDL } from "@dfinity/candid";
+import { Principal } from "@dfinity/principal";
+import fetch from "isomorphic-fetch";
+import assert from "node:assert/strict";
+
+const idlFactory = ({ IDL }) => {
+ const User = IDL.Record({ age: IDL.Nat8, name: IDL.Text });
+ const CertifiedUser = IDL.Record({
+ certificate: IDL.Vec(IDL.Nat8),
+ user: User,
+ witness: IDL.Vec(IDL.Nat8),
+ });
+ return IDL.Service({
+ get_user: IDL.Func([IDL.Nat64], [CertifiedUser], ["query"]),
+ set_user: IDL.Func([User], [IDL.Nat64], []),
+ });
+};
+
+const canisterId = Principal.fromText("a3shf-5eaaa-aaaaa-qaafa-cai");
+const host = "http://localhost:35777";
+
+start().await;
+
+async function start() {
+ const agent = new HttpAgent({ fetch, host });
+ await agent.fetchRootKey();
+
+ const rootKey = agent.rootKey.buffer;
+ let dummyUser = { name: "test_user", age: 21 };
+
+ const actor = Actor.createActor(idlFactory, {
+ agent,
+ canisterId,
+ });
+
+ let index = await actor.set_user(dummyUser);
+ let certifiedUser = await actor.get_user(index);
+
+ await verifyCertificate(certifiedUser, index, rootKey, canisterId);
+}
+
+async function verifyCertificate(certifiedUser, index, rootKey, canisterId) {
+ const certificate = certifiedUser.certificate.buffer;
+ const witness = certifiedUser.witness.buffer;
+ const user = certifiedUser.user;
+
+ const cert = new Certificate(certificate, rootKey, canisterId, blsVerify);
+
+ // Step 1: Check if signature in the certificate can be validated with the
+ // root_hash of the tree in certificate as message and root_key as public_key
+ await cert.verify();
+ console.log("Certificate verification succeeded");
+
+ // Step 2: Check if the response is not stale with the given time offset of 5m.
+ const te = new TextEncoder();
+ const pathTime = [te.encode("time")];
+ const rawTime = cert.lookup(pathTime).value;
+ console.log("Time skew: ", verifyTime(rawTime));
+
+ // Step 3: Check if witness root_hash matches the certified_data
+ const pathData = [
+ te.encode("canister"),
+ canisterId.toUint8Array(),
+ te.encode("certified_data"),
+ ];
+
+ const certifiedData = cert.lookup(pathData).value;
+ let witnessTree = Cbor.decode(witness);
+ let witnessRootHash = await reconstruct(witnessTree);
+ console.log(
+ "Verify CertifiedData matches witness root_hash: ",
+ certifiedData.buffer === witnessRootHash.buffer
+ );
+
+ // Step 4: Check if the query parameters are in the witness
+ const query_params = [te.encode("user"), bigEndian(index).buffer];
+ const witnessData = Cbor.decode(lookup_path(query_params, witnessTree).value);
+ console.log("Witness data: ", witnessData);
+
+ // Step 5: Check if the data found in Witness matches the returned result from the query.
+ assert.deepStrictEqual(witnessData, user, "Value matches response data");
+
+ // Step 6: Return the result
+ return user
+}
+
+function verifyTime(rawTime) {
+ const idlMessage = new Uint8Array([
+ ...new TextEncoder().encode("DIDL\x00\x01\x7d"),
+ ...new Uint8Array(rawTime),
+ ]);
+ const decodedTime = IDL.decode([IDL.Nat], idlMessage)[0];
+ const time = Number(decodedTime) / 1e9;
+ const now = Date.now() / 1000;
+ const diff = Math.abs(time - now);
+ if (diff > 5) {
+ return false;
+ }
+ return true;
+}
+
+function bigEndian(n) {
+ let buf = new Uint8Array(8);
+
+ for (let i = 7; i >= 0; i--) {
+ buf[i] = Number(n & 0xffn);
+ n >>= 8n;
+ }
+ return buf;
+}
+```
+
+## Use HTTP asset certification and avoid serving your dapp through `raw.icp0.io`
+
+### Security concern
+
+Dapps on ICP can use [asset certification](https://learn.internetcomputer.org/hc/en-us/articles/34276431179412-Asset-Certification) to make sure the HTTP assets delivered to the browser are authentic (i.e., threshold-signed by the subnet). If an app does not do asset certification, it can only be served insecurely through `raw.icp0.io`, where no asset certification is checked. This is insecure since a single malicious node or boundary node can freely modify the assets delivered to the browser.
+
+If an app is served through `raw.icp0.io` in addition to `icp0.io`, an adversary may trick users (phishing) into using the insecure `raw.icp0.io`.
+
+### Recommendation
+
+- Only serve assets through `.icp0.io`, where the boundary nodes enforce response verification on the served assets. Do not serve through `.raw.icp0.io`.
+
+- Serve assets using the asset canister, which creates asset certification automatically, or add the `ic-certificate` header including the asset certification as, e.g., done in the [NNS dapp](https://github.com/dfinity/nns-dapp) and [Internet Identity](https://github.com/dfinity/internet-identity).
+
+- Check in the canister's `http_request` method if the request came through raw. If so, return an error and do not serve any assets.
+
+
diff --git a/docs/guides/security/data-integrity.md b/docs/guides/security/data-integrity.md
deleted file mode 100644
index ad7d37d..0000000
--- a/docs/guides/security/data-integrity.md
+++ /dev/null
@@ -1,455 +0,0 @@
----
-title: "Data Integrity"
-description: "Protect data confidentiality and authenticity in canisters using vetKeys encryption, identity-based encryption, certified variables, and signature verification."
-sidebar:
- order: 3
----
-
-Data on the Internet Computer faces two distinct threats: **confidentiality** (unauthorized parties reading data) and **authenticity** (verifying that data hasn't been tampered with). This guide covers the IC mechanisms that address both: vetKeys for onchain encryption, certified variables for cryptographic data authenticity, and signature verification for external data.
-
-For a conceptual overview of how these fit into the IC security model, see [Security model](../../concepts/security.md). For a deeper look at the vetKeys cryptographic protocol, see [vetKeys](../../concepts/vetkeys.md).
-
-## Onchain encryption with vetKeys
-
-Canister state on standard application subnets is readable by node operators. If your application stores private data (notes, messages, files), you must encrypt it before storing. vetKeys (verifiably encrypted threshold keys) give canisters access to cryptographic key material derived by a threshold quorum of subnet nodes. No single node ever holds the raw key.
-
-The core workflow:
-
-1. The client generates an ephemeral **transport key pair**
-2. The canister calls `vetkd_derive_key` on the management canister, which derives a key encrypted under the client's transport public key
-3. The client decrypts the result with its transport private key to obtain the raw vetKey
-4. The client uses the vetKey to encrypt or decrypt data locally
-
-No key material ever leaves the subnet in plaintext. The canister never sees the raw key.
-
-### Prerequisites
-
-**Rust:**
-
-```toml
-[dependencies]
-ic-cdk = "0.19"
-ic-vetkeys = "0.6"
-ic-stable-structures = "0.7"
-```
-
-**Motoko** (`mops.toml`):
-
-```toml
-[dependencies]
-core = "2.0.0"
-```
-
-**Frontend:**
-
-```bash
-npm install @dfinity/vetkeys
-```
-
-> **API stability:** The `ic-vetkeys` crate and `@dfinity/vetkeys` package are published but their APIs may still change. Pin the versions above and check the [DFINITY forum](https://forum.dfinity.org) for migration guides before upgrading.
-
-### Key names and environments
-
-| Key name | Environment | Cycle cost (approx.) |
-|----------|-------------|----------------------|
-| `test_key_1` | Local + mainnet (testing) | ~10B cycles |
-| `key_1` | Mainnet (production) | ~26B cycles |
-
-Use `test_key_1` during development and mainnet testing. Switch to `key_1` for production. `vetkd_public_key` does not cost cycles; only `vetkd_derive_key` does.
-
-### Rust implementation
-
-The `ic-vetkeys` crate provides a high-level `KeyManager` that handles access control and stable storage. For simpler use cases, you can also call the management canister directly.
-
-**Using `ic-vetkeys` KeyManager (recommended):**
-
-Initialize the `KeyManager` with stable memory and a key ID in the `init` hook:
-
-```rust
-use ic_stable_structures::memory_manager::{MemoryId, MemoryManager};
-use ic_stable_structures::DefaultMemoryImpl;
-use ic_vetkeys::key_manager::KeyManager;
-use ic_vetkeys::types::{AccessRights, VetKDCurve, VetKDKeyId};
-
-thread_local! {
- static MEMORY_MANAGER: std::cell::RefCell> =
- std::cell::RefCell::new(MemoryManager::init(DefaultMemoryImpl::default()));
- static KEY_MANAGER: std::cell::RefCell