diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS index dbd77ef..7e493ac 100644 --- a/.github/CODEOWNERS +++ b/.github/CODEOWNERS @@ -3,3 +3,9 @@ # DX team members are also configured as bypass actors in the branch ruleset # and can merge their own PRs without a separate review. * @dfinity/dx + +# Security — product-security team must approve changes to security best practices +docs/guides/security/ @dfinity/product-security @dfinity/dx +docs/concepts/security.md @dfinity/product-security @dfinity/dx +docs/references/message-execution-properties.md @dfinity/product-security @dfinity/dx +docs/guides/canister-calls/idempotency.md @dfinity/product-security @dfinity/dx diff --git a/docs/404.mdx b/docs/404.mdx index a0f159d..dfa8a52 100644 --- a/docs/404.mdx +++ b/docs/404.mdx @@ -16,7 +16,7 @@ Pick a guide that matches what you're building. - **[Frontends](/guides/frontends/asset-canister/)**: Serve assets, integrate frameworks, configure custom domains, and certify responses. - **[Authentication](/guides/authentication/internet-identity/)**: Add passwordless login and verifiable user identity with Internet Identity. - **[Chain Fusion](/guides/chain-fusion/bitcoin/)**: Connect canisters to Bitcoin, Ethereum, and Solana. -- **[Security](/guides/security/access-management/)**: Access control, encryption, DoS prevention, and safe upgrade patterns. +- **[Security](/guides/security/identity-and-access-management/)**: Access control, DoS prevention, and safe upgrade patterns. ## Other places to look diff --git a/docs/concepts/security.md b/docs/concepts/security.md index ef23dff..b281fd1 100644 --- a/docs/concepts/security.md +++ b/docs/concepts/security.md @@ -61,7 +61,7 @@ The following threats are your responsibility to mitigate: Every update method is publicly callable. If you do not check the caller, anyone can invoke admin functions, drain funds, or corrupt state. The anonymous principal (`2vxsx-fae`) is a particularly common gap: it must be explicitly rejected in any authenticated endpoint, because otherwise it acts as a shared identity that anyone can use. -See [Access management](../guides/security/access-management.md#reject-anonymous-callers) for implementation patterns. +See [Access management](../guides/security/identity-and-access-management.md#reject-anonymous-callers) for implementation patterns. ### Reentrancy and async interleaving @@ -97,11 +97,11 @@ Users have no way to verify that a canister's running code matches its published ## What's next -- [Access management](../guides/security/access-management.md): caller checks, guards, and role-based access control +- [Access management](../guides/security/identity-and-access-management.md): caller checks, guards, and role-based access control - [Upgrade safety](../guides/security/canister-upgrades.md): safe upgrade patterns - [Inter-canister call safety](../guides/security/inter-canister-calls.md): async pitfalls and mitigations - [DoS prevention](../guides/security/dos-prevention.md): cycle drain protection -- [Data integrity](../guides/security/data-integrity.md): input validation and storage safety +- [Data integrity](../guides/security/data-integrity-and-authenticity.md): input validation and storage safety - [Response certification](../guides/frontends/certification.md): certified variables for query responses diff --git a/docs/concepts/vetkeys.md b/docs/concepts/vetkeys.md index e27d660..cc14ef6 100644 --- a/docs/concepts/vetkeys.md +++ b/docs/concepts/vetkeys.md @@ -99,7 +99,6 @@ The vetKD management canister API is live on mainnet. The `ic-vetkeys` Rust crat ## Next steps -- [Encryption with VetKeys](../guides/security/encryption.md): implement encrypted storage, IBE, and the full end-to-end key derivation flow - [Chain-Key Cryptography](chain-key-cryptography.md): the threshold cryptographic foundation that vetKeys build on - [Security](security.md): where vetKeys fit in the broader canister security model diff --git a/docs/guides/backends/randomness.md b/docs/guides/backends/randomness.md index ec94e8f..59438e6 100644 --- a/docs/guides/backends/randomness.md +++ b/docs/guides/backends/randomness.md @@ -225,7 +225,7 @@ Note: this example predates `mo:core` and uses the older `Random.Finite` API. Th - [Verifiable Randomness (concept)](../../concepts/verifiable-randomness.md): how the IC's threshold VRF works - [Management Canister](../../references/management-canister.md): `raw_rand` API reference -- [Data Integrity](../security/data-integrity.md): using randomness in a secure application design +- [Data Integrity](../security/data-integrity-and-authenticity.md): using randomness in a secure application design - [Inter-canister calls](../canister-calls/inter-canister-calls.md#reentrancy): async patterns and reentrancy diff --git a/docs/guides/canister-calls/idempotency.md b/docs/guides/canister-calls/idempotency.md new file mode 100644 index 0000000..baff003 --- /dev/null +++ b/docs/guides/canister-calls/idempotency.md @@ -0,0 +1,121 @@ +--- +title: "Safe Retries and Idempotency" +description: "Design idempotent canister APIs to enable safe retries for ingress calls and bounded-wait inter-canister calls, preventing double-spend and other correctness issues." +--- + +In the case of network issues or other unexpected behavior, ICP clients (such as agents) that issue ingress update calls may be unable to determine whether their ingress request has been processed. For example, this can happen if the client loses connection until after the request's ingress expiry ends and the request's status is removed from the system state tree. + +Similarly, canisters that call other canisters using bounded-wait calls may be unable to determine whether the call was successful or not. + +This can be risky as the callers (external users or applications for ingress messages, or canisters for inter-canister calls) might decide to retry the transaction, potentially leading to serious security vulnerabilities such as double spending. + +Thus, it is important to design and/or use canister APIs such that it is possible to retry requests safely, even when the ICP provides no information about previous request attempts. This page describes general approaches that both the canister authors and clients can adopt to enable safe retries. + +## Idempotent canister APIs + +A canister endpoint is idempotent if executing it multiple times is equivalent to executing it once.[^1] Whenever an endpoint is idempotent or can be made idempotent by the developer, this provides an easy way to implement safe retries. + +Given an idempotent endpoint, you can implement retries from an external application by retrying the call until you observe a certified response, either a replied or rejected status; see the illustration below. If such a response is ever observed, it's sure that the transaction has been executed at least once, which, thanks to idempotency, has the same result as executing it exactly once. However, the application may not be willing to wait for a response indefinitely, and a timeout could be implemented. Upon timeout, an error should be displayed to the user instructing them to wait until the latest message that has been sent has expired (as defined by the request's `ingress_expiry`) and then manually check the status of the transaction. Ideally, timeouts should be rare and not occur during normal operation. + +```plantuml +actor User +participant "Web Browser" as Browser +participant Agent +participant "Boundary Node" as BN +participant "IC Node" as IC + +User -> Browser: Start transaction + +loop until certified response or timeout + Browser -> Agent: idempotent call + Agent -> BN: call & subsequent read_state calls + BN -> IC + IC --> BN + BN --> Agent: certified response or error + Agent --> Browser: certified response or error +end + +Browser --> User: certified response\nor timeout error message +``` + +The situation is similar for bounded-wait inter-canister calls. Given an idempotent endpoint, the calling canister can keep retrying until a response other than `SYS_UNKNOWN` is observed or give up after a timeout if waiting indefinitely is not an option. + +Below are two approaches to making endpoints idempotent: sequence numbers and (time window) ID deduplication. + +### Update sequence numbers + +An endpoint can make use of sequence numbers to provide idempotency by taking a sequence number parameter in addition to other parameters. In the extreme case, a canister could keep a single expected sequence number for every endpoint, and a call could only be accepted if it contained the next expected sequence number, causing the expected sequence number to be incremented upon call execution. This trivially implies that any call can only be executed once. More practically, an expected sequence number is kept for each caller principal, or, in the case of ledger-like canisters, each ledger account. Note that Ethereum implements this mechanism. + +The advantages of this approach are: + +1. Sequence numbers are simple to implement and understand. +2. When applicable, it has a modest memory footprint because only the next expected sequence number must be stored (for example, per active account). + +The approach also has some disadvantages: + +1. It limits the throughput. When per-caller sequence numbers are used, it means that the caller can generally perform only one ingress call per consensus block, translating to a throughput of about 1 ingress call per second for that user. The situation is better for inter-canister calls, as the requests (if delivered) will be delivered in the order in which they were sent. Thus, the calling canister can issue multiple requests simultaneously, using appropriate sequence numbers. Under normal load, all requests should be delivered. However, under heavy load where the system may drop some requests, requests that follow such a dropped request may become invalid. + +2. It limits concurrency. The user has to sequentialize all their calls. This is straightforward to do when the user is another canister, but it can be much more difficult when the canister is called through ingress messages. In particular, it's complicated when the user is using multiple clients or devices to access the canister, for example. This concurrency problem also makes the approach inapplicable to cases where anonymous users are allowed to trigger update calls. + +3. If the sequence number is stored per user or per account, tracking them for too many users can exhaust the canister memory, even if each individual number is small. This could, e.g., be exploited by an attacker to exhaust the memory. The approach is thus best suited for cases where the user has to pay for the usage in some way (e.g., the ledgers usually require a fee to both create an account and transfer funds), which thwarts attackers by requiring them to invest significant funds in an attack. + +### ID deduplication + +Another approach to idempotency is to make the calls uniquely identifiable on the receiving canister side (e.g., by using user-chosen IDs, sequence numbers, or a combination of several argument fields) to make sure a given call is executed at most once. The canister then deduplicates calls before executing them; if a call with the same ID has been executed previously, the new call is simply ignored (potentially returning the result of the previous call). Thus, the user can safely keep retrying the call until they get a response. + +For example, the ICRC ledger standard provides deduplication in this way. Using identical values for all call parameters, including the `created_at_time` and `memo` parameters, when issuing a transaction makes the transaction call idempotent by deduplicating calls with the same parameters. + +However, a naive implementation of this approach can exhaust the canister memory, as all successfully executed IDs need to be kept around forever. Thus, the deduplication is usually time-limited to a certain time window. For example, the ICP ledger uses a 24-hour window, and the ICRC standard defines a configuration parameter `TX_WINDOW` that determines the window length. + +Moreover, the ICP/ICRC ledgers use the `created_at_time` parameter to limit the validity period of a call. Roughly, the call is only considered valid if its `created_at_time` is not in the future and at most 24 hours in the past.[^2] This avoids the problem where the deduplication window expiring would allow a retried call to succeed again. + +But even with this improvement used in the ledgers, the time window approach implicitly assumes that the client will be able to get a definite answer to their call within the time window. For example, after the 24 hours expire, the user cannot easily tell if their ledger transfer happened; their only option is to analyze the ledger blocks, which is somewhat tedious and has to be done carefully to avoid asynchrony issues; see the section on [queryable call results](#queryable-call-results). + +Relying solely on a time window for deduplication does not guarantee bounded memory usage. In theory, an unlimited number of updates could occur within the time window, though in practice, this is constrained by the scaling limits of the ICP. The ICP/ICRC ledgers thus also define a maximum capacity: a limit on the number of deduplicated transactions (i.e., deduplication IDs) that can be stored in their deduplication store. Once this capacity is reached, further transactions are rejected until older transactions expire from the deduplication store at the end of the time window. Yet another extension of the approach is to guarantee deduplication for the stated time window as above but keep storing deduplication IDs even beyond that window, as long as the capacity is not reached. This way, the clients obtain a hard deduplication guarantee for the time window and a best-effort attempt to deduplicate transactions even past the window. + +An alternative is to do away with the time window and store the deduplication data forever. This requires storing this data in multiple canisters in order to prevent exhausting canister memory, similar to how the ICP/ICRC ledgers store transaction data in the archive canister. This shifts the tedious part of querying the deduplication data (e.g., ledger blocks) from the user to the canister. + +Summarizing, the advantages of this approach are: + +1. It can support high throughput. +2. It requires no synchronization on the part of the user and supports use cases like multiple devices. + +The disadvantages are: + +1. It is more complicated to implement than sequence numbers. +2. If a time window is used, it usually implicitly assumes that the user learns the call outcome within the time window. +3. The memory usage can grow fairly high with high supported throughput and long deduplication windows. For example, supporting 100 transactions per second with a deduplication window of 24 hours can require hundreds of megabytes of heap space. This can be mitigated by using multiple canisters to store the deduplication data, at the expense of further implementation complexity and higher latency. + +## Other approaches to safe retries + +In the absence of idempotent endpoints, or even in addition to them, clients may be able to use other endpoints to make their retries safe. + +### Queryable call results + +If the canister, in addition to the update endpoint, also exposes a query that can inform the user of the result of the update, the client can also use this for safe retries as follows: + +1. Attempt to perform the update. +2. If the result of the update is unknown (e.g., not present in the ingress history anymore, or a `SYS_UNKNOWN` error is returned for an inter-canister call), query the call result endpoint to determine whether the update was applied or not. Moreover, one needs to ensure that the previously sent call cannot be applied in the future. If both of these are true, the call might be retried or safely reported as failed. + +In practice, this pattern may be more complicated. For example, the ICP ledger exposes a `query_blocks` method that can be used to implement the above pattern for transfers initiated as ingress messages: + +1. Call the `query_blocks` method on the ledger to determine what the last block (as specified in the `chain_length` field of the response) currently is. Let's call this `last_block`. +2. Attempt to perform a transfer. This ingress message includes an `ingress_expiry` field. +3. If the result of the transfer is unknown, ensure that the transfer will not be applied at a later point: + - If using ingress messages, call the `read_state` endpoint on the ledger canister to obtain the `/time` branch of the system state tree. Repeat this until the reported time exceeds the `ingress_expiry` time. + - If using inter-canister calls, perform all subsequent calls (`query_blocks`) listed below from the same canister that initiated the transfer. The [ordering guarantees](../../references/message-execution-properties.md) then ensure that the transfer cannot happen later. +4. Call the `query_blocks` method on the ledger again to retrieve all ledger blocks since `last_block`, and check that the `timestamp` also exceeds the `ingress_expiry` time. In case of failure, retry until a result is obtained. Then, scan through the returned blocks to determine whether the transaction has been included or not. + +### 2-step transfers + +Another approach applicable to ledgers (such as ICRC-1 or ICP) is to perform transfers in two steps: + +1. First, transfer the tokens to an intermediate subaccount of the sender that's specific to this transaction. For example, if the transaction has a unique ID, the client can hash the ID to obtain a subaccount. The transferred amount should be the desired amount plus the ledger transaction fee. +2. If the result of the above transfer is unknown, query the balance of the transaction-specific subaccount. Like in the [queryable call result](#queryable-call-results) approach, if using ingress messages, this should be repeated until the `timestamp` accompanying the response exceeds the `ingress_expiry`. If the balance is 0, the transaction can safely be reported as failed, or it can be retried (starting from step 1). If the balance is at least the expected balance, one can proceed. +3. If the transfer to the transaction-specific subaccount succeeded (as determined either by the transfer result or by the balance query above), the client sends another transfer from the transaction-specific subaccount to the desired target account. This can be repeated as many times as necessary until a result of the call is known. Once a result is known, the overall transfer can be declared as succeeded, even if this step fails with an error, as this signifies that some previous attempt to transfer the money to the target succeeded. + +[^1]: "Equivalent" is meant from the user perspective here. Multiple executions may trigger changes such as those in the canister's cycle balance, but they are not relevant for the user. + +[^2]: More precisely, the ledger also allows for a small time drift of `created_at_time` into the future, which has to be taken into account when clearing the deduplication window. + + diff --git a/docs/guides/index.md b/docs/guides/index.md index ff1a2f4..506a02a 100644 --- a/docs/guides/index.md +++ b/docs/guides/index.md @@ -20,7 +20,7 @@ Practical how-to guides organized by development stage. Each guide solves a spec - **[Testing](testing/strategies.md)**: Write unit tests, run integration tests with PocketIC, and set up end-to-end testing. - **[Canister Management](canister-management/lifecycle.md)**: Deploy, upgrade, fund, optimize, and back up canisters. -- **[Security](security/access-management.md)**: Implement access control, encryption, DoS prevention, and safe upgrade patterns. +- **[Security](security/identity-and-access-management.md)**: Implement access control, DoS prevention, and safe upgrade patterns. ## Advanced features diff --git a/docs/guides/security/access-management.mdx b/docs/guides/security/access-management.mdx deleted file mode 100644 index 8c756eb..0000000 --- a/docs/guides/security/access-management.mdx +++ /dev/null @@ -1,375 +0,0 @@ ---- -title: "Access Management" -description: "Control who can call your canister with guards, caller checks, and controller management" -sidebar: - order: 1 ---- - -import { Tabs, TabItem } from '@astrojs/starlight/components'; - -Every canister method is callable by anyone on the internet. Without explicit access checks, any user or canister can invoke any of your public functions. This guide covers the patterns you need to restrict access. - -## Checklist - -Use this as a quick reference when securing your canister: - -- [ ] Reject the anonymous principal (`2vxsx-fae`) in every authenticated endpoint -- [ ] Check the caller inside each update method: not just in `canister_inspect_message` -- [ ] Use the `guard` attribute (Rust) or guard functions (Motoko) to enforce access rules -- [ ] Add a backup controller so you never lose canister access -- [ ] Use `canister_inspect_message` only as a cycle-saving optimization, never as a security boundary - -## How caller identity works - -When a canister receives a message, the network includes the caller's principal. This identity is provided by the system: it cannot be forged or spoofed. You access it with: - -- **Motoko:** `shared({ caller })` pattern on public functions -- **Rust:** `ic_cdk::api::msg_caller()` - -Every principal is one of these types: - -| Type | Format | Example | Meaning | -|------|--------|---------|---------| -| User | Varies (self-authenticating) | `wo5qg-ysjaa-aaaaa-...` | Human with a cryptographic identity | -| Canister | 10 bytes, ends in `-cai` | `rrkah-fqaaa-aaaaa-aaaaq-cai` | Another canister making an inter-canister call | -| Anonymous | Fixed | `2vxsx-fae` | Unauthenticated caller: no identity | -| Management | Fixed | `aaaaa-aa` | IC management canister (system calls) | - -## Reject anonymous callers - -Any endpoint that requires authentication must reject the anonymous principal. Without this check, unauthenticated callers can invoke protected methods. If your canister uses the caller principal as an identity key (for balances, ownership, etc.), the anonymous principal becomes a shared identity anyone can use. - - - - -```motoko -import Principal "mo:core/Principal"; -import Runtime "mo:core/Runtime"; - -// Inside persistent actor { ... } - - func requireAuthenticated(caller : Principal) { - if (Principal.isAnonymous(caller)) { - Runtime.trap("anonymous caller not allowed"); - }; - }; - - public shared ({ caller }) func protectedAction() : async Text { - requireAuthenticated(caller); - "ok"; - }; -``` - - - - -```rust -use ic_cdk::update; -use ic_cdk::api::msg_caller; -use candid::Principal; - -fn require_authenticated() -> Result<(), String> { - if msg_caller() == Principal::anonymous() { - return Err("anonymous caller not allowed".to_string()); - } - Ok(()) -} - -#[update(guard = "require_authenticated")] -fn protected_action() -> String { - "ok".to_string() -} -``` - -The Rust `guard` attribute runs the check before the method body executes. If the guard returns `Err`, the call is rejected. This is more robust than calling guard functions inside the method: you cannot forget to add it. Multiple guards can be chained: - -```rust -#[update(guard = "require_authenticated", guard = "require_admin")] -fn admin_action() { - // both guards passed -} -``` - - - - -## Owner and role-based access control - -There is no built-in role system on ICP. You implement it yourself by tracking principals in your canister state. - - - - -The `shared(msg)` pattern on an actor class captures the deployer's principal atomically. No separate init call, no front-running risk. Use `transient` for the owner since it gets recomputed from `msg.caller` on each install/upgrade. - -```motoko -import Principal "mo:core/Principal"; -import Set "mo:core/pure/Set"; -import Runtime "mo:core/Runtime"; - -shared(msg) persistent actor class MyCanister() { - - transient let owner = msg.caller; - var admins : Set.Set = Set.empty(); - - func requireOwner(caller : Principal) { - if (Principal.isAnonymous(caller)) { - Runtime.trap("anonymous caller not allowed"); - }; - if (caller != owner) { - Runtime.trap("caller is not the owner"); - }; - }; - - func requireAdmin(caller : Principal) { - if (Principal.isAnonymous(caller)) { - Runtime.trap("anonymous caller not allowed"); - }; - if (caller != owner and not Set.contains(admins, Principal.compare, caller)) { - Runtime.trap("caller is not an admin"); - }; - }; - - public shared ({ caller }) func addAdmin(newAdmin : Principal) : async () { - requireOwner(caller); - admins := Set.add(admins, Principal.compare, newAdmin); - }; - - public shared ({ caller }) func removeAdmin(admin : Principal) : async () { - requireOwner(caller); - admins := Set.remove(admins, Principal.compare, admin); - }; - - public shared ({ caller }) func adminAction() : async () { - requireAdmin(caller); - // ... protected logic - }; -}; -``` - - - - -```rust -use ic_cdk::{init, update}; -use ic_cdk::api::msg_caller; -use candid::Principal; -use std::cell::RefCell; - -thread_local! { - static OWNER: RefCell = RefCell::new(Principal::anonymous()); - static ADMINS: RefCell> = RefCell::new(vec![]); -} - -fn require_authenticated() -> Result<(), String> { - if msg_caller() == Principal::anonymous() { - return Err("anonymous caller not allowed".to_string()); - } - Ok(()) -} - -fn require_owner() -> Result<(), String> { - require_authenticated()?; - OWNER.with(|o| { - if msg_caller() != *o.borrow() { - return Err("caller is not the owner".to_string()); - } - Ok(()) - }) -} - -fn require_admin() -> Result<(), String> { - require_authenticated()?; - let caller = msg_caller(); - let is_authorized = OWNER.with(|o| caller == *o.borrow()) - || ADMINS.with(|a| a.borrow().contains(&caller)); - if !is_authorized { - return Err("caller is not an admin".to_string()); - } - Ok(()) -} - -#[init] -fn init(owner: Principal) { - OWNER.with(|o| *o.borrow_mut() = owner); -} -// Unlike Motoko's shared(msg) pattern which captures the deployer automatically, -// the Rust #[init] requires passing the owner explicitly at deploy time: -// icp canister deploy backend --argument '(principal "your-principal-here")' - -#[update(guard = "require_owner")] -fn add_admin(new_admin: Principal) { - ADMINS.with(|a| a.borrow_mut().push(new_admin)); -} - -#[update(guard = "require_owner")] -fn remove_admin(admin: Principal) { - ADMINS.with(|a| a.borrow_mut().retain(|p| p != &admin)); -} - -#[update(guard = "require_admin")] -fn admin_action() { - // ... protected logic: guard already validated caller -} -``` - - - - -Always include admin revocation (`removeAdmin`). Missing revocation is a common source of bugs: once granted, admin access should be removable. - -## Controller checks - -Controllers are the principals authorized to manage a canister (install code, change settings, stop/delete). The controller list is managed at the IC level, not inside your canister code. - - - - -**Motoko** provides `Principal.isController` to check if a principal is a controller of the current canister: - -```motoko -import Principal "mo:core/Principal"; -import Runtime "mo:core/Runtime"; - -// Inside persistent actor { ... } - - public shared ({ caller }) func controllerOnly() : async () { - if (not Principal.isController(caller)) { - Runtime.trap("caller is not a controller"); - }; - // ... - }; -``` - - - - -In Rust, there is no built-in `is_controller` function: checking controllers requires an async call to the management canister. See [inter-canister calls](../canister-calls/inter-canister-calls.md#making-calls) for inter-canister call patterns. - - - - -**Managing controllers with icp-cli:** - -```bash -# View current canister settings including controllers -icp canister settings show backend -e ic - -# Add a backup controller -icp canister settings update backend --add-controller -e ic - -# Remove a controller (warning: removing yourself locks you out) -icp canister settings update backend --remove-controller -e ic -``` - -Always add a backup controller. If you lose the private key of the only controller, the canister becomes permanently unupgradeable: there is no recovery mechanism. - -## `canister_inspect_message`: cycle optimization only - -`canister_inspect_message` is a hook that runs on a single replica before consensus. It can reject ingress messages early to save cycles on Candid decoding and execution. However, it is **not a security boundary**: - -- It runs on one node without consensus: a malicious boundary node can bypass it -- It is never called for inter-canister calls, query calls, or management canister calls - -Always duplicate real access checks inside each method. Use `inspect_message` only to reduce cycle waste from spam. - - - - -```motoko -import Principal "mo:core/Principal"; - -// Inside persistent actor { ... } -// Method variants must match your public methods - - system func inspect( - { - caller : Principal; - msg : { - #adminAction : () -> (); - #addAdmin : () -> Principal; - #removeAdmin : () -> Principal; - #protectedAction : () -> (); - } - } - ) : Bool { - switch (msg) { - case (#adminAction _) { not Principal.isAnonymous(caller) }; - case (#addAdmin _) { not Principal.isAnonymous(caller) }; - case (#removeAdmin _) { not Principal.isAnonymous(caller) }; - case (#protectedAction _) { not Principal.isAnonymous(caller) }; - case (_) { true }; - }; - }; -``` - - - - -```rust -use ic_cdk::api::{accept_message, msg_caller, msg_method_name}; -use candid::Principal; - -#[ic_cdk::inspect_message] -fn inspect_message() { - let method = msg_method_name(); - match method.as_str() { - "admin_action" | "add_admin" | "remove_admin" | "protected_action" => { - if msg_caller() != Principal::anonymous() { - accept_message(); - } - // Silently reject anonymous: saves cycles - } - _ => accept_message(), - } -} -``` - - - - -## Debugging identity - -When troubleshooting access control issues, it helps to know which principal your canister sees. A simple `whoami` endpoint returns the caller's identity: - - - - -```motoko -// Inside persistent actor { ... } - - public shared ({ caller }) func whoami() : async Principal { - caller; - }; -``` - - - - -```rust -use ic_cdk::query; -use ic_cdk::api::msg_caller; -use candid::Principal; - -#[query] -fn whoami() -> Principal { - msg_caller() -} -``` - - - - -Call it to verify which identity is being used: - -```bash -icp canister call backend whoami -``` - -## Next steps - -- [Security concepts](../../concepts/security.md): understand the IC security model -- [Canister settings](../canister-management/settings.md): configure controllers and freezing thresholds -- [DoS prevention](dos-prevention.md): rate limiting as an access control mechanism - -{/* Upstream: informed by dfinity/icskills (skills/canister-security/SKILL.md, dfinity/portal) docs/building-apps/best-practices/general.mdx */} diff --git a/docs/guides/security/canister-upgrades.md b/docs/guides/security/canister-upgrades.md index 7e8ba7a..0af030d 100644 --- a/docs/guides/security/canister-upgrades.md +++ b/docs/guides/security/canister-upgrades.md @@ -1,350 +1,52 @@ --- -title: "Secure Upgrades" -description: "Upgrade canisters safely: pre/post hooks, stable memory, Candid compatibility, snapshot rollbacks, schema evolution, and testing" +title: "Canister Upgrade Security" +description: "Security best practices for canister upgrade hooks, panics during upgrades, and timer reinstatement." sidebar: - order: 2 + order: 8 --- -Canister upgrades are one of the highest-risk operations in production. A bad upgrade can corrupt state, make the canister permanently non-upgradeable, or break clients. This guide covers the patterns and checks you need to upgrade safely. +## Be careful with panics during upgrades -## Checklist +### Security concern -Use this before every production upgrade: +If a canister traps or panics in `pre_upgrade`, this can lead to permanently blocking the canister, resulting in a situation where upgrades fail or are no longer possible at all. -- [ ] Take a snapshot immediately before upgrading -- [ ] Run the upgrade locally first with `icp deploy` -- [ ] Verify data survives: write → upgrade → read -- [ ] Check Candid interface compatibility. No removed methods, no breaking type changes -- [ ] Avoid `pre_upgrade` hooks that serialize large state (use stable structures instead) -- [ ] In Motoko, use `persistent actor` (which eliminates the need for pre_upgrade hooks): avoid manual `pre_upgrade`/`post_upgrade` -- [ ] Confirm you have a backup controller (cannot recover from a trapped `post_upgrade` without one) -- [ ] Add a rollback plan: snapshot ID recorded, restore procedure tested +### Recommendation -## How upgrades work +- Avoid using `pre_upgrade` hooks if possible. Panics in the `pre_upgrade` hook prevent upgrades, and since the `pre_upgrade` hook is controlled by the old code, it can permanently block upgrading. -When you run `icp deploy` on an existing canister, the IC executes the following sequence: +- Panic in the `post_upgrade` hook if the state is invalid so that one can retry the upgrade and try to fix the invalid state. Panics in the `post_upgrade` hook abort the upgrade, but one can retry with new code. -1. **Stop** the canister (waits for in-flight messages to complete) -2. Run `pre_upgrade` on the old code (if defined) -3. Replace the Wasm module with the new code -4. Run `post_upgrade` on the new code (if defined) -5. **Restart** the canister +- [Test the upgrade hooks](https://mmapped.blog/posts/01-effective-rust-canisters.html#test-upgrades) (from [effective Rust canisters](https://mmapped.blog/posts/01-effective-rust-canisters.html)). -Stable memory is preserved through steps 2–4. Heap memory is cleared when the new Wasm loads. If `pre_upgrade` or `post_upgrade` traps, the upgrade fails with different consequences: +- See also the section on upgrades in [how to audit an Internet Computer canister](https://www.joachim-breitner.de/blog/788-How_to_audit_an_Internet_Computer_canister) (though focused on Motoko). -| Hook | Trap result | -|------|-------------| -| `pre_upgrade` | Upgrade cancelled. Old code still running. State intact but may need attention. | -| `post_upgrade` | New Wasm installed but initialization failed. Canister may be in an inconsistent state. | +- See [current limitations of the Internet Computer](https://wiki.internetcomputer.org/wiki/Current_limitations_of_the_Internet_Computer), section "Bugs in `pre_upgrade` hooks." -Both scenarios leave the canister in a difficult state. Prevention is far better than recovery. +## Reinstantiate timers during upgrades -## Stable memory patterns +### Security concern -### Motoko: use `persistent actor` +Global timers are deactivated upon changes to the canister's Wasm module. The [IC specification](../../references/ic-interface-spec/canister-interface.md#global-timer) states this as follows: -The `persistent actor` declaration automatically stores all `let` and `var` fields in stable memory. No serialization, no upgrade hooks, no instruction-limit traps. +> "The timer is also deactivated upon changes to the canister's Wasm module (calling install_code or uninstall_code methods of the management canister or if the canister runs out of cycles). In particular, the function canister_global_timer won't be scheduled again unless the canister sets the global timer again (using the System API function ic0.global_timer_set)." -```motoko -persistent actor Counter { - var count : Nat = 0; +Upgrade is a mode of `install_code`, and hence the timers are deactivated during an upgrade. - public func increment() : async Nat { - count += 1; - count; - }; +This could result in a vulnerability in certain cases where security controls or other critical features rely on these timers to function. For example, a DEX that relies on timers to update the exchange rates of currencies could be vulnerable to arbitraging opportunities if the rates are no longer updated. - public query func get() : async Nat { count }; +Since global timers are used internally by the Motoko `Timer` mechanism, the same holds true for the Motoko Timer. As explained in the [pull request](https://github.com/dfinity/motoko/pull/3542) under "The upgrade story," the global timer gets discarded on upgrade, and the timers need to be set up in the `post_upgrade` hook. - // transient: resets to [] on each upgrade: correct for caches, transient logs, and reset-on-upgrade counters - transient var recentCallers : [Principal] = []; -}; -``` +This behavior is different when [using Motoko](https://github.com/dfinity/motoko/pull/3542) and implementing `system func timer`. The `timer` function will be called after an upgrade. In case your canister was using timers for recurring tasks, the `timer` function would likely set the global timer again for a later time. However, the time between invocations of `timer` would not be consistent as the upgrade triggered an "unexpected" call to `timer`. -**Key rules:** +Using the Rust CDK, the recurring timer is also lost on upgrade as explained in the API documentation of [set_timer_interval](https://docs.rs/ic-cdk/0.6.9/ic_cdk/timer/fn.set_timer_interval.html). -- All `let`/`var` fields persist automatically. No `stable` keyword needed -- `transient var` for caches or counters that should reset on upgrade -- Do not write manual `pre_upgrade`/`post_upgrade` hooks. The runtime handles everything -- If a persistent field's type changes incompatibly, the upgrade traps. See [Schema evolution](#schema-evolution). +### Recommendation -### Rust: use stable structures +- In Motoko canisters, global timers should be set up in the actor initializer for canister installation or reinstallation. Canister-wide timers should be set in the `post_upgrade` hook for upgrades, as timers do not survive upgrades and must be explicitly set up thereafter. -In Rust, use [`ic-stable-structures`](https://docs.rs/ic-stable-structures/latest/ic_stable_structures/) to store data directly in stable memory. Data lives there from the start. No serialization step on upgrade. +- See the Motoko documentation on [timers](../../languages/motoko/icp-features/timers.md). -```rust -use ic_stable_structures::{ - memory_manager::{MemoryId, MemoryManager, VirtualMemory}, - DefaultMemoryImpl, StableBTreeMap, StableCell, -}; -use std::cell::RefCell; +- See the Rust documentation on [set_timer_interval](https://docs.rs/ic-cdk/0.6.9/ic_cdk/timer/fn.set_timer_interval.html). -type Memory = VirtualMemory; - -// Each structure must have its own unique MemoryId: never reuse IDs -const USERS_MEM_ID: MemoryId = MemoryId::new(0); -const COUNTER_MEM_ID: MemoryId = MemoryId::new(1); - -thread_local! { - static MEMORY_MANAGER: RefCell> = - RefCell::new(MemoryManager::init(DefaultMemoryImpl::default())); - - static USERS: RefCell, Memory>> = - RefCell::new(StableBTreeMap::init( - MEMORY_MANAGER.with(|m| m.borrow().get(USERS_MEM_ID)) - )); - - static COUNTER: RefCell> = - RefCell::new(StableCell::init( - MEMORY_MANAGER.with(|m| m.borrow().get(COUNTER_MEM_ID)), - 0u64, - ).expect("Failed to init counter")); -} - -#[ic_cdk::post_upgrade] -fn post_upgrade() { - // Stable structures auto-restore: no deserialization needed. - // Re-initialize timers or transient state here if required. -} -``` - -> **Warning:** Each `MemoryId` must map to exactly one data structure for the lifetime of the canister. Reusing a `MemoryId` for a different structure after an upgrade corrupts both. Keep a written record of your `MemoryId` allocations and never reorder them. - -### Avoid `pre_upgrade` serialization - -The serialization-based upgrade pattern is common in older Rust code but is fundamentally fragile: - -```rust -// DO NOT DO THIS in production -#[ic_cdk::pre_upgrade] -fn pre_upgrade() { - // If STATE is large, this hits the instruction limit and traps. - // A trapped pre_upgrade prevents the upgrade: canister stays on old code. - ic_cdk::storage::stable_save((STATE.with(|s| s.borrow().clone()),)).unwrap(); -} -``` - -When `pre_upgrade` traps due to instruction exhaustion, the canister cannot be upgraded. The `skip_pre_upgrade` flag (an emergency escape hatch via the management canister's `install_code` API (see [Management canister reference](../../references/management-canister.md#install_code)) bypasses the hook) but anything the hook would have saved is lost. Use stable structures so the upgrade path cannot brick itself under load. - -## Candid interface compatibility - -The IC checks your new Wasm module's Candid interface against the old one before completing the upgrade. If the new interface is not backward-compatible, the upgrade is rejected. - -**Safe changes:** - -| Change | Why it is safe | -|--------|---------------| -| Add a new method | Existing clients don't call it | -| Add optional parameters to an existing method | Old clients send no value; IC substitutes `null` | -| Remove trailing parameters from an existing method | Old clients send extra values; IC ignores them | -| Return additional values from a method | Old clients ignore extra return values | -| Change a parameter type to a supertype | Old values remain valid inputs | -| Change a return type to a subtype | New values remain valid for old clients | - -**Breaking changes (upgrade rejected or clients break):** - -| Change | Why it breaks | -|--------|--------------| -| Remove a method | Clients calling it get errors | -| Add a required (non-optional) parameter | Old clients don't send it | -| Change a parameter type to an incompatible type | Old clients send invalid values | - -**Example: safe evolution:** - -```candid -// Before -service counter : { - add : (nat) -> (); - get : () -> (int) query; -} - -// After: safe: optional param added, new return value, new method -service counter : { - add : (nat, label : opt text) -> (new_val : nat); - get : () -> (nat, last_change : nat) query; - reset : () -> (); -} -``` - -icp-cli checks Candid compatibility during deploy and prompts for confirmation if it detects a potentially breaking change. Use `--yes` in automated pipelines after manually verifying compatibility: - -```bash -icp deploy my-canister -e ic --yes -``` - -## Snapshot-based rollback - -Always take a snapshot immediately before a risky upgrade. If the upgrade causes unexpected behavior, you can restore the previous state within minutes. - -```bash -# 1. Stop the canister and create a snapshot -icp canister stop my-canister -e ic -icp canister snapshot create my-canister -e ic -# Note the snapshot ID printed in the output -icp canister start my-canister -e ic - -# 2. Deploy the upgrade -icp deploy my-canister -e ic - -# 3. Verify correctness -icp canister call my-canister health_check -e ic - -# 4a. If everything works, clean up when no longer needed -icp canister snapshot delete my-canister -e ic - -# 4b. If something is wrong, stop and restore -icp canister stop my-canister -e ic -icp canister snapshot restore my-canister -e ic -icp canister start my-canister -e ic -``` - -Snapshots capture the full canister state: Wasm module, Wasm heap memory, stable memory, and chunk store. Restoring from a snapshot brings back all of this state atomically. - -See [Canister snapshots](../canister-management/snapshots.md) for listing, downloading, and the state transfer workflow. - -## Schema evolution - -Upgrading canister code sometimes requires changing the shape of stored data. The rules differ by language. - -### Motoko - -When upgrading a `persistent actor`, the runtime checks that every persistent field's current type is compatible with the value stored in stable memory. Incompatible changes cause the upgrade to trap. - -**Safe changes:** - -- Add new `let` or `var` fields with initial values. The runtime initializes them on upgrade -- Add optional record fields (e.g., change `{ name : Text }` to `{ name : Text; email : ?Text }`) -- Widen a field's type (e.g., `Nat` → `Int`) - -**Unsafe changes (upgrade traps):** - -- Remove or rename a persistent field -- Narrow a field's type (e.g., `Int` → `Nat`) -- Change a non-optional field to an incompatible type - -If you need to make an unsafe change, migrate the data in two upgrades: add the new field alongside the old one, upgrade once (both fields present), then upgrade again to remove the old field. Test this two-step process locally before deploying to mainnet. - -### Rust - -Rust stable structures use serialized bytes on disk. Schema evolution safety depends on the serialization format and versioning strategy. - -**Adding fields safely with Candid encoding:** - -```rust -use candid::{CandidType, Decode, Deserialize, Encode}; -use ic_stable_structures::storable::{Bound, Storable}; -use std::borrow::Cow; - -#[derive(CandidType, Deserialize, Clone)] -struct UserV2 { - id: u64, - name: String, - created: u64, - // New optional field: safe to add: old records deserialize with None - email: Option, -} - -impl Storable for UserV2 { - // Unbounded avoids write failures when struct grows. - // Bounded requires a fixed max_size; if encoded size exceeds it after - // adding fields, writes trap. - const BOUND: Bound = Bound::Unbounded; - - fn to_bytes(&self) -> Cow<'_, [u8]> { - Cow::Owned(Encode!(self).expect("failed to encode UserV2")) - } - - fn from_bytes(bytes: Cow<'_, [u8]>) -> Self { - Decode!(&bytes, Self).expect("failed to decode UserV2") - } -} -``` - -**Rules:** - -- Use `Option` for new fields: Candid deserializes absent fields as `None`, so old records remain readable after the upgrade -- Use `Bound::Unbounded` unless you have a strict size requirement -- Never reorder `MemoryId` allocations across upgrades: same effect as changing a field type -- For breaking schema changes, use a versioned enum and migrate records lazily on read - -## Testing upgrades locally - -Never upgrade on mainnet without first verifying locally that data written before the upgrade is still readable after. - -**Motoko:** - -```bash -# Start local network -icp network start -d - -# Deploy initial version -icp deploy backend - -# Write data -icp canister call backend increment '()' -icp canister call backend increment '()' -icp canister call backend get '()' -# Returns: (2 : nat) - -# Modify source code, then redeploy -icp deploy backend - -# Verify data survived -icp canister call backend get '()' -# Must still return: (2 : nat) -``` - -**Rust:** - -```bash -# Start local network -icp network start -d - -# Deploy initial version -icp deploy backend - -# Write data -icp canister call backend add_user '("Alice")' -icp canister call backend get_user_count '()' -# Returns: (1 : nat64) - -# Modify source code, then upgrade -icp deploy backend - -# Verify data survived -icp canister call backend get_user_count '()' -# Must still return: (1 : nat64) -``` - -If the count drops to zero after upgrade, your data is not in stable memory: review your storage declarations before touching mainnet. - -For advanced scenarios (upgrade rollbacks, schema migrations, concurrent call safety), use [PocketIC](../testing/pocket-ic.md) to script multi-step upgrade scenarios in a controlled environment. - -## Controller safety - -You cannot upgrade a canister without a valid controller. Losing all controller keys leaves the canister permanently frozen at its current code: there is no recovery path on the IC. - -```bash -# Check current controllers -icp canister settings show my-canister -e ic - -# Add a backup controller before any risky upgrade -icp canister settings update my-canister --add-controller -e ic -``` - -For production canisters: - -- Maintain at least two controllers (primary identity + hardware wallet or multisig) -- For fully onchain governance, add an SNS or DAO canister as controller and remove personal principals - -See [Access management](access-management.md) for detailed controller management patterns. - -## Next steps - -- [Data persistence](../backends/data-persistence.md): stable structures and upgrade patterns in depth -- [Canister lifecycle](../canister-management/lifecycle.md#upgrade-a-canister): the full upgrade sequence and install modes -- [Canister snapshots](../canister-management/snapshots.md): create and restore snapshots -- [Testing strategies](../testing/strategies.md): test upgrade scenarios before deploying to mainnet -- [Access management](access-management.md): manage controllers and prevent lock-out - - + diff --git a/docs/guides/security/data-integrity-and-authenticity.md b/docs/guides/security/data-integrity-and-authenticity.md new file mode 100644 index 0000000..1433e65 --- /dev/null +++ b/docs/guides/security/data-integrity-and-authenticity.md @@ -0,0 +1,629 @@ +--- +title: "Data Integrity and Authenticity" +description: "Security best practices for certified variables, asset certification, and protecting data authenticity on ICP." +sidebar: + order: 4 +--- + +## Certified variables + +### Security concern + +ICP offers three modes of operation for canisters: `update`, `query`, and `composite_query`. For the sake of simplicity, we will club `composite_query` under queries for the rest of this section. + +Update calls are slow and expensive but provide integrity guarantees as their responses include a threshold signature signed by the subnet. + +On the other hand, query calls are fast since a single replica formulates the response, but **there is no integrity guarantee, since the response can be manipulated by a single replica or boundary node.** For example, if the NNS dapp fetches proposal information from the governance canister via query calls and the responding node is malicious, it can mask an ill-intentioned proposal that causes irrevocable damage as innocuous by modifying the proposal payload in the response and mislead voters into voting yes. Another consequence of query calls is that users can't rely on [canister_inspect_message](../../references/ic-interface-spec/canister-interface.md#system-api-inspect-message) as a guard. **This makes query calls, in their raw form, unfit to serve data for security-critical applications.** + +### Using certified variables for secure queries +In certain use cases, there is a third option whereby query results can return data that has been certified by the subnet in an earlier update call. This is the concept of certified data, and it requires changes to the update call to create the certification, the query call to return the certificate, and the frontend to verify the certificate. Using certified data provides the best of both worlds with query-like response times and update-like certified responses. + +Some examples of certified variables are asset certification in [Internet Identity](https://github.com/dfinity/internet-identity/blob/b29a6f68bbe5a49d048e12bc7a3263a9f43d080b/src/internet_identity/src/main.rs#L775-L808), [NNS dapp](https://github.com/dfinity/nns-dapp/blob/372c3562127d70c2fde059bc9c268e8ae858583e/rs/src/assets.rs#L121-L145), or the [canister signature implementation in Internet Identity](https://github.com/dfinity/ic-canister-sig-creation). + +:::tip +Certified variables are an advanced feature that require careful implementation of authenticated data structures and verification on the canister and client sides, respectively. **If the client doesn't require fast response times, call the query method as an update call (replicated query).** The response would be certified by the subnet, and a single malicious or boundary node can't modify the response. +::: + +:::tip +ICP also provides replica signed queries, where query responses are signed by the answering replica node; however, it doesn't have the same security guarantees as an `update` call and only protects from malicious boundary nodes. Replica signed queries are enabled by default on both the ICP Rust agent and the ICP JavaScript agent. +::: + +### What is certified data? +Aside from update calls, the subnet certifies (creates a threshold signature) a part of the canister data every round. This is stored in the state tree under the label `certified_data`. However, since it's certified every round, the amount of data that can be stored in `certified_data` is limited to 32 bytes. Hence, when you modify the state of your canister during an update call, if you can convert the state into a unique representation that can fit into 32 bytes, you can store it under `certified_data`, and it will be certified. Naturally, this can be done by computing a hash of the data structure of the canister state. This is also why certified variables are difficult to implement. Depending on your data structures, you will need to develop a different kind of hashing function. + +Subsequent query calls can return the data as-is, including the signature on the `certified_data`, which the frontend can verify with the IC root public key. This means that data aggregation or other calculations can't be done in query calls, as there would be no way to produce a signature over that newly created data. There are two workarounds: either this data is precomputed in the update call or all raw data is sent to the frontend, which verifies it and does the calculations. Combining these features, a canister should be able to certify a variable in a query response with this [design](https://medium.com/dfinity/how-internet-computer-responses-are-certified-as-authentic-2ff1bb1ea659). + +On a high level, in your canister: +1. Choose an [authenticated data structure](https://cs.brown.edu/research/pubs/pdfs/2003/Tamassia-2003-ADS.pdf) like Merkle trees to store a value in canister memory. +2. In the **update** call: + - Perform the computation and store the result in the Merkle tree. + - The lookup path for the result must act as its `key`. Ideally this `key` should be the parameters provided by the caller in the query method. + - Recompute the Merkle proof (`root_hash`) + - Store the `root_hash` as the canister's certified data. + - Return the `key` as response. +3. In the **query** call: + - Fetch the result from the Merkle structure using the query parameters as the lookup path. + - Fetch the current `certified_data` for the canister. + - Compute the witness for the result using the same lookup path. The Merkle witness provides proof of inclusion that the requested result exists in the Merkle tree under the given path. + - Return `(result, certified_data, witness)` as the response. + +The rest of the section shows an example canister, which can serve a certified response for a `query` using `certified_data` that is verified in the frontend. The examples are written in Rust and Motoko, but the overall design can be implemented in other languages. + +### Building a canister with certified variables +Let's consider the following canister interface: + +```c +type User = record { + name: text; + age: nat8; +}; + +type CertifiedUser = record { + user : User; + certificate : blob; + witness : blob; +}; + +service : { + "set_user": (User) -> (nat64); + "get_user": (nat64) -> (CertifiedUser) query; +} +``` + +The canister exposes the following service: +- **set_user**: The caller provides a `User` object to the canister. The canister records it and serves a corresponding `index` for the entry as the response. Since `certified_data` can only store 32 bytes of data, it uses a specialized data structure from `ic_certified_map` to store the `User` data. + - The data structure internally stores the data in a `HashTree` (or [Merkle tree](https://en.wikipedia.org/wiki/Merkle_tree)) and records the `root_hash` of the data structure in the `certified_data`, which is 32 bytes. + - The `root_hash` cryptographically guarantees that only one tree can correspond to that hash. The `root_hash` is also referred to as the Merkle proof. +- **get_user**: The caller provides a `index: nat64` to the canister and gets a certified response for the corresponding `User`. The `CertifiedUser` response must have the following structure for verifying the response: + - **user**: The actual response. + - **certificate**: The payload for verifying the signature on the `certified_data`. ICP provides the system API `data_certificate()` for this. + - **witness**: Allows for the final verification of the response to be completed with the requested input and `certified_data`. + +You can find an example implementation of the canister below. + +**Motoko:** + +```motoko +import CertifiedData "mo:core/CertifiedData"; +import Blob "mo:core/Blob"; +import Nat8 "mo:core/Nat8"; +import Debug "mo:core/Debug"; +import Text "mo:core/Text"; +import Nat64 "mo:core/Nat64"; +import Array "mo:core/Array"; +import CertTree "mo:ic-certification/CertTree"; +import CV "mo:cbor/Value"; +import CborEncoder "mo:cbor/Encoder"; +import CborDecoder "mo:cbor/Decoder"; + +actor CertifiedVariable { + + type User = { + name : Text; + age : Nat8; + }; + + type CertifiedUser = { + user : User; + certificate : Blob; + witness : Blob; + }; + + stable var count : Nat64 = 0; + stable let cert_store : CertTree.Store = CertTree.newStore(); + let ct = CertTree.Ops(cert_store); + + public func set_user(user : User) : async Nat64 { + count += 1; + let path : [Blob] = [Text.encodeUtf8("user"), blobOfNat64(count)]; + ct.put(path, encodeUser(user)); + ct.setCertifiedData(); + return count; + }; + + public query func get_user(index : Nat64) : async CertifiedUser { + let certificate = switch (CertifiedData.getCertificate()) { + case (?certificate) { + certificate; + }; + case (null) { + Debug.trap("Certified data not set"); + }; + }; + + let path : [Blob] = [Text.encodeUtf8("user"), blobOfNat64(index)]; + + let value = switch (ct.lookup(path)) { + case (?value) { + value; + }; + case (null) { + Debug.trap("Lookup failed"); + }; + }; + + let user : User = decodeUser(value); + let witness = ct.encodeWitness(ct.reveal(path)); + + let certifiedUser : CertifiedUser = { + certificate = certificate; + witness = witness; + user = user; + }; + + return certifiedUser; + }; + + func encodeUser(user : User) : Blob { + let bytes : CV.Value = #majorType5([ + (#majorType3("name"), #majorType3(user.name)), + (#majorType3("age"), #majorType0(Nat64.fromNat(Nat8.toNat(user.age)))), + ]); + + let #ok(encoded_user) = CborEncoder.encode(bytes); + return Blob.fromArray(encoded_user); + }; + + func decodeUser(bytes : Blob) : User { + let #ok(#majorType5(map)) = CborDecoder.decode(bytes); + let name_tag = Array.find<(CV.Value, CV.Value)>(map, func x = x.0 == #majorType3("name")); + let age_tag = Array.find<(CV.Value, CV.Value)>(map, func x = x.0 == #majorType3("age")); + + let name = switch (name_tag) { + case (?name_value) { + let #majorType3(name) = name_value.1; + name; + }; + case (null) { + Debug.trap("Decoding failed for name"); + }; + }; + + let age = switch (age_tag) { + case (?age_value) { + let #majorType0(age) = age_value.1; + Nat8.fromNat(Nat64.toNat(age)); + }; + case (null) { + Debug.trap("Decoding failed for age"); + }; + }; + + return { + name = name; + age = age; + }; + }; + + func blobOfNat64(n : Nat64) : Blob { + let byteMask : Nat64 = 0xff; + func byte(x : Nat64) : Nat8 { + Nat8.fromNat(Nat64.toNat(x)); + }; + Blob.fromArray([ + byte(((byteMask << 56) & n) >> 56), + byte(((byteMask << 48) & n) >> 48), + byte(((byteMask << 40) & n) >> 40), + byte(((byteMask << 32) & n) >> 32), + byte(((byteMask << 24) & n) >> 24), + byte(((byteMask << 16) & n) >> 16), + byte(((byteMask << 8) & n) >> 8), + byte(((byteMask << 0) & n) >> 0), + ]); + }; + +}; +``` + +**Rust:** + +```rust +use candid::CandidType; +use ic_certified_map::HashTree; +use ic_certified_map::{leaf_hash, AsHashTree, Hash, RbTree}; +use serde::{Deserialize, Serialize}; +use std::borrow::Cow; +use std::cell::Cell; +use std::cell::RefCell; + +#[derive(CandidType, Serialize, Deserialize, Clone)] +struct User { + name: String, + age: u8, +} + +impl AsHashTree for User { + fn root_hash(&self) -> Hash { + let user_serialized = serde_cbor::to_vec(&self).unwrap(); + leaf_hash(&user_serialized[..]) + } + fn as_hash_tree(&self) -> HashTree<'_> { + HashTree::Leaf(Cow::from(serde_cbor::to_vec(&self).unwrap())) + } +} + +#[derive(CandidType)] +struct CertifiedUser { + user: User, + certificate: Vec, + witness: Vec, +} + +thread_local! { + static INDEX : Cell = Cell::new(0); + static TREE: RefCell>> = RefCell::new(RbTree::new()); +} + +#[ic_cdk::update] +fn set_user(user: User) -> u64 { + let index = INDEX.with(|index| { + let count = index.get() + 1; + index.set(count); + count + }); + + TREE.with_borrow_mut(|tree| { + match tree.get(b"user") { + Some(_) => { + tree.modify(b"user", |inner| { + inner.insert(index.to_be_bytes(), user); + }); + } + None => { + let mut inner = RbTree::new(); + inner.insert(index.to_be_bytes(), user); + tree.insert("user", inner); + } + } + ic_cdk::api::set_certified_data(&tree.root_hash()); + }); + index +} + +#[ic_cdk::query] +fn get_user(index: u64) -> CertifiedUser { + let certificate = ic_cdk::api::data_certificate().expect("No data certificate available"); + + TREE.with_borrow(|tree| { + let user = match tree.get(b"user") { + Some(inner) => { + let user = inner.get(&index.to_be_bytes()[..]).expect("User not found"); + user.to_owned() + } + None => { + panic!("Tree isn't initialized"); + } + }; + + let mut witness = vec![]; + let mut witness_serializer = serde_cbor::Serializer::new(&mut witness); + let _ = witness_serializer.self_describe(); + tree.nested_witness(b"user", |inner| inner.witness(&index.to_be_bytes()[..])) + .serialize(&mut witness_serializer) + .unwrap(); + + CertifiedUser { + user, + certificate, + witness, + } + }) +} +``` + +### Verifying certified variables + +Once you have the response `CertifiedUser`, for the integrity guarantee, the frontend must verify the certification in the response. This is broken down into several steps implemented in the Rust and JavaScript example below. + +:::note +The example has some extra steps to set up the canister with some `User` data before verification. You can ignore the section marked between `// ==== START of canister data setup` and `// ==== END of canister data setup`. +::: + +1. Verify the IC certificate: Recompute the `root_hash` of `certificate.tree` (pruned state tree with the canister's `certified_data`) and verify the `certificate.signature` with `root_hash` as the message, `certificate.delegation`, and the IC `root_key` as the public key. This confirms that the signature is valid for the current state tree. +2. Validate that the response is not stale by verifying the time at `/time` in `certificate.tree` is less than a certain delta of current time. The recommended delta is 5 minutes but should be adapted to the use case. +3. Recompute the `root_hash` of the witness and verify equality with the `certified_data`. The `certified_data` can be obtained from `certificate.tree` under the path `/canister//certified_data`. +4. Check if query parameters are in the witness. In this example, the lookup path is `/user/` and should be present in the witness. +5. Validate if the value found in `/user/` matches `user` from the response. +6. If all of the previous steps succeed, return `user` as the valid response. + +**Rust (client-side verification):** + +```rust +use arbitrary::{Arbitrary, Unstructured}; +use candid::Encode; +use candid::Principal; +use candid::{CandidType, Decode, Deserialize}; +use futures::future::join_all; +use ic_agent::identity::AnonymousIdentity; +use ic_agent::Agent; +use ic_certificate_verification::validate_certificate_time; +use ic_certificate_verification::VerifyCertificate; +use ic_certification::hash_tree::HashTree; +use ic_certification::{Certificate, LookupResult}; +use rand::prelude::*; +use serde_cbor::Deserializer; +use std::time::{SystemTime, UNIX_EPOCH}; + +#[derive(CandidType, Deserialize, Debug, PartialEq, Eq, Arbitrary)] +struct User { + name: String, + age: u8, +} + +#[derive(CandidType, Deserialize)] +struct CertifiedUser { + user: User, + certificate: Vec, + witness: Vec, +} + +static URL: &str = "http://localhost:41749"; +static CANISTER: &str = "a3shf-5eaaa-aaaaa-qaafa-cai"; +const MAX_CERT_TIME_OFFSET_NS: u128 = 300_000_000_000; // 5 min +const MAX_CALLS: usize = 10; + +#[tokio::main] +async fn main() { + + let agent = Agent::builder() + .with_url(URL) + .with_identity(AnonymousIdentity) + .build() + .expect("Unable to create agent"); + + // This should be done only in demo environments. + // When interacting with mainnet, hardcode the root_key. + agent + .fetch_root_key() + .await + .expect("Unable to fetch root key"); + let root_key = agent.read_root_key(); + + let canister_id = Principal::from_text(CANISTER).unwrap(); + + // ==== START of canister data setup + let mut rng = rand::thread_rng(); + + // Make MAX_CALLS to set_user + let mut get_user_calls = Vec::new(); + for _ in 0..MAX_CALLS { + let bytes: [u8; 16] = rng.gen(); + let mut u = Unstructured::new(&bytes[..]); + let temp_user = User::arbitrary(&mut u).unwrap(); + + println!("Calling set_user with {:?}", temp_user); + let response = agent + .update(&canister_id, "set_user") + .with_effective_canister_id(canister_id) + .with_arg(Encode!(&temp_user).unwrap()) + .call_and_wait(); + get_user_calls.push(response); + } + let results: Vec = join_all(get_user_calls) + .await + .into_iter() + .map(|result| { + Decode!( + result + .expect("Query call get_user failed") + .as_slice(), + u64 + ) + .unwrap() + }) + .collect(); + + // From response indexes, choose a random index for get_user + let index: usize = rng.gen(); + let index: u64 = *results.get(index % MAX_CALLS).unwrap(); + // ==== END of canister data setup + + println!("Fetching index {:?}", index); + + let query_response = agent + .query(&canister_id, "get_user") + .with_effective_canister_id(canister_id) + .with_arg(Encode!(&index).unwrap()) + .call() + .await + .expect("Unable to call query call get_user"); + + let certified_user = Decode!(&query_response, CertifiedUser).unwrap(); + + let mut deserializer = Deserializer::from_slice(&certified_user.certificate); + let certificate: Certificate = serde::de::Deserialize::deserialize(&mut deserializer).unwrap(); + + let start = SystemTime::now(); + let current_time = start + .duration_since(UNIX_EPOCH) + .expect("Time went backwards") + .as_nanos(); + + // Step 1: Check if signature in the certificate can be validated with the + // root_hash of the tree in certificate as message and root_key as public_key + let verification_result = certificate.verify(canister_id.as_slice(), &root_key[..]); + + println!( + "Step 1: Digest match & Signature verification: {:?}", + verification_result + ); + + // Step 2: Check if the response is not stale with the given time offset MAX_CERT_TIME_OFFSET_NS. + let time_verification_result = + validate_certificate_time(&certificate, ¤t_time, &MAX_CERT_TIME_OFFSET_NS); + + println!("Step 2: Time skew: {:?}", time_verification_result); + + // Step 3: Check if witness root_hash matches the certified_data + let lookup_result = + certificate + .tree + .lookup_path([b"canister", canister_id.as_slice(), b"certified_data"]); + + let certified_data: [u8; 32] = match lookup_result { + LookupResult::Found(result) => result.try_into().unwrap(), + _ => panic!("Certified data not found"), + }; + + let mut deserializer = Deserializer::from_slice(&certified_user.witness); + let witness_decoded: HashTree> = + serde::de::Deserialize::deserialize(&mut deserializer).unwrap(); + let witness_digest = witness_decoded.digest(); + + println!( + "Step 3: Witness digest matches certified data: {:?} ", + witness_digest == certified_data + ); + + // Step 4: Check if the query parameters are in the witness + let witness_lookup: User = + match witness_decoded.lookup_path([b"user", &index.to_be_bytes()[..]]) { + LookupResult::Found(result) => serde_cbor::from_slice(result).unwrap(), + _ => panic!("user {} not found", index), + }; + + // Step 5: Check if the data found in Witness matches the returned result from the query. + println!( + "Step 4 & Step 5: Witness data matches User value: {:?}", + witness_lookup == certified_user.user + ); + + // Step 6: Return the result + println!("Result: {:?}", certified_user.user); +} +``` + +**JavaScript (client-side verification):** + +```js +import pkg from "@dfinity/agent"; +const { Actor, HttpAgent, Certificate, blsVerify, Cbor, reconstruct, lookup_path } = pkg; +import { IDL } from "@dfinity/candid"; +import { Principal } from "@dfinity/principal"; +import fetch from "isomorphic-fetch"; +import assert from "node:assert/strict"; + +const idlFactory = ({ IDL }) => { + const User = IDL.Record({ age: IDL.Nat8, name: IDL.Text }); + const CertifiedUser = IDL.Record({ + certificate: IDL.Vec(IDL.Nat8), + user: User, + witness: IDL.Vec(IDL.Nat8), + }); + return IDL.Service({ + get_user: IDL.Func([IDL.Nat64], [CertifiedUser], ["query"]), + set_user: IDL.Func([User], [IDL.Nat64], []), + }); +}; + +const canisterId = Principal.fromText("a3shf-5eaaa-aaaaa-qaafa-cai"); +const host = "http://localhost:35777"; + +start().await; + +async function start() { + const agent = new HttpAgent({ fetch, host }); + await agent.fetchRootKey(); + + const rootKey = agent.rootKey.buffer; + let dummyUser = { name: "test_user", age: 21 }; + + const actor = Actor.createActor(idlFactory, { + agent, + canisterId, + }); + + let index = await actor.set_user(dummyUser); + let certifiedUser = await actor.get_user(index); + + await verifyCertificate(certifiedUser, index, rootKey, canisterId); +} + +async function verifyCertificate(certifiedUser, index, rootKey, canisterId) { + const certificate = certifiedUser.certificate.buffer; + const witness = certifiedUser.witness.buffer; + const user = certifiedUser.user; + + const cert = new Certificate(certificate, rootKey, canisterId, blsVerify); + + // Step 1: Check if signature in the certificate can be validated with the + // root_hash of the tree in certificate as message and root_key as public_key + await cert.verify(); + console.log("Certificate verification succeeded"); + + // Step 2: Check if the response is not stale with the given time offset of 5m. + const te = new TextEncoder(); + const pathTime = [te.encode("time")]; + const rawTime = cert.lookup(pathTime).value; + console.log("Time skew: ", verifyTime(rawTime)); + + // Step 3: Check if witness root_hash matches the certified_data + const pathData = [ + te.encode("canister"), + canisterId.toUint8Array(), + te.encode("certified_data"), + ]; + + const certifiedData = cert.lookup(pathData).value; + let witnessTree = Cbor.decode(witness); + let witnessRootHash = await reconstruct(witnessTree); + console.log( + "Verify CertifiedData matches witness root_hash: ", + certifiedData.buffer === witnessRootHash.buffer + ); + + // Step 4: Check if the query parameters are in the witness + const query_params = [te.encode("user"), bigEndian(index).buffer]; + const witnessData = Cbor.decode(lookup_path(query_params, witnessTree).value); + console.log("Witness data: ", witnessData); + + // Step 5: Check if the data found in Witness matches the returned result from the query. + assert.deepStrictEqual(witnessData, user, "Value matches response data"); + + // Step 6: Return the result + return user +} + +function verifyTime(rawTime) { + const idlMessage = new Uint8Array([ + ...new TextEncoder().encode("DIDL\x00\x01\x7d"), + ...new Uint8Array(rawTime), + ]); + const decodedTime = IDL.decode([IDL.Nat], idlMessage)[0]; + const time = Number(decodedTime) / 1e9; + const now = Date.now() / 1000; + const diff = Math.abs(time - now); + if (diff > 5) { + return false; + } + return true; +} + +function bigEndian(n) { + let buf = new Uint8Array(8); + + for (let i = 7; i >= 0; i--) { + buf[i] = Number(n & 0xffn); + n >>= 8n; + } + return buf; +} +``` + +## Use HTTP asset certification and avoid serving your dapp through `raw.icp0.io` + +### Security concern + +Dapps on ICP can use [asset certification](https://learn.internetcomputer.org/hc/en-us/articles/34276431179412-Asset-Certification) to make sure the HTTP assets delivered to the browser are authentic (i.e., threshold-signed by the subnet). If an app does not do asset certification, it can only be served insecurely through `raw.icp0.io`, where no asset certification is checked. This is insecure since a single malicious node or boundary node can freely modify the assets delivered to the browser. + +If an app is served through `raw.icp0.io` in addition to `icp0.io`, an adversary may trick users (phishing) into using the insecure `raw.icp0.io`. + +### Recommendation + +- Only serve assets through `.icp0.io`, where the boundary nodes enforce response verification on the served assets. Do not serve through `.raw.icp0.io`. + +- Serve assets using the asset canister, which creates asset certification automatically, or add the `ic-certificate` header including the asset certification as, e.g., done in the [NNS dapp](https://github.com/dfinity/nns-dapp) and [Internet Identity](https://github.com/dfinity/internet-identity). + +- Check in the canister's `http_request` method if the request came through raw. If so, return an error and do not serve any assets. + + diff --git a/docs/guides/security/data-integrity.md b/docs/guides/security/data-integrity.md deleted file mode 100644 index ad7d37d..0000000 --- a/docs/guides/security/data-integrity.md +++ /dev/null @@ -1,455 +0,0 @@ ---- -title: "Data Integrity" -description: "Protect data confidentiality and authenticity in canisters using vetKeys encryption, identity-based encryption, certified variables, and signature verification." -sidebar: - order: 3 ---- - -Data on the Internet Computer faces two distinct threats: **confidentiality** (unauthorized parties reading data) and **authenticity** (verifying that data hasn't been tampered with). This guide covers the IC mechanisms that address both: vetKeys for onchain encryption, certified variables for cryptographic data authenticity, and signature verification for external data. - -For a conceptual overview of how these fit into the IC security model, see [Security model](../../concepts/security.md). For a deeper look at the vetKeys cryptographic protocol, see [vetKeys](../../concepts/vetkeys.md). - -## Onchain encryption with vetKeys - -Canister state on standard application subnets is readable by node operators. If your application stores private data (notes, messages, files), you must encrypt it before storing. vetKeys (verifiably encrypted threshold keys) give canisters access to cryptographic key material derived by a threshold quorum of subnet nodes. No single node ever holds the raw key. - -The core workflow: - -1. The client generates an ephemeral **transport key pair** -2. The canister calls `vetkd_derive_key` on the management canister, which derives a key encrypted under the client's transport public key -3. The client decrypts the result with its transport private key to obtain the raw vetKey -4. The client uses the vetKey to encrypt or decrypt data locally - -No key material ever leaves the subnet in plaintext. The canister never sees the raw key. - -### Prerequisites - -**Rust:** - -```toml -[dependencies] -ic-cdk = "0.19" -ic-vetkeys = "0.6" -ic-stable-structures = "0.7" -``` - -**Motoko** (`mops.toml`): - -```toml -[dependencies] -core = "2.0.0" -``` - -**Frontend:** - -```bash -npm install @dfinity/vetkeys -``` - -> **API stability:** The `ic-vetkeys` crate and `@dfinity/vetkeys` package are published but their APIs may still change. Pin the versions above and check the [DFINITY forum](https://forum.dfinity.org) for migration guides before upgrading. - -### Key names and environments - -| Key name | Environment | Cycle cost (approx.) | -|----------|-------------|----------------------| -| `test_key_1` | Local + mainnet (testing) | ~10B cycles | -| `key_1` | Mainnet (production) | ~26B cycles | - -Use `test_key_1` during development and mainnet testing. Switch to `key_1` for production. `vetkd_public_key` does not cost cycles; only `vetkd_derive_key` does. - -### Rust implementation - -The `ic-vetkeys` crate provides a high-level `KeyManager` that handles access control and stable storage. For simpler use cases, you can also call the management canister directly. - -**Using `ic-vetkeys` KeyManager (recommended):** - -Initialize the `KeyManager` with stable memory and a key ID in the `init` hook: - -```rust -use ic_stable_structures::memory_manager::{MemoryId, MemoryManager}; -use ic_stable_structures::DefaultMemoryImpl; -use ic_vetkeys::key_manager::KeyManager; -use ic_vetkeys::types::{AccessRights, VetKDCurve, VetKDKeyId}; - -thread_local! { - static MEMORY_MANAGER: std::cell::RefCell> = - std::cell::RefCell::new(MemoryManager::init(DefaultMemoryImpl::default())); - static KEY_MANAGER: std::cell::RefCell>> = - std::cell::RefCell::new(None); -} - -#[ic_cdk::init] -fn init() { - let key_id = VetKDKeyId { - curve: VetKDCurve::Bls12381G2, - name: "key_1".to_string(), // "test_key_1" for local + mainnet testing - }; - MEMORY_MANAGER.with(|mm| { - KEY_MANAGER.with(|km| { - *km.borrow_mut() = Some(KeyManager::init( - "my_app_v1", key_id, - mm.borrow().get(MemoryId::new(0)), - mm.borrow().get(MemoryId::new(1)), - mm.borrow().get(MemoryId::new(2)), - )); - }); - }); -} -``` - -Expose the two endpoints callers need: one to retrieve an encrypted key, one to retrieve the verification key: - -```rust -use candid::Principal; -use ic_cdk::update; - -#[update] -async fn get_encrypted_vetkey(subkey_id: Vec, transport_public_key: Vec) -> Vec { - let caller = ic_cdk::caller(); // capture BEFORE await - let future = KEY_MANAGER.with(|km| { - km.borrow().as_ref().expect("not initialized") - .get_encrypted_vetkey(caller, subkey_id, transport_public_key) - .expect("access denied") - }); - future.await -} - -#[update] -async fn get_vetkey_verification_key() -> Vec { - let future = KEY_MANAGER.with(|km| { - km.borrow().as_ref().expect("not initialized") - .get_vetkey_verification_key() - }); - future.await -} -``` - -**Calling management canister directly (lower level):** - -Retrieve the public key (no cycles required): - -```rust -use ic_cdk::management_canister::{ - VetKDCurve, VetKDKeyId, VetKDPublicKeyArgs, -}; - -const CONTEXT: &[u8] = b"my_app_v1"; - -fn key_id() -> VetKDKeyId { - VetKDKeyId { - curve: VetKDCurve::Bls12_381_G2, - name: "key_1".to_string(), // "test_key_1" for testing - } -} - -#[ic_cdk::update] -async fn get_public_key() -> Vec { - let response = ic_cdk::management_canister::vetkd_public_key( - &VetKDPublicKeyArgs { canister_id: None, context: CONTEXT.to_vec(), key_id: key_id() } - ).await.expect("vetkd_public_key call failed"); - response.public_key -} -``` - -Derive a key for the authenticated caller (`key_1` costs ~26B cycles; `ic-cdk` attaches them automatically): - -```rust -use ic_cdk::management_canister::{VetKDDeriveKeyArgs, VetKDCurve, VetKDKeyId}; - -#[ic_cdk::update] -async fn derive_key(transport_public_key: Vec) -> Vec { - let caller = ic_cdk::api::msg_caller(); // MUST capture before await - let response = ic_cdk::management_canister::vetkd_derive_key( - &VetKDDeriveKeyArgs { - input: caller.as_slice().to_vec(), - context: CONTEXT.to_vec(), - transport_public_key, - key_id: key_id(), - } - ).await.expect("vetkd_derive_key call failed"); - response.encrypted_key -} -``` - -### Motoko implementation - -Motoko uses the management canister directly. Define the request/response types and declare the actor interface: - -```motoko -import Blob "mo:core/Blob"; -import Principal "mo:core/Principal"; -import Text "mo:core/Text"; - -persistent actor { - - type VetKdCurve = { #bls12_381_g2 }; - type VetKdKeyId = { curve : VetKdCurve; name : Text }; - type VetKdPublicKeyRequest = { canister_id : ?Principal; context : Blob; key_id : VetKdKeyId }; - type VetKdPublicKeyResponse = { public_key : Blob }; - type VetKdDeriveKeyRequest = { input : Blob; context : Blob; transport_public_key : Blob; key_id : VetKdKeyId }; - type VetKdDeriveKeyResponse = { encrypted_key : Blob }; - - let managementCanister : actor { - vetkd_public_key : VetKdPublicKeyRequest -> async VetKdPublicKeyResponse; - vetkd_derive_key : VetKdDeriveKeyRequest -> async VetKdDeriveKeyResponse; - } = actor "aaaaa-aa"; - - let context : Blob = Text.encodeUtf8("my_app_v1"); - // "test_key_1" for local + mainnet testing, "key_1" for production - func keyId() : VetKdKeyId = { curve = #bls12_381_g2; name = "key_1" }; - // ... -``` - -Implement the public key and key derivation endpoints: - -```motoko - public shared func getPublicKey() : async Blob { - // vetkd_public_key does not require cycles - let response = await managementCanister.vetkd_public_key({ - canister_id = null; context; key_id = keyId(); - }); - response.public_key - }; - - public shared ({ caller }) func deriveKey(transportPublicKey : Blob) : async Blob { - // caller captured before the await; key_1 costs ~26B cycles - let response = await (with cycles = 26_000_000_000) managementCanister.vetkd_derive_key({ - input = Principal.toBlob(caller); - context; - transport_public_key = transportPublicKey; - key_id = keyId(); - }); - response.encrypted_key - }; -}; -``` - -### Frontend: decrypt and use the vetKey - -The frontend generates a transport key pair, sends the public half to the canister, receives the encrypted derived key, and decrypts it locally. - -Generate a fresh transport key pair each session, then request and decrypt the vetKey: - -```typescript -import { TransportSecretKey, DerivedPublicKey, EncryptedVetKey } from "@dfinity/vetkeys"; - -// 1. Generate an ephemeral transport key: new one each session -const transportSecretKey = TransportSecretKey.fromSeed(crypto.getRandomValues(new Uint8Array(32))); -const transportPublicKey = transportSecretKey.publicKey(); - -// 2. Request encrypted vetkey and verification key from the canister -const [encryptedKeyBytes, verificationKeyBytes] = await Promise.all([ - backendActor.get_encrypted_vetkey(subkeyId, transportPublicKey), - backendActor.get_vetkey_verification_key(), -]); - -// 3. Decrypt the vetkey using the transport secret -const vetKey = EncryptedVetKey.deserialize(new Uint8Array(encryptedKeyBytes)) - .decryptAndVerify( - transportSecretKey, - DerivedPublicKey.deserialize(new Uint8Array(verificationKeyBytes)), - new Uint8Array(subkeyId), - ); -``` - -Use the vetKey to derive a symmetric AES-GCM key and encrypt/decrypt data: - -```typescript -// 4. Derive a 256-bit AES key from the vetKey material -const aesKey = await crypto.subtle.importKey( - "raw", - vetKey.toDerivedKeyMaterial().data.slice(0, 32), - { name: "AES-GCM" }, - false, - ["encrypt", "decrypt"], -); - -// 5. Encrypt data -const iv = crypto.getRandomValues(new Uint8Array(12)); -const ciphertext = await crypto.subtle.encrypt( - { name: "AES-GCM", iv }, - aesKey, - new TextEncoder().encode("secret message"), -); - -// 6. Decrypt data -const plaintext = await crypto.subtle.decrypt({ name: "AES-GCM", iv }, aesKey, ciphertext); -``` - -### Common mistakes - -- **Reusing transport keys across sessions.** Generate a fresh transport key pair for each session. If an attacker ever learns the transport secret, they can decrypt all keys derived while that secret was in use. -- **Using derived key bytes directly as an AES key.** The `encrypted_key` field from `vetkd_derive_key` is an encrypted blob. After decryption, call `toDerivedKeyMaterial()` before using for AES: do not use the raw bytes directly. -- **Putting secret data in the `input` field.** The `input` field is sent to the management canister in plaintext and serves as a key identifier (e.g., a user principal or document ID). Never use it for actual secret data. -- **Inconsistent context values.** The `context` field on the canister and on the frontend must match exactly. A mismatch causes silent decryption failure. - -## Identity-based encryption (IBE) - -IBE lets you encrypt to an identity (such as a user's principal) without the recipient being online or having registered a key. Anyone who knows the canister's derived public key can encrypt to any principal. The recipient later authenticates to the canister, obtains their vetKey, and decrypts locally. - -This is useful for private messaging, sealed auctions, and any case where you want to encrypt data "to" a principal who will retrieve it later. - -> **Access control:** If you implement IBE without using `KeyManager` or `EncryptedMaps`, your canister must verify that `caller == recipient_principal` before calling `vetkd_derive_key`. Without this check, any caller can request any derived key and decrypt messages meant for someone else. The `ic-vetkeys` library handles this automatically. - -**TypeScript IBE example: encrypt (sender side):** - -```typescript -import { IbeCiphertext, IbeIdentity, IbeSeed } from "@dfinity/vetkeys"; - -// No canister call needed if the public key is already known -const recipientIdentity = IbeIdentity.fromBytes(recipientPrincipalBytes); -const ciphertext = IbeCiphertext.encrypt( - derivedPublicKey, recipientIdentity, - new TextEncoder().encode("secret message"), - IbeSeed.random(), -); -const serialized = ciphertext.serialize(); // store this onchain (ciphertext, not plaintext) -``` - -**TypeScript IBE example: decrypt (recipient side):** - -```typescript -import { TransportSecretKey, DerivedPublicKey, EncryptedVetKey, IbeCiphertext } from "@dfinity/vetkeys"; - -// Recipient authenticates to the canister to obtain their vetKey -const transportSecretKey = TransportSecretKey.fromSeed(crypto.getRandomValues(new Uint8Array(32))); -const [encryptedKeyBytes, verificationKeyBytes] = await Promise.all([ - backendActor.get_encrypted_vetkey(subkeyId, transportSecretKey.publicKey()), - backendActor.get_vetkey_verification_key(), -]); -const vetKey = EncryptedVetKey.deserialize(new Uint8Array(encryptedKeyBytes)) - .decryptAndVerify( - transportSecretKey, - DerivedPublicKey.deserialize(new Uint8Array(verificationKeyBytes)), - new Uint8Array(subkeyId), - ); - -const decrypted = IbeCiphertext.deserialize(serialized).decrypt(vetKey); -// decrypted is Uint8Array containing "secret message" -``` - -### Deriving public keys offline - -You can derive the canister's public key for a given context without making a canister call. This is useful for IBE encryption when the recipient is offline: - -```typescript -import { MasterPublicKey, DerivedPublicKey } from "@dfinity/vetkeys"; - -// Derive offline from the known mainnet master public key -const masterKey = MasterPublicKey.productionKey(); -const canisterKey = masterKey.deriveCanisterKey(canisterId); -const derivedKey: DerivedPublicKey = canisterKey.deriveSubKey( - new TextEncoder().encode("my_app_v1"), -); -// Use derivedKey for IBE encryption without any network calls -``` - -For complete IBE and encrypted storage examples, see: -- [Password manager](https://github.com/dfinity/examples/tree/master/rust/vetkeys/password_manager): encrypted key-value storage with `EncryptedMaps` -- [Encrypted notes app](https://github.com/dfinity/examples/tree/master/rust/vetkeys/encrypted_notes_dapp_vetkd): per-user encrypted note storage -- [IBE example](https://github.com/dfinity/examples/tree/master/rust/vetkeys/basic_ibe): identity-based encryption with Internet Identity principals - -## Certified variables for data authenticity - -Query calls on ICP run on a single replica and are not verified by consensus. A malicious or faulty replica could return fabricated data. Certified variables solve this: the canister stores a Merkle root hash in the subnet's certified state during update calls, and query responses include a subnet BLS signature proving the data is authentic. - -Use certified variables when: -- Query responses must be verifiable by clients without trusting any single replica -- You serve data that could change (balances, configuration, records) via fast query calls -- Your frontend needs to verify that data hasn't been tampered with in transit - -For the full implementation guide, including Merkle tree construction, witness generation, and frontend verification, see [Certified variables](../backends/certified-variables.md). - -**Key rules:** -- `certified_data_set` may only be called during update calls (not query calls) -- You can only certify 32 bytes: build a Merkle tree and certify the root hash -- Re-certify data in `post_upgrade`: certified data is cleared on upgrade -- Clients must verify certificate freshness (the certificate embeds a timestamp; reject certificates older than ~5 minutes) - -## Signature verification for external data - -When your canister receives data from external parties (signed messages, X.509 CSRs, or HTTP request signatures) it must verify the cryptographic signature before trusting the data. ICP verifies signatures on ingress messages automatically, but canister-to-canister or external data flows require manual verification. - -### IC ingress message signatures - -Every ingress call to a canister is signed by the caller's identity. The IC verifies these signatures automatically before the message reaches your canister: you do not need to verify them yourself. The `caller` principal in your canister method is already authenticated. - -For workflows that require additional independent verification (such as verifying a message offline or in a different context), the IC uses the following signature schemes: - -- **Ed25519**: used by Internet Identity and many wallet implementations -- **ECDSA on secp256r1 (P-256)**: used by some hardware authenticators -- **ECDSA on secp256k1**: used by Bitcoin-compatible wallets - -To verify IC signatures independently (outside the IC, or as a second layer of validation), use the `ic-validator-ingress-message` Rust crate or the `@dfinity/standalone-sig-verifier-web` JavaScript library. See the [independently verifying IC signatures (Rust)](https://github.com/dfinity/ic/tree/master/rs/validator) documentation, or the [`@dfinity/standalone-sig-verifier-web` npm package](https://www.npmjs.com/package/@dfinity/standalone-sig-verifier-web) for the JavaScript path. - -### X.509 certificate handling - -Canisters can act as certificate authorities using threshold signing keys. Because no single node ever holds the threshold private key, only the canister (via consensus) can sign certificates: this gives you a CA whose private key cannot be exfiltrated. - -The pattern: a canister generates a root CA certificate signed with its threshold Ed25519 or ECDSA key, then issues child certificates for CSRs submitted by external parties. Certificates can be verified by any standard X.509 tool. - -For a complete working example in Rust, see the [x509 example](https://github.com/dfinity/examples/tree/master/rust/x509), which demonstrates: - -1. Creating a root CA certificate with a threshold signing key -2. Issuing child certificates from externally provided CSRs (in PKCS#10/PEM format) -3. Verifying ownership of the CSR before signing - -The key pattern for issuing a child certificate: - -```rust -// Verify the CSR signature before trusting its contents -verify_certificate_request_signature(&cert_req)?; - -// Verify the caller owns the key in the CSR -prove_ownership(&cert_req, ic_cdk::api::caller())?; - -// Sign the child certificate using the canister's threshold key -// (ed25519_sign or ecdsa_sign via management canister) -``` - -This approach is used when you need to issue certificates to external systems that expect standard PKI infrastructure, while keeping the CA private key under threshold-protected control. - -## Deploying and testing - -### Local development - -```bash -# Start a local network: test_key_1 and key_1 are provisioned automatically -icp network start -d - -# Deploy your canister -icp deploy backend - -# Test public key retrieval -icp canister call backend getPublicKey '()' -# Returns: (blob "..."): the vetKD public key for your canister - -# Test key derivation (requires a 48-byte transport public key blob) -# In practice, the frontend generates this using TransportSecretKey.fromSeed() -icp canister call backend deriveKey '(blob "\00\01\02...")' -# Returns: (blob "..."): the encrypted derived key -``` - -### Mainnet deployment - -```bash -# Deploy to mainnet -icp deploy backend -e ic - -# Verify the public key is non-empty -icp canister call backend getPublicKey '()' -e ic -``` - -Confirm that: -- `getPublicKey` returns a non-empty blob (48+ bytes of BLS public key material) -- `deriveKey` returns a non-empty blob (encrypted key material) -- Different callers receive different derived keys (same caller + same input = same key; different caller = different key) - -## Next steps - -- [vetKeys concept guide](../../concepts/vetkeys.md): how the threshold key derivation protocol works -- [Encryption guide](./encryption.md): vetKeys encryption patterns including EncryptedMaps -- [Certified variables](../backends/certified-variables.md): full certified data implementation -- [Security model](../../concepts/security.md): IC security guarantees and threat model - - diff --git a/docs/guides/security/data-storage.md b/docs/guides/security/data-storage.md new file mode 100644 index 0000000..960491f --- /dev/null +++ b/docs/guides/security/data-storage.md @@ -0,0 +1,94 @@ +--- +title: "Data Storage" +description: "Security best practices for canister data storage, stable memory, encryption of sensitive data, and backups." +sidebar: + order: 3 +--- + +## Rust: Use `thread_local!` with `Cell/RefCell` for state variables and put all your globals in one basket + +### Security concern + +Canisters need a global mutable state. In Rust, there are several ways to achieve this. However, some options can lead to vulnerabilities such as memory corruption. + +### Recommendation + +- [Use `thread_local!` with `Cell/RefCell` for state variables](https://mmapped.blog/posts/01-effective-rust-canisters.html#use-threadlocal) (from [effective Rust canisters](https://mmapped.blog/posts/01-effective-rust-canisters.html)). + +- [Put all your globals in one basket](https://mmapped.blog/posts/01-effective-rust-canisters.html#clear-state) (from [effective Rust canisters](https://mmapped.blog/posts/01-effective-rust-canisters.html)). + +## Limit the amount of data that can be stored in a canister per user + +### Security concern + +If a user is able to store a big amount of data on a canister, this may be abused to fill up the canister storage and make the canister unusable. + +### Recommendation + +Limit the amount of data that can be stored in a canister per user. This limit has to be checked whenever data is stored for a user in an update call. + +## Consider using stable memory, version it, and test it + +### Security concern + +Canister memory is not persisted across upgrades. If data needs to be kept across upgrades, you may serialize the canister memory in `pre_upgrade` and deserialize it in `post_upgrade`. Using `pre_upgrade` and `post_upgrade` methods is not recommended and should be avoided. The available number of instructions for these methods is limited. If the memory grows too big, the canister can no longer be updated. + +### Recommendation + +- Stable memory is persisted across upgrades and can be used to address this issue. + +- [Consider using stable memory](https://mmapped.blog/posts/01-effective-rust-canisters.html#stable-memory-main) (from [effective Rust canisters](https://mmapped.blog/posts/01-effective-rust-canisters.html)). Take note of the discussed disadvantages. + +- [Version stable memory](https://mmapped.blog/posts/01-effective-rust-canisters.html#version-stable-memory) (from [effective Rust canisters](https://mmapped.blog/posts/01-effective-rust-canisters.html)). + +- [Test the upgrade hooks](https://mmapped.blog/posts/01-effective-rust-canisters.html#test-upgrades) (from [effective Rust canisters](https://mmapped.blog/posts/01-effective-rust-canisters.html)). + +- See also the section on upgrades in [how to audit an ICP canister](https://www.joachim-breitner.de/blog/788-How_to_audit_an_Internet_Computer_canister) (focused on Motoko canisters). + +- Write tests for stable memory to avoid bugs. + +- Some libraries commonly used are: + + - [https://github.com/dfinity/stable-structures](https://github.com/dfinity/stable-structures) + + - [https://github.com/seniorjoinu/ic-stable-memory](https://github.com/seniorjoinu/ic-stable-memory) + +:::caution +Please note some of these libraries may be partially unfinished. +::: + +- See [current limitations of the Internet Computer](https://wiki.internetcomputer.org/wiki/Current_limitations_of_the_Internet_Computer), sections "Long running upgrades" and "\[de\]serializer requiring additional Wasm memory." + +- For example, [Internet Identity](https://github.com/dfinity/internet-identity) uses stable memory directly to store user data. + +## Consider encrypting sensitive data on canisters + +### Security concern + +By default, canisters provide integrity but not confidentiality. Data stored on canisters can be read by nodes/replicas. + +### Recommendation + +- Consider end-to-end encrypting any private or personal data (e.g., a user's personal or private information) on canisters. + +- The example dapp [encrypted notes](https://github.com/dfinity/examples/tree/master/motoko/encrypted-notes-dapp) illustrates how end-to-end encryption can be done. + +## Create backups + +### Security concern + +A canister could be rendered unusable and impossible to upgrade. For example, due to one of the following reasons: + +- It has a faulty upgrade process due to some bug from the dapp developer. + +- The state becomes inconsistent or corrupt because of a bug in the code that persists data. + +### Recommendation + +- Make sure methods used in upgrading are tested, or the canister becomes immutable. + +- It may be useful to have a disaster recovery strategy that makes it possible to reinstall the canister. + +- See the "Backup and recovery" section in [how to audit an Internet Computer canister](https://www.joachim-breitner.de/blog/788-How_to_audit_an_Internet_Computer_canister). + + diff --git a/docs/guides/security/decentralization.md b/docs/guides/security/decentralization.md new file mode 100644 index 0000000..14bcdf5 --- /dev/null +++ b/docs/guides/security/decentralization.md @@ -0,0 +1,76 @@ +--- +title: "Decentralization" +description: "Security best practices for decentralizing dapp control using SNS, governance, and reducing centralized trust." +sidebar: + order: 10 +--- + +## Use a decentralized governance system like SNS to put dapps under decentralized control + +### Security concerns + +If single entities or small groups control canisters, they can apply changes or updates whenever they like. If a canister, e.g., holds assets such as ICP, ckBTC, or ckETH on a user's behalf, this effectively means that the controller could decide at any time to steal these funds through methods such as updating the canister and transferring the assets to their account. + +Furthermore, the controller of canisters serving web content (such as e.g., the asset canister) could maliciously modify the web application to e.g., steal user funds or perform security-sensitive actions on the user's behalf. For example, if [Internet Identity](../authentication/internet-identity.mdx) is used, the user principal's private key for the given origin is stored in the browser storage, and a malicious app can therefore fully control the private key, the user's session, and any assets controlled by that key. + +Dapps are commonly reachable over their own custom domain name instead of ic0.app. These domains are registered with a DNS registrar by one of the developers. The developer can choose to have this domain point at a completely different web application, even one not hosted on ICP. Users will trust this domain and the app it serves. This could allow such a developer to steal funds, leak data, etc. + +A dapp might have privileged features that are only accessible to principals that are on an allow list. For example, minting new tokens, debugging functions, managing permissions, removing NFTs for digital rights violations, etc. This means that whoever controls that principal (such as the dapp developers) may have central control over these privileged features. + +For performance or privacy reasons, some components of a dapp may be hosted off-chain. These off-chain components often control principals used to interact with the onchain components and are usually controlled by a developer holding credentials to the off-chain cloud environment. On top of that, third party off-chain entities such as cloud providers can inspect and manipulate data in this environment if they choose. They could take ICP principal private keys out of this environment and call privileged operations on the canisters. Off-chain components can quickly lead to many additional centrally trusted parties. Depending on the value managed by a dapp, these parties could be tempted to act maliciously. + +### Recommendations + +In the following list, we first provide recommendations for centralized dapp control and then move to recommendations for increasingly decentralized settings. From a security perspective, more decentralization is favorable. The following list could also be used as a basis for assessing a dapp's level of decentralization. This is just a set of recommendations and may be incomplete. + +1. **The dapp uses central, off-chain components:** The application makes use of centralized components such as those running in the cloud. The owners of these cloud services have full control over the application and assets managed by it. Your application should likely be further decentralized by avoiding central components. But while you have them, [securely manage your keys in the cloud](https://cloudsecurityalliance.org/research/topics/cloud-key-management/). +2. **The dapp is controlled by the developer team:** Your project is not under decentralized control, for example, because it is in an early development stage or does not (yet) hold significant funds. In that case, it is recommended to manage access to your canisters securely and ideally not let individuals control the application. To achieve that, consider the following: + - Require approval by several individuals or parties to perform any canister controller operations. + - Require approval by several individuals or parties for any security-sensitive changes at the application level that are restricted to privileged principals, such as admin operations including permissions management, minting new tokens, removing NFTs for digital rights violations, etc. + - A helpful tool to achieve either of the above two points is the [orbit station canister](https://github.com/dfinity/orbit) which allows you to configure intricate policies for canister control. [Orbit](https://orbit.global/) also serves as an enterprise wallet where token funds are governed using policies. Ideally, individuals also manage their key material using hardware security modules, such as [YubiHSM](https://www.yubico.com/ch/store/yubihsm-2-series/) and physically protect these through methods such as using safes at different geographical locations. Some of HSMs support threshold signature schemes, which can help to further secure the setup. To increase transparency about the changes made to a dapp, consider using a tool like [LaunchTrail](https://github.com/spinner-cash/launchtrail). +3. **Full decentralization using a DAO**: The dapp is controlled by a decentralized governance system such as ICP's [Service Nervous System (SNS)](https://learn.internetcomputer.org/hc/en-us/articles/34084394684564-SNS-Service-Nervous-System), so that any security-sensitive changes to the canisters are only executed if the SNS community approves them collectively through a proposal voting mechanism. If an SNS is used: + - Make sure voting power is distributed over many independent entities such that there is not one single or a few entities that can decide by themselves how the [DAO evolves](https://learn.internetcomputer.org/hc/en-us/articles/34088279488660-Tokenomics#voting-power-and-decentralization). + - Ensure all components of the dapp are under SNS control, including the canisters serving the web frontends; see [SNS asset canisters](../governance/managing.md). + - Consider the [SNS preparation checklist](../governance/launching.md). Important points from a security perspective are tokenomics, disclosing dependencies to off-chain components, and performing security reviews. + - Rather than self-deploying the SNS code or building your own DAO, consider using the official SNS on the SNS subnet, as this guarantees that the SNS is running an NNS-blessed version and maintained as part of ICP. + - See also [verification and trust in a (launched) SNS](https://wiki.internetcomputer.org/wiki/Verification_and_trust_in_a_(launched)_SNS) and [SNS decentralization swap trust](https://wiki.internetcomputer.org/wiki/SNS_decentralization_swap_trust). + +An alternative to DAO control (3. above) would be to create an immutable canister smart contract by removing the canister controller completely. This can be achieved by setting the controller to a [black hole canister](https://github.com/ninegua/ic-blackhole). However, note that this implies that the canister can **never** be upgraded, which may have severe implications in case a bug is found. The complexity of ICP dapps and the fact that complex frontends are hosted onchain means that black holed canisters are rarely the right solution. The option to use a decentralized governance system and thus being able to upgrade smart contracts is a big advantage of the ICP ecosystem compared to other blockchains. + +:::note +Contrary to some other blockchains, immutable smart contracts need cycles to run, and they can receive cycles. +::: + +It is also possible to implement a DAO on ICP from scratch. If you decide to do this (e.g., along the lines of the [basic DAO example](https://github.com/dfinity/examples/tree/master/rust/basic_dao)), be aware that this is security critical and must be security reviewed carefully. Furthermore, users will need to verify that the DAO is controlled by itself. + +## Verify the control and level of decentralization of smart contracts you depend on + +### Security concern + +If a dapp depends on a third-party canister smart contract (e.g., by making inter-canister calls to it), it is important to verify that the callee satisfies an appropriate level of decentralization. For example: +- If funds or cycles are transferred to a third-party canister, one might require the canister to be controlled by a decentralized governance system, as otherwise these funds are centrally controlled. +- If inter-canister calls are made to a centrally controlled and potentially malicious canister, that canister could execute a denial of service attack on the caller or even trigger functional bugs; see [be aware of the risks involved in calling untrustworthy canisters](./inter-canister-calls.md#be-aware-of-the-risks-involved-in-calling-untrustworthy-canisters). + +### Recommendation + +If you interact with a canister that you require to be decentralized, make sure it is controlled by the NNS, a service nervous system (SNS) or a decentralized governance system, and review under what conditions and by whom the smart contract can be changed. + +## Don't load JavaScript or other assets from untrusted domains + +### Security concern + +Loading untrusted JavaScript from domains other than `.icp0.io` means you completely trust that domain. Also, assets loaded from these domains (incl. `.raw.icp0.io`) will not use asset certification. + +If they deliver malicious JavaScript, they can take over the web app or account. This could, for example, happen by reading the private key managed by the ICP JavaScript agent from the browser's local storage. + +Note that also loading other assets such as [CSS](https://xsleaks.dev/docs/attacks/css-injection/) from untrusted domains is a security risk. + +### Recommendation + +- Loading JavaScript and other assets from other origins should be avoided. Especially for security-critical applications, you can't assume other domains to be trustworthy. + +- Make sure all the content delivered to the browser is served and certified by the canister using asset certification. This holds in particular for any JavaScript, but also for fonts, CSS, etc. + +- Use a [content security policy](https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP) to prevent scripts and other content from other origins from being loaded at all. See also [define security headers, including a content security policy (CSP)](./overview.md#web-security). + + diff --git a/docs/guides/security/dos-prevention.md b/docs/guides/security/dos-prevention.md index 946284d..5cc8a93 100644 --- a/docs/guides/security/dos-prevention.md +++ b/docs/guides/security/dos-prevention.md @@ -1,344 +1,62 @@ --- -title: "DoS Prevention" -description: "Protect canisters from denial-of-service attacks with rate limiting, cycle drain protection, and resource management" +title: "Denial of Service Prevention" +description: "Security best practices for protecting canisters against DoS and DDoS attacks, noisy neighbors, and expensive calls." sidebar: - order: 4 + order: 7 --- -On ICP, [canisters pay for every message they process](../../concepts/cycles.md): including messages from attackers. Anyone on the internet can send update calls to your canister, and each call burns cycles even if your code ultimately rejects it. Left unmitigated, this lets an attacker drain your cycle balance by flooding your canister with messages. +## Protect against DoS and DDoS attacks -This guide covers the patterns that protect against denial-of-service (DoS) attacks: early message filtering, rate limiting, resource allocation, and cycle monitoring. +### Security concern -## Checklist +A denial of service (DoS) attack aims to make a system unavailable by overwhelming it with requests or data. A Distributed Denial of Service (DDoS) attack is a more sophisticated version, where the attack originates from multiple sources, making it harder to block. An attacker will typically search for operations that are free to be executed by anyone but which are expensive for the application in terms of certain resources such as storage, memory usage, network bandwidth, computing resources, etc. In the case of canisters, such attacks can aim to deplete cycles, making the canister unable to process legitimate requests. The reverse gas model means that a dapp needs to implement strategies to deal with this. -- [ ] Use `canister_inspect_message` to drop obviously invalid messages before Candid decoding -- [ ] Reject the anonymous principal in every endpoint that requires authentication -- [ ] Enforce per-caller rate limits or concurrency locks for expensive operations -- [ ] Set a conservative freezing threshold (90–180 days) -- [ ] Set explicit `wasm_memory_limit` to guard against memory exhaustion -- [ ] Set `wasm_memory_threshold` to receive an `on_low_wasm_memory` hook notification before the limit is hit -- [ ] Monitor cycle balances and alert on unusual consumption spikes -- [ ] Reserve compute or memory allocation for high-traffic canisters +### Recommendation -## Cycle drain attacks +To protect your canisters from DoS and DDoS attacks, consider the following strategies: +* **Bot prevention techniques**: Use methods like captchas or proof of work to ensure only legitimate users can access your canister. CAPTCHAs help verify that the user is human, while proof of work requires the user to spend computational resources to proceed, deterring automated attacks. [Internet Identity](https://github.com/dfinity/internet-identity) has a [captcha implementation](https://github.com/dfinity/internet-identity/blob/2bf92dc16371428a3dcc1115580a691842ec76df/src/internet_identity/src/main.rs#L517) that can serve as an example for implementing this in other projects. +* **Monitor cycles usage**: Regularly track your canisters cycles consumption and set alerts for any sudden spikes that may indicate an attack. +* **Ingress message charging**: While charging for ingress messages (external requests to the canister) is not natively supported, custom solutions could be implemented to make sure that any expensive actions have costs associated with them. +* **Filter ingress messages using inspect message**: Certain non-critical checks can be placed in the inspect message function to filter out ingress update messages before they are executed by all nodes of a subnet. Since this code only runs on a single node, the execution does not consume cycles, but it also shouldn't be relied upon for security-critical checks such as access control. However, they can efficiently reject certain ingress messages early. Read the corresponding [documentation](../../references/ic-interface-spec/canister-interface.md#system-api-inspect-message) and [security best practice](./identity-and-access-management.mdx) carefully for the caveats. -Every ingress message (external call to your canister) costs cycles. The cost includes: +## Protect against noisy neighbors -- A base execution fee of 5M cycles per update message (13-node subnet), plus an ingress reception fee of ~1.2M cycles and 2,000 cycles per byte received -- Per-instruction fees for all code executed before a trap or rejection -- Candid decoding, which runs before your method body +### Security concern -This means an attacker can drain your cycles simply by sending many messages. The canister pays for Candid decoding and early checks even when it rejects the call. See [Cycles costs](../../references/cycles-costs.md#cost-table) for exact figures. +In a shared resource environment like the Internet Computer, multiple canisters can run on the same subnet. If one canister consumes too many resources (CPU, memory, etc.), it can negatively impact the performance of others on the same subnet. This is known as the "noisy neighbor" problem. -### Use inspect_message as a first-pass filter +### Recommendation -`canister_inspect_message` runs on a **single replica** before a message enters consensus. Code in this hook does not burn cycles, so it is an efficient place to drop messages that are obviously invalid: for example, calls from the anonymous principal to authenticated endpoints. +To mitigate the "noisy neighbor" issue, manage your canister's resource allocation effectively: +* **Memory allocation**: Memory can be reserved per canister by setting `memory_allocation`, ensuring that your canister can always allocate memory up to the requested `memory_allocation` and preventing other canisters from using up the subnet's available memory. Note that memory availability is not guaranteed beyond the memory allocation and thus monitoring actual memory usage against this value is important to avoid availability issues. +* **Compute reservation**: Similar to memory, computing power can also be reserved by setting `compute_allocation` to a value between 0 and 100, which denotes the percentage of one CPU core to be reserved for this canister. A value of 50 means that every 2 rounds, the canister will be scheduled to execute a message. This guarantees the minimal progress your canister can make, which protects against noisy neighbors. Both allocations are reserving resources for your canister on the subnet, which prevents the other canisters from using them. Hence, they come at a cost. Memory allocation is charged as if all that memory would be allocated. Compute allocation is currently charged at 10M cycles per percentage point. +Learn more about managing memory and compute resources in the [cycles costs reference](../../references/cycles-costs.md). +* **Subnet and canister distribution**: Implement a smart canister deployment strategy by monitoring the load on subnets. You can choose to deploy new canisters on less busy subnets or adopt a multi-canister architecture that balances the load across subnets. Be mindful to minimize inter-subnet communication for canisters that frequently interact with each other. Additionally, avoid deploying to known high-traffic subnets where possible, though keep in mind that resource usage can change unexpectedly with new dapps. -**Critical limitation:** `canister_inspect_message` is not a security boundary. It runs on one node and can be bypassed by a malicious boundary node. It is also never called for inter-canister calls, query calls, or management canister calls. Always duplicate real access control inside each update method. See [Access management](access-management.md) for the full access control pattern. +:::note +When the subnet grows above 750GiB, then the new reservation mechanism activates. Every time a canister allocates new storage bytes, the system sets aside some amount of cycles from the main balance of the canister. These reserved cycles will be used to cover future payments for the newly allocated bytes. The reserved cycles are not transferable, and the amount of reserved cycles depends on how full the subnet is. For example, it may cover days, months, or even years of payments for the newly allocated bytes. It is important to note that the reservation mechanism applies only to the newly allocated bytes and does not apply to the storage already in use by the canister. See more at [resource reservations](https://forum.dfinity.org/t/increasing-subnet-storage-capacity-and-introducing-resource-reservation-mechanism/23447). +::: -`inspect_message` has a budget of **200 million instructions**: do not perform expensive work here. Use it only to short-circuit calls that are structurally invalid (wrong caller type, missing required data). +## Handle expensive calls -**Motoko: inspect_message:** +### Security concern -```motoko -import Principal "mo:core/Principal"; +Some calls (update or query) might be expensive in terms of the memory or cycles they consume. For example, any function using chain-key signing or HTTPS outcalls is relatively expensive. See the [additional documentation](../../references/cycles-costs.md) on cycles cost details and for other examples. -// Inside the persistent actor { ... } +An attacker will target expensive calls to drain the cycles balance or available memory quickly. -system func inspect( - { - caller : Principal; - msg : { - #adminAction : () -> (); - #publicAction : () -> (); - #expensiveOperation : () -> (); - } - } -) : Bool { - switch (msg) { - // Admin and expensive methods: reject anonymous callers before Candid decoding - case (#adminAction _) { not Principal.isAnonymous(caller) }; - case (#expensiveOperation _) { not Principal.isAnonymous(caller) }; - // Public methods: accept all - case (_) { true }; - }; -}; -``` +### Recommendation -**Rust: inspect_message:** +* **Use captchas**: Expensive operations should require a captcha to be solved. Try to use a library to implement a captcha instead of a cloud service, as such a service would require HTTPS outcalls and isn't decentralized. +* **Use PoW (proof-of-work)**: Require a proof-of-work challenge to be solved by the client for any expensive operation. The parameters need to be carefully chosen to require sufficient computation per call to the expensive operation without creating too much impact for legitimate clients. Don't forget to consider clients on slow and older mobile devices while protecting against attackers on modern multi-GPU systems. Certain algorithms can limit the performance increase of GPUs to improve this uneven battlefield. +* **Charge for expensive calls**: You can require that certain expensive calls from other canisters include cycles to compensate for the resources consumed. In addition, one can charge for ingress messages. However, that is not currently supported by the protocol itself, and a custom solution, such as pre-paying a certain amount, would need to be designed. +* **Differentiate between update and query calls**: Expensive computations should generally be avoided for update calls unless absolutely necessary. While query calls are not authenticated, they are faster and less resource-intensive. To check whether a method was called as a query or update call, you can use `ic0.in_replicated_execution()`. -```rust -use ic_cdk::api::{accept_message, msg_caller, msg_method_name}; -use candid::Principal; +### Further recommendations -/// Pre-filter to reduce cycle waste from spam. -/// Runs on ONE node. Can be bypassed. NOT a security check. -/// Always duplicate real access control inside each method. -#[ic_cdk::inspect_message] -fn inspect_message() { - let method = msg_method_name(); - match method.as_str() { - // Admin and expensive methods: reject anonymous callers - "admin_action" | "expensive_operation" => { - if msg_caller() != Principal::anonymous() { - accept_message(); - } - // Silently reject anonymous: saves cycles on Candid decoding - } - // Public methods: accept all - _ => accept_message(), - } -} -``` +- Automatically monitor cycles consumption and set appropriate alerts for cycles consumption rate and balance. Sudden spikes in cycles consumption could indicate an attack. +- Implement early authentication and rate limiting for your canisters. +- Be aware of attacks targeting high cycles-consuming calls. +- See the "Cycle balance drain attacks section" in [How to audit an ICP canister](https://www.joachim-breitner.de/blog/788-How_to_audit_an_Internet_Computer_canister). -### Rate limiting and per-caller locking - -For expensive operations (chain-key signing, HTTPS outcalls, large state writes), enforce per-caller concurrency limits. Allowing the same caller to queue up many concurrent requests multiplies the cost of any single caller's flood. - -The CallerGuard pattern prevents concurrent calls from the same principal. While the guard is held, any second call from the same caller is rejected immediately: before any expensive work runs. - -**Motoko: per-caller concurrency lock:** - -```motoko -import Map "mo:core/Map"; -import Principal "mo:core/Principal"; -import Result "mo:core/Result"; - -// Inside the persistent actor { ... } - -let pendingRequests = Map.empty(); - -func acquireGuard(principal : Principal) : Result.Result<(), Text> { - if (Map.get(pendingRequests, Principal.compare, principal) != null) { - return #err("already processing a request for this caller"); - }; - Map.add(pendingRequests, Principal.compare, principal, true); - #ok; -}; - -func releaseGuard(principal : Principal) { - ignore Map.delete(pendingRequests, Principal.compare, principal); -}; - -public shared ({ caller }) func expensiveOperation() : async Result.Result { - // 1. Reject anonymous - if (Principal.isAnonymous(caller)) { - return #err("anonymous principal not allowed"); - }; - - // 2. Acquire per-caller lock: rejects concurrent calls from same principal - switch (acquireGuard(caller)) { - case (#err(msg)) { return #err(msg) }; - case (#ok) {}; - }; - - // 3. Do expensive work (async calls, etc.) - try { - let result = await someExpensiveCall(); - #ok(result) - } catch _ { - #err("operation failed") - } finally { - // Released in cleanup context: runs even if the callback traps - releaseGuard(caller); - }; -}; -``` - -**Rust: per-caller concurrency lock (CallerGuard):** - -```rust -use std::cell::RefCell; -use std::collections::BTreeSet; -use candid::Principal; -use ic_cdk::update; -use ic_cdk::api::msg_caller; - -thread_local! { - static PENDING: RefCell> = RefCell::new(BTreeSet::new()); -} - -struct CallerGuard { - principal: Principal, -} - -impl CallerGuard { - fn new(principal: Principal) -> Result { - PENDING.with(|p| { - if !p.borrow_mut().insert(principal) { - return Err("already processing a request for this caller".to_string()); - } - Ok(Self { principal }) - }) - } -} - -impl Drop for CallerGuard { - fn drop(&mut self) { - PENDING.with(|p| { - p.borrow_mut().remove(&self.principal); - }); - } -} - -#[update] -async fn expensive_operation() -> Result { - let caller = msg_caller(); - if caller == Principal::anonymous() { - return Err("anonymous principal not allowed".to_string()); - } - - // Acquire per-caller lock: Drop releases it even if the callback traps - let _guard = CallerGuard::new(caller)?; - - // Do expensive work: use Call::bounded_wait for inter-canister calls - // to avoid unbounded waits that would block canister upgrades - let result = do_expensive_work().await?; - Ok(result) - // _guard dropped here -> lock released -} -``` - -The guard releases automatically when it goes out of scope: including when an inter-canister call callback traps. Never use `let _ = CallerGuard::new(caller)?` (this drops the guard immediately, making locking ineffective). Always bind to a named variable (`let _guard`). - -### Proof-of-work and captchas for public endpoints - -For endpoints that must accept anonymous or unauthenticated callers: for example, a public registration flow. The per-caller lock pattern cannot apply. Instead, require the caller to prove they spent computational resources: - -- **Captcha:** Require solving a captcha before calling an expensive endpoint. Use a library-based captcha (not a cloud service) to keep the solution onchain and avoid HTTPS outcalls. -- **Proof of work:** Require the client to include a nonce that satisfies a hash challenge. The canister verifies the nonce in `inspect_message` before accepting the message. This imposes CPU cost on the caller proportional to the difficulty parameter. - -[Internet Identity](https://github.com/dfinity/internet-identity)'s [captcha implementation](https://github.com/dfinity/internet-identity/blob/2bf92dc16371428a3dcc1115580a691842ec76df/src/internet_identity/src/main.rs#L517) provides a working example. - -## Resource limit awareness - -The IC enforces hard limits on message execution. If your canister frequently approaches these limits, a flood of requests can make it unable to serve legitimate users: - -| Limit | Value | -|-------|-------| -| Instructions per update call | 40 billion | -| Instructions per query call | 5 billion | -| Instructions per `inspect_message` | 200 million | -| Max ingress message payload | 2 MiB | -| Wasm heap memory | 4 GiB (wasm32) | -| Wasm stable memory | 500 GiB | - -Source: [Cycles costs reference](../../references/cycles-costs.md#resource-limits). - -### Prevent memory exhaustion - -If users can store data without limits, an attacker can fill the 4 GiB Wasm heap or stable memory, causing allocation failures that corrupt canister state. Mitigations: - -- **Enforce per-user storage quotas**: track bytes stored per principal and reject requests that exceed the limit. -- **Validate input sizes**: check the size of user-provided blobs, text, or arrays before storing them. -- **Set a `wasm_memory_limit`**: configures a soft ceiling below the 4 GiB hard limit. When exceeded, new update calls trap instead of corrupting state. See [Canister settings](../canister-management/settings.md). - -```yaml -# icp.yaml: memory protection (settings nested under canister name) -canisters: - - name: backend - settings: - wasm_memory_limit: 3gib - wasm_memory_threshold: 512mib # triggers on_low_wasm_memory hook -``` - -### Paginate large queries - -Data queries that return unbounded result sets can exhaust the instruction limit for a single call. An attacker can exploit this by requesting a query that processes all stored data: - -- **Always paginate**: accept an optional cursor or offset and return at most a fixed number of results per call. -- **Avoid unbounded iteration**: do not iterate entire data structures in a single call unless the data set is provably bounded. - -## Freezing threshold as a safety net - -The `freezing_threshold` setting defines the minimum number of seconds the canister should be able to survive on its current cycle balance. When the balance drops below this reserve, the canister **freezes**: update calls are rejected. A frozen canister does not execute code, but it continues to pay for storage and compute allocation. - -The default threshold is 30 days. For production canisters holding valuable state, increase it to 90–180 days: - -```bash -# Set freezing threshold to 90 days -icp canister settings update backend --freezing-threshold 7776000 -e ic -``` - -Or via `icp.yaml`: - -```yaml -# icp.yaml: settings nested under canister name -canisters: - - name: backend - settings: - freezing_threshold: 90d -``` - -A conservative freezing threshold gives you time to detect and respond to a cycle drain attack before the canister is uninstalled. If cycles reach zero and the threshold expires, the canister is uninstalled: code and data are deleted permanently. See [Canister settings](../canister-management/settings.md) for full configuration details. - -## Noisy neighbor protection - -Multiple canisters share the same subnet. If a neighboring canister consumes excessive compute or memory, it can slow your canister's response times. You can reserve resources to protect against this: - -### Compute allocation - -Setting `compute_allocation` guarantees your canister a percentage of an execution core and ensures scheduled execution even when the subnet is busy: - -```yaml -# icp.yaml: settings nested under canister name -canisters: - - name: backend - settings: - compute_allocation: 10 # Guaranteed 10% of one execution core -``` - -A value of `10` means the canister is scheduled at least every 10 consensus rounds. Compute allocation incurs an ongoing rental fee (10M cycles per percentage point per second on a 13-node subnet). Only set it if you need guaranteed throughput under load. See [Cycles costs](../../references/cycles-costs.md#compute-allocation). - -### Memory allocation - -Setting `memory_allocation` reserves a fixed pool of memory for your canister, preventing other canisters from consuming the subnet's available memory: - -```yaml -# icp.yaml: settings nested under canister name -canisters: - - name: backend - settings: - memory_allocation: 4gib -``` - -Memory allocation is charged as if the full allocated amount were in use. Monitor actual memory usage to avoid paying for unused allocation. - -## Monitoring cycle consumption - -Cycle drain attacks appear as unusual spikes in consumption. Set up monitoring before deploying to mainnet: - -```bash -# Check current cycle balance -icp canister status backend -e ic - -# Check balance of a specific canister by ID -icp canister status -e ic -``` - -Key metrics to monitor: - -- **Balance**: alert when balance drops below a safe threshold (e.g., 2x the freezing threshold reserve) -- **Burn rate**: track cycles per day; a sudden spike indicates unexpected activity -- **Memory usage**: track growth over time; sudden jumps suggest user-driven data accumulation - -For production canister monitoring, consider automating balance checks with a heartbeat or timer canister that sends an alert notification when the balance approaches the freezing threshold. - -## Handling expensive operations safely - -Chain-key signing (threshold ECDSA/Schnorr), HTTPS outcalls, and Bitcoin API calls are significantly more expensive than standard update calls. These make attractive targets for attackers: - -- **Require authentication**: never allow anonymous callers to trigger expensive operations. -- **Apply per-caller locking**: use the CallerGuard pattern to prevent the same caller from queuing multiple expensive calls. -- **Charge callers**: for canister-to-canister calls, require the calling canister to attach cycles to cover the cost. The called canister accepts the cycles using `ic0.msg_cycles_accept` (Rust: `ic_cdk::api::msg_cycles_accept(max_amount: u128)`). -- **Differentiate update vs. query**: move expensive computations to update calls and use query calls for cheap reads. Check whether a method is running as a query or update with `ic0.in_replicated_execution()` (Rust: `ic_cdk::api::in_replicated_execution()`). - -## Next steps - -- [Access management](access-management.md): caller checks, anonymous principal rejection, and role-based guards -- [Inter-canister call safety](inter-canister-calls.md): TOCTOU vulnerabilities and the CallerGuard pattern -- [Canister settings](../canister-management/settings.md): freezing threshold, memory allocation, and compute allocation -- [Cycles costs](../../references/cycles-costs.md#cost-table): exact cost tables and resource limits -- [Security model](../../concepts/security.md): IC trust boundaries and threat model overview - - + diff --git a/docs/guides/security/encryption.mdx b/docs/guides/security/encryption.mdx deleted file mode 100644 index 3480abb..0000000 --- a/docs/guides/security/encryption.mdx +++ /dev/null @@ -1,410 +0,0 @@ ---- -title: "Encryption with VetKeys" -description: "Encrypt and decrypt data on ICP using VetKeys for privacy, key management, and identity-based encryption" -sidebar: - order: 6 ---- - -import { Tabs, TabItem } from '@astrojs/starlight/components'; - -VetKeys enable canisters to derive cryptographic key material on demand so that clients can encrypt and decrypt data without the canister ever seeing the raw key. This guide covers the complete flow: exposing vetKD endpoints in a canister, generating a transport key pair on the frontend, and using the derived key for symmetric encryption. It also covers higher-level patterns: the `EncryptedMaps` abstraction for encrypted key-value storage, and identity-based encryption (IBE) for sending encrypted messages to a principal. - -For background on how the vetKD protocol works, see [VetKeys](../../concepts/vetkeys.md). - -## Prerequisites - - - - -Add the following to `Cargo.toml`: - -```toml -[dependencies] -candid = "0.10" -ic-cdk = "0.19" -ic-vetkeys = "0.6" -ic-stable-structures = "0.7" -serde = { version = "1", features = ["derive"] } -serde_bytes = "0.11" -``` - - - - -Add `ic-vetkeys` to `mops.toml`: - -```toml -[package] -name = "my-vetkd-app" -version = "0.1.0" - -[dependencies] -core = "2.0.0" -``` - - - - -Frontend (TypeScript): - -```bash -npm install @dfinity/vetkeys@0.4.0 -``` - -## Step 1: Expose vetKD endpoints in the backend canister - -The backend canister wraps the management canister's `vetkd_derive_key` and `vetkd_public_key` methods and enforces per-caller key isolation. The context passed to both API methods encodes the domain separator and the caller's principal, so each caller's keys are cryptographically separate and only that caller can retrieve them. - - - - -```motoko -import Array "mo:core/Array"; -import Blob "mo:core/Blob"; -import Nat8 "mo:core/Nat8"; -import Principal "mo:core/Principal"; -import Text "mo:core/Text"; - -persistent actor { - - type VetKdCurve = { #bls12_381_g2 }; - - type VetKdKeyId = { - curve : VetKdCurve; - name : Text; - }; - - type VetKdPublicKeyRequest = { - canister_id : ?Principal; - context : Blob; - key_id : VetKdKeyId; - }; - - type VetKdPublicKeyResponse = { - public_key : Blob; - }; - - type VetKdDeriveKeyRequest = { - input : Blob; - context : Blob; - transport_public_key : Blob; - key_id : VetKdKeyId; - }; - - type VetKdDeriveKeyResponse = { - encrypted_key : Blob; - }; - - let managementCanister : actor { - vetkd_public_key : VetKdPublicKeyRequest -> async VetKdPublicKeyResponse; - vetkd_derive_key : VetKdDeriveKeyRequest -> async VetKdDeriveKeyResponse; - } = actor "aaaaa-aa"; - - let domainSeparator : [Nat8] = Blob.toArray(Text.encodeUtf8("my_app_v1")); - - // Encodes domain separator + caller principal so each caller's keys are isolated. - func callerContext(caller : Principal) : Blob { - Blob.fromArray( - Array.flatten([ - [Nat8.fromNat(domainSeparator.size())], - domainSeparator, - Blob.toArray(Principal.toBlob(caller)), - ]) - ) - }; - - func keyId() : VetKdKeyId { - { curve = #bls12_381_g2; name = "test_key_1" } - // Use "key_1" for production - }; - - public shared ({ caller }) func getPublicKey() : async Blob { - let response = await managementCanister.vetkd_public_key({ - canister_id = null; - context = callerContext(caller); - key_id = keyId(); - }); - response.public_key - }; - - public shared ({ caller }) func getEncryptedVetKey( - input : Blob, - transportPublicKey : Blob, - ) : async Blob { - // test_key_1 costs ~10B cycles; key_1 costs ~26B cycles - let response = await (with cycles = 10_000_000_000) managementCanister.vetkd_derive_key({ - input; - context = callerContext(caller); - transport_public_key = transportPublicKey; - key_id = keyId(); - }); - response.encrypted_key - }; -}; -``` - - - - -```rust -use ic_cdk::update; - -const DOMAIN_SEPARATOR: &[u8] = b"my_app_v1"; - -/// Encodes domain separator + caller principal so each caller's keys are isolated. -fn caller_context(caller: candid::Principal) -> Vec { - [DOMAIN_SEPARATOR.len() as u8] - .into_iter() - .chain(DOMAIN_SEPARATOR.iter().copied()) - .chain(caller.as_slice().iter().copied()) - .collect() -} - -fn key_id() -> ic_cdk::management_canister::VetKDKeyId { - ic_cdk::management_canister::VetKDKeyId { - curve: ic_cdk::management_canister::VetKDCurve::Bls12_381_G2, - name: "test_key_1".to_string(), // Use "key_1" for production - } -} - -#[update] -async fn get_public_key() -> Vec { - let caller = ic_cdk::caller(); - let request = ic_cdk::management_canister::VetKDPublicKeyArgs { - canister_id: None, - context: caller_context(caller), - key_id: key_id(), - }; - let reply = ic_cdk::management_canister::vetkd_public_key(&request) - .await - .expect("vetkd_public_key call failed"); - reply.public_key -} - -#[update] -async fn get_encrypted_vetkey(input: Vec, transport_public_key: Vec) -> Vec { - let caller = ic_cdk::caller(); // capture before await - // test_key_1 costs ~10B cycles; key_1 costs ~26B cycles - let request = ic_cdk::management_canister::VetKDDeriveKeyArgs { - input, - context: caller_context(caller), - transport_public_key, - key_id: key_id(), - }; - let reply = ic_cdk::management_canister::vetkd_derive_key(&request) - .await - .expect("vetkd_derive_key call failed"); - reply.encrypted_key -} - -ic_cdk::export_candid!(); -``` - - - - -**Key decisions:** - -- **Context**: encodes the domain separator (`my_app_v1`) plus the caller's principal. This makes every caller's keys cryptographically separate; a key derived for one principal cannot be decrypted by another. Both `getPublicKey` and `getEncryptedVetKey` must use the same context so that `decryptAndVerify` succeeds on the frontend. -- **Input**: an additional identifier within the caller's key space (a document ID, a room ID, or an empty vector for a single per-user key). Different inputs yield different keys; the same input always yields the same key. -- **Caller capture before `await`**: always read `caller` before any `await` in an update call. - -## Step 2: Generate a transport key pair on the frontend - -The transport key pair is ephemeral. Generate it fresh for each session or each key request. - -```typescript -import { TransportSecretKey } from "@dfinity/vetkeys"; - -const transportSecretKey = TransportSecretKey.random(); -const transportPublicKey = transportSecretKey.publicKeyBytes(); -``` - -Pass `transportPublicKey` to the canister when requesting a derived key. - -## Step 3: Retrieve and decrypt the vetKey - -```typescript -import { - TransportSecretKey, - DerivedPublicKey, - EncryptedVetKey, -} from "@dfinity/vetkeys"; - -// An additional identifier within the caller's key space. -// Use an empty vector for a single per-user key, or a document/room ID for multiple. -const input = new Uint8Array(0); - -const [encryptedKeyBytes, verificationKeyBytes] = await Promise.all([ - backendActor.get_encrypted_vetkey(input, transportPublicKey), - backendActor.get_public_key(), -]); - -const verificationKey = DerivedPublicKey.deserialize( - new Uint8Array(verificationKeyBytes), -); -const encryptedVetKey = EncryptedVetKey.deserialize( - new Uint8Array(encryptedKeyBytes), -); - -// Verify and decrypt: throws if the key is malformed or was tampered with -const vetKey = encryptedVetKey.decryptAndVerify( - transportSecretKey, - verificationKey, - input, -); -``` - -## Step 4: Derive a symmetric key and encrypt data - -The raw vetKey is not used directly as an AES key. Use `toDerivedKeyMaterial()` to derive a symmetric key from it. - -```typescript -// Derive a 256-bit AES-GCM key -const aesKeyMaterial = vetKey.toDerivedKeyMaterial(); -const aesKey = await crypto.subtle.importKey( - "raw", - aesKeyMaterial.data.slice(0, 32), - { name: "AES-GCM" }, - false, - ["encrypt", "decrypt"], -); - -// Encrypt -const iv = crypto.getRandomValues(new Uint8Array(12)); -const ciphertext = await crypto.subtle.encrypt( - { name: "AES-GCM", iv }, - aesKey, - new TextEncoder().encode("secret message"), -); - -// Store ciphertext (and iv) in the canister; never store the key - -// Decrypt -const plaintext = await crypto.subtle.decrypt( - { name: "AES-GCM", iv }, - aesKey, - ciphertext, -); -``` - -Store only the ciphertext and IV in the canister; the raw key exists only in the client's memory for the duration of the session. - -## Using EncryptedMaps for encrypted key-value storage - -`EncryptedMaps` is a higher-level abstraction that combines `KeyManager` (access-controlled vetKey derivation) with encrypted storage. It manages key derivation, access control, and client-side encryption transparently. Each named map is secured with a single vetKey; all key-value pairs in the map share the same access permissions. - - - - -```typescript -import { EncryptedMaps } from "@dfinity/vetkeys/encrypted_maps"; - -// encryptedMapsClientInstance connects to your backend canister -const encryptedMaps = new EncryptedMaps(encryptedMapsClientInstance); - -const mapOwner = Principal.fromText("aaaaa-aa"); -const mapName = "passwords"; -const mapKey = "email_account"; - -// Store an encrypted value (encryption is automatic) -const value = new TextEncoder().encode("my_secure_password"); -await encryptedMaps.setValue(mapOwner, mapName, mapKey, value); - -// Retrieve and decrypt a stored value -const stored = await encryptedMaps.getValue(mapOwner, mapName, mapKey); - -// Grant another user read-write access to the map -const user = Principal.fromText("bbbbbb-bb"); -await encryptedMaps.setUserRights(mapOwner, mapName, user, { ReadWrite: null }); -``` - - - - -The backend `EncryptedMaps` component stores only ciphertext; all plaintext stays on the frontend. See the [password manager example](https://github.com/dfinity/examples/tree/master/rust/vetkeys/password_manager) (Motoko + Rust) for a full implementation, or the [password manager with metadata](https://github.com/dfinity/examples/tree/master/rust/vetkeys/password_manager_with_metadata) variant that adds unencrypted metadata alongside encrypted values. - -For the Rust backend, `EncryptedMaps` lives in `ic_vetkeys::encrypted_maps`; for TypeScript, import from `@dfinity/vetkeys/encrypted_maps`. - -## Identity-based encryption (IBE) - -IBE lets anyone encrypt a message to a principal using only the canister's public key. The recipient authenticates to the canister, obtains their corresponding vetKey, and decrypts. No prior key exchange is needed and the sender does not need the recipient to be online. - -**Encrypt (sender, no canister call needed):** - -```typescript -import { - IbeCiphertext, - IbeIdentity, - IbeSeed, - DerivedPublicKey, -} from "@dfinity/vetkeys"; - -// Derive the canister's IBE public key (fetch once, cache) -const publicKeyBytes = await backendActor.get_public_key(); -const ibePublicKey = DerivedPublicKey.deserialize(new Uint8Array(publicKeyBytes)); - -// Encrypt to the recipient's principal -const recipientIdentity = IbeIdentity.fromBytes(recipientPrincipalBytes); -const seed = IbeSeed.random(); -const plaintext = new TextEncoder().encode("secret message"); - -const ciphertext = IbeCiphertext.encrypt( - ibePublicKey, - recipientIdentity, - plaintext, - seed, -); -const serialized = ciphertext.serialize(); // store in the canister or transmit -``` - -**Decrypt (recipient, after obtaining vetKey):** - -```typescript -import { IbeCiphertext } from "@dfinity/vetkeys"; - -// Obtain the vetKey for the recipient's principal (steps 2-3 above) -const vetKey = /* ... decryptAndVerify as shown in Step 3 ... */; - -const deserialized = IbeCiphertext.deserialize(serialized); -const decrypted = deserialized.decrypt(vetKey); -// decrypted is Uint8Array of the plaintext -``` - -See the [basic IBE example](https://github.com/dfinity/examples/tree/master/rust/vetkeys/basic_ibe) (Motoko + Rust) for a complete backend and frontend implementation. For IBE with a time-based release condition (timelock encryption), see the [secret-bid auction example](https://github.com/dfinity/examples/tree/master/rust/vetkeys/basic_timelock_ibe). - -## Testing locally - -Start the local network and deploy: - -```bash -icp network start -d -icp deploy backend -``` - -The local network automatically provisions both `test_key_1` and `key_1`. Verify that your canister returns a public key: - -```bash -icp canister call backend get_public_key '()' -# Returns: (blob "...") -- 48+ bytes of BLS public key data -``` - -For `vetkd_derive_key` testing, use the [chain-key testing canister](https://github.com/dfinity/chainkey-testing-canister) (`vrqyr-saaaa-aaaan-qzn4q-cai`) on mainnet as a lower-cost alternative during development. It provides a fake vetKD implementation with no threshold. Use key name `insecure_test_key_1`. Never use it with real data or in production. - -## Common mistakes - -- **Reusing transport keys across sessions.** Each session must generate a fresh transport key pair. -- **Using the raw vetKey as an AES key.** Always call `toDerivedKeyMaterial()` first; do not pass the raw bytes to `importKey`. -- **Putting secret data in the `input` field.** The `input` is sent to the management canister in plaintext. Use it as an identifier (principal, document ID), not for the secret data itself. -- **Mismatched `context` between `getPublicKey` and `getEncryptedVetKey`.** Both endpoints must derive context from the same inputs (domain separator + caller principal). If they differ, `decryptAndVerify` will fail silently. -- **Not attaching enough cycles to `vetkd_derive_key`.** `test_key_1` costs approximately 10 billion cycles; `key_1` costs approximately 26 billion cycles. - -## Next steps - -- [VetKeys concept](../../concepts/vetkeys.md): how the vetKD protocol works and what use cases it enables -- [Data integrity](data-integrity.md): certified variables and response verification -- [Internet Identity](../authentication/internet-identity.md): authenticate users before granting access to vetKeys -- [vetkeys examples](https://github.com/dfinity/examples/tree/master/rust/vetkeys): password manager, encrypted notes, IBE messaging, BLS signing, and secret-bid auction -- [ic-vetkeys library](https://github.com/dfinity/vetkeys): Rust crate and TypeScript package source - -{/* Upstream: informed by dfinity/portal docs/building-apps/network-features/vetkeys/ (introduction.mdx, api.mdx, encrypted-onchain-storage.mdx, identity-based-encryption.mdx, dkms.mdx, timelock-encryption.mdx); dfinity/icskills vetkd; dfinity/examples rust/vetkeys/password_manager, rust/vetkeys/password_manager_with_metadata, rust/vetkeys/basic_ibe, rust/vetkeys/basic_timelock_ibe, rust/vetkeys/encrypted_notes_dapp_vetkd */} diff --git a/docs/guides/security/formal-verification.md b/docs/guides/security/formal-verification.md new file mode 100644 index 0000000..502fb99 --- /dev/null +++ b/docs/guides/security/formal-verification.md @@ -0,0 +1,34 @@ +--- +title: "Formal Verification" +description: "Applying formal verification and TLA+ model checking to find and prove the absence of security bugs in ICP canisters." +sidebar: + order: 12 +--- + +Formal verification is the highest form of quality assurance for software. Given a specification of what the system should do, formal verification tools check whether this specification is satisfied by a model of the system. The unique advantage of formal verification is that it can not only find bugs but also formally **prove** their absence, including the absence of security bugs. This goes beyond what testing or manual audits can achieve. + +The proof is always relative to the model and the specification. Any simplifications and assumptions in the model, or omissions in the specification, may hide bugs and attacks. On the other hand, verification can require a lot of effort, and model simplifications can make it significantly easier. + +For a concrete example, when verifying the ckBTC minter canister, the DFINITY team used models that exclude the possibility of calling the `update_balance` canister method more than 2 times concurrently. This potentially misses attacks that require 3 or more concurrent calls to `update_balance` to trigger a bug. But we considered such bugs highly unlikely, and, in return, we were able to run a fully automatic verification process, which was much cheaper than other verification methods. + +There are many existing formal verification tools. While none of them take into account the specifics of ICP yet, many of them are general enough that they can be applied to canisters. In particular, the DFINITY team has made good use of the TLA+ toolkit to combat reentrancy bugs, a type of concurrency bug where one method of a particular canister is called while another one is still executing. + +These bugs are particularly difficult to find, as they can involve unexpected interactions of code scattered throughout a canister or even in different canisters. The number of such code interactions may be huge and thus difficult for humans to detect. These interactions are usually difficult to test automatically and systematically, which TLA+ can do. + +## TLA+ + +The Temporal Logic of Actions (TLA+) is a language for specifying and verifying complex systems. TLA+ comes with a set of tools for lightweight formal verification in the form of so-called model checking. Through model checking, it exhaustively (within bounds, such as the aforementioned 2 concurrent calls bound) explores all possible concurrent interactions of a model of the code (exactly the domain that is difficult to test) and finds bugs. + +Importantly, after building the model of the code, model checking runs with virtually no further human input, making it highly cost-effective. To illustrate with some made-up numbers: if the industry standard practices (such as testing and security reviews) eliminate 80% of the bugs, and "heavyweight" formal verification eliminates 99.99%, with TLA+ you can eliminate 90% with a fraction of the effort of the heavyweight verification. + +We have used TLA+ to create the following models that can be interesting for dapp developers: + +- NNS and SNS governance (focusing on interactions with the ledger canister). +- ICP ledger (focusing on block archival). +- ckBTC minter. +- SNS swap canister. +- People parties dapp. + +To find out more on why and how you can apply TLA+ to your canisters and dapps, including an in-depth guide to modeling canisters, refer to our series of blog posts ([1](https://medium.com/dfinity/eliminating-smart-contract-bugs-with-tla-e986aeb6da24), [2](https://medium.com/dfinity/weeding-out-the-bugs-with-tla-models-3606045bf24e), [3](https://mynosefroze.com/blog/2023-08-09-tla_for_canisters)). You can also look at the [DFINITY-produced TLA+ models](https://github.com/dfinity/formal-models) for examples and techniques. + + diff --git a/docs/guides/security/https-outcalls.md b/docs/guides/security/https-outcalls.md new file mode 100644 index 0000000..22e244b --- /dev/null +++ b/docs/guides/security/https-outcalls.md @@ -0,0 +1,97 @@ +--- +title: "HTTPS Outcall Security" +description: "Security best practices for canister HTTPS outcalls: API keys, rate limits, idempotency, response consistency, and input validation." +sidebar: + order: 6 +--- + +## Do not store sensitive data such as API keys in canisters + +### Security concern + +Sensitive data is a broad term that varies depending on your application logic and behavior. Here is a non-exhaustive list of secrets that are typically considered sensitive, such as API keys or tokens: +* Secrets that allow interaction with non-public endpoints. +* Secrets that allow querying or modifying endpoints with confidential data. +* API tokens that are fee-based. + +By default, the data stored inside your canister is unencrypted. Therefore, if your canister is installed on a malicious replica, it can easily retrieve and steal your keys, tokens, and secrets in plain text. + +### Recommendation + +Make sure you don't store sensitive data inside your canister. + +See also: [data confidentiality on ICP](./miscellaneous.md#data-confidentiality-on-icp). + +## Ensure your canisters have a sufficiently large quota with the HTTP server + +### Security concern + +When an HTTPS outcall is performed, it is amplified by the number of replicas in the subnet. The target web server will receive not only one request but as many requests as the number of nodes in the subnet. + +Most web servers implement some sort of rate limiting; this is a mechanism used to restrict the number of requests a client can make to a web server within a specific time period, preventing abuse or excessive usage of their API(s). + +### Recommendation + +You should consider such rate limits when designing and implementing your canisters. Rate limits are enforced using different time granularities, e.g., seconds or minutes. For second-granularity enforcement, make sure that the simultaneous requests by all subnet replicas do not violate the quota. Violations may lead to temporary or permanent bans. + +See the [HTTPS outcalls guide](../backends/https-outcalls.mdx) for more details. + +## Only make HTTPS outcall requests to idempotent endpoints + +### Security concern + +As mentioned before, if an HTTPS outcall is performed, it is amplified by the number of replicas in the subnet. That means the queried endpoint will receive the same request several times. This is especially risky in requests that change the endpoint state, given that one HTTPS outcall could lead to unintentionally changing the endpoint state several times. + +### Recommendation + +Make sure the endpoints, called by an HTTPS outcall, are idempotent, such that the queried endpoint has the same behavior with the same request payload, no matter the number of times it is called. + +Some servers support the use of idempotency keys. These keys are random unique strings submitted in the HTTP request as headers. If used with the HTTPS outcalls feature, all requests sent by each honest replica will contain the same idempotency key. This allows the server to recognize duplicated requests (i.e., requests with the same idempotency key), handle just one, and modify the server state only once. Note that this is a feature that must be supported by the server. + +See the [HTTPS outcalls guide](../backends/https-outcalls.mdx) for more details. + +## Ensure HTTPS responses are identical + +### Security concern + +When replicas of a subnet receive HTTP responses, these responses must be identical. Otherwise, consensus won't be achieved, and the HTTP response will be rejected but still charged. + +### Recommendation + +Make sure the HTTP responses sent to the consensus layer are identical. + +Ideally, the HTTP responses returned by the queried endpoint would always be the same. However, most of the time this is not possible to control, and the responses include random data (e.g., the response includes timestamps, cookie values, or some sort of identifiers). In those cases, make sure to use transformation functions to guarantee that the responses received by each replica are identical by removing any random data or extracting only the relevant data. + +This applies to the HTTP response body and headers. Make sure to consider both when applying the transformation functions. Response headers are often overlooked and lead to failure because of failed consensus. + +See the [HTTPS outcalls guide](../backends/https-outcalls.mdx) for more details. + +## Be aware of HTTP request and response sizes + +### Security concern + +The pricing of HTTPS outcalls is determined by the size of the HTTP request and the maximal response size, among other variables. Thus, if big requests are made, this could quickly drain the canister's cycles balance. This can be risky in scenarios where HTTPS outcalls are triggered by user actions (rather than a heartbeat or timer invocation). + +### Recommendation + +When using HTTPS outcalls, be mindful of the HTTP request and response sizes. Ensure that the size of the request issued and the size of the HTTP response coming from the server are reasonable. + +When making an HTTPS outcall, it is possible (and highly recommended) to define the `max_response_bytes` parameter, which allows you to set the maximum allowed response size. If this parameter is not defined, it defaults to the hard response size limit of the HTTPS outcalls feature, which is 2MiB. The cycle cost of the response is always charged based on the `max_response_bytes` or 2MB if not set. + +Finally, be aware that users may incur cycles costs for HTTPS outcalls in case these calls can be triggered by user actions. + +See the [cycles costs reference](../../references/cycles-costs.md) for pricing details. + +## Perform input validation in HTTPS outcalls + +### Security concern + +HTTPS outcalls that use user-submitted data are susceptible to various injection attacks. This may lead to several issues, such as the ones previously mentioned. + +### Recommendation + +Perform input validation when using user-submitted data in the HTTPS outcalls. + +See the [OWASP Input Validation Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Input_Validation_Cheat_Sheet.html) for more information. + + diff --git a/docs/guides/security/identity-and-access-management.mdx b/docs/guides/security/identity-and-access-management.mdx new file mode 100644 index 0000000..a8a7f2b --- /dev/null +++ b/docs/guides/security/identity-and-access-management.mdx @@ -0,0 +1,250 @@ +--- +title: "Identity and Access Management" +description: "Security best practices for authentication, anonymous principal rejection, ingress message inspection, and session management." +sidebar: + order: 2 +--- + +import { Tabs, TabItem } from '@astrojs/starlight/components'; + +## Make sure specific user actions require authentication + +### Security concern + +If this is not the case, an attacker may be able to perform sensitive actions on behalf of a user, compromising their account. + +### Recommendation + +- The caller of every canister call can be identified. The calling [principal](../../references/ic-interface-spec/index.md#principal) can be accessed using the system API's methods [`ic0.msg_caller_size` and `ic0.msg_caller_copy`](../../references/ic-interface-spec/canister-interface.md#system-api-imports). If an identity provider such as Internet Identity is used, [the principal is the user identity for this specific origin](../../references/internet-identity-spec.md#identity-design-and-data-model). If some actions (e.g., access to user's account data or account-specific operations) should be restricted to a principal or a set of principals, then this must be explicitly checked in the canister call. An example in Rust can be found below: + +```rust +// Let pk be the public key of a principal that is allowed to perform +// this operation. This pk could be stored in the canister's state. +if caller() != Principal::self_authenticating(pk) { ic_cdk::trap(...) } + +// Alternatively, if the canister keeps data for different principals +// in e.g., a map such as BTreeMap, then the canister +// must ensure that each caller can only access and perform operations +// on their own data: +if let Some(user_data) = user_data_store.get_mut(&caller()) { + // perform operations on the user's data +} +``` +- In Rust, the `ic_cdk` crate can be used to authenticate the caller using `ic_cdk::api::caller`. Make sure the returned principal is of type `Principal::self_authenticating` and identify the user's account using the public key of that principal. See the example code above. + +- Do authentication as early as possible in the call to avoid unauthenticated actions and potentially expensive operations before authentication. It is also a good idea to [deny service to anonymous users](#disallow-the-anonymous-principal-in-authenticated-calls). + +- Do not rely on authentication performed during [ingress message inspection](#do-not-rely-on-ingress-message-inspection). + +## Disallow the anonymous principal in authenticated calls + +### Security concern + +The caller from the system API (e.g., `ic0::api::caller` in Rust) may also return `Principal::anonymous()`. In authenticated calls, this is probably undesired and could have security implications since this would behave like a shared account for anyone that does unauthenticated calls. + +### Recommendation + +In authenticated calls, make sure the caller is not anonymous and return an error or trap if it is. This could be done centrally by using a helper method. An example in Rust can be found below: +```rust +fn caller() -> Result { + let caller = ic0::api::caller(); + // The anonymous principal is not allowed to interact with the canister. + if caller == Principal::anonymous() { + Err(String::from( + "Anonymous principal not allowed to make calls.", + )) + } else { + Ok(caller) + } +} +``` + +## Do not rely on ingress message inspection + +### Security concern + +The correct execution of [`canister_inspect_message`](../../references/ic-interface-spec/canister-interface.md#system-api-inspect-message) is not guaranteed because it is executed by a single node, and if that node is malicious, it can simply skip this check. In that case the update call would be executed without any message inspection checks. + +Also note that for inter-canister calls, `canister_inspect_message` is not invoked. + +### Recommendation + +Your canisters should not rely on the correct execution of `canister_inspect_message`. This in particular means that no security-critical code, such as [access control checks](#make-sure-specific-user-actions-require-authentication), should be solely performed in that method. Such checks **must** be performed as part of an update method to guarantee reliable execution. Ideally, they are executed both in the `canister_inspect_message` function and a guard function. + +## Use a well-audited authentication service and client-side ICP libraries + +### Security concern + +Implementing user authentication and canister calls yourself in your web app is error-prone and risky. For example, if canister calls are implemented from scratch, there may be bugs around signature creation or verification. + +### Recommendation + +- Consider using an identity provider such as [Internet Identity](https://github.com/dfinity/internet-identity) for authentication, and use the ICP JavaScript agent for making canister calls. + +- You may consider alternative authentication frameworks on ICP for authentication. + +## Set an appropriate session timeout + +### Security concern + +Currently, Internet Identity issues delegations with an expiry time. This expiry time can be set in the auth-client. After a delegation expires, the user has to re-authenticate. Setting a good value is a trade-off between security and usability. + +### Recommendation + +See the [OWASP recommendations](https://cheatsheetseries.owasp.org/cheatsheets/Session_Management_Cheat_Sheet.html#session-expiration). A timeout of 30 minutes should be set for security-sensitive applications. + +The auth-client supports [idle timeouts](https://github.com/dfinity/agent-js/tree/main/packages/auth-client#idle-management). + +## Don't use fetchRootKey in the ICP JavaScript agent in production + +### Security concern + +`agent.fetchRootKey()` can be used in the ICP JavaScript agent to fetch the root subnet threshold public key from a status call in test environments. This key is used to verify threshold signatures on certified data received through canister update calls. Using this method in a production web app gives an attacker the option to supply their own public key, invalidating all authenticity guarantees of update responses. + +### Recommendation + +Never use `agent.fetchRootKey()` in production builds, only in test builds. Not calling this method will result in the hardcoded root subnet public key of the mainnet being used for signature verification, which is the desired behavior in production. + +## Integrating Internet Identity on mobile devices + +A [short presentation](https://www.youtube.com/watch?v=iRmpCkzC6iI&t=1863s) can be found as part of the November 2024 global R&D. + +### Security concern + +Internet Identity has a standardized way for web applications to request authentication of a user. This [client authentication protocol](../../references/internet-identity-spec.md#client-authentication-protocol) allows a client dapp frontend to obtain a delegation signed by the Internet Identity for a locally generated session key pair. Using this delegation in combination with the session key allows the dapp frontend to make authenticated calls towards the backend canister. Such calls need to be digitally signed by the session private key. The IC will verify the signature and verify if there is a delegation (or chain of delegations) from the II key to the session public key. + +The II client authentication protocol leverages the browser's `postMessage` API to communicate between the client origin and the II origin. This protocol allows II to authenticate the origin of the authorization request using the hostname. + +:::note +As part of the client authentication protocol, a dapp can specify an alternative origin by following the [alternative frontend origins](../../references/internet-identity-spec.md#alternative-frontend-origins) requirements. +::: + +Upon successful authentication, II will return a delegation for the principal derived from the users' II for the specific frontend origin. This serves two purposes. First, a client dapp can't use this delegation on other dapps to impersonate the user. Second, multiple client dapps can't correlate user behavior across dapps, thereby reducing privacy. A dapp with a different frontend origin won't be able to request authentication for your dapp, which provides protection against certain phishing attacks. + +When integrating a mobile application with II, the implementation is not straightforward since a mobile app can't call the `postMessage` API. It is tempting to create a simple "proxy" web frontend served by the dapp as shown in the sequence diagram below. The mobile application can load this proxy to complete the normal II authorization flow. Upon completion, this proxy web app provides the delegation back to the mobile app. + +**Naive implementation sequence diagram:** +```mermaid +sequenceDiagram + actor U as User + participant MA as Mobile App + participant PWA as Proxy web app + participant II_FE as II Front-end + participant II_BE as II Back-end + participant BE as Back-end + activate U + U -> MA : 1. Login + activate MA + MA ->> MA : 2. Generate session key pair + activate PWA + MA ->> PWA: 3. Load /?sessionPublicKey=
+ activate II_FE + PWA ->> II_FE: 4. Standard II client auth protocol + Note over PWA,II_FE: II will create a delegation for the session
key generated by the mobile application. + U ->> II_FE: 5. Authenticate with passkey + activate II_BE + II_FE ->> II_BE: 6. getDelegation(frontEndHostname) + II_BE -->> II_FE: 7. <> + Note over II_BE,II_FE: Delegation can leak through
replica or boundary nodes + deactivate II_BE + II_FE -->> PWA: 8. Return the delegation
using postMessage API + deactivate II_FE + PWA -->> MA: 9. Return the delegation + Note over PWA,MA: Delegation can leak through
insecure return mechanism + deactivate PWA + U ->> MA: 10. Authenticated call + activate BE + MA ->> BE: 11. Update call using the delegation + Note over II_BE,BE: Delegation can leak through
replica or boundary nodes + BE -->> MA: <> + deactivate BE + deactivate MA + deactivate U +``` + +However, without any precautions, this proxy would happily accept malicious requests to authenticate the user and might return the delegation back to an attacker. + +Such an attack would start by phishing the user by means of a malicious mobile or web application. The user is asked to authenticate through II. However, instead of using II directly, the attacker abuses the open proxy to authenticate the user for the dapp, under which the vulnerable proxy is running. The attacker would generate a session key and ask the proxy to use the session public key in the II authentication protocol. Through this method, II issues a signed delegation for the user's II derived for the frontend origin of the proxy. This delegation could leak to the attacker, who can use it to impersonate the user. For example, if the attacker can trick the proxy to redirect to the malicious application (e.g., by registering Android deep links or iOS custom schemes), it could directly obtain the delegation. Furthermore, the delegation could leak through an insecure communication channel between the proxy and the mobile app or through observation of the IC state. + +The attack requires four conditions: +1. An attacker can provide a session key to be used in the II client authentication protocol. +2. The client authentication protocol is initiated for a target frontend hostname. +3. The user completes the II authentication protocol. +4. The attacker can obtain the delegation, which is signed by the II canister. + +Conditions 1, 2, and 3 can be satisfied by convincing the user to initiate an authentication flow with a session public key that is chosen by the attacker by loading the proxy from an attacker-controlled mobile or web application. Concretely, an attacker would execute a phishing attack where a victim is directed to the proxy from an unsuspicious application. For example, the victim is convinced that the attacker is issuing an airdrop. The victim has to download a corresponding malicious mobile app that requires II authentication. This malicious mobile app would load the proxy (step 3) similarly to how the legitimate mobile app would. The malicious app would ask the proxy to authenticate the user for an Condition 2 is met for any dapp that exposes such an open II authentication proxy on their domain. session key. The victim might not realize they are completing an authorization flow for a different dapp origin. + +Condition 4 can be satisfied by controlling a replica or boundary node that can observe the delegation in step 7. Alternatively, the delegation could leak in step 9 by using an HTTP GET parameter in a URI pointing to the IC. In such cases, if the mobile app that should receive the URI isn't installed, the browser loads the web app by making a request to the URI. Boundary and replica nodes would again receive the delegation as part of the URI. Condition 4 can also be met if the mobile app issues a request to the IC in step 11 without verifying the delegation obtained in step 9. + +Finally, condition 4 can also be satisfied if the delegation is returned insecurely from the proxy frontend to the mobile app. For example, by using Android deep links or iOS custom schemes, which can be intercepted by a malicious app. + +### Recommendation + +In the standard integration between a client web app and the II web frontend, the origin of the client is verified **before** starting the client authentication protocol. Unfortunately, loading the URI of the proxy app in step 3 does not provide any information about the mobile application. Therefore, the proxy frontend is unable to authenticate the client. This creates an open endpoint for attackers to use, as described in the previous section. + +This risk can be addressed by adopting the following remediations shown in the sequence diagram and explained further below. + +**Secure integration sequence diagram:** +```mermaid +sequenceDiagram + actor U as User + participant MA as Mobile App + participant PWA as Secure Proxy web app + participant II_FE as II Front-end + participant II_BE as II Back-end + participant BE as Back-end + + activate U + U -> MA : 1. Login + activate MA + MA ->> MA : 2. Generate session key pair + activate PWA + MA ->> PWA: 3. Load /?sessionPublicKey=
+ PWA ->> PWA: 4. Generate intermediate session key + Note over PWA: This key never
leaves the proxy
front-end + activate II_FE + PWA ->> II_FE: 5. Standard II client auth protocol
using intermediate session key + Note over PWA,II_FE: II will create a delegation for the
intermediate key and not for the attacker
chosen session key. + U ->> II_FE: 6. Authenticate with passkey + activate II_BE + II_FE ->> II_BE: 7. getDelegation(frontEndHostname) + II_BE -->> II_FE: 8. <> + II_FE ->> II_FE: 9. Construct the delegation chain + Note over II_FE: The proxy front-end creates a
delegation from the intermediate
key to the mobile app session key
and combines it with the
delegation from the II canister key
to the intermediate key. + II_FE -->> PWA: 10. Return the delegation
using the postMessage API + deactivate II_FE + PWA -->> MA: 11. Return the delegation chain using
an app link (Android)
or universal link (iOS)
as part of the fragment
using associated domains + Note over MA,PWA: Protect the delegation chain
from being leaked to the web
server by using a URI fragment
instead of a GET parameter. + MA ->> MA: 12 Verify the delegation chain
against the session
key from step 2 + Note over MA: Verify the delegation chain before using it
to avoid leaking a delegation with an
attacker controlled session key to the IC +MA -->> U: <> + U ->> MA: 13. Authenticated call + activate BE + MA ->> BE: 14. Update call using the verified delegation chain + BE -->> MA: <> + deactivate BE + deactivate MA + deactivate U +``` + +* Introduce an intermediate session key that is generated and stored by the web app proxy frontend. +* Initiate the II client authentication protocol using this intermediate session key. By using a new session key that the attacker can't control, the delegation issued by II would no longer be usable by the attacker if it were stolen in step 8, as the attacker doesn't have access to the intermediate session private key. +* [Create a delegation chain](../../references/ic-interface-spec/https-interface.md#authentication) to allow the mobile application to use their session key. The delegation chain consists of two delegations as shown in the figure below. The first one delegates from the II canister key to the intermediate key and is generated by the II canister. The second one delegates from the intermediate key to the mobile app public key and is signed by the proxy frontend's intermediate session private key. Note, this means the intermediate key can impersonate the user. Since the proxy frontend is served from the IC, it can be trusted to handle this key properly. It is up to the developer to ensure the confidentiality of this key. For example, using the WebCrypto API to create unextractable keys as is used internally by the ICP JavaScript agent. Ideally, this intermediate key is short-lived to reduce the risk of exposure. + +![Delegation Chain](/img/docs/security/ii_mobile_delegation_chain.png) +* Return the delegation chain to the mobile app using [app links](https://developer.android.com/training/app-links) on Android and [universal links](https://developer.apple.com/documentation/xcode/allowing-apps-and-websites-to-link-to-your-content) on iOS. These mechanisms bind the domain name/hostname to the mobile app, which prevents an attacker from using a malicious mobile app to receive the delegation chain. The domain-to-mobile app binding occurs through a JSON file that has to be hosted under the `/.well-known` directory of your web application. See [iOS](https://developer.apple.com/documentation/xcode/supporting-associated-domains) +and [Android](https://developer.android.com/training/app-links/verify-android-applinks) documentation for details. +* Return the delegation chain to the mobile app using a [URI fragment](https://www.w3.org/DesignIssues/Fragment.html) (everything following the # in the URI). The browser will load the URI if the mobile app linked to the app/universal link isn't installed on the mobile device. The benefit of URI fragments is that they are not included in the request to the server if the browser were to resolve the URI. A URL parameter or path would be included in such a request, and therefore it would leak the delegation chain to the proxy app backend (most likely the IC boundary and replica nodes). A URI fragment is still available to the mobile app for extraction. +* Verify the delegation chain in the mobile application before using it in an IC message. The mobile application likely uses an agent that does not verify whether the session key generated in step 2 corresponds to the delegation found in the delegation chain returned in step 11. Using such an agent to make a signed update call would simply create a message with the provided delegation chain and sign it with a mismatching key. Obviously, the IC would reject such a message as the signature does not correspond to the delegation chain, but the delegation chain would already have leaked to the boundary and potentially replica nodes where an attacker could steal it. +* Optionally, the proxy frontend could explicitly warn the user that it is about to sign in with II for your dapp. It could include the dapp's name and logo. This might alarm the user who is being phished since the pretense used by the attacker would likely not match with the purpose of your dapp. For example, the attacker claims the authentication is required as part of an airdrop while you are running an unrelated decentralized exchange. When the proxy dapp is opened, the user would see your dapp's logo and abort the sign-in. + +For more information, view an [example implementation in the form of a Unity app](https://github.com/dfinity/examples/tree/main/native-apps/unity_ii_deeplink). The following pieces of that codebase are most important: + +* Generation of the intermediate session key [in index.js](https://github.com/dfinity/examples/blob/main/native-apps/unity_ii_deeplink/ii_integration_dapp/src/greet_frontend/src/index.js#L25). +* [Authentication using the intermediate session key](https://github.com/dfinity/examples/blob/main/native-apps/unity_ii_deeplink/ii_integration_dapp/src/greet_frontend/src/index.js#L26-L36) instead of the mobile app public key. +* [Generating the delegation chain](https://github.com/dfinity/examples/blob/main/native-apps/unity_ii_deeplink/ii_integration_dapp/src/greet_frontend/src/index.js#L48-L57) by combining the delegation obtained from II with a delegation created by the frontend. +* [Returning the delegation chain using an applink/universal link](https://github.com/dfinity/examples/blob/main/native-apps/unity_ii_deeplink/ii_integration_dapp/src/greet_frontend/src/index.js#L71). +* Returning the delegation chain [using a URI fragment](https://github.com/dfinity/examples/blob/main/native-apps/unity_ii_deeplink/ii_integration_dapp/src/greet_frontend/src/index.js#L73). +* The example is currently being improved whereby the delegation chain will also be verified in the mobile app before using it. + +{/* Upstream: sync from dfinity/portal building-apps/security/iam.mdx */} diff --git a/docs/guides/security/inter-canister-calls.md b/docs/guides/security/inter-canister-calls.md index 2f5414b..5863fce 100644 --- a/docs/guides/security/inter-canister-calls.md +++ b/docs/guides/security/inter-canister-calls.md @@ -1,483 +1,342 @@ --- -title: "Inter-Canister Call Safety" -description: "Handle reentrancy, callback traps, and async safety in inter-canister calls" +title: "Inter-Canister Call Security" +description: "Security best practices for handling traps in callbacks, message ordering, rejected calls, and untrustworthy canisters." sidebar: order: 5 --- -Inter-canister calls are the most common source of security bugs on the Internet Computer. The async messaging model creates a class of vulnerabilities that do not exist in synchronous systems: state can change between an `await` and its response, traps in callbacks can skip security-critical operations, and calls to untrusted canisters can permanently block upgrades. +To understand the issues around async inter-canister calls, one needs to understand the [properties of message execution on ICP](../../references/message-execution-properties.md). Understanding these properties is a prerequisite for understanding the security issues discussed below. -This guide covers the specific patterns you must apply whenever your canister makes an inter-canister call. +This is also explained in the [community conversation on security best practices](https://www.youtube.com/watch?v=PneRzDmf_Xw&list=PLuhDt1vhGcrez-f3I0_hvbwGZHZzkZ7Ng&index=2&t=4s). -## Why inter-canister calls are dangerous +## Securely handle traps in callbacks -When your canister `await`s a call to another canister, the IC scheduler can interleave other incoming messages while your canister waits for the response. This means: +### Security concern -- State your canister read before the `await` may be different when the callback runs. -- A second call from the same user can arrive and begin executing before the first call's callback completes. -- If the callback traps, any mutations made in the callback are rolled back: but mutations made before the `await` are already committed. +Traps and panics roll back the canister state, as described in [Property 5](../../references/message-execution-properties.md#message-execution-properties). So any state change followed by a trap or panic can be risky. This is an important concern when inter-canister calls are made. If a trap occurs after an await to an inter-canister call, then the state is reverted to the snapshot before the inter-canister call's callback invocation, and not to the state before the entire call. -The code before `await` and the code after `await` execute as **separate atomic message executions**. Understanding this is the foundation of inter-canister call security. +More precisely, suppose some state changes are applied and then an inter-canister call is issued. Also, assume that these state changes leave the canister in an inconsistent state, and that state is only made consistent again in the callback. Now if there is a trap in the callback, this leaves the canister in an inconsistent state. -## Reentrancy and the CallerGuard pattern +Here are two example security issues that can arise because of this: -A reentrancy bug occurs when a second message from the same caller interleaves with a first message that is still in progress: that is, awaiting a response. In DeFi contexts this enables double-spending: the attacker calls `withdraw()`, waits for it to begin the inter-canister transfer, then calls `withdraw()` again before the first call updates the balance. +- Assume an inter-canister call is issued to transfer funds. In the callback, the canister accounts for having made that transfer by updating the balances in the canister storage. However, suppose the callback also updates some usage statistics data, which eventually leads to a trap when some data structure becomes full. As soon as that is the case, the canister ends up in an inconsistent state because the state changes in the callback are no longer applied, and thus the transfers are not correctly accounted for. + ![example_trap_after_await](/img/docs/security/example_trap_after_await.png) + This example is also discussed in this [community conversation](https://www.youtube.com/watch?v=PneRzDmf_Xw&list=PLuhDt1vhGcrez-f3I0_hvbwGZHZzkZ7Ng&index=2&t=4s). -The CallerGuard pattern prevents this by tracking which callers have an in-flight operation. When a second call arrives from the same caller, it is rejected before it can interleave. +- Suppose part of the canister state is locked before an inter-canister call and released in the callback. Then the lock may never be released if the callback traps. + Note that in canisters implemented in Rust with Rust CDK version `0.5.1`, any local variables still go out of scope if a callback traps. The CDK actually calls into the `ic0.call_on_cleanup` API to release these resources. This helps to prevent issues with locks not being released, as it is possible to use Rust's Drop implementation to release locked resources, as we discuss in [Be aware that there is no reliable message ordering](#be-aware-that-there-is-no-reliable-message-ordering). -### Motoko +### Recommendation -In Motoko, the guard must be released in a `finally` block. The `finally` block runs in cleanup context, where state changes are committed even if the `try` body trapped. If you release the guard inside the `try` body, a trap in the callback leaves the guard held forever. The caller is permanently locked out. +Recall that the responses to inter-canister calls are processed in the corresponding callback. If the callback traps, the cleanup (ic0.call_on_cleanup) is executed. When making an inter-canister call, ICP reserves sufficiently many cycles to execute the response callback or cleanup, up to the instruction limit. A fixed fraction of the reservation is set aside for the cleanup. Thus, a response or cleanup execution can never "run out of cycles," but they can run into the instruction limit and trap. -```motoko -import Map "mo:core/Map"; -import Principal "mo:core/Principal"; -import Error "mo:core/Error"; -import Result "mo:core/Result"; +The naïve recommendation to address the security concern described above would be to avoid traps. However, that can be very difficult to achieve due to the following reasons: -// Inside your persistent actor class { ... } -// Replace otherCanister with your canister reference. +- The implementation can be involved and could panic due to bugs, such as index out-of-bounds errors or panics (expect, unwrap) that should supposedly never happen. -let pendingRequests = Map.empty(); +- It is hard to make sure the callback or cleanup doesn't run into the instruction limit and thus traps, because the number of instructions required can in general not be predicted and may depend on the data being processed. -func acquireGuard(principal : Principal) : Result.Result<(), Text> { - if (Map.get(pendingRequests, Principal.compare, principal) != null) { - return #err("already processing a request for this caller"); - }; - Map.add(pendingRequests, Principal.compare, principal, true); - #ok -}; +Due to these reasons, while it is easy to recommend "avoiding traps", this is actually hard to achieve in practice. Therefore, code should be written so that it can deal even with unexpected traps due to bugs or hitting the instruction limits. There are two approaches: -func releaseGuard(principal : Principal) { - ignore Map.delete(pendingRequests, Principal.compare, principal); -}; +1. Perform simple cleanups +1. Utilize "journaling." -public shared ({ caller }) func withdraw(amount : Nat) : async Result.Result<(), Text> { - if (Principal.isAnonymous(caller)) { - return #err("anonymous caller not allowed"); - }; +In the first approach, the cleanup callback is used to recover from unexpected panics. This can work, but it has several drawbacks: +- The cleanup itself could panic, in which case one is in the initial problematic situation again. The risk may be acceptable for simple cleanups, but as discussed above, it is hard to write code that never panics, especially if it is somewhat complex. +- As of version 0.12.0, Motoko provides the `try`/`finally` feature to clean up temporary resource allocations in a structured way. Cleanup is used (as formerly) internally by Motoko to perform some state manipulations and now allows inserting programmer-written code also. If an execution path after `await` traps, all `finally` blocks in (dynamic) scope will be executed as a last-resort measure. Be aware that `finally` is not a magical construct to end all trap worries, as trapping in the `finally` blocks themselves can still leave your canister in an inconsistent state. Thus we recommend keeping your `finally` code clear and concise and paying special attention to reviewing it well. +- As discussed above, the Rust CDK has a feature that automatically releases local variables in cleanup, which [can be used to release locks](#recommendation-1). Since only one cleanup callback can be defined, any custom cleanup would currently have to implement that feature itself if needed, making this currently hard to use and understand. - // Acquire per-caller lock before any state reads or async calls. - switch (acquireGuard(caller)) { - case (#err(msg)) { return #err(msg) }; - case (#ok) {}; - }; +Instead, "journaling" is the recommended way of addressing the problem at hand. - try { - // Read state and make the inter-canister call here. - let result = await otherCanister.transfer(caller, amount); - #ok(result) - } catch (e) { - #err("transfer failed: " # Error.message(e)) - } finally { - // Runs in cleanup context regardless of success or trap. - // State mutations here are always committed. - releaseGuard(caller); - }; -}; -``` +### Journaling -### Rust +Journaling can be used for ensuring that tasks are completed correctly in an asynchronous context, where any instruction or async task can fail. Journaling is generally useful in any security-critical application canister on ICP. The journaling concept we describe here is inspired and adapted from journaling in file systems. -In Rust, the `Drop` trait releases the lock when the guard goes out of scope: including when the async function is cancelled or a trap occurs. Never write `let _ = CallerGuard::new(caller)?`: the leading underscore drops the guard immediately, making locking ineffective. Always bind to a named variable: `let _guard = CallerGuard::new(caller)?`. +Conceptually, a journal is a chronological list of records kept in a canister's storage. It keeps track of tasks before they begin and when they are completed. Before each failable task, the journal records the intent to execute the task, and after the task, the journal records the result. The journal supports idempotent task flows by providing the necessary information for the canister to resume flows that failed to complete, report progress for ongoing flows, and report results for completed flows. Retries can be initiated by calls, automatically on a [heartbeat](../backends/timers.mdx#heartbeats-legacy) or using [timers](../backends/timers.mdx). If the task flow was completed in a heartbeat or a timer, a user can take advantage of idempotency to check the result. -```rust -use std::cell::RefCell; -use std::collections::BTreeSet; -use candid::Principal; -use ic_cdk::update; -use ic_cdk::api::msg_caller; -use ic_cdk::call::Call; +Creating a record in the journal is called "journaling." For example, to make an unreliable async call to a ledger: -// Replace other_canister_id() with your canister's ID lookup. +1. Check the journal to ensure the transfer is not already in progress. If it is already in progress, go into recovery (see the [Recovery](#recovery) section below). Otherwise, journal the intent to call a ledger to transfer 1 token from A to B. The journaled intent should contain sufficient context to later identify what happened to the call. -thread_local! { - static PENDING: RefCell> = RefCell::new(BTreeSet::new()); -} + - An "in progress" transfer would show in the journal as an entry containing intent to do the transfer without an entry containing the result of the transfer call. -struct CallerGuard { - principal: Principal, -} +1. Call the ledger to transfer 1 token from A to B. -impl CallerGuard { - fn new(principal: Principal) -> Result { - PENDING.with(|p| { - if !p.borrow_mut().insert(principal) { - return Err("already processing a request for this caller".to_string()); - } - Ok(Self { principal }) - }) - } -} +1. Journal the result of the transfer. -impl Drop for CallerGuard { - fn drop(&mut self) { - PENDING.with(|p| { - p.borrow_mut().remove(&self.principal); - }); - } -} + - On failure, record the error. -#[update] -async fn withdraw(amount: u64) -> Result<(), String> { - let caller = msg_caller(); - if caller == Principal::anonymous() { - return Err("anonymous caller not allowed".to_string()); - } + - On success, record success. In order to commit the record, an inter-canister call can be made to an endpoint on the same canister that does nothing. Otherwise, a trap could erase the journaled result, complicating recovery. - // Acquire per-caller lock. Drop releases the lock when _guard goes out of scope. - let _guard = CallerGuard::new(caller)?; +1. Continue onto the next blocked task. - // Make the inter-canister call while the lock is held. - Call::bounded_wait(other_canister_id(), "transfer") - .with_args(&(caller, amount)) - .await - .map_err(|e| format!("transfer failed: {:?}", e))?; + - "Blocked tasks" are those that require step 3 to be completed before execution. - Ok(()) - // _guard dropped here: lock released -} -``` - -## State mutations before and after await + - A blocked task may depend on the success or failure recorded in step 3. -Because the code before `await` and the code after `await` are separate message executions, you must treat them independently when reasoning about consistency. + - Examples of blocked tasks: -**The critical rule:** If your canister mutates state before an `await`, that mutation is committed even if the callback traps. + - On failure, log the failure in a user-visible log, and if less than 5 failures have occurred, make a new transfer outcall with the same parameters. -### Example: deduct before transferring + - On success, update the internal accounting of assets to conform to the result of the transfer. -In a token transfer flow, deduct the balance before the inter-canister call rather than after. If the call fails, refund in the callback. This approach is safe: if the callback traps, the pre-deducted balance stays deducted (you can detect and remediate the stuck state. If you deduct after the call and the callback traps, the transfer happened but the balance was never deducted) funds are double-spent. + - Note that any independent task does not need to wait for any part of this flow. -**Motoko:** +The critical property of the journal is that at any point, if there is a failure, the journal is sufficient to determine what the next safe step should be. If, after step 1 (journal the intent), +there is a failure in step 2 or 3, and step 3 has not been completed, then the application should complete step 3 by finding out what happened to the call in step 2. If finding out what happened to the call is too difficult to automate, it can be done manually. The journal can indicate whether a manual intervention is necessary and the type of intervention that is necessary. +The fact that the intent has been journaled and the app knows not to reenter the flow until the result has been recorded means the journal acts as a lock on the critical section containing +the ledger outcall. The lock will not get stuck, assuming the application can always find out what happened to a call. Enough context about the call should be recorded in the intent to ensure +that this is the case. For the ICP ledger, an ID can be generated and recorded in the journaled intent, and the ledger can be called with the ID included in the memo so that the result of the +call can be queried later. -```motoko -import Map "mo:core/Map"; -import Principal "mo:core/Principal"; -import Error "mo:core/Error"; -import Result "mo:core/Result"; +### Journaling is robust to panics -// Inside your persistent actor class { ... } -let balances = Map.empty(); +Continuing the above example, consider a panic at any point. +1. If there is panic before the async outcall, then the journaled intent will be lost. No state change occurred internally, and no outcalls were made, so the app is in a safe state. The next step is to record a new intent. +1. If there is a panic after the async outcall and no self-call was used to commit the journal, the journaled result (step 3) will be lost. This means the app will need to determine the result and journal it before continuing to step 4. As long as it is possible to determine the result, the app can be brought back to a consistent state. -public shared ({ caller }) func transfer(to : Principal, amount : Nat) : async Result.Result<(), Text> { - // 1. Validate balance before the await. - let balance = switch (Map.get(balances, Principal.compare, caller)) { - case (?b) b; - case null 0; - }; - if (balance < amount) { - return #err("insufficient balance"); - }; +### Journaling and audit events - // 2. Deduct BEFORE the await: mutation is committed regardless of callback outcome. - Map.add(balances, Principal.compare, caller, balance - amount); - - // 3. Perform the inter-canister call. - try { - await ledgerCanister.transfer(to, amount); - #ok(()) - } catch (e) { - // 4. Refund on failure: the deduction persists even if this try/catch runs. - let currentBalance = switch (Map.get(balances, Principal.compare, caller)) { - case (?b) b; - case null 0; - }; - Map.add(balances, Principal.compare, caller, currentBalance + amount); - #err("transfer failed: " # Error.message(e)) - } -}; -``` +The journal can be used to augment the audit trail for recent events. However, it is probably too detailed for long-term storage. After a while, journal entries could be compressed and incorporated into long-term audit events. The process for creating audit events could itself be journaled. -**Rust:** +### Recovery -```rust -use std::cell::RefCell; -use std::collections::BTreeMap; -use candid::Principal; -use ic_cdk::update; -use ic_cdk::api::msg_caller; -use ic_cdk::call::Call; +The journal ensures the application knows that recovery from an error is needed and aids in making recovery decisions. In order to support the recovery process, the journal should support querying all unresolved tasks of a certain type and tasks of a certain type that resulted in an error. Given an intent, the journal should also be able to return the result if it exists and indicate if it does not exist. -// Replace ledger_canister_id() with your canister's ID lookup. +Note that recovery can often be complex to automate. In such cases, the journal can support a manual recovery process. +Extending the ledger example above, a recovery process could look as follows: -thread_local! { - static BALANCES: RefCell> = - RefCell::new(BTreeMap::new()); -} +1. There is a panic, and the status of the ledger call is unknown. However, the journal has recorded that a call to transfer with particular parameters and a memo has been made, including the deduplication timestamp of the transfer. +1. The app calls the ledger to determine whether a transaction with the journaled parameters has succeeded on the ledger. Due to the guarantee that any pair of messages that are both executed are always executed in the order issued, if the ledger indicates that the transaction has not occurred, then the transaction will never occur. +1. The app journals the result of the transfer call. +1. The app journals the intention to update internal state according to the result of the transfer call, then updates the internal state, and finally journals the result of the attempt to update the internal state. Journaling this step is still useful even if it does not contain outcalls, because outcalls may be introduced later, and the step could conflict with other processes that are not atomic. -#[update] -async fn transfer(to: Principal, amount: u64) -> Result<(), String> { - let caller = msg_caller(); - - // 1. Validate and deduct BEFORE the await. - BALANCES.with(|b| { - let mut balances = b.borrow_mut(); - let balance = balances.get(&caller).copied().unwrap_or(0); - if balance < amount { - return Err("insufficient balance".to_string()); - } - balances.insert(caller, balance - amount); - Ok(()) - })?; - - // 2. Make the inter-canister call. - let result = Call::bounded_wait(ledger_canister_id(), "transfer") - .with_args(&(to, amount)) - .await; - - if let Err(e) = result { - // 3. Refund on failure. - BALANCES.with(|b| { - let mut balances = b.borrow_mut(); - let current = balances.get(&caller).copied().unwrap_or(0); - balances.insert(caller, current + amount); - }); - return Err(format!("transfer failed: {:?}", e)); - } +Note that querying the ICP ledger or an ICRC ledger to determine whether a transaction has succeeded is not straightforward to automate, so it could be done manually. - Ok(()) -} -``` +### Example implementation of journaling -## Callback traps and security-critical cleanup +GoldDAO's GLDT-swap has an implementation of journaling. In their case, the journal entries are recorded in the "registry." Note that in GLDT-swap there is also a separate concept of "record," which is a permanent audit trail and is not used for journaling. Some error paths require manual recovery. See the following reference points: -A trap in an inter-canister call callback is particularly dangerous: the callback's state mutations are rolled back, but the pre-`await` mutations are not. A malicious callee can induce a trap in your callback to skip actions that should always run: like debiting an account. +- Registry (journal) structure: + - https://github.com/GoldDAO/gold-dao/blob/ledger-v1.0.0/canister/gldt_core/src/registry.rs#L18 + - https://github.com/GoldDAO/gold-dao/blob/ledger-v1.0.0/canister/gldt_core/src/lib.rs#L654 +- The registry is used in `notify_sale_nft_origyn` to record progress and enforce correctness of the flow. + - https://github.com/GoldDAO/gold-dao/blob/ledger-v1.0.0/canister/gldt_core/src/lib.rs#L910 + - Note that not all details of the flow appear in the registry. The amount of detail to include depends on one's goals for recovery. -To protect against this: +## Be aware that there is no reliable message ordering -1. **Keep callbacks minimal.** The less logic in a callback, the fewer opportunities for a trap. -2. **Use `finally` (Motoko) or `Drop` guards (Rust) for cleanup.** Cleanup that runs in `finally` or in `drop()` executes in cleanup context where mutations persist even after a trap. -3. **Avoid calling untrusted canisters** from callbacks that perform security-critical state changes. The callee can cause your callback to trap. +### Security concern -### Motoko: cleanup in finally +As described in the [properties of message executions on ICP](../../references/message-execution-properties.md), messages (but not entire calls) are processed atomically. In particular, as described in Property 4 in that document, messages from interleaving calls do not have a reliable execution ordering. Thus, the state of the canister (and other canisters) may change between the time an inter-canister call is started and the time when it returns, which may lead to issues if not handled correctly. These issues are generally called 'reentrancy bugs' (see the [Ethereum best practices on reentrancy](https://consensysdiligence.github.io/smart-contract-best-practices/attacks/reentrancy/)). Note, however, that the messaging guarantees, and thus the bugs, on ICP are different from Ethereum. -```motoko -import Error "mo:core/Error"; +Here are two concrete and somewhat similar types of bugs to illustrate potential reentrancy security issues: -// Inside your persistent actor class { ... } -// Replace otherCanister with your canister reference. +- **Time-of-check time-of-use issues:** These occur when some condition on global state is checked before an inter-canister call and then wrongly assuming the condition still holds when the call returns. For example, one might check if there is sufficient balance on some account, then issue an inter-canister call, and finally make a transfer as part of the callback message. When the second inter-canister call starts, it is possible that the condition that was checked initially no longer holds, because other ledger transfers may have happened before the callback of the first call is executed (see also Property 4 above). -var operationInProgress = false; +- **Double-spending issues**: Such issues occur when a transfer is issued twice, often because of unfavorable message scheduling. For example, suppose you check if a caller is eligible for a refund, and if so, transfer some refund amount to them. When the refund ledger call returns successfully, you set a flag in the canister storage indicating that the caller has been refunded. This is vulnerable to double-spending because the refund method can be called twice by the caller in parallel, in which case it is possible that the messages before issuing the transfer (including the eligibility check) are scheduled before both callbacks. A detailed explanation of this issue can be found in the [community conversation on security best practices](https://www.youtube.com/watch?v=PneRzDmf_Xw&list=PLuhDt1vhGcrez-f3I0_hvbwGZHZzkZ7Ng&index=2&t=4s). -public shared ({ caller }) func riskyOperation() : async () { - operationInProgress := true; // Committed immediately +### Recommendation - try { - await otherCanister.doSomething(); - // ... callback logic - } catch (e) { - // Handle error - ignore Error.message(e); - } finally { - // Runs in cleanup context: mutation persists even if callback trapped. - operationInProgress := false; - } -}; -``` +It is highly recommended to carefully review any canister code that makes async inter-canister calls (`await`). If two messages read or write the same state, review if there is a possible scheduling of these messages that leads to illegal transactions or an inconsistent state. -### Rust: cleanup via Drop +See also: "Inter-canister calls" section in [how to audit an ICP canister](https://www.joachim-breitner.de/blog/788-How_to_audit_an_Internet_Computer_canister). -```rust -use std::cell::Cell; -use ic_cdk::update; -use ic_cdk::call::Call; +To address issues around message ordering that can lead to bugs, one usually employs locking mechanisms to ensure that a caller or anyone can only execute an entire call, which involves several messages, once at a time. A simple example is also given in the [community conversation](https://www.youtube.com/watch?v=PneRzDmf_Xw&list=PLuhDt1vhGcrez-f3I0_hvbwGZHZzkZ7Ng&index=2&t=4s) mentioned above. -// Replace other_canister_id() with your canister's ID lookup. +The locks would usually be released in the callback. That bears the risk that the lock may never be released in case the callback traps, as we discussed in [securely handle traps in callbacks](#securely-handle-traps-in-callbacks). The code examples below show how one can securely implement a lock per caller. +- In Rust, one can use the drop pattern where each caller lock (`CallerGuard` struct) implements the `Drop` trait to release the lock. From Rust CDK version `0.5.1`, any local variables still go out of scope if the callback traps, so the lock on the caller is released even in that case. Technically, the CDK calls into the `ic0.call_on_cleanup` API to release these resources. Recall that `ic0.call_on_cleanup` is executed if the `reply` or the `reject` callback executed and trapped. +- In Motoko, one can use the `try`/`finally` control flow construct. This construct guarantees that the lock is released in the `finally` block regardless of any errors or traps in the `try` or `catch` blocks. -thread_local! { - static OPERATION_IN_PROGRESS: Cell = Cell::new(false); -} +**Motoko:** -struct OperationGuard; +```motoko +import Result "mo:core/Result"; +import Map "mo:core/Map"; +import Error "mo:core/Error"; +import Principal "mo:core/Principal"; -impl Drop for OperationGuard { - fn drop(&mut self) { - // Runs when the guard is dropped, even during cleanup after a trap. - OPERATION_IN_PROGRESS.with(|f| f.set(false)); - } -} +actor { -#[update] -async fn risky_operation() -> Result<(), String> { - OPERATION_IN_PROGRESS.with(|f| f.set(true)); // Committed immediately + let pending_requests = Map.empty(); - // _guard released (Drop called) when this function returns or is cancelled. - let _guard = OperationGuard; + private func guard(principal : Principal) : Result.Result<(), Error.Error> { + if (Map.get(pending_requests, Principal.compare, principal) != null) { + #err (Error.reject("Already processing a request for principal " # Principal.toText(principal))); + } else { + Map.add(pending_requests, Principal.compare, principal, true); + #ok; + }; + }; - Call::bounded_wait(other_canister_id(), "do_something") - .await - .map_err(|e| format!("call failed: {:?}", e))?; + private func drop_guard(principal : Principal) { + ignore Map.delete(pending_requests, Principal.compare, principal); + }; - Ok(()) -} + public shared ({ caller }) func example_call_with_locking_per_caller() : async Result.Result<(), (Error.ErrorCode, Text)> { + var guard_acquired = false; + try { + // Try to create a lock for `caller`, return an error immediately if there is already a call in progress for `caller` + switch (guard caller) { + case (#ok) guard_acquired := true; + case (#err e) return (#err (Error.code e, Error.message e)); + }; + // do anything, call other canisters + #ok + } catch e { + #err (Error.code e, Error.message e); + } finally { + // Release the guard (only requests that have created a lock) + if guard_acquired { + drop_guard caller; + }; + }; + }; +}; ``` -## Bounded vs unbounded wait - -The IC offers two kinds of inter-canister calls: - -| | `bounded_wait` | `unbounded_wait` | -|---|---|---| -| Timeout | 300 seconds (default) | No timeout | -| If callee is unresponsive | Returns `SYS_UNKNOWN` error | Waits indefinitely | -| Upgrade safety | Canister can be stopped and upgraded after timeout | Canister **cannot be stopped** while awaiting | -| Use for | Calls to external or untrusted canisters | Calls to your own canisters you control | - -**The upgrade safety issue:** A canister cannot be stopped (and therefore cannot be upgraded) while it has outstanding unbounded-wait calls. If the callee is malicious or buggy and never responds, your canister is permanently stuck. Use `bounded_wait` for any call to a canister you do not control. - -### Motoko: bounded vs unbounded - -Motoko does not yet expose a direct API to switch between bounded and unbounded wait. The `await` keyword currently uses unbounded wait. For calls to untrusted canisters, prefer the system-level API (available via Rust) or structure your application so calls to untrusted canisters only go out from canisters you can afford to sacrifice. - - - -### Rust: choose bounded_wait for untrusted canisters +**Rust:** ```rust -use ic_cdk::call::Call; -use candid::Principal; - -async fn call_trusted(canister: Principal, method: &str) -> Result { - // Use unbounded_wait only for canisters you control. - Call::unbounded_wait(canister, method) - .await - .map_err(|e| format!("call failed: {:?}", e))? - .candid() - .map_err(|e| format!("decode failed: {:?}", e)) +pub struct State { + pending_requests: BTreeSet, } -async fn call_untrusted(canister: Principal, method: &str) -> Result { - // Use bounded_wait for external or untrusted canisters. - // Default timeout is 300 seconds. Adjust with .change_timeout(seconds). - Call::bounded_wait(canister, method) - .await - .map_err(|e| format!("call failed: {:?}", e))? - .candid() - .map_err(|e| format!("decode failed: {:?}", e)) +thread_local! { + static STATE: RefCell = RefCell::new(State{pending_requests: BTreeSet::new()}); } -``` -## Response size limits - -All inter-canister call payloads (both requests and responses) are limited to **2 MB**. A request above 2 MB fails synchronously. A response above 2 MB causes the callee to trap. +pub struct CallerGuard { + principal: Principal, +} -When reading large datasets across canisters, use pagination: return chunks of data per call rather than everything at once. Keep individual payloads under 1 MB to leave room for encoding overhead. +impl CallerGuard { + pub fn new(principal: Principal) -> Result { + STATE.with(|state| { + let pending_requests = &mut state.borrow_mut().pending_requests; + if pending_requests.contains(&principal){ + return Err(format!("Already processing a request for principal {:?}", &principal)); + } + pending_requests.insert(principal); + Ok(Self { principal }) + }) + } +} -```motoko -// Paginated query: avoid returning unbounded data -// Requires: import Array "mo:core/Array"; import Nat "mo:core/Nat"; -public query func getItems(offset : Nat, limit : Nat) : async [Item] { - // Return at most `limit` items starting from `offset`. - // Caller makes multiple calls to retrieve all data. - Array.sliceToArray(items, offset, offset + Nat.min(limit, items.size() - offset)) -}; -``` +impl Drop for CallerGuard { + fn drop(&mut self) { + STATE.with(|state| { + state.borrow_mut().pending_requests.remove(&self.principal); + }) + } +} -## Caller identity across await points +#[update] +#[candid_method(update)] +async fn example_call_with_locking_per_caller() -> Result<(), String> { + let caller = ic_cdk::caller(); + // using `?`, return an error immediately if there is already a call in progress for `caller` + // warning: never use `let _ = CallerGuard::new(caller)?`, because this will drop the guard immediately + // and locking would not be effective + let _guard = CallerGuard::new(caller)?; + // do anything, call other canisters + Ok(()) +} // here the guard goes out of scope and is dropped + +mod test { + use super::*; + + #[test] + fn should_obtain_guard_for_different_principals() { + let principal_1 = Principal::anonymous(); + let principal_2 = Principal::management_canister(); + let caller_guard = CallerGuard::new(principal_1); + assert!(caller_guard.is_ok()); + assert!(CallerGuard::new(principal_2).is_ok()); + } -In Motoko, the `caller` is captured as an immutable binding at function entry via `public shared ({ caller }) func`. This is safe across `await` points. + #[test] + fn should_not_obtain_guard_twice_for_same_principal() { + let principal = Principal::anonymous(); + let caller_guard = CallerGuard::new(principal); + assert!(caller_guard.is_ok()); + assert!(CallerGuard::new(principal).is_err()); + } -In Rust, the current ic-cdk executor preserves caller across `.await` points via protected tasks, but this is an implementation detail: not a language guarantee. Bind `msg_caller()` before the first `await` as a defensive practice. + #[test] + fn should_release_guard_on_drop() { + let principal = Principal::anonymous(); + { + let caller_guard = CallerGuard::new(principal); + assert!(caller_guard.is_ok()); + } // drop caller_guard as it goes out of scope here + // it is possible to get a guard again: + assert!(CallerGuard::new(principal).is_ok()); + } +} +``` -```rust -use ic_cdk::update; -use ic_cdk::api::msg_caller; -use ic_cdk::call::Call; -use candid::Principal; +This pattern can be extended to work for the following use cases: -// Replace other_canister_id() with your canister's ID lookup. +- A global lock that does not only lock per caller. For this, set a boolean flag in the canister state instead of using a `BTreeSet` (Rust) or `Map` (Motoko). +- A guard that makes sure that only a limited number of principals are allowed to execute a method at the same time. + - Rust: Return an error in `CallerGuard::new()` in case `pending_requests.len() >= MAX_NUM_CONCURRENT_REQUESTS`. + - Motoko: Return an error in `guard` in case `Map.size(pending_requests) >= MAX_NUM_CONCURRENT_REQUESTS`. +- A guard that limits the number of times a method can be called in parallel. + - Rust: Use a counter in the canister state that is checked and increased in `CallerGuard::new()` and decreased in `Drop`. + - Motoko: Increase a counter in the `guard` function and decrease it in the `drop` function. +- A guard that makes sure that every task from a set of tasks can only be processed once, independent of the caller who triggered the processing. [View example project](https://github.com/dfinity/examples/tree/master/rust/guards). +- A lock that uses a different type than `Principal` to grant access to the resource. [View an implementation using generic types](https://github.com/dfinity/examples/tree/master/rust/guards). -#[update] -async fn process() -> Result<(), String> { - // Capture caller BEFORE any await: defensive practice in Rust. - let caller: Principal = msg_caller(); +Finally, note that the same guard can be used in several methods to restrict parallel execution of them. - Call::bounded_wait(other_canister_id(), "validate") - .with_arg(caller) - .await - .map_err(|e| format!("validation failed: {:?}", e))?; +## Handle rejected inter-canister calls correctly - // Use the captured binding, not msg_caller() again. - do_work_for(caller); - Ok(()) -} +### Security concern -fn do_work_for(_caller: Principal) { - // ... -} -``` +As stated by the [Property 6](../../references/message-execution-properties.md#message-execution-properties), inter-canister calls can fail in which case they result in a **reject**. See [reject codes](../../references/ic-interface-spec/https-interface.md#reject-codes) for more detail. The caller must correctly deal with the reject cases, as they can happen in normal operation, because of insufficient cycles on the sender or receiver side, or because some data structures like message queues are full. -## canister_inspect_message is not called for inter-canister calls +1. The call was issued as a bounded-wait (best-effort response) call, and the system responded with a `SYS_UNKNOWN` reject code. In this case, the caller cannot be a priori sure whether the call took effect or not. +2. The system responded with a `CANISTER_ERROR` reject code. This indicates a bug in the ledger canister. In this case, it is still possible that the call had a partial effect on the ledger canister. +3. The system responded with a `CANISTER_REJECT` reject code. This means that the call was explicitly rejected by the ledger canister. Normally, this indicates that the transfer didn't happen, but this depends on the ledger canister. The ICP ledger canister for example never rejects calls explicitly. -`canister_inspect_message` (Motoko: `system func inspect`) runs only for **ingress messages**: calls from external users arriving at the boundary nodes. It is never called for inter-canister calls. +### Recommendation -This means any access control you implement in `inspect_message` does not protect your canister from being called by another canister. Always duplicate access checks inside the method body itself. +When making inter-canister calls, always handle the error cases (rejects) correctly. Other than the `SYS_UNKNOWN` error code, these errors imply that the message has not been successfully executed. For `SYS_UNKNOWN`, follow the guidelines in the [safe retries and idempotency](../canister-calls/idempotency.md) document to handle this scenario correctly. -For full details on access control patterns, see [access management](access-management.md). +## Be aware of the risks involved in calling untrustworthy canisters -## Handling rejected calls +### Security concern -Inter-canister calls can be rejected for reasons beyond your control: the callee may have trapped, run out of cycles, been stopped, or the system may have rejected the message due to resource pressure. Unhandled rejections trap your canister. +- If inter-canister calls are made to potentially malicious canisters, this can lead to DoS issues, or there could be issues related to candid decoding. Also, the data returned from a canister call could be assumed to be trustworthy when it is not. -Always handle the error result of an inter-canister call. +- When a canister `C1` calls a canister `C2` using an unbounded-wait (guaranteed-response) inter-canister call, and `C2` stalls the response indefinitely by not responding, the result would be a DoS on `C1`. Additionally, since the call registers a callback on `C1`, `C1` can no longer be stopped because of the outstanding callback, and thus can no longer be cleanly upgraded. Recovery would require wiping the state of the canister by reinstalling it. Note that even if `C2` was trustworthy it could still stall indefinitely. This could happen due to a bug in `C2` (which may be unlikely to occur). But other causes could be a stall of the subnet hosting `C2` (assuming that `C1` and `C2` are on different subnets), or `C2` making a downstream call to an untrusted canister `C3`. -**Motoko:** use `try/catch`: +- In summary, this can DoS a canister, consume an excessive amount of resources, or lead to logic bugs if the behavior of the canister depends on the inter-canister call response. -```motoko -import Error "mo:core/Error"; -import Result "mo:core/Result"; +### Recommendation -// Inside your persistent actor class { ... } -// Replace otherCanister with your canister reference. +- Making inter-canister calls to trustworthy canisters is safe, except for the (possibly unlikely) case that there is a bug in the callee or its subnet that makes it stall for a long time. -public shared func callSomething() : async Result.Result { - try { - let result = await otherCanister.someMethod(); - #ok(result) - } catch (e) { - #err("call failed: " # Error.message(e)) - } -}; -``` +- Interacting with untrustworthy canisters is still possible by using a state-free proxy canister which could easily be re-installed if it is attacked as described above and is stuck. When the proxy is reinstalled, the caller obtains an error response to the open calls. -**Rust:** handle the `Result` from `Call::bounded_wait`: +- Sanitize data returned from inter-canister calls. -```rust -use ic_cdk::update; -use ic_cdk::call::Call; -use candid::Principal; +- See the "Talking to malicious canisters" section in [how to audit an ICP canister](https://www.joachim-breitner.de/blog/788-How_to_audit_an_Internet_Computer_canister). -// Replace other_canister_id() with your canister's ID lookup. +- See [current limitations of the Internet Computer](https://wiki.internetcomputer.org/wiki/Current_limitations_of_the_Internet_Computer), section "Calling potentially malicious or buggy canisters can prevent canisters from upgrading." -#[update] -async fn call_something() -> Result { - let response = Call::bounded_wait(other_canister_id(), "some_method") - .await - .map_err(|e| format!("call rejected: {:?}", e))?; - response.candid::() - .map_err(|e| format!("decode failed: {:?}", e)) -} -``` +## Make sure there are no loops in call graphs -## Summary checklist +### Security concern -Before shipping any canister that makes inter-canister calls: +Loops in the call graph (e.g., canister A calling B, B calling C, C calling A) may lead to canister deadlocks. -- **Reentrancy:** Apply CallerGuard (per-caller lock) to any method that makes an inter-canister call and reads or writes shared state. -- **State ordering:** Deduct or commit before `await`; compensate on failure in the callback. -- **Cleanup:** Use `finally` (Motoko) or `Drop` (Rust) for locks and cleanup that must always run. -- **Wait type:** Use `bounded_wait` for calls to canisters you do not control; `unbounded_wait` only for your own canisters. -- **Payload size:** Keep request and response payloads under 1 MB; paginate larger datasets. -- **Caller capture:** In Rust, bind `msg_caller()` before the first `await`. -- **Access control:** Do not rely on `canister_inspect_message` for inter-canister call security: always check the caller inside the method. -- **Error handling:** Always handle the `Result` of every inter-canister call. +### Recommendation -## Next steps +- Avoid such loops, or rely on bounded-wait calls instead, since these provide timeouts. -- [Inter-canister calls](../canister-calls/inter-canister-calls.md#making-calls): Basic inter-canister call patterns and the `Call` API -- [Parallel inter-canister calls](../canister-calls/parallel-inter-canister-calls.md): Running multiple calls concurrently and handling partial failures -- [Security concepts](../../concepts/security.md): IC security model and threat landscape +- For more information, see [current limitations of the Internet Computer](https://wiki.internetcomputer.org/wiki/Current_limitations_of_the_Internet_Computer), section "Loops in call graphs." - + diff --git a/docs/guides/security/miscellaneous.md b/docs/guides/security/miscellaneous.md new file mode 100644 index 0000000..792e737 --- /dev/null +++ b/docs/guides/security/miscellaneous.md @@ -0,0 +1,280 @@ +--- +title: "Miscellaneous Security Practices" +description: "Miscellaneous security best practices: data confidentiality, secure randomness, endpoint validation, testing, reproducible builds, monotonic time, and floating point." +sidebar: + order: 11 +--- + +## Data confidentiality on ICP + +### Security concern + +When storing data on ICP, there are two levels of data access. + +1. Nodes are able to read all data that is stored on a subnet. This includes all messages sent to or from a canister, along with all data stored in a canister. This means a node could extract all data available to a canister. This will change with the implementation of TEE-based security for nodes. + +2. End user clients can only access whatever data that nodes and canisters have made available to them. If the subnet's nodes do not misbehave and leak data, clients can only read the responses to ingress messages and queries that they have sent. The canister decides what data is exposed to the client. + +Partial information on data that is stored in the subnet state tree will always leak. Therefore, data with a low-entropy value may entirely leak and be fully exposed, such as a Boolean value that can only be either "True" or "False." Leakage on data with high entropy is negligible. + +There are two types of user-related data that may be stored in the subnet state tree. The first is when a user sends an ingress message to a canister; the message hash and the response are both stored in the subnet state tree to be retrieved securely by the client. The ingress message should contain a high-entropy nonce that is implemented by the agent and typically not exposed to the user. The message response is determined by the canister and may not contain a high-entropy value. If the canister response consists of a low-entropy value, then the data may be leaked to users other than the ingress message sender. + +The second type of user-related data is certified variables maintained by a canister that are also exposed through the subnet state tree. If a canister places low-entropy data into the state tree, then the data may leak to users who should not have access to that piece of data. + +### Recommendation + +For developers that need to protect the confidentiality of their data against external users, they should ensure that data in the subnet state tree has a sufficient level of entropy. 128 bits is recommended. If the data does not have enough entropy itself, then adding some artificial data using randomness would be recommended. + +In particular, a canister can ensure that responses to ingress messages do not leak data to external users, other than the sender, by including high-entropy data in the response. Or, a canister can ensure that data in certified variables is not leaked by adding high-entropy data to the variables that should be kept confidential. + +Additionally, similarly to ingress message responses, a canister's private custom sections that contain low-entropy data could leak to unauthorized users. Therefore, a sufficient level of entropy for canister private custom sections should be used. 128 bits is recommended. If the data does not have enough entropy itself, then adding some artificial data using randomness would be recommended. + +## Using secure randomness in canisters + +Canister developers often require access to secure randomness in their canisters to perform certain operations. The requirements for a secure randomness source include: +* **Unbiased:** The value shouldn't be influenced by anyone. +* **Unpredictable:** The value is unknown to anyone before it is generated. + +ICP exposes the system API [`raw_rand`](../../references/ic-interface-spec/management-canister.md#ic-raw_rand) for this exact purpose, which accepts no input and returns 32 bytes of cryptographically secure randomness. It is always recommended to use `raw_rand` as a source of randomness in canisters and **avoid** using other sources with low entropy, such as current time. + +To illustrate the usage of `raw_rand`, two examples in Motoko and Rust can be found below, including the benefits and caveats around using them. + +### 1. Direct usage of `raw_rand` as the random number generator + +In this Motoko example, the canister provides the requested size of random bytes by calling `raw_rand`. However, it can only generate 32 bytes of secure randomness in a single message, and thus subsequent calls to the system API are required to fill the requested size. + +```motoko +import Random "mo:core/Random"; +import Array "mo:core/Array"; + +actor Randomness { + public func random_bytes(n : Nat) : async [Nat8] { + let byteArray : [var Nat8] = Array.init(n, 0); + let entropy = await Random.blob(); + var f = Random.Finite(entropy); + var i = 0; + loop { + if (i == n) { + return Array.freeze(byteArray); + } else { + switch (f.byte()) { + case (?byte) { + byteArray[i] := byte; + i := i + 1; + }; + case null { + let entropy = await Random.blob(); + f := Random.Finite(entropy); + }; + }; + }; + }; + }; +}; +``` + +#### Benefits: +- The random bytes is guaranteed to be secure. + +#### Caveats: +- The method doesn't scale when a large amount of random bytes is requested, as `raw_rand` must be called for every 32 bytes. + +### 2. Using `raw_rand` as seed for a pseudo random number generator (PRNG) + +In this Rust example, we seed the output from `raw_rand` in a known PRNG like ChaCha20 in the `init` and `post_upgrade` hooks and generate randomness by calling the `random_bytes` method. + +```rust +use candid::{CandidType, Principal}; +use rand_chacha::rand_core::{RngCore, SeedableRng}; +use rand_chacha::ChaCha20Rng; +use std::cell::RefCell; +use std::time::Duration; + +thread_local! { + static RNG: RefCell> = RefCell::new(None); +} + +const SEEDING_INTERVAL: Duration = Duration::from_secs(3600); + +#[derive(CandidType)] +enum RngError { + RngNotInitialized(String), +} + +type RandomBytesResult = Result; + +async fn seed_randomness() { + let (seed,): ([u8; 32],) = ic_cdk::call(Principal::management_canister(), "raw_rand", ()) + .await + .expect("Failed to call the management canister"); + RNG.with_borrow_mut(|rng| *rng = Some(ChaCha20Rng::from_seed(seed))); +} + +fn schedule_seeding(duration: Duration) { + ic_cdk_timers::set_timer(duration, || { + ic_cdk::spawn(async { + seed_randomness().await; + // Schedule reseeding on a timer with duration SEEDING_INTERVAL + schedule_seeding(SEEDING_INTERVAL); + }) + }); +} + +#[ic_cdk::init] +fn init() { + // Initialize randomness during canister install or reinstall + schedule_seeding(Duration::ZERO); +} + +#[ic_cdk::post_upgrade] +fn post_upgrade() { + // Initialize randomness after a canister upgrade + schedule_seeding(Duration::ZERO); +} + +// This must always be an update method or the PRNG state won't be updated +#[ic_cdk::update] +fn random_bytes(size: u32) -> RandomBytesResult { + let mut buf = vec![0; size as usize]; + RNG.with_borrow_mut(|rng| match rng.as_mut() { + Some(rand) => { + rand.fill_bytes(&mut buf); + Ok(hex::encode(buf)) + } + None => Err(RngError::RngNotInitialized( + "Randomness is not initialized. Please try again later".to_string(), + )), + }) +} +``` + +#### Benefits: +- This method scales for large random bytes, as `raw_rand` needs to be called only once, and subsequent PRNG computation is local to the canister. + +#### Caveats: +- The `setup_randomness` must **always** be initialized in both the `init` and `post_upgrade` hook as `init` [is not invoked during a canister upgrade](../../references/ic-interface-spec/canister-interface.md#system-api-upgrades). +- The `init` and `post_upgrade` methods don't allow async calls, and thus a timer is immediately scheduled to seed the randomness. +- Once the seed is initialized, the outcome of all future `random_bytes` is predictable to anyone having the seed (node providers), as the PRNG is deterministic. This breaks the unpredictable property of secure randomness. Hence, to balance security vs. performance, we recommend frequently reseeding the PRNG on a timer. The example above already does this with a duration of **1 hour**. However, based on the sensitivity of their dapp, developers can choose an appropriate reseeding interval by setting `SEEDING_INTERVAL`. +- The `random_bytes` must **always** be an `update` method, so the PRNG can preserve the state and offer unique randomness on every request. + +## Verify that your canister doesn't export malicious endpoints + +### Security concern + +Malicious code that could for example be introduced through a library in a supply chain attack could maliciously export canister endpoints, potentially leading to exposure of sensitive data, malicious canister state changes, or denial of service. + +### Recommendation + +Verify in your CI pipeline that the endpoints a canister's WASM exports are legitimate and as intended. [The `ic-wasm check-endpoints` command can be used for that purpose](https://github.com/dfinity/ic-wasm/blob/main/README.md#check-endpoints). + +## Test your canister code even in the presence of system API calls + +### Security concern + +Since canisters interact with the system API, it is harder to test the code because unit tests cannot call the system API. This may lead to a lack of unit tests. + +### Recommendation + +- Create loosely coupled modules that do not depend on the system API and unit test those. See this [recommendation](https://mmapped.blog/posts/01-effective-rust-canisters.html#target-independent) (from [effective Rust canisters](https://mmapped.blog/posts/01-effective-rust-canisters.html)). + +- For the parts that still interact with the system API, create a thin abstraction of the system API that is faked in unit tests. See the [recommendation](https://mmapped.blog/posts/01-effective-rust-canisters.html#target-independent) (from [effective Rust canisters](https://mmapped.blog/posts/01-effective-rust-canisters.html)). For example, one can implement a 'Runtime' as follows and then use the 'MockRuntime' in tests (code by Dimitris Sarlis): + +```rust +use ic_cdk::api::{ + call::call, caller, data_certificate, id, print, time, trap, +}; + +#[async_trait] +pub trait Runtime { + fn caller(&self) -> Result; + fn id(&self) -> Principal; + fn time(&self) -> u64; + fn trap(&self, message: &str) -> !; + fn print(&self, message: &str); + fn data_certificate(&self) -> Option>; + (...) +} + +#[async_trait] +impl Runtime for RuntimeImpl { + fn caller(&self) -> Result { + let caller = caller(); + // The anonymous principal is not allowed to interact with the canister. + if caller == Principal::anonymous() { + Err(String::from( + "Anonymous principal not allowed to make calls.", + )) + } else { + Ok(caller) + } + } + + fn id(&self) -> Principal { + id() + } + + fn time(&self) -> u64 { + time() + } + + (...) + +} + +pub struct MockRuntime { + pub caller: Principal, + pub canister_id: Principal, + pub time: u64, + (...) +} + +#[async_trait] +impl Runtime for MockRuntime { + fn caller(&self) -> Result { + Ok(self.caller) + } + + fn id(&self) -> Principal { + self.canister_id + } + + fn time(&self) -> u64 { + self.time + } + + (...) + +} +``` + +## Make canister builds reproducible + +### Security concern + +It should be possible to verify that a canister does what it claims to do. ICP provides a SHA256 hash of the deployed WASM module. In order for this to be useful, the canister build has to be reproducible. + +### Recommendation + +Make canister builds reproducible. See this [recommendation](https://mmapped.blog/posts/01-effective-rust-canisters.html#reproducible-builds) (from [effective Rust canisters](https://mmapped.blog/posts/01-effective-rust-canisters.html)). See also the [developer docs on reproducible builds](../canister-management/reproducible-builds.md). + +## Don't rely on time being strictly monotonic + +### Security concern + +The time read from the system API is monotonic but not strictly monotonic. Thus, two subsequent calls can return the same time, which could lead to security bugs when the time API is used. + +### Recommendation + +See the "Time is not strictly monotonic" section in [How to audit an ICP canister](https://www.joachim-breitner.de/blog/788-How_to_audit_an_Internet_Computer_canister). + +## Rust: Avoid floating point arithmetic for financial information + +### Security concern + +Floats in Rust may behave unexpectedly. There can be undesirable loss of precision under certain circumstances. When dividing by zero, the result could be `-inf`, `inf`, or `NaN`. When converting to an integer, this can lead to unexpected results. (There is no `checked_div` for floats.) + +### Recommendation + +Use [`rust_decimal::Decimal`](https://docs.rs/rust_decimal/latest/rust_decimal/) or [`num_rational::Ratio`](https://docs.rs/num-rational/latest/num_rational/). Decimal uses a fixed-point representation with base 10 denominators, and Ratio represents rational numbers. Both implement `checked_div` to handle division by zero, which is not available for floats. Numbers in common use, like 0.1 and 0.2, can be represented more intuitively with Decimal and can be represented exactly with Ratio. Rounding oddities like `0.1 + 0.2 != 0.3`, which happen with floats in Rust, do not arise with Decimal (see https://0.30000000000000004.com/ ). With Ratio, the desired precision can be made explicit. With either Decimal or Ratio, although one still has to manage precision, the above makes arithmetic easier to reason about. + + diff --git a/docs/guides/security/observability-and-monitoring.md b/docs/guides/security/observability-and-monitoring.md new file mode 100644 index 0000000..79477d5 --- /dev/null +++ b/docs/guides/security/observability-and-monitoring.md @@ -0,0 +1,22 @@ +--- +title: "Observability and Monitoring" +description: "Security best practices for monitoring canister cycles, logs, and health indicators." +sidebar: + order: 9 +--- + +## Monitor your canister + +### Security concern + +Without monitoring, it can be hard to detect attacks or vulnerabilities that are being actively exploited. For example, a sudden increase in cycles consumption could indicate a DoS attack, while unexpected changes in canister state could indicate a security breach. + +### Recommendation + +- Monitor your canister's cycles balance regularly, set up alerts for sudden changes in cycles consumption, and add an endpoint to expose health indicators. See the [DoS prevention best practices](./dos-prevention.md) for more context on cycles monitoring. + +- Consider emitting logs for security-relevant events (e.g., access control failures, unexpected state transitions). Since logs are stored on-chain, they provide a tamper-resistant audit trail. + +- See [effective Rust canisters](https://mmapped.blog/posts/01-effective-rust-canisters.html) for general patterns on canister observability. + + diff --git a/docs/guides/security/overview.md b/docs/guides/security/overview.md new file mode 100644 index 0000000..1fbf3d3 --- /dev/null +++ b/docs/guides/security/overview.md @@ -0,0 +1,54 @@ +--- +title: "Security Overview" +description: "Introduction to the ICP security best practices for canister and web app developers." +sidebar: + order: 1 +--- + +This section provides security best practices for developing canisters and web apps served by canisters on ICP. These best practices are mostly inspired by issues found in security reviews. + +The goal of these best practices is to enable developers to identify and address potential issues early during the development of new dapps, and not only in the end when (if at all) a security review is done. Ideally, this will make the development of secure dapps more efficient. + +Some excellent canister best practices linked here are from [effective Rust canisters](https://mmapped.blog/posts/01-effective-rust-canisters.html) and [how to audit an ICP canister](https://www.joachim-breitner.de/blog/788-How_to_audit_an_Internet_Computer_canister). The relevant sections are linked in the individual best practices. + +## Target audience + +The target audience for these documents is any developer working on ICP canisters or web apps and anyone who reviews such code. + +## Disclaimers and limitations + +The collection of best practices may grow over time. While it is useful to improve the security of dapps on ICP, such a list will never be complete and will never cover all potential security concerns. For example, there will always be attack vectors very specific to a dapp's use cases that cannot be covered by general best practices. Thus, following the best practices can complement, but not replace, security reviews. Especially for security-critical dapps, it is recommended to perform security reviews or audits. Furthermore, please note that the best practices are currently not ordered according to risk or priority. + +## Further reading + +Below are resources covering security best practices for technologies commonly used in ICP dapps. These are equally important as the ICP-specific guidelines and should be studied carefully. + +### General +* [How to audit an Internet Computer canister](https://www.joachim-breitner.de/blog/788-How_to_audit_an_Internet_Computer_canister) by Joachim Breitner +* [OWASP application security verification standard](https://owasp.org/www-project-application-security-verification-standard/) +* [OWASP top ten](https://owasp.org/www-project-top-ten/) + +### Rust +* [Secure Rust guidelines](https://anssi-fr.github.io/rust-guide/01_introduction.html), in particular [unsafe code](https://anssi-fr.github.io/rust-guide/04_language.html#unsafe-code), [overflows](https://anssi-fr.github.io/rust-guide/04_language.html#integer-overflows) and [Cargo-audit](https://anssi-fr.github.io/rust-guide/03_libraries.html#cargo-audit) + * For overflowing operations, consider using `saturated` or `checked` variants, such as `saturated_add`, `saturated_sub`, `checked_add`, `checked_sub`. See the [Rust docs](https://doc.rust-lang.org/std/primitive.u32.html#method.saturating_add) for `u32`. + +### Crypto +* [OWASP cryptographic failures](https://owasp.org/Top10/A02_2021-Cryptographic_Failures/) points out issues related to cryptography, or the lack thereof. +* [OWASP application security verification standard](https://owasp.org/www-project-application-security-verification-standard/) (see Section V6) +* **Use the [Web Crypto API](https://developer.mozilla.org/en-US/docs/Web/API/Web_Crypto_API).** Storing key material in the browser storage (such as [sessionStorage](https://developer.mozilla.org/en-US/docs/Web/API/Window/sessionStorage) or [localStorage](https://developer.mozilla.org/en-US/docs/Web/API/Window/localStorage)) is considered unsafe because these keys can be accessed by JavaScript code, e.g. in an XSS attack. To protect the private key from direct access, use Web Crypto's [generateKey](https://developer.mozilla.org/en-US/docs/Web/API/SubtleCrypto/generateKey) with `extractable=false`. + +### Web security {#web-security} +* Resources for setting security headers: + * [securityheaders.com](https://securityheaders.com/) + * [Permissions policy generator](https://www.permissionspolicy.com/) + * [Content security policy evaluator](https://csp-evaluator.withgoogle.com/) and [strict CSP](https://csp.withgoogle.com/docs/strict-csp.html) + * [OWASP secure headers project](https://owasp.org/www-project-secure-headers/) +* [SSL server test](https://www.ssllabs.com/ssltest/) +* Don't use features that could lead to an XSS vulnerability, such as [@html in Svelte](https://svelte.dev/docs#template-syntax-html). +* **Log out securely.** Clear all session data (especially [sessionStorage](https://developer.mozilla.org/en-US/docs/Web/API/Window/sessionStorage) and [localStorage](https://developer.mozilla.org/en-US/docs/Web/API/Window/localStorage)), clear [IndexedDB](https://developer.mozilla.org/en-US/docs/Web/API/IndexedDB_API), etc. on logout. Make sure other browser tabs showing the same origin are logged out if the logout is triggered in one tab. This may not happen automatically when the ICP JavaScript agent is used, since the ICP JavaScript agent keeps the private key in memory once initialized. + +### Testing +* In [effective Rust canisters](https://mmapped.blog/posts/01-effective-rust-canisters.html): [test upgrades](https://mmapped.blog/posts/01-effective-rust-canisters.html#test-upgrades), [make code target-independent](https://mmapped.blog/posts/01-effective-rust-canisters.html#target-independent) +* Consider [PocketIC](../testing/pocket-ic.md) for canister testing + + diff --git a/docs/references/message-execution-properties.md b/docs/references/message-execution-properties.md new file mode 100644 index 0000000..6aecbf8 --- /dev/null +++ b/docs/references/message-execution-properties.md @@ -0,0 +1,78 @@ +--- +title: "Properties of Message Executions on ICP" +description: "The 11 properties of message execution on ICP, covering atomicity, ordering guarantees, inter-canister call delivery, and cycle handling for bounded-wait and unbounded-wait calls." +--- + +## Asynchronous messaging model + +ICP relies on an asynchronous messaging model. Compared to synchronous messaging like on Ethereum, this provides performance advantages because multiple calls can be executed concurrently, and also in parallel, when multiple canisters are involved. However, asynchronous message execution can also lead to sometimes unexpected or unintuitive behavior. Therefore, it is important to understand the properties of message execution. Potential security issues that arise in this model, such as reentrancy bugs, are discussed in the [security best practices on inter-canister calls](../guides/security/inter-canister-calls.md). + +The [community conversation on security best practices](https://www.youtube.com/watch?v=PneRzDmf_Xw&list=PLuhDt1vhGcrez-f3I0_hvbwGZHZzkZ7Ng&index=2&t=4s) also discusses the messaging properties. + +## Message execution properties + +To interact with a canister's methods, there are two primary types of **entry points** or **methods** that can be called, update and query methods. If the Rust CDK is used, these are usually annotated with `#[update]` or `#[query]`, respectively. In Motoko, updates are declared as `public func`, and queries use the dedicated keyword: `public query func`. + +These entry points can be called either by external users through the IC's HTTP interface, or by other canisters. ICP also supports additional [entry points](./ic-interface-spec/canister-interface.md#entry-points) such as heartbeats, timers, initialization or upgrade hooks. These cannot be called directly, but the properties listed in this document are still relevant for them, and in particular heartbeats and timers, as they behave like update methods that are called by the system. + +A **message execution** is a set of consecutive instructions that a subnet executes when a canister's method is invoked. The code execution for any such method can be split into several message executions if the method makes inter-canister calls. The following properties are essential: + +- **Property 1**: Only a single message execution is run at a time per canister. Message execution within a single canister is atomic and sequential, and never parallel. + +Note that parallel message execution over multiple canisters is possible; this property talks about just a single canister. + +- **Property 2**: Each downstream call that a canister makes, query or update, triggers a message. When using `await` on the response from an inter-canister call, the code after the `await` (the callback, highlighted in blue) is executed as a separate message execution. + + For example, consider the following Motoko code: + + ![example_highlighted_code](/img/docs/references/example_highlighted_code.png) + + The first message execution spans the lines 2-3, until the inter-canister call is made using the `await` syntax (orange box). The second message execution spans lines 3-5 when the inter-canister call returns (blue box). This part is called the _callback_ of the inter-canister call. The two message executions involved in this example will always be scheduled sequentially. + +:::note +An `await` in the code does not necessarily mean that an inter-canister call is made and thus a message execution ends and the code after the `await` is executed as a separate message execution (callback). Async code with the `await` syntax (e.g. in Rust or Motoko) can also be used "internally" in the canister, without issuing an inter-canister call. In that case, the code part including the `await` will be processed within a single message execution. For Rust, both cases are possible if `await` is used. An inter-canister call is only made if the system API `ic0.call_perform` is called, e.g. when awaiting result of the CDK's `call` method. In Motoko, `await` always commits the current state and triggers a new message send, while `await*` does not necessarily commit the current state or trigger new message sends. See [Motoko actors and async programming](../languages/motoko/fundamentals/actors-async.md) for details on `await` vs. `await*`. +::: + +:::note +In the Rust CDK, it is the `await` expression that triggers the canister call, rather than invoking the `call` function itself. That is, a call that is not `await`-ed is never executed. In Motoko, if the code does not `await` the response of a call, the call is still made, but the code after the call is executed in the same message execution, until the next inter-canister call is triggered using `await`. Also, multiple outgoing calls can be triggered in parallel from the same message execution; see the [parallel calls examples](https://github.com/dfinity/examples). +::: + +- **Property 3**: Successfully delivered requests are received in the order in which they were sent. In particular, if a canister A sends `m1` and `m2` to canister B in that order, then, if both are accepted, `m1` is executed before `m2`. + +:::note +This property only gives a guarantee on when the request messages are executed, but there is no guarantee on the ordering of the responses received. +::: + +- **Property 4**: Multiple message executions, e.g., from different calls, can interleave and have no reliable execution ordering. + + Property 3 provides a guarantee on the order of message executions on a target canister. However, if multiple calls interleave, one cannot assume additional ordering guarantees for these interleaving calls. To illustrate this, let's consider the above example code again, and assume the method `example` is called twice in parallel, the resulting calls being Call 1 and Call 2. The following illustration shows two possible message execution orderings. On the left, the first call's message executions are scheduled first, and only then the second call's messages are executed. On the right, you can see another possible message execution scheduling, where the first messages of each call are executed first. Your code should result in a correct state regardless of the message execution ordering. + + ![example_orderings](/img/docs/references/example_orderings.png) + +- **Property 5**: On a trap or panic, modifications to the canister state for the current message execution are not applied. + + For example, if a trap occurs in the execution of the second message (blue box) of the above example, canister state changes resulting from that message execution, i.e. everything in the blue box, are discarded. However, note that any state changes from earlier message executions and in particular the first message execution (orange box) have been applied, as that message executed successfully. + +- **Property 6**: Inter-canister calls are not guaranteed to be delivered to the destination canister, but they are guaranteed to be delivered at most once. When a call does reach the destination canister, the destination canister may trap or return a reject response while processing the call. + + There are many reasons why a call might not be delivered to the destination canister. Some of them are under the control of the canister developers, such as the destination having sufficient cycles to process the call. Others are not; the Internet Computer may decide to fail the call under high load. + +- **Property 7**: Every inter-canister call is guaranteed to receive a single response, either from the callee, or synthetically produced by the protocol. + + However, a malicious destination canister could choose to delay the response for arbitrarily long if it is willing to put in the required cycles. Also, the response does not have to be successful, but can also be a reject response. The reject may come from the called canister, but it may also be generated by ICP. Such protocol-generated rejects can occur at any time before the call reaches the callee-canister, as well as once the call does reach the callee-canister, for example if the callee-canister traps while processing the call. Thus, it's important that the calling canister handles reject responses as well. A reject response means that the message hasn't been successfully processed by the receiver but doesn't guarantee that the receiver's state wasn't changed. + +- **Property 8**: If the calling canister made an unbounded-wait (as opposed to a bounded-wait) inter-canister call, if the call is delivered to the callee and the callee responds without trapping, the protocol guarantees that the first such response will get back to the caller canister. Otherwise, the caller will receive a reject response (with the code never being `SYS_UNKNOWN`). + +- **Property 9**: If the calling canister makes a bounded-wait call, the system may generate a reject response and deliver it to the caller. This can happen even if the call is successfully delivered to the callee and the callee responds without trapping. But this can only happen if the reject response is `SYS_UNKNOWN`. + + Since, by property 7, the caller will only ever receive a single response, this means that the real response from the callee, if produced, will get dropped by the system if `SYS_UNKNOWN` is returned. + +- **Property 10**: If cycles are attached to an unbounded-wait call, the sum of cycles accepted by the callee, and those refunded to the caller equals the amount that the caller attached. + +- **Property 11**: If cycles are attached to a bounded-wait call, they may get lost whenever a `SYS_UNKNOWN` response is received by the caller. In particular, any cycles refunded by the callee are lost, and if the callee doesn't receive the call, all attached cycles are lost. If any response other than `SYS_UNKNOWN` is received, property 10 holds even for bounded-wait calls. + + Note that cycles do not necessarily get lost. Even if the caller receives `SYS_UNKNOWN`, it can happen that the callee still receives the sent cycles. + +For more details, refer to the [IC Interface Specification abstract behavior](./ic-interface-spec/abstract-behavior.md) which defines message execution in more detail. + + diff --git a/public/img/docs/references/example_highlighted_code.png b/public/img/docs/references/example_highlighted_code.png new file mode 100644 index 0000000..e7b1a1f Binary files /dev/null and b/public/img/docs/references/example_highlighted_code.png differ diff --git a/public/img/docs/references/example_orderings.png b/public/img/docs/references/example_orderings.png new file mode 100644 index 0000000..f2f62a6 Binary files /dev/null and b/public/img/docs/references/example_orderings.png differ diff --git a/public/img/docs/security/example_trap_after_await.png b/public/img/docs/security/example_trap_after_await.png new file mode 100644 index 0000000..f611f11 Binary files /dev/null and b/public/img/docs/security/example_trap_after_await.png differ diff --git a/public/img/docs/security/ii_mobile_delegation_chain.png b/public/img/docs/security/ii_mobile_delegation_chain.png new file mode 100644 index 0000000..3f12f22 Binary files /dev/null and b/public/img/docs/security/ii_mobile_delegation_chain.png differ diff --git a/sidebar.mjs b/sidebar.mjs index 4ef125b..e99f28e 100644 --- a/sidebar.mjs +++ b/sidebar.mjs @@ -129,6 +129,7 @@ export const sidebar = [ { slug: "references/cycles-costs" }, { slug: "references/subnet-types" }, { slug: "references/execution-errors" }, + { slug: "references/message-execution-properties" }, { slug: "references/http-gateway-spec" }, { slug: "references/candid-spec" }, { slug: "references/internet-identity-spec" },