Skip to content

Observations from a real-world dfx → icp-cli migration #135

@marc0olo

Description

@marc0olo

Context

During a real-world migration of Kairos (a production ICP app) from dfx to icp-cli, an AI agent (Claude) used the ICP skills to guide the migration. The migration partially failed, leaving the project in a hybrid state — backend on icp-cli, frontend still requiring dfx.

A follow-up session completed the migration successfully, but identified several areas where the skills may not have provided sufficient guidance, potentially contributing to incorrect agent decisions. These observations are documented below as recommendations for review — they reflect one migration experience and should be validated before applying changes to the skills.

The project was originally built on the Caffeine platform (managed ICP deployment service) and exported for standalone deployment. It uses a Motoko backend with enhanced orthogonal persistence and a React 19 + TypeScript frontend.


Observation 1: Asset canister version drift can cause upgrade failures

Affected skills: asset-canister, icp-cli

What we observed: The project's config pinned assetstorage.wasm v0.27.0 via type: pre-built. The developer had since deployed with dfx (which installed a newer version on-chain). When the agent migrated to icp-cli, it carried over the v0.27.0 reference. Installing v0.27.0 onto a canister running a newer version caused a "Cannot parse header" panic.

Possible gap: Mistake #7 in the asset-canister skill warns about versions below 0.30.2 for the ic_env cookie, but doesn't mention that downgrading an asset canister version causes stable memory parse failures. Agents have no way to query the on-chain version, so they can't detect this mismatch.

Recommendation to review: Consider adding a note that asset canister stable memory is not backwards-compatible, and that when migrating from dfx, the pinned version should match or exceed what was previously deployed. Since there's no way to query the running version, using the latest available version is the safest default when the on-chain version is unknown.


Observation 2: Frontend runtime migration from env.json to ic_env cookie lacks concrete examples

Affected skills: icp-cli, asset-canister, icp-cli/references/dfx-migration.md

What we observed: The project used env.json to pass the backend canister ID to the frontend. The dfx-migration reference says to remove output_env_file, and the icp-cli skill mentions safeGetCanisterEnv(), but the agent didn't connect these into an actual code migration. It kept the env.json approach.

Possible gap: There may not be a clear end-to-end code example showing the before/after for frontend canister ID resolution.

Recommendation to review: Consider adding a concrete migration example:

// Before (dfx): fetch canister ID from env.json
const response = await fetch('/env.json');
const { backend_canister_id } = await response.json();

// After (icp-cli): read from ic_env cookie
import { safeGetCanisterEnv } from '@icp-sdk/core/agent/canister-env';
const backendCanisterId = safeGetCanisterEnv()?.['PUBLIC_CANISTER_ID:backend'];

Implementation detail we discovered: safeGetCanisterEnv() returns IC_ROOT_KEY (uppercase) as a Uint8Array, parsed from the cookie param ic_root_key (lowercase). This casing transformation caused bugs during our migration. It may be worth documenting the exact property names and types of the returned object.


Observation 3: fetchRootKey() replacement pattern not shown concretely

Affected skills: internet-identity, icp-cli

What we observed: The project called agent.fetchRootKey() for local development. The II skill warns against this but the agent didn't know what to replace it with.

Recommendation to review: Consider showing the concrete replacement:

// Before (security risk on mainnet)
const agent = new HttpAgent({ host });
if (host.includes('localhost')) await agent.fetchRootKey();

// After: root key from ic_env cookie
const rootKey = safeGetCanisterEnv()?.['IC_ROOT_KEY']; // Uint8Array or undefined
const agent = new HttpAgent({ host, rootKey });
// Mainnet: rootKey undefined → uses hardcoded IC root key
// Local: rootKey from cookie → local replica's key

Observation 4: II URL can be runtime-derived but this isn't shown

Affected skills: internet-identity, icp-cli/references/dev-server.md

What we observed: The project used vite-plugin-environment to inject II_URL at build time. With ic_env, the II URL can be derived at runtime by checking for the presence of IC_ROOT_KEY (local dev has it, mainnet doesn't). The agent didn't make this connection and kept the build-time injection.

Recommendation to review: Consider showing the runtime pattern:

import { safeGetCanisterEnv } from '@icp-sdk/core/agent/canister-env';
const isLocal = safeGetCanisterEnv()?.['IC_ROOT_KEY'] != null;
const iiUrl = isLocal ? 'http://id.ai.localhost:8000' : 'https://id.ai';

Observation 5: icp.yaml networks syntax not shown in skill

Affected skills: icp-cli

What we observed: We wrote networks as a YAML map. icp-cli rejected it — networks is an array. We had to fetch the JSON schema to fix it.

Recommendation to review: Consider adding a networks example to the icp-cli skill:

networks:
    - name: local
      mode: managed
      ii: true

Observation 6: Custom Motoko builds need guidance on mops path resolution

Affected skills: icp-cli

What we observed: The backend used a custom build script (not the Motoko recipe) because it needs special flags. The original script fell back to $(dfx cache show)/moc. After removing dfx, the build failed. The solution was mops toolchain bin moc and mops sources.

Recommendation to review: For custom Motoko builds (not using the recipe), consider showing:

MOC="$(mops toolchain bin moc)"
BASE="$(mops sources | sed 's/--package base //')"

Observation 7: Migration checklist may be missing frontend runtime items

Affected skills: icp-cli/references/dfx-migration.md

What we observed: The migration reference covers config files, packages, ports, and env vars. But the frontend runtime changes (env.json → ic_env, fetchRootKey removal, II URL derivation, vite-plugin-environment removal) were missed by the agent.

Recommendation to review: Consider adding to the verification checklist:

  • env.json fetches replaced with safeGetCanisterEnv() reads
  • fetchRootKey() calls removed — root key comes from ic_env cookie
  • II URL derived at runtime (not injected via build-time env vars)
  • Build-time env injection plugins removed if no longer needed
  • No remaining process.env.CANISTER_* in frontend code
  • Vite dev server simulates ic_env cookie (per dev-server reference)

General observation: agents tend to preserve existing patterns

Across all findings, the common pattern was that the agent preserved existing code rather than replacing it with icp-cli equivalents. When it saw pre-built WASM steps, env.json reads, fetchRootKey() calls, or env injection plugins — it kept them because they appeared to be working code.

Skills appear to be effective at telling agents what to add, but may be less effective at telling them what to remove. Migration guides may benefit from explicit "remove this" directives alongside "add this" ones. This observation may be worth considering when structuring migration-related skill content.

Metadata

Metadata

Assignees

No one assigned

    Labels

    hallucinationSkill produces incorrect or fabricated outputskill-improvementImprovement to an existing skill

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions