Skip to content

State Tracking

rhoopr edited this page May 26, 2026 · 11 revisions

State Tracking

kei uses a SQLite database to track the state of every asset across sync runs.

Database Location

The state database is stored at {data_dir}/{username}.db (default: ~/.config/kei/{username}.db).

What's Tracked

For each iCloud asset:

Field Description
asset_id Unique iCloud asset identifier
status pending, downloaded, or failed
checksum SHA256 checksum from iCloud
filename Original filename
local_path Where the file was downloaded
download_attempts Number of retry attempts
last_error Error message from last failure
created_at Asset creation date in iCloud
downloaded_at When the file was downloaded locally

For each sync run:

Field Description
started_at When the sync began
completed_at When the sync finished (null if interrupted)
assets_seen Total assets enumerated from API
assets_downloaded Successfully downloaded count
assets_failed Failed download count
interrupted Whether the run was interrupted (legacy boolean; kept alongside status for back-compat)
status Lifecycle: running, complete, or interrupted. A SIGKILL'd process leaves the row at running, which the next startup promotes to interrupted.

Benefits

Near-Instant Subsequent Syncs

On the first run, every asset is enumerated and downloaded. On subsequent runs, assets already marked as downloaded are skipped without any filesystem checks. This makes re-running kei nearly instant for unchanged libraries.

Automatic Re-Download

If the database says a file is downloaded but the file is missing from disk, it's automatically re-downloaded. This handles cases where files were accidentally deleted or moved.

Failed Asset Tracking

Assets that fail to download are marked as failed with their error message. This includes assets that exhaust retry budgets from [download.retry] - they're moved from pending to failed with a descriptive error rather than being silently skipped.

You can view failed assets with kei status --failed. On the next sync, all failed assets are automatically reset to pending and retried with fresh attempt counts - no manual intervention needed.

Resume After Interruption

If a sync is interrupted (Ctrl+C, crash, reboot), the next run picks up where it left off. Assets already downloaded are skipped, and only pending/failed assets are attempted.

Incremental Sync

After the first full sync, kei stores a CloudKit syncToken for each library zone. On the next run, it calls Apple's changes/database endpoint to check if anything changed. If nothing did, the sync completes in 1-2 API calls instead of the ~75 needed for full library enumeration.

When changes exist, the changes/zone endpoint returns only new, modified, or deleted records since the last token. Created assets are downloaded normally. Deletions and hidden assets are logged and skipped. The token is updated after each page of results and persisted to the state DB, so interrupted incremental syncs resume from the last completed page.

Tokens are stored per-zone in the metadata table as sync_token:{zone_name}.

Token withholding on partial failure

The sync token is only advanced when all downloads succeed. If downloads partially fail or the session expires mid-sync, the token stays at its previous value. On the next run, the same change events replay from the last good token, so nothing is silently lost.

Fallback behavior

If Apple rejects a stored token (expired or invalid), kei logs a warning and automatically falls back to a full enumeration. The new token from the full sync is stored for next time. No manual intervention needed.

kei also falls back to full enumeration when pending assets exist from a previous sync. The changes/zone API only returns new modifications - it can't re-enumerate assets that were already seen but not yet downloaded. Once all pending assets are resolved, incremental sync resumes.

Transient zone-level errors (THROTTLED, RETRY_LATER, etc.) don't trigger a full re-enumeration. Only InvalidToken and ZoneNotFound cause a fallback. Other errors propagate and the sync retries with the existing token on the next cycle.

Forcing a full scan

Clear stored sync tokens, then run sync again:

kei reset sync-token
kei sync

See kei reset sync-token.

Subcommands

The state database enables several management commands:

Command Description
status Show sync status and database summary
status --failed List failed assets with error messages
sync --retry-failed Reset failed assets to pending and re-sync
reset state Delete the database and start fresh
reset sync-token Clear stored sync tokens
import-existing Scan local files and mark matching assets as downloaded
verify Check that downloaded files still exist
verify --checksums Also verify SHA256 checksums

Import Existing Files

If you have files from a previous tool (Python icloudpd, manual download, etc.), use import-existing to populate the database:

kei import-existing --config ~/.config/kei/config.toml

This scans the download directory, matches files to iCloud assets by filename and size, and marks them as downloaded. The next sync will skip these files.

Reset State

To start fresh and re-download everything:

kei reset state --yes

This deletes the database file. The next sync will treat all assets as new.

Database Schema

The database uses SQLite. Current schema version is v7. Migrations apply automatically when upgrading kei versions; each step runs inside a SAVEPOINT so a failure rolls back only that step.

Tables:

  • assets - one row per iCloud asset, keyed by (id, version_size)
  • sync_runs - one row per sync execution, including the status lifecycle column added in v7
  • metadata - key-value store for sync tokens, schema markers, and the shared-library notice latch
  • asset_albums - many-to-many between assets and album names (added v5)
  • asset_people - face-recognition labels per asset (added v5)

Schema-version highlights:

Version Change
v1 assets (with (id, version_size) PK) and sync_runs
v2 metadata key-value table
v3 assets.local_checksum column (locally-computed SHA-256)
v4 assets.download_checksum column (pre-EXIF download hash)
v5 Provider-agnostic metadata columns (source, is_favorite, rating, GPS, media_subtype, keywords, description, provider_data, metadata_hash, etc.) plus the asset_albums and asset_people tables. Sync tokens are invalidated on the first crossing so the metadata backfill repopulates without re-downloading files.
v6 assets.metadata_write_failed_at so the metadata-only rewrite path can re-drive failed EXIF/XMP embeds on subsequent syncs.
v7 sync_runs.status lifecycle column (running / complete / interrupted); existing rows are backfilled from the (completed_at, interrupted) pair.

Related

Commands

Getting Started

Features

Clone this wiki locally