-
-
Notifications
You must be signed in to change notification settings - Fork 5
State Tracking
kei uses a SQLite database to track the state of every asset across sync runs.
The state database is stored at {data_dir}/{username}.db (default: ~/.config/kei/{username}.db).
For each iCloud asset:
| Field | Description |
|---|---|
library |
CloudKit library zone, such as PrimarySync or a SharedSync-* zone |
asset_id |
Unique iCloud asset identifier |
version_size |
iCloud resource/version identifier for the downloaded variant |
status |
pending, downloaded, or failed
|
checksum |
Apple's MMCS checksum string from iCloud, kept for provider traceability |
download_checksum |
SHA-256 of the downloaded bytes before metadata writes |
local_checksum |
SHA-256 of the final local file, used by verify --checksums
|
filename |
Original filename |
local_path |
Where the file was downloaded |
download_attempts |
Number of retry attempts |
last_error |
Error message from last failure |
created_at |
Asset creation date in iCloud |
downloaded_at |
When the file was downloaded locally |
metadata_hash |
Hash of captured provider metadata for rewrite/backfill checks |
imported_size, imported_mtime
|
Import snapshot used to skip rehashing unchanged adopted files |
For each sync run:
| Field | Description |
|---|---|
started_at |
When the sync began |
completed_at |
When the sync finished (null if interrupted) |
assets_seen |
Total assets enumerated from API |
assets_downloaded |
Successfully downloaded count |
assets_failed |
Failed download count |
enumeration_errors |
Count of hard enumeration errors observed during the run |
interrupted |
Whether the run was interrupted (legacy boolean; kept alongside status for back-compat) |
status |
Lifecycle: running, complete, or interrupted. A SIGKILL'd process leaves the row at running, which the next startup promotes to interrupted. |
On the first run, every asset is enumerated and downloaded. On subsequent runs, assets already marked as downloaded are skipped without any filesystem checks. This makes re-running kei nearly instant for unchanged libraries.
If the database says a file is downloaded but the file is missing from disk, it's automatically re-downloaded. This handles cases where files were accidentally deleted or moved.
Assets that fail to download are marked as failed with their error message. This includes assets that exhaust retry budgets from [download.retry] - they're moved from pending to failed with a descriptive error rather than being silently skipped.
You can view failed assets with kei status --failed. On the next sync, all failed assets are automatically reset to pending and retried with fresh attempt counts - no manual intervention needed.
If a sync is interrupted (Ctrl+C, crash, reboot), the next run picks up where it left off. Assets already downloaded are skipped, and only pending/failed assets are attempted.
After the first full sync, kei stores a CloudKit syncToken for each library zone. On the next run, it calls Apple's changes/database endpoint to check if anything changed. If nothing did, the sync completes in 1-2 API calls instead of the ~75 needed for full library enumeration.
When changes exist, the changes/zone endpoint returns only new, modified, or deleted records since the last token. Created assets are downloaded normally. Deletions and hidden assets are written to state and skipped. kei only advances the stored token after the relevant pass and cycle finish safely.
Tokens are stored per-zone in the metadata table as sync_token:{zone_name}.
Watch mode also tracks Apple's database-level changes token separately as db_sync_token. It is advanced only after the selected-zone cycle completes safely. If Apple returns an empty complete precheck page, kei skips that wakeup but keeps the prior database token so the next wakeup rechecks from a known checkpoint.
The sync token is only advanced when the run is safe. Unsafe cases include:
- partial failures or interrupted shutdown
- dry runs, read-only filename output, and recent-limited runs
- hard enumeration errors, pagination shortfalls, or ambiguous empty pages
- source delete or hidden-state writes that fail or update zero rows
- full-query asset/master records that cannot be paired safely
- album relation hydration that has not completed
- path or enumeration config drift before a full reconciliation has run
When token advancement is blocked, sync_report.json includes sync_token_blocked and reason fields. If kei collected token receiver telemetry, the report includes the expected receiver count, receivers with tokens, missing receivers, blank receivers, dropped receivers, and unique token count even when advancement was not blocked.
If Apple rejects a stored token (expired or invalid), kei logs a warning and automatically falls back to a full enumeration. The new token from the full sync is stored for next time. No manual intervention needed.
kei also falls back to full enumeration when pending or failed assets need another look. The changes/zone API only returns new modifications - it can't re-enumerate assets that were already seen but not yet downloaded. Once all pending assets are resolved, incremental sync resumes.
Path-affecting config drift clears stored zone tokens and forces a full reconciliation. That covers changes such as download directory, folder templates, filename policy, media/resource selection, and date/recent filters where known assets need to be planned into a new path shape.
Album-filtered runs can avoid a full library enumeration once trusted album membership snapshots exist. If a selected album lacks a trusted snapshot, kei can run a targeted album backfill to build one, then use incremental routing later.
Transient zone-level errors (THROTTLED, RETRY_LATER, etc.) don't trigger a full re-enumeration. Only InvalidToken and ZoneNotFound cause a fallback. Other errors propagate and the sync retries with the existing token on the next cycle.
When a run uses full enumeration instead of incremental sync, kei records a bounded full_enumeration_reason in the JSON report, structured logs, and Prometheus metrics. Current reasons include:
| Reason | Meaning |
|---|---|
no_stored_token |
First run, reset tokens, or no usable token for the zone |
retry_failed_rows |
Failed rows need re-enumeration before retry |
pending_rows |
Pending rows from a prior run need re-enumeration |
metadata_backfill |
Metadata rewrite/backfill work needs a full asset view |
album_relation_hydration_incomplete |
Album relation data is not trusted enough for incremental routing yet |
enum_config_hash_drift |
Enumeration-affecting config changed |
download_config_hash_drift |
Path-affecting download config changed |
explicit_retry_failed |
The run was started with --retry-failed
|
other_static_reason |
A less specific safe fallback path was used |
Clear stored sync tokens, then run sync again:
kei reset sync-token
kei syncSee kei reset sync-token.
The state database enables several management commands:
| Command | Description |
|---|---|
status |
Show sync status and database summary |
status --failed |
List failed assets with error messages |
sync --retry-failed |
Reset failed assets to pending and re-sync |
reset state |
Delete the database and start fresh |
reset sync-token |
Clear stored sync tokens |
import-existing |
Scan local files and mark matching assets as downloaded |
verify |
Check that downloaded files still exist |
verify --checksums |
Also verify SHA256 checksums |
If you have files from a previous tool (Python icloudpd, manual download, etc.), use import-existing to populate the database:
kei import-existing --config ~/.config/kei/config.tomlThis scans the download directory, matches files to iCloud assets by filename and size, and marks them as downloaded. The next sync will skip these files.
To start fresh and re-download everything:
kei reset state --yesThis deletes the database file. The next sync will treat all assets as new.
The database uses SQLite. Current schema version is v12. Migrations apply automatically when upgrading kei versions; each step runs inside a SAVEPOINT so a failure rolls back only that step.
Tables:
-
assets- one row per iCloud asset version, keyed by(library, id, version_size) -
sync_runs- one row per sync execution, including lifecycle status and enumeration error counts -
metadata- key-value store for sync tokens, schema markers, and the shared-library notice latch -
asset_albums- many-to-many between assets and album names, scoped by library -
asset_people- face-recognition labels per asset, scoped by library -
album_containers,album_membership_snapshots,asset_album_memberships- trusted album membership cache for album-aware incremental routing
Schema-version highlights:
| Version | Change |
|---|---|
| v1 |
assets (with (id, version_size) PK) and sync_runs
|
| v2 |
metadata key-value table |
| v3 |
assets.local_checksum column (locally-computed SHA-256) |
| v4 |
assets.download_checksum column (pre-EXIF download hash) |
| v5 | Provider-agnostic metadata columns (source, is_favorite, rating, GPS, media_subtype, keywords, description, provider_data, metadata_hash, etc.) plus the asset_albums and asset_people tables. Sync tokens are invalidated on the first crossing so the metadata backfill repopulates without re-downloading files. |
| v6 |
assets.metadata_write_failed_at so the metadata-only rewrite path can re-drive failed EXIF/XMP embeds on subsequent syncs. |
| v7 |
sync_runs.status lifecycle column (running / complete / interrupted); existing rows are backfilled from the (completed_at, interrupted) pair. |
| v8 | Adds library to the assets primary key so shared-library assets cannot collide with primary-library assets. |
| v9 | Adds library to asset_albums and asset_people primary keys. |
| v10 | Adds sync_runs.enumeration_errors for hard enumeration failure reporting. |
| v11 | Adds assets.imported_size and assets.imported_mtime so repeated import-existing runs can skip SHA-256 reads when size and mtime are unchanged. |
| v12 | Adds album container and membership snapshot tables used by album-aware incremental routing. |