Skip to content

harmonia: migrate subcommand for legacy libraries #163

@forkwright

Description

@forkwright

Context

Cody migrated his existing music, ebook, and audiobook libraries off Lidarr/arr-stack to the canonical layout via one-shot Python scripts in `~/menos-ops/scratch/`. Long term, harmonia should own this migration as a first-class CLI so any future legacy library imports go through the same pipeline.

What

New subcommand: `harmonia migrate --library /path/to/library`

Supported types: `music`, `ebooks`, `audiobooks`, `podcasts`

Behavior matches the Python migration scripts:

  • music: walk artist directories, read FLAC vorbis tags, parse existing `album.nfo`/`artist.nfo` for MB IDs, look up release type via MusicBrainz (`epignosis`), rename folders to `[YYYY] [TYPE] {Title}`, normalize track filenames to `DD-TT - {Title}.flac`, generate `album.toml` + `artist.toml`, back up old NFOs to `.migration-backup/`, delete loose Lidarr per-track `.xml` artifacts.
  • ebooks: walk author directories, fuzzy-group variants per book, score each variant (format > size > cleanliness), pick canonical (epub > azw3 > mobi > pdf), read OPF metadata via zipfile, generate `book.toml` + `author.toml`, move duplicates to `{library_root}/_quarantine/`.
  • audiobooks: walk author directories, parse m4b iTunes-style atoms, fall back to filename parsing for mp3, detect multi-file books, generate `audiobook.toml` + `author.toml`, rename single-file as `{Title}.m4b` and multi-file as `{Title} - NN.m4b`.
  • podcasts: enforce layout going forward; no migration needed (Cody's podcasts dir is empty).

Modes:

  • `--dry-run` (default) — emit JSON plan + human summary, do not modify
  • `--apply` — execute renames + sidecar generation
  • `--verify` — walk post-migration tree, assert canonical invariants

Why a Rust subcommand and not just keep the Python scripts

  • The Python scripts are one-shot artifacts. Any future legacy library import (a friend's collection, an old backup tape) shouldn't require re-deriving the logic.
  • Living inside harmonia means it shares the same metadata providers (`epignosis`), quality scoring (`kritike`), DB schemas (`harmonia-db`), and sanitization (`taxis::template::sanitize_path_segment`) as the runtime ingest.
  • The Python scripts duplicate the smart sanitization logic; Rust subcommand can call the canonical implementation directly.

Files

  • New crate or new subcommand under `harmonia-host`
  • Reuses: `taxis::import`, `epignosis`, `kritike`, sidecar reader/writer (issue: `taxis: TOML sidecar reader/writer`)

Acceptance

  • `harmonia migrate music --library /path --dry-run` produces a plan JSON identical in structure to the Python script's output
  • `--apply` executes the migration with atomic operations
  • `--verify` asserts all canonical invariants
  • Idempotent: re-running `--apply` on a migrated library is a no-op
  • Same flag set works for ebooks, audiobooks, podcasts

Reference

The reference Python implementations live at:

  • `~/menos-ops/scratch/migrate-music-canonical.py`
  • `~/menos-ops/scratch/migrate-ebooks-canonical.py`
  • `~/menos-ops/scratch/migrate-audiobooks-canonical.py`

These are executed one-shot for Cody's existing libraries; this issue is for the long-term Rust replacement.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requeststorage-canonicalCanonical filesystem storage layout work

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions