Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions mcp_plex/loader/AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,19 @@
- Qdrant upserts are batched and network errors are logged so large loads can continue even when individual batches fail.
- Qdrant model metadata is tracked locally to avoid relying on private client helpers.
- Qdrant collection setup happens before media ingestion, and the loader streams asynchronous upsert tasks once the configurable buffer fills so fetching can continue while points are written.
- The staged loader rewrite lives under `mcp_plex/loader/pipeline/`. The concrete classes that must be wired together are:
- `IngestionStage` (`ingestion.py`)
- `EnrichmentStage` (`enrichment.py`)
- `PersistenceStage` (`persistence.py`)
- `LoaderOrchestrator` (`orchestrator.py`)
- `mcp_plex/loader/pipeline/channels.py` defines the queue type aliases and sentinel tokens (`INGEST_DONE`, `PERSIST_DONE`) shared by the stages.

## Loader CLI expectations
- `mcp_plex/loader/__init__.py` still contains the legacy `LoaderPipeline` implementation for reference. New work should instantiate the staged classes directly and coordinate them with `LoaderOrchestrator`.
- When constructing stages from the CLI:
- `IngestionStage` must receive the Plex server (or `None` for sample mode), the list of sample items, the Plex chunk size for both movies and episodes, the enrichment batch size for sample batches, the ingest queue instance, and the `INGEST_DONE` sentinel.
- `EnrichmentStage` requires a factory that returns an `httpx.AsyncClient` (or context manager), the TMDb API key (empty string when unused in sample mode), the ingest queue, the persistence queue, the shared `IMDbRetryQueue`, the enrichment batch size for movies and episodes, and the IMDb configuration derived from CLI flags.
- `PersistenceStage` expects the `AsyncQdrantClient`, collection name, dense/sparse vector names, the persistence queue, the Qdrant retry queue, the semaphore limiting concurrent upserts, the upsert buffer size, and callables for performing the upsert as well as recording progress.
- `LoaderOrchestrator` must be initialised with the three stage instances, the ingest queue, the persistence queue, and the number of persistence workers (the CLI's `max_concurrent_upserts`).
- Convert `AggregatedItem` batches into Qdrant `PointStruct` objects with `build_point` before handing them to the persistence stage's `enqueue_points` helper.
- Prefer explicit keyword arguments when threading CLI options into stage constructors so the mapping is obvious to future readers.