feat: add loader orchestrator builder #103

Teagan42 · 2025-10-06T03:00:06Z

What

add a _build_loader_orchestrator helper that wires the staged ingestion, enrichment, and persistence components
ensure Qdrant upserts convert aggregated items to points, reuse shared IMDb retry queues, and record progress logging

Why

provide a reusable constructor for the staged loader pipeline to simplify CLI integration work

Affects

loader module orchestrator wiring and related staging queues

Testing

uv run pytest tests/test_loader_logging.py::test_run_logs_upsert

Documentation

not needed

https://chatgpt.com/codex/tasks/task_e_68e32f8f3fd88328a4e030e88d7a9c95

github-actions · 2025-10-06T03:01:41Z

Coverage Report

File	Stmts	Miss	Cover	Missing
mcp_plex/loader
__init__.py	785	158	80%	182–189, 238, 324–329, 931–932, 976, 978, 1043–1044, 1055–1057, 1060, 1063–1064, 1066, 1088, 1098, 1109–1141, 1156, 1158, 1165–1179, 1182–1196, 1202, 1217–1248, 1253–1286, 1289–1293, 1298–1306, 1316, 1371–1467, 1523–1525, 1586–1588, 1946
mcp_plex/loader/pipeline
__init__.py	15	7	53%	57–62, 68
channels.py	73	2	97%	19–20
enrichment.py	334	44	87%	78, 80, 87, 91, 134–137, 170, 191, 211, 219–221, 228–231, 234–236, 244, 301, 357, 380, 384, 386, 412, 430, 465, 471, 474–482, 508, 532, 603, 606–608
ingestion.py	81	8	90%	67, 98–108, 129, 157, 179
orchestrator.py	85	5	94%	48, 112, 141, 164–165
persistence.py	117	9	92%	109, 151–152, 162, 167, 171–173, 201
mcp_plex/server
__init__.py	614	29	95%	43–44, 119–120, 148, 252, 256, 277–280, 297, 362, 365, 402, 420–421, 458, 1109, 1131–1137, 1173, 1191, 1196, 1214, 1338, 1375
__main__.py	4	4	0%	3–8
config.py	48	7	85%	50, 52–55, 65, 76
TOTAL	2358	273	88%

Tests	Skipped	Failures	Errors	Time
128	0 💤	0 ❌	0 🔥	47.302s ⏱️

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting

@codex fix this CI failure
@codex address that feedback

chatgpt-codex-connector · 2025-10-06T03:02:20Z

mcp_plex/loader/__init__.py

+    async def _upsert_aggregated(
+        batch: Sequence[AggregatedItem],
+    ) -> None:
+        if not batch:
+            return
+        items.extend(batch)
+        points = [
+            build_point(item, dense_model_name, sparse_model_name)
+            for item in batch
+        ]
+        await _upsert_in_batches(
+            client,
+            collection_name,
+            points,
+            retry_queue=qdrant_retry_queue,
+        )
+
+    def _record_upsert(worker_id: int, batch_size: int, queue_size: int) -> None:
+        nonlocal upserted, upsert_start
+        if upserted == 0:
+            upsert_start = time.perf_counter()
+        upserted += batch_size
+        elapsed = time.perf_counter() - upsert_start
+        rate = upserted / elapsed if elapsed > 0 else 0.0
+        logger.info(
+            "Upsert worker %d processed %d items (%.2f items/sec, queue size=%d)",
+            worker_id,
+            upserted,
+            rate,
+            queue_size,
+        )
+
+    ingestion_stage = IngestionStage(
+        plex_server=plex_server,
+        sample_items=sample_items,
+        movie_batch_size=plex_chunk_size,
+        episode_batch_size=plex_chunk_size,
+        sample_batch_size=enrichment_batch_size,
+        output_queue=ingest_queue,
+        completion_sentinel=INGEST_DONE,
+    )
+
+    enrichment_stage = EnrichmentStage(
+        http_client_factory=lambda: httpx.AsyncClient(timeout=30),
+        tmdb_api_key=tmdb_api_key or "",
+        ingest_queue=ingest_queue,
+        persistence_queue=persistence_queue,
+        imdb_retry_queue=_imdb_retry_queue,
+        movie_batch_size=enrichment_batch_size,
+        episode_batch_size=enrichment_batch_size,
+        imdb_cache=_imdb_cache,
+        imdb_max_retries=_imdb_max_retries,
+        imdb_backoff=_imdb_backoff,
+        imdb_batch_limit=_imdb_batch_limit,
+        imdb_requests_per_window=_imdb_requests_per_window,
+        imdb_window_seconds=_imdb_window_seconds,
+    )
+
+    persistence_stage = _PersistenceStage(
+        client=client,
+        collection_name=collection_name,
+        dense_vector_name=dense_model_name,
+        sparse_vector_name=sparse_model_name,
+        persistence_queue=persistence_queue,
+        retry_queue=qdrant_retry_queue,
+        upsert_semaphore=upsert_capacity,
+        upsert_buffer_size=upsert_buffer_size,
+        upsert_fn=_upsert_aggregated,
+        on_batch_complete=_record_upsert,


Treat retry payloads as AggregatedItem batches

The new orchestrator wiring feeds EnrichmentStage’s aggregated items directly into PersistenceStage and converts them to Qdrant points inside _upsert_aggregated. That works for the first attempt, but when _upsert_in_batches fails it enqueues the raw PointStruct batch onto the shared retry queue. On shutdown PersistenceStage._flush_retry_queue() re-enqueues those point batches via enqueue_points, after which the persistence workers call _upsert_aggregated again. _upsert_aggregated always assumes batch contains AggregatedItem instances and passes each element to build_point, so any retried payload will raise attribute errors and the failed points can never be persisted. Any upsert failure therefore permanently breaks the retry path. The upsert helper needs to detect already-built PointStructs or reuse PersistenceStage’s native point-based enqueueing instead.

Useful? React with 👍 / 👎.

feat(loader): add orchestrator builder

5ef6ae9

Teagan42 added the codex label Oct 6, 2025 — with ChatGPT Codex Connector

Teagan42 merged commit 629f29e into main Oct 6, 2025
4 checks passed

Teagan42 deleted the codex/add-_build_loader_orchestrator-function branch October 6, 2025 03:02

chatgpt-codex-connector bot reviewed Oct 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add loader orchestrator builder #103

feat: add loader orchestrator builder #103

Uh oh!

Teagan42 commented Oct 6, 2025

Uh oh!

github-actions bot commented Oct 6, 2025

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Oct 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: add loader orchestrator builder #103

feat: add loader orchestrator builder #103

Uh oh!

Conversation

Teagan42 commented Oct 6, 2025

What

Why

Affects

Testing

Documentation

Uh oh!

github-actions bot commented Oct 6, 2025

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants