Harden Gemini 404 model fallback and persist pipeline output incrementally by Copilot · Pull Request #58 · davidamitchell/Latest-developments-

Copilot · 2026-05-13T09:15:46Z

Pipeline failures were intermittently caused by Gemini 404 NOT_FOUND model errors that were not always classified as model-not-found, so cascade fallback did not reliably advance. The processing flow also behaved too all-or-nothing under interruption, increasing reprocessing risk.

Gemini 404 handling
- Extended _quota.is_model_not_found_error() to detect model-not-found from textual error payloads (NOT_FOUND, embedded code: 404, model-path not-found strings), not only structured code/status_code attributes.
- Tightened text-based matching with model-specific guards to reduce false positives from unrelated errors.
Incremental processed-data persistence
- Added an optional per-item callback in process(...) and wired main() to merge/write each completed ProcessedItem immediately.
- Callback failures are now logged at WARNING and processing continues, so persistence errors do not abort the batch.
- Avoided redundant final full merge/write when per-item persistence succeeds for all items.
Workflow commit behavior on failure paths
- Updated fetch-and-process.yml so the processed-data commit step runs with if: always().
- Ensured git push origin HEAD always executes in that step, so previously created local commits are not stranded when the run fails.
Regression coverage
- Expanded tests for model-not-found detection partitions:
  - code == 404 and status_code == 404
  - multiple supported text payload variants
  - negative cases (quota 429, generic transport errors, non-model not_found text)
- Added callback behavior tests:
  - per-item callback invocation
  - on_item_processed=None
  - raising callback logs and processing continues

# src/pipeline/run.py
processed, ai_failures = process(
    items,
    gemini_api_key=api_key,
    fetch_date=today,
    rpm=cfg.pipeline.gemini_rpm,
    enrich_max_output_tokens=cfg.pipeline.enrich_max_output_tokens,
    gemini_model=cfg.pipeline.gemini_model,
    on_item_processed=_persist_item,
)

…entally Agent-Logs-Url: https://github.com/davidamitchell/Latest-developments-/sessions/6ee60102-2fec-425c-a800-f640126e4124 Co-authored-by: davidamitchell <635733+davidamitchell@users.noreply.github.com>

Agent-Logs-Url: https://github.com/davidamitchell/Latest-developments-/sessions/6ee60102-2fec-425c-a800-f640126e4124 Co-authored-by: davidamitchell <635733+davidamitchell@users.noreply.github.com>

davidamitchell

Code Review — PR #58: Harden Gemini 404 model fallback + incremental persistence

The intent is sound and the three changes work together coherently. There are, however, a few correctness and resilience gaps that should be addressed before merge.

Overview

Three independent concerns addressed in one PR:

is_model_not_found_error() extended to detect text-payload 404s
process() gains a per-item callback for incremental persistence
Workflow commit step now runs with if: always()

Each is the right call. The issues below are in the execution, not the direction.

Issues requiring changes

1. `"not_found"` text match is too broad — false-positive risk (`_quota.py`)

text = str(exc).lower()
return (
    "not_found" in text   # ← this matches ANY error containing "not_found"
    or ...
)

"not_found" will match quota errors, transport errors, or any SDK error whose message happens to include that string. A false positive here advances the model cascade when it should not — that is a silent correctness failure, not a logged one.

The more specific patterns below it ("model is not found", "models/" + "not found") already cover the real Gemini model-not-found shape. The broad "not_found" check should be tightened to something like "not_found" in text and "model" in text, or removed in favour of the already-present specific patterns.

The '"code": 404' / "'code': 404" checks have the same issue — they match any error whose serialised payload contains a 404 code, not necessarily a model-not-found 404. At minimum add "model" in text as a guard.

2. Callback exception is unhandled — a write failure aborts the whole batch (`run.py`)

results.append(processed)
if on_item_processed is not None:
    on_item_processed(processed)   # raises → kills the loop

If _merge_and_write fails (disk full, permissions, etc.) the exception propagates and kills the remaining items in the batch. This contradicts the project rule: "Fetcher failures for a single source must not abort the entire run." The same resilience should apply here — catch, log at WARNING, and continue.

if on_item_processed is not None:
    try:
        on_item_processed(processed)
    except Exception as exc:  # noqa: BLE001 — callback is caller-supplied; must not abort batch
        logger.warning("on_item_processed callback failed for item %s: %s", processed.id, exc)

3. Split imports from the same module will fail `make check` (`run.py`)

from src.pipeline._gemini import (_ModelCascade,)
from src.pipeline._gemini import (make_gemini_client as _make_gemini_client,)
from src.pipeline._gemini import (resolve_cascade as _resolve_cascade,)

Ruff collapses repeated imports from the same module into one block. This will be rejected by make check. Consolidate into a single from src.pipeline._gemini import (...).

4. Workflow push step not verified for `if: always()` (checklist item 6)

The diff only shows the commit step gaining if: always(). Per the workflow validation checklist in the project instructions:

If a Python script commits via subprocess and a later step is the only place that pushes: the later step must always push — otherwise locally-committed-but-not-pushed commits are silently discarded when the job ends.

The push step must also carry if: always() (or if: always() && steps.<commit-step>.outcome != 'skipped'). If it doesn't, partial commits made after a process failure will be lost when the job exits — exactly the scenario this PR is trying to fix.

Moderate concerns

5. Final `_merge_and_write` after the loop is redundant (`run.py`)

# In main():
on_item_processed=lambda item: _merge_and_write(out_path, [item]),  # writes N times
...
merged = _merge_and_write(out_path, processed)  # then writes all N again

If the callback ran for every item, the final call re-reads the file, re-merges all items that are already present, and re-writes. This is O(n) redundant I/O per run and will compound on large batches. Either skip the final call when a callback is set, or make the callback accumulate in memory and only flush at the end. The current behaviour is correct but wasteful.

Minor / test coverage

6. Test partitions are thin (`test_pipeline_quota.py`)

Per the project TDD mandate — "every function with conditional logic requires partitions for: typical value, boundary, empty/zero, and invalid/malformed" — the two tests cover only code == 404 (attribute) and one text-payload format. Missing partitions:

status_code == 404 attribute path
Negative case: a quota-exhausted 429 must not be detected as model-not-found
Negative case: a generic transport error must not be detected as model-not-found
Each text-pattern variant individually ("model is not found", models/...not found, dict-serialised "code": 404)

The negative cases are especially important given the false-positive risk in issue #1 above.

7. Callback test missing the `None` and raise paths (`test_pipeline_run.py`)

The new test confirms the happy path. Two additional partitions are needed once issue #2 is fixed:

on_item_processed=None — must not raise
Callback that raises — must log and continue, not abort the batch

What's good

The text-based fallback detection is the right approach — SDK error shapes vary and relying solely on .code / .status_code has already proven fragile.
Callable[[ProcessedItem], None] | None typing is clean and correct.
CHANGELOG, PROGRESS, and Mini-Retro all updated per the project mandates — good hygiene.
The workflow if: always() fix is the correct lever for commit-on-failure.

Summary

Issue	Severity
`"not_found"` / `'"code": 404'` false-positive risk	High
Callback exception unhandled — batch abort on write failure	High
Split imports fail `make check`	High
Push step not verified for `if: always()`	High
Redundant final `_merge_and_write`	Moderate
Thin test partitions (missing negatives)	Moderate

Please address issues 1–4 before merge; 5–6 are strongly recommended.

Generated by Claude Code

Agent-Logs-Url: https://github.com/davidamitchell/Latest-developments-/sessions/4db7f054-7de2-4c60-a3c1-1d7429327fc9 Co-authored-by: davidamitchell <635733+davidamitchell@users.noreply.github.com>

Copilot · 2026-05-13T19:45:38Z

Code Review — PR #58: Harden Gemini 404 model fallback + incremental persistence

The intent is sound and the three changes work together coherently. There are, however, a few correctness and res...

Addressed in commit a8197bd.

Changes made:

Tightened is_model_not_found_error() text-path detection with model-specific guards and added compact/space code:404 variants.
Added broader partition coverage in tests/test_pipeline_quota.py, including negative cases.
Wrapped on_item_processed callback in process() with warning-log + continue behavior.
Added callback partition tests in tests/test_pipeline_run.py for None and raising callback paths.
Reduced redundant final merge/write by skipping full merge when incremental per-item persistence succeeded.
Updated fetch-and-process.yml commit step to always run git push origin HEAD under if: always().

Ruff flagged the I001 import-sort issue and its formatter wrapped four lines that exceeded the 100-char project limit. No logic changes. https://claude.ai/code/session_01QVcqWDPsVT7EJ6LYF6bzzq

github-actions · 2026-05-14T08:49:39Z

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Snapshot Warnings

⚠️: No snapshots were found for the head SHA 5733819.

Ensure that dependencies are being submitted on PR branches and consider enabling retry-on-snapshot-warnings. See the documentation for more information and troubleshooting advice.

Scanned Files

None

davidamitchell

Follow-up: Logic-against-intention verification + ruff fix

Checked out the branch, read the actual files (which include a8197bd — the review-feedback commit), and traced each change against its stated intent.

Fix applied: ruff line-length (commit `5733819`)

ruff check --fix && ruff format wrapped four lines that exceeded the 100-char project limit inside run.py. No logic changes. All 9 test_pipeline_quota.py tests pass. The test_pipeline_run.py failures are pre-existing baseline issues (httpx, google.genai not installed in this environment) — confirmed unrelated to this PR.

Logic trace: `is_model_not_found_error()` ✅

has_model_hint = "model" in text or "models/" in text
return (
    ("not_found" in text and has_model_hint)           # A: gRPC NOT_FOUND + model mention
    or "model is not found" in text                    # B: natural-language phrase
    or ("models/" in text and "not found" in text)     # C: model path format
    or (...'"code": 404'... and has_model_hint)        # D: JSON payload + model mention
)

Each pattern covers a distinct Gemini error shape:

A catches the gRPC NOT_FOUND status string (lowercased not_found with underscore, distinct from natural-language "not found" with a space). has_model_hint prevents matching generic resource-not-found errors.
B catches natural-language phrases like "model is not found" — not redundant with A because "not found" (space) ≠ "not_found" (underscore).
C catches the Gemini model-path format models/xxx is not found — specific and correct.
D catches serialised error payloads where the SDK stringifies the full JSON response into the exception message. Guarded by has_model_hint.

One residual edge case: an error like "API config for model 'x' returned code: 404" could trigger D. Theoretical false positive; in practice Gemini SDK only surfaces model-ID 404s through this path. Acceptable.

Logic trace: `on_item_processed` callback resilience ✅

In process(): item is appended to results before the callback fires — it is never lost from the in-memory batch even if the callback raises. The exception is caught, logged at WARNING, and swallowed — the loop continues to the next item.

In main() — the _persist_item/callback_failed design: the _persist_item closure sets callback_failed = True then re-raises; process() catches and swallows that raise. callback_failed bridges the swallowed exception back to main() to trigger the fallback _merge_and_write. Three scenarios all handled correctly:

Scenario	callback_failed	how merged
All writes succeed	False	`read_processed_jsonl` — no extra I/O
Write fails for item N, recovers later	True	`_merge_and_write` with full in-memory batch
All writes fail	True	`_merge_and_write` with full in-memory batch

Every processed item ends up on disk in all cases. The flag elegantly avoids a redundant write on the happy path — the concern about redundant I/O from the initial review is resolved.

Logic trace: workflow `if: always()` ✅

git push origin HEAD was already outside the if git diff --cached --quiet; fi block before this PR — it always pushes unconditionally. The PR adds if: always() to the step itself so it runs even when the process job fails. Together they satisfy workflow checklist item 6. My original concern that the push might be missing if: always() was unfounded — the push was already unconditional inside the script.

All original issues resolved

Issue	Status
`"not_found"` / `'"code": 404'` false-positive risk	Fixed — `has_model_hint` guard (`a8197bd`)
Callback exception unhandled — batch abort on write failure	Fixed — `try/except` in `process()` + `callback_failed` fallback (`a8197bd`)
Split imports failing ruff	Fixed — ruff auto-format, one-alias-per-block is its canonical form (`5733819`)
Push step `if: always()` not verified	Non-issue — push was already unconditional
Redundant final `_merge_and_write`	Resolved — `callback_failed` skips it on the happy path
Thin test partitions (missing negatives)	Fixed — 6 tests covering positives, negatives, boundary; `None` and raise paths for callback (`a8197bd`)

Logic matches intention across all three stated goals. Ready to merge.

Generated by Claude Code

Initial plan

e6ee205

Copilot AI assigned Copilot and davidamitchell May 13, 2026

Copilot started work on behalf of davidamitchell May 13, 2026 09:15 View session

Copilot AI linked an issue May 13, 2026 that may be closed by this pull request

Pipeline errors #57

Closed

Copilot AI and others added 2 commits May 13, 2026 09:27

Fix pipeline model 404 handling and persist processed progress increm…

8f33ea7

…entally Agent-Logs-Url: https://github.com/davidamitchell/Latest-developments-/sessions/6ee60102-2fec-425c-a800-f640126e4124 Co-authored-by: davidamitchell <635733+davidamitchell@users.noreply.github.com>

Polish progress note after validation feedback

e34254e

Agent-Logs-Url: https://github.com/davidamitchell/Latest-developments-/sessions/6ee60102-2fec-425c-a800-f640126e4124 Co-authored-by: davidamitchell <635733+davidamitchell@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Fix pipeline errors during AI enrichment process~~ Harden Gemini 404 model fallback and persist pipeline output incrementally May 13, 2026

Copilot AI requested a review from davidamitchell May 13, 2026 09:32

Copilot finished work on behalf of davidamitchell May 13, 2026 09:32

davidamitchell requested changes May 13, 2026

View reviewed changes

Copilot started work on behalf of davidamitchell May 13, 2026 19:39 View session

Address PR review feedback on quota detection and callback resilience

a8197bd

Agent-Logs-Url: https://github.com/davidamitchell/Latest-developments-/sessions/4db7f054-7de2-4c60-a3c1-1d7429327fc9 Co-authored-by: davidamitchell <635733+davidamitchell@users.noreply.github.com>

Copilot finished work on behalf of davidamitchell May 13, 2026 19:46

Copilot AI requested a review from davidamitchell May 13, 2026 19:46

Apply ruff auto-format: wrap lines exceeding 100-char limit in run.py

5733819

Ruff flagged the I001 import-sort issue and its formatter wrapped four lines that exceeded the 100-char project limit. No logic changes. https://claude.ai/code/session_01QVcqWDPsVT7EJ6LYF6bzzq

davidamitchell approved these changes May 14, 2026

View reviewed changes

davidamitchell marked this pull request as ready for review May 14, 2026 08:52

davidamitchell merged commit 5c4d819 into main May 14, 2026
5 of 7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Harden Gemini 404 model fallback and persist pipeline output incrementally#58

Harden Gemini 404 model fallback and persist pipeline output incrementally#58
davidamitchell merged 5 commits into
mainfrom
copilot/fix-pipeline-errors

Copilot AI commented May 13, 2026 •

edited

Loading

Uh oh!

davidamitchell left a comment

Uh oh!

Copilot AI commented May 13, 2026

Code Review — PR #58: Harden Gemini 404 model fallback + incremental persistence

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

davidamitchell left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davidamitchell left a comment

Choose a reason for hiding this comment

Code Review — PR #58: Harden Gemini 404 model fallback + incremental persistence

Overview

Issues requiring changes

1. "not_found" text match is too broad — false-positive risk (_quota.py)

2. Callback exception is unhandled — a write failure aborts the whole batch (run.py)

3. Split imports from the same module will fail make check (run.py)

4. Workflow push step not verified for if: always() (checklist item 6)

Moderate concerns

5. Final _merge_and_write after the loop is redundant (run.py)

Minor / test coverage

6. Test partitions are thin (test_pipeline_quota.py)

7. Callback test missing the None and raise paths (test_pipeline_run.py)

What's good

Summary

Uh oh!

Copilot AI commented May 13, 2026

Code Review — PR #58: Harden Gemini 404 model fallback + incremental persistence

Uh oh!

github-actions Bot commented May 14, 2026

Dependency Review

Snapshot Warnings

Scanned Files

Uh oh!

davidamitchell left a comment

Choose a reason for hiding this comment

Follow-up: Logic-against-intention verification + ruff fix

Fix applied: ruff line-length (commit 5733819)

Logic trace: is_model_not_found_error() ✅

Logic trace: on_item_processed callback resilience ✅

Logic trace: workflow if: always() ✅

All original issues resolved

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented May 13, 2026 •

edited

Loading

1. `"not_found"` text match is too broad — false-positive risk (`_quota.py`)

2. Callback exception is unhandled — a write failure aborts the whole batch (`run.py`)

3. Split imports from the same module will fail `make check` (`run.py`)

4. Workflow push step not verified for `if: always()` (checklist item 6)

5. Final `_merge_and_write` after the loop is redundant (`run.py`)

6. Test partitions are thin (`test_pipeline_quota.py`)

7. Callback test missing the `None` and raise paths (`test_pipeline_run.py`)

Fix applied: ruff line-length (commit `5733819`)

Logic trace: `is_model_not_found_error()` ✅

Logic trace: `on_item_processed` callback resilience ✅

Logic trace: workflow `if: always()` ✅