Skip to content

tests: add tests for OCRDriver: [#2964]#3028

Merged
Salazareo merged 2 commits into
mainfrom
DS/DS/2964
May 9, 2026
Merged

tests: add tests for OCRDriver: [#2964]#3028
Salazareo merged 2 commits into
mainfrom
DS/DS/2964

Conversation

@Salazareo
Copy link
Copy Markdown
Member

Adds offline OCRDriver.test.ts covering both providers:

• test_mode short-circuit; argument validation (missing actor, missing
source, unknown provider, AWS/Mistral not configured)
• aws-textract: raw-bytes vs S3Object source selection (regional
client when fsEntry has a bucket), block normalisation (PAGE/WORD/
TABLE filtered, LINE/LAYOUT_TITLE → text/textract:* blocks),
402 on insufficient credits, per-page metering
• mistral: image vs PDF chunk packaging (image_url with base64 data
URL vs document_url with documentName), pass-through of pages /
annotation / image-limit options, markdown → LINE-block
normalisation with page indices, per-page metering, additional
annotations metering when bbox/document annotation formats are set
• default-provider selection (AWS preferred → Mistral fallback →
500 when neither is configured)
• getReportedCosts mirrors costs.ts

Adds offline OCRDriver.test.ts covering both providers:

  • test_mode short-circuit; argument validation (missing actor, missing
    source, unknown provider, AWS/Mistral not configured)
  • aws-textract: raw-bytes vs S3Object source selection (regional
    client when fsEntry has a bucket), block normalisation (PAGE/WORD/
    TABLE filtered, LINE/LAYOUT_TITLE → text/textract:* blocks),
    402 on insufficient credits, per-page metering
  • mistral: image vs PDF chunk packaging (image_url with base64 data
    URL vs document_url with documentName), pass-through of pages /
    annotation / image-limit options, markdown → LINE-block
    normalisation with page indices, per-page metering, additional
    annotations metering when bbox/document annotation formats are set
  • default-provider selection (AWS preferred → Mistral fallback →
    500 when neither is configured)
  • getReportedCosts mirrors costs.ts

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 9, 2026

Coverage Report

Status Category Percentage Covered / Total
🔵 Lines 37.77%
⬆️ +0.51%
6676 / 17671
🔵 Statements 37.03%
⬆️ +0.51%
7054 / 19047
🔵 Functions 39.65%
⬆️ +0.47%
1184 / 2986
🔵 Branches 26.5%
⬆️ +0.52%
3717 / 14026
File CoverageNo changed files found.
Generated in workflow #89 for commit bb3b171 by the Vitest Coverage Report Action

Drops the manual config/clients/stores/services stub apparatus and
the loadFileInput mock in favour of the live wired driver from
server.drivers.aiOcr. The Textract and Mistral SDKs are still mocked
at the module boundary (the real network egress points); inputs go
through the real loadFileInput against real fs/store wiring (data
URLs for most cases; FSService.write produces a real fsEntry for the
PDF documentName test). Aligns with AGENTS.md: "Prefer test server
over mocking deps."

The S3Object-source/regional-client assertion was dropped because it
isn't deterministic against the in-memory S3 store and the driver's
per-region TextractClient cache leaks across tests. That branch is
better exercised by a real-cloud integration test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Salazareo Salazareo merged commit 64402b0 into main May 9, 2026
4 checks passed
@Salazareo Salazareo deleted the DS/DS/2964 branch May 10, 2026 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant