Port collection runner into self-healing path by giaphutran12 · Pull Request #43 · tinyfish-io/bigset

giaphutran12 · 2026-05-22T15:13:50Z

Summary

vendor the collection pipeline runtime source needed by the BigSet populate stack
add backend/src/pipeline/collection-agent-runner.ts, exporting runCollectionPopulatePipeline(input) for POPULATE_COLLECTION_RUNNER_MODULE
map collection pipeline output into PopulateRuntimeResult rows, evidence, usage, metrics, and validation issues
require COLLECTION_AGENT_PIPELINE_MODULE explicitly so built backend does not silently import vendored TypeScript source
fix collection metrics so initial and repair TinyFish Agent dispatches are counted together without fake agentRuns=1
add a runner unit test proving recipe instructions, benchmark metadata, required columns, output mapping, and repair metrics flow through the runner contract
fix benchmark failure text so claim-support failures name missing claim-support entities, not missing entity-coverage entities
add @tiny-fish/sdk to backend dependencies for the vendored TinyFish integration

Verification

node --check benchmarks/dataset-agent/run-benchmark.mjs
npm --prefix backend test -- test/collection-agent-runner.test.ts
npm --prefix backend run build
make verify-self-healing
code-reviewer subagent re-review: no findings; prior P1/P2 blockers resolved

Real Benchmark Evidence

With keys loaded execution-only:

explicit pipeline env, no-Agent collection run, 2 prompts:
- COLLECTION_AGENT_ENABLE_AGENT=false
- COLLECTION_AGENT_PIPELINE_MODULE=./backend/BigSet_Data_Collection_Agent/src/orchestrator/pipeline.ts
- BIGSET_COLLECTION_BENCHMARK_RUNNER_MODULE=./backend/src/pipeline/collection-agent-runner.ts
- result: 2/2 passed, 7 rows, 13 evidence quotes, 13 source URLs, cost $0.010813
earlier default collection run, 2 prompts: 0/2 passed, 1 failed, 1 blocked by 10-minute timeout; saas-pricing-pages produced 3 rows, score 0.967, cost $0.006087
explicit pipeline env, no-Agent collection run, full 16-prompt benchmark:
- result: 4/16 passed, 12 failed, 0 blocked
- cost $0.100698; wall time 16m 13s
- output volume: 131 rows, 195 evidence quotes, 94 source URLs
- calls/tokens: 93 search calls, 206 fetch calls, 1,020,923 total tokens, 0 browser/Agent runs
- passed: hcmc-bakery-products, california-insurance-prices, la-coke-menu-lol, pastry-things-menlo

Conclusion: the real collection runner now executes through the self-healing benchmark path. The cheap Search/Fetch + LLM lane produces rows and passes some prompts, but full-benchmark quality is not default-ready yet. The default TinyFish Agent path still needs timeout/polling work before it should become the default.

Notes

No merge. This stays stacked after PR #42.

coderabbitai · 2026-05-22T15:13:59Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7a99fa27-cf6f-44b4-bff9-9909c4c96881

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch codex/collection-runner-port

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Port collection pipeline runner into self-healing path

ca90366

giaphutran12 self-assigned this May 22, 2026

Harden collection runner wiring

d476174

This was referenced May 22, 2026

Bound collection Agent runtime defaults #44

Draft

Document collection Agent canary result #47

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Port collection runner into self-healing path#43

Port collection runner into self-healing path#43
giaphutran12 wants to merge 2 commits into
codex/migration-plan-status-refreshfrom
codex/collection-runner-port

giaphutran12 commented May 22, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 22, 2026 •

edited

Loading

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

giaphutran12 commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Verification

Real Benchmark Evidence

Notes

Uh oh!

coderabbitai Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

giaphutran12 commented May 22, 2026 •

edited

Loading

coderabbitai Bot commented May 22, 2026 •

edited

Loading