v0.3.1
v0.3.1
Scraper Studio gains AI self-healing: when a saved scraper drifts —
selectors move, a page redesigns, output goes empty or partial — the agent
fixes it in place so the collector_id keeps working and improves,
instead of rebuilding from scratch. The fix is human-in-the-loop by default:
heal stops at an approval gate, and a new approve command commits it.
All changes are additive — existing scraper create and scraper run
invocations behave exactly as before. This delivers the self-healing path
that v0.2.0 listed as planned.
Features
scraper heal — AI self-healing in place (#11)
bdata scraper heal <collector_id> "<prompt>" is the maintenance twin of
scraper create. It triggers Bright Data's AI self-healing flow
(POST /dca/collectors/{id}/refactor_template) and polls progress, reusing
the same async trigger→poll machinery (429 backoff, retry forwarding) as
create.
bdata scraper heal c_xxx \
"Price stopped extracting after the page redesign — it's now in span.price-now" \
--url https://example.com/product/1 -o heal.json- You are the detector. The CLI never decides on its own that a scraper is
broken — a heal is slow, billable, and mutating. You inspect the run output
and decide.scraper runstays read-only — there is no--healflag. - The
collector_idis preserved — the scraper is improved, not replaced. - Required
<prompt>(≤1000 chars, validated up front); name what's wrong
and what the correct output should be. - Carries over
--timeout,--max-retries/--no-retry, and all output
flags (-o/--json/--pretty/--legacy-output/--timing/-k).
Human-in-the-loop approval gate
By default, heal runs the fix and then stops at an approval gate rather
than committing it — exiting 0 with a status: "awaiting_approval" envelope
that carries preview_result (sample rows the fixed scraper would produce)
and a next_step pointing at scraper approve:
{
"collector_id": "c_xxx",
"status": "awaiting_approval",
"preview_result": [ ... ],
"next_step": "bdata scraper approve c_xxx --url https://example.com/product/1"
}awaiting_approval is not a failure — the fix is ready and waiting for
your decision.
scraper approve — commit or reject a fix (#11)
bdata scraper approve <collector_id> commits a fix that heal left awaiting
approval (POST /dca/collectors/{id}/resume_automation_job, then polls to
done). On success the envelope hands back a next_step = scraper run so
you can verify the committed fix.
# Commit the proposed fix
bdata scraper approve c_xxx --url https://example.com/product/1 -o approve.json
# Reject it and start over with a sharper prompt
bdata scraper approve c_xxx --reject--auto-approve — fully autonomous heal
For unattended flows, heal --auto-approve approves the fix automatically and
polls through to done in one command:
bdata scraper heal c_xxx \
"Reviews stopped extracting after the page redesign" --auto-approveThe self-healing loop
The intended agent flow: run → inspect → heal → approve → re-run to verify.
bdata scraper run c_xxx https://example.com/product/1 -o run.json # 1. run
# 2. inspect run.json — if the data is wrong:
bdata scraper heal c_xxx "<what's wrong>" --url https://example.com/product/1 -o heal.json
# 3. review heal.json's preview_result, then commit:
bdata scraper approve c_xxx --url https://example.com/product/1 -o approve.json
# 4. re-run to verify the committed fixNon-destructive failure
A failed heal (429 cap exhausted, timeout, terminal failed) leaves the
existing scraper unchanged and still working — distinct from create,
where a failure can leave a half-built collector. The recovery note says so.
Upgrade notes
- No action required — fully additive, backward compatible.
scraper runis unchanged and remains read-only by design.
Full changelog: v0.3.0...v0.3.1 (PR #11)