feat: agent-native retrospective — analyze / verify guards / fixture content checks#1133
Merged
feat: agent-native retrospective — analyze / verify guards / fixture content checks#1133
Conversation
…content checks Post-mortem on slow 1point3acres + 51job adapter sessions, consolidated into one PR. Scope is "reduce uncertainty and catch silent failures" — the two things that sink agent success rate on first-time adapters. Changes: - `browser analyze <url>` — one command returns pattern (A/B/C/D), anti-bot vendor (Aliyun/Cloudflare/Akamai/Geetest), nearest adapter, and a single-sentence recommended_next_step. Replaces the three-step open/wait/network recon loop when it can reach a confident verdict. - `browser wait xhr <regex>` — poll for a specific XHR URL instead of blind `wait time N`, so SPA data-arrival barriers are deterministic. - Fixture `mustNotContain` / `mustBeTruthy` — catch two silent-failure modes `notEmpty` misses: content contamination (sibling DOM bleed) and `|| 0` / `|| false` fallbacks. - `browser verify` post-success site-memory check + `--strict-memory` — verify-green no longer hides the case where `~/.opencli/sites/` was never written back. Memory only materializes if authors write it. - CI: guard that committed `cli-manifest.json` matches a fresh build. Main was already drifted (#1118 left stale ordering + a missing arg); this PR regenerates the manifest and will catch the next drift. Docs (opencli-adapter-author + opencli-autofix skills): - `success-rate-pitfalls.md` — 10 concrete silent-failure scenarios seen in real adapter sessions, each with defense via fixture / adapter patterns. - `autofix` gains discipline rule #6: verify pattern failure means tighten the adapter, never loosen the fixture. - `site-recon.md` leads with `browser analyze`; `api-discovery.md` adds a §0 covering WAF vendor detection and cross-subdomain CORS (the two gotchas that burned the 51job session). - `wait time 3` → `wait time 2`, with `wait xhr` as the robust choice.
Three adapters (chatgpt/image, gemini/image, instagram/download) baked `path.join(os.homedir(), ...)` into the `default` field of their args. The committed manifest therefore carried my personal `/Users/jakevin/...` paths — which agents running on a different host saw as surprising defaults. The drift guard I just added to CI caught it on the first run. Runtime behavior is unchanged: each adapter still falls back to `path.join(os.homedir(), …)` inside `func` when the kwarg is absent. Only the displayed / registered default becomes a tilde-path.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Post-mortem on slow 1point3acres + 51job adapter sessions, consolidated into one PR. Scope = "reduce uncertainty and catch silent failures" — the two things that sink agent success rate on first-time adapters.
Principle guarding scope: agents do NOT fear repetition/boilerplate/handwriting; agents DO fear uncertainty (which path / which request / what to persist) and silent failures (verify-green but data is wrong). Every change here maps to one of those two.
What changed
New commands
browser analyze <url>— one-shot site recon. Returnspattern(A/B/C/D classification),anti_bot(Aliyun WAF / Cloudflare / Akamai / Geetest vendor detection + evidence),nearest_adapter(registry lookup for adapter reuse), and a single-sentencerecommended_next_step. Replaces the manualopen → wait → network → squint → guessloop.browser wait xhr <regex>— poll for a specific XHR URL. Replaces blindwait time 5on SPA pages; data-arrival barriers become deterministic.New fixture rules (catch silent failures)
mustNotContain: { col: [substrs] }— catches content contamination (sibling DOM bleed, breadcrumb prefixes, e.g.descriptionaccidentally containingaddress:/category:from a tag node).notEmptydoes not catch this.mustBeTruthy: [col]— catches|| 0/|| falsesilent fallbacks on numeric/boolean columns.notEmpty/typesdo not catch this.Guards
verifypost-success site-memory check +--strict-memoryflag. Verify-green now surfaces when~/.opencli/sites/<site>/was never written back;--strict-memorymakes the missing memory a hard failure. Memory only materializes if authors write it; previously there was no feedback loop.cli-manifest.jsondrift guard. Committed manifest must match a freshnpm run build. Main was already drifted (PR fix(bilibili): resolve full video URLs and preserve full description #1118 left bilibili/video reordering + a jianyusince_daysarg not committed to the manifest); this PR regenerates it and the guard prevents the next drift.Docs (
opencli-adapter-author+opencli-autofix)references/success-rate-pitfalls.md— 10 concrete silent-failure scenarios from real sessions, each with defense via fixture/adapter patterns.autofix: Discipline rule fix: escape evaluate args in browser scripts #6 — verify pattern failure = tighten adapter, never loosen the fixture. Fixture relaxation is the canonical way to silently accept broken data.site-recon.md: Leads withbrowser analyze; manual 3-step is the fallback.wait time 3→wait time 2(2s is enough;wait xhris more robust still).api-discovery.md: New §0 covering WAF vendor detection and cross-subdomain CORS — the two gotchas that burned the 51job session (acw_sc__v2cookie meant the Node fetch was always going to return slider HTML;cupid.51job.comfromjobs.51job.comwas CORS-blocked even withcredentials:'include').site-memory.md: Documents the new fixture fields; adds explicit warning not to loosenpatternsto pass verify.Test plan
npx tsc --noEmitcleannpx vitest run src/— 845 passed / 2 skippednpm run buildidempotent (only drift is from main's preexisting stale manifest — which this PR fixes)analyze.ts(WAF detection, pattern classification, nearest-adapter matching, top-level assembly) + 3 for fixture content rulesbrowser analyze https://www.51job.com/after merge — expectsanti_bot.vendor = aliyun_wafbrowser wait xhr '/api/...'on a slow SPA after mergeReview focus
Two pairs of reviewers worth: agent-UX (skill docs +
analyzeoutput shape) and CLI correctness (wait xhrpolling,verifymemory-write check, manifest drift guard).Background: one PR consolidation explicitly approved by WAWQAQ (msg 0de55bc8, "有必要的我们都做,全部一次性做掉").