ClawLab is a persistent Auto Research Loop system built on top of a Claude Code CLI source snapshot. It is designed for real repo work: topic-driven research, existing-project improvement, SSH/GPU experiments, and now a practical rebuttal workflow.
These are the parts that are actually implemented and runnable in this repo today:
new_projectandexisting_project_improvementresearch missions via/research start- explicit state-machine-driven research loop under
src/research/** - local + SSH executor support for experiments
- model routing for
auto,anthropic_oauth,anthropic_api_key, andopenai_compatible - native integration inspection/scaffolding for Codex, Claude Code, and OpenClaw
- a local rebuttal pipeline that reads paper/review files, scans repo evidence, applies venue policies, and drafts rebuttal artifacts
- a small executable local skill catalog that can be listed, shown, and run
These are deliberately not overstated:
/research team ...is currently team scaffolding and role guidance, not a full embedded OMX runtime- the rebuttal pipeline is artifact-first and locally runnable, but model-assisted drafting still depends on whatever model auth/provider is actually configured on your machine
- the repo-wide TypeScript baseline is still noisy because this source snapshot contains many unrelated upstream issues; the focused ClawLab tests are the reliable verification path right now
bun install
bun src/entrypoints/cli.tsxOr use the binary alias:
clawlabIf you want Anthropic OAuth-backed model access inside the CLI:
/loginInitialize the local scaffold once:
/research setupThis creates:
.clawlab/tasks/.clawlab/docs/.clawlab/memory/.clawlab/team/.clawlab/skills/.clawlab/rebuttal/.clawlab/integrations/
/research start --mode new "test-time adaptation for multimodal agents"/research start \
--mode improve \
--repo /path/to/project \
--problem "validation F1 is stuck around 0.72 after epoch 3" \
--target-metric f1 \
--current-metric f1=0.72 \
--goal "push F1 beyond 0.76 without a large inference-cost regression"/research summarize report
/research summarize summary
/research summarize paperClawLab now has a real integration layer for three external ecosystems:
codexclaude-codeopenclaw
Commands:
/research integration status
/research integration doctor
/research integration doctor codex
/research integration init codex
/research integration init claude-code
/research integration init openclawWhat it does:
- detects CLI availability on
PATH - checks user-level config locations
- checks whether project-local adapter files exist
- performs conservative auth detection where that is statically safe
- writes project-local adapter templates under
.codex/,.claude/, and.openclaw/
Current auth detection policy is intentionally conservative:
- Codex: detects
auth.jsonorOPENAI_API_KEY - Claude Code: can confirm env-backed auth, but does not claim an interactive Claude login is valid from static files alone
- OpenClaw: checks config/profile signals, not live gateway liveness
ClawLab now includes a runnable rebuttal path.
/research rebuttal init/research rebuttal plan \
--paper /path/to/paper.pdf \
--review /path/to/review1.pdf \
--review /path/to/review2.txt \
--repo /path/to/repo \
--venue neurips/research rebuttal draft --run-dir /path/to/.clawlab/rebuttal/runs/run_.../research rebuttal validate --draft /path/to/rebuttal_draft.md --venue neuripsCurrent built-in venue presets:
cvprneuripsiclracl_arrgeneric
Artifacts written per rebuttal run:
inputs.jsonpaper.txtreviews.txtvenue_policy.jsonconcerns.jsonrepo_evidence.jsonrebuttal_plan.jsonrebuttal_plan.mdrebuttal_draft.mdrebuttal_validation.json
ClawLab now exposes a small executable skill catalog instead of a giant fake list.
Commands:
/research skills list
/research skills show integration-doctor
/research skills run integration-doctor
/research skills run review-concern-extract --review /path/to/review.pdf
/research skills run venue-policy-check --draft /path/to/draft.md --venue neuripsCurrent executable built-ins:
integration-doctorreview-concern-extractvenue-policy-checkrepo-evidence-scanrebuttal-plan
Curated external references are also listed, but they are clearly marked as references rather than pretending to be built-in local skills.
The current /research team ... surface is still useful, but be clear about what it is:
- role scaffolding
- team memory templates
- role switch/status commands
- role-oriented playbook recommendations
It is not the same thing as a fully embedded OMX $team runtime.
Available commands:
/research team init
/research team status
/research team roles
/research team switch reviewer
/research team skills --stage experimentThese are the verification commands that currently give reliable signal for ClawLab work in this repo:
bun run lint:clawlab
bun run test:clawlab
bun run check:clawlabWhat I have actually verified in this environment:
bun run test:clawlabpassesbun run lint:clawlabpasses with complexity warnings only
The focused typecheck:clawlab script is still limited by inherited upstream TypeScript graph issues from this repository snapshot, so I do not treat it as the main pass/fail gate yet.
- OpenAI Codex docs
- OpenAI Codex GitHub repo
- Claude Code docs
- Anthropic Claude Code docs
- OpenClaw docs
- OpenClaw GitHub repo
- oh-my-codex
- oh-my-claudecode
- Paper2Rebuttal
- English guide: docs/auto-research-loop/README.en.md
- 中文指南: docs/auto-research-loop/README.zh-CN.md
- Docs landing page: docs/auto-research-loop/README.md
