A Lightless Labs research project studying AI agent behavior at scale through multi-disciplinary corpus analysis (basically, throwing stuff at the wall and seeing what sticks).
Named after Tiffany Aching's concept from Discworld: first thoughts (agents thinking), second thoughts (agents analyzing their thinking), third thoughts (this project analyzing that).
Third Thoughts has two halves that share a corpus and a methodology:
middens/— a Rust CLI for extracting behavioral patterns from AI agent session logs. Parses transcripts from Claude Code, Codex, OpenClaw, and Pi coding-agent sessions (Gemini stub), classifies messages and sessions, and runs a battery of 23 analytical techniques (6 Rust-native + 17 Python, bundled via an embedded Python bridge). The CLI has three core commands:analyze(run techniques → Parquet storage),interpret(LLM-powered cross-technique narrative), andexport(Jupyter notebook). Seemiddens/README.md.- Research artifacts — methods catalog, natural-language specs, replication studies, and documented findings in
docs/. This is where the scientific claims live.
The corpus itself (corpus/, experiments/) is gitignored — the sessions contain private data and cannot be redistributed. The tooling and methodology are open; the raw data is not.
| Finding | Status | Scope |
|---|---|---|
| 100% risk-token suppression in paired thinking/text messages | Provisional | language=en ∧ thinking_visibility=Visible ∧ ¬contaminated_by_Boucle. N=828 sessions, 4,819 risk tokens, 209 paired messages. |
| HSMM pre-failure state (24.6× lift) | Robust (mixed corpus) | Pending re-run under 4-axis stratification. |
| MVT violated — agents under-explore | Robust | See experiments/full-corpus/information-foraging.md. |
| Session degradation (agents get worse over time) | Holds on interactive only | See experiments/interactive/survival_analysis.txt. |
| W10–W12 Boucle contamination in "interactive" bucket | Confirmed | 1,820/1,826 sessions carry autonomous-loop markers. |
Compound scoping rule: any headline finding on thinking or text behaviour must survive four axes — session_type, thinking_visibility, language, and a temporal window. A finding that doesn't survive all four is not a finding. More context in CLAUDE.md and docs/HANDOFF.md.
middens/ Rust CLI — parser, classifiers, techniques, Python bridge
docs/
HANDOFF.md Session-continuity document, read this first
methods-catalog.md 20 method families, 80+ references
examples/ Worked examples for the CLI triad workflow
nlspecs/ Natural-language specs (Why / What / How / Done)
reports/ Research reports
reviews/ Multi-model peer reviews
brainstorms/ Requirements docs
plans/ Implementation plans
solutions/ Institutional knowledge — documented learnings
scripts/ Python analytical battery (26 scripts, mostly superseded by middens)
todos/ Individual todo files with YAML frontmatter
Gitignored: corpus/, corpus-full/, corpus-split/, corpus-frozen/, experiments/, data/labeled-messages.json.
Install the CLI with Homebrew on macOS or Linux:
brew install lightless-labs/tap/middens
middens --helpOn Linux without Homebrew, grab the release tarball directly:
# x86_64 Linux
curl -LO https://github.com/Lightless-Labs/third-thoughts/releases/download/v0.0.1-beta.4/middens-0.0.1-beta.4-x86_64-unknown-linux-gnu.tar.gz
tar xzf middens-0.0.1-beta.4-x86_64-unknown-linux-gnu.tar.gz
./middens-0.0.1-beta.4-x86_64-unknown-linux-gnu/middens --helpmiddens currently ships binaries for Apple Silicon macOS, x86_64 Linux, and arm64 Linux. Homebrew is the easiest path if you already use it; release tarballs and source builds are documented in middens/README.md.
If you want to run the CLI on your own session logs, head to middens/. If you want to read about the methodology and findings, start with docs/methods-catalog.md and the reports under docs/reports/.
middens archive copies local agent JSONL logs into a private, content-addressed archive before vendors, retention policies, or your own late-night cleanup impulses make them vanish. It stores raw transcripts, so treat the archive root as private data:
middens archive --to ~/agent-session-archive --dry-run
middens archive --to ~/agent-session-archive --yesSelf-contained automation plugins live under integrations/ and do not require the middens binary on PATH:
integrations/pi/middens-archive/— Pi extension with/middens-archive-nowand/middens-archive-status.integrations/claude-code/middens-archive/— Claude Code hooks plus/middens-archive-now.integrations/codex/middens-archive/— Codex hooks plus archive skills.
All three require an explicit MIDDENS_ARCHIVE_ROOT; no default archive path is chosen for you, because surprise raw-transcript folders are rude.
AGPL-3.0-or-later. See LICENSE.