A true AI agent that synthesizes qualitative research transcripts into structured insights reports.
github.com/rackumar21/insight-engine
Most "AI tools" make one API call: you give it input, it gives you output.
Insight Engine is different. It uses Claude's tool_use API to work step by step — planning its own analysis, reading files autonomously, checkpointing as it goes, and synthesizing across everything at the end. Claude decides the order of operations. If interrupted, it resumes from where it left off.
The agentic loop:
List files → Read each transcript → Extract + save themes → Load all themes → Synthesize → Write report
Each arrow is a tool call. Claude drives the process, not a hardcoded script.
Given a folder of interview or research transcripts, it produces:
- Executive summary — 3-5 bullets for someone who has 30 seconds
- Key themes — grouped by topic, with supporting quotes and participant counts
- Tensions and contradictions — where participants disagreed (often the most interesting signals)
- Underexplored areas — questions worth asking in follow-up research
- Methodology notes — how many transcripts, any gaps
| Language | Python 3.11+ |
| AI | Anthropic Claude API (claude-opus-4-6) |
| Tool use | Claude's native tool_use API |
git clone https://github.com/rackumar21/insight-engine
cd insight-engine
pip install -r requirements.txt
export ANTHROPIC_API_KEY=your_key_here# Run on sample transcripts (included)
python agent.py --folder transcripts/sample
# Run on your own transcripts
python agent.py --folder path/to/your/transcripts
# Specify output location
python agent.py --folder transcripts/ --output my_report.mdReports are saved to ./reports/insights_YYYY-MM-DD.md by default.
Per-transcript theme extractions are cached in ./analysis/. If the agent is interrupted mid-run, re-running it skips already-analyzed transcripts and picks up from where it left off.
- Agents need checkpoints. A script that fails halfway through loses all its work. An agent that saves intermediate state can resume. The
analysis/cache was the single most important design decision. - Tool design is product design. Writing the tool schemas felt exactly like writing a PRD: define the interface, anticipate edge cases, make the inputs unambiguous. Claude's ability to use a tool correctly depends almost entirely on how well the schema describes what it does.
- The reasoning is the output. Watching the agent work — seeing it decide to re-read a transcript before synthesizing, or group themes before writing — is more interesting than the report. The process reveals something about how good analysis actually works.