Conversation Parser

Parse Claude Code conversation JSONL logs into Markdown or LaTeX
for documentation and scientific papers.

Quick Start · Install · Usage · Output · Complexity Analysis

Companion tool for the agent template — after a researcher loop completes, point it at the .claude/ directory to produce publication-ready conversation transcripts.

Quick Start (NERSC)

The recommended way to run the parser is via the pre-built Docker image on a NERSC compute node:

claude-hpc -A m3246 -t 1:00:00 -w . --agent-image docker.io/nollde24/claude-parser

This launches an interactive Claude Code session with the parser pre-installed. Inside the session:

conversation-parser .demo -o out

Install (standalone)

pip install -e .                    # base install
pip install -e '.[complexity]'      # + LLM-based complexity analysis

Usage

# Markdown output (default)
conversation-parser .demo -o out

# LaTeX fragments for papers
conversation-parser .demo -f latex -o out

# Include tool calls (Read, Write, Bash, ...) in the transcript
conversation-parser .demo -o out --include-tool-calls

# Semantic complexity analysis (requires ANTHROPIC_API_KEY or active Claude login)
conversation-parser .demo -o out --analyze-complexity

Flags

Flag	Description
`-f`, `--format`	Output format: `markdown` (default) or `latex`
`-o`, `--output-dir`	Output directory (default: `conversation_<format>`)
`--include-tool-calls`	Include tool calls and results in the transcript
`--analyze-complexity`	Classify human messages by complexity and type via Claude API
`--complexity-model`	Model for complexity analysis (default: `claude-haiku-4-5-20251001`)
`--complexity-context`	Messages above/below the target to include as context for classification. `-1` uses the full session transcript (default: `1`)

Output

Markdown (default)

File	Contents
`conversation_metrics.md`	Metrics summary table (sessions, tokens, tools, complexity)
`conversation_transcript.md`	Full conversation transcript
`conversation_highlights.md`	Session overview + human intervention list

LaTeX

File	Purpose	Where to `\input{}`
`conversation_preamble.tex`	Colour definitions, `\convmsg` / `\convsession` commands	Preamble
`conversation_metrics.tex`	`\newcommand` macros (`\convTotalSessions`, `\convHumanMessages`, ...)	Preamble
`conversation_transcript.tex`	Full conversation as a `longtable`	Body
`conversation_highlights.tex`	Session overview + human intervention list	Body

After \input{conversation_metrics.tex} in your preamble, use the macros inline:

The analysis was completed in \convTotalSessions{} sessions with
\convHumanMessages{} human interventions. Of these,
\convComplexityEasy{} were simple corrections,
\convComplexityMedium{} required domain knowledge, and
\convComplexityHard{} involved deep methodological guidance.

Complexity Analysis

With --analyze-complexity, each human message is classified along two axes by an LLM.

Complexity — how much thought the message required:

Level	Description
easy	Simple error reports, approvals, trivial corrections
medium	Guidance requiring domain knowledge (dataset names, parameter values)
hard	Deep methodological or physics insights

Type — the role of the intervention in a paper-reproduction task:

Type	Description
essential	Necessary for the reproduction to be faithful to the paper (corrections, bugfix nudges, unblocking clarifications)
optional	Extra scope the user wants but not required by the paper (additional plots, style tweaks, exploratory asks)
meta	Process/workflow steering that does not touch the scientific content (approvals, "commit now", "move on")

The classifier sees a configurable context window around each message (--complexity-context, default 1; -1 for the full session) so it can judge whether a request falls inside the paper's scope. The assembled excerpt is capped at a model-derived character budget (half the model's token window, converted to chars). If the window is too large to fit, the farthest surrounding messages are dropped symmetrically from above and below until under budget. The target message itself is always kept in full.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.demo/projects/project		.demo/projects/project
.github/workflows		.github/workflows
src/conversation_parser		src/conversation_parser
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Conversation Parser

Quick Start (NERSC)

Install (standalone)

Usage

Flags

Output

Markdown (default)

LaTeX

Complexity Analysis

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Conversation Parser

Quick Start (NERSC)

Install (standalone)

Usage

Flags

Output

Markdown (default)

LaTeX

Complexity Analysis

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages