Paper Copilot

A local AI agent that reads a research paper (PDF) and produces a structured, Notion-ready markdown summary. Built with LangChain and Ollama — all processing happens on your machine, no data leaves it.

What it does

Drop a PDF, get a markdown summary containing:

Metadata — title, authors, venue, year, DOI
Overview — what the paper is about
Contribution — what the authors propose and why it matters
State of the Art — related work landscape
Methodology — high-level overview + detailed walkthrough with formulae
Evaluation — datasets, metrics, baselines
Key Results — main findings as bullet points
References Analysis — pie chart of venue distribution
Figures — relevant images extracted from the PDF with captions

Quickstart

Prerequisites

Python 3.12+
Ollama running locally
A model that supports tool calling pulled via Ollama

Setup

# Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone and install
git clone git@github.com:MarcosRodrigoT/Paper-Copilot.git
cd Paper-Copilot
uv sync

# Pull an Ollama model
ollama pull gpt-oss

Run via CLI

uv run python -m src.agent input/paper.pdf

Run via web UI

uv run streamlit run app.py

Then open http://localhost:8501, drag-and-drop a PDF, and click Generate Summary.

Architecture

PDF ──► extract_text ──────────────────────────────► LLM (per-section) ──► save_markdown ──► summary.md
    ├─► extract_images ──► figure selection (LLM) ─┘                       + images/
    └─► parse_references ──► generate_chart ────────────────────────────────┘

The pipeline uses a hybrid approach:

Code handles deterministic steps: PDF parsing, image extraction, reference parsing, chart generation, markdown assembly
LLM handles creative steps: writing each section summary individually with focused prompts and only the relevant paper text

This is more reliable than a pure ReAct agent because local models tend to stop calling tools after seeing large results.

Configuration

Edit src/config.py to change the model or paths:

OLLAMA_MODEL = "gpt-oss"              # Any Ollama model
OLLAMA_BASE_URL = "http://localhost:11434"

Good model choices for 2x RTX 4090:

gpt-oss — built for structured outputs and agentic use
mistral-small — best documented for function calling
mistral-small3.2 — improved function calling, vision capable

Project structure

├── app.py                  # Streamlit web UI
├── src/
│   ├── agent.py            # Orchestration pipeline
│   ├── config.py           # Model and path settings
│   ├── prompts.py          # Per-section LLM prompts
│   └── tools/
│       ├── extract_text.py     # PDF → text sections
│       ├── extract_images.py   # PDF → image files + captions
│       ├── parse_references.py # References → structured data
│       ├── generate_chart.py   # Venue data → pie chart PNG
│       └── save_markdown.py    # Everything → final .md
├── input/                  # Drop PDFs here
├── output/                 # Generated summaries
└── docs/                   # Step-by-step explanations

Documentation

The docs/ folder contains step-by-step explanations of every component, written as a learning resource:

Doc	Topic
`00-project-overview.md`	Project summary and plan
`01-setup.md`	uv, Ollama, and project setup
`02-pdf-extraction.md`	How PDF text extraction works
`02b-image-extraction.md`	How PDF image extraction works
`03-tools.md`	All five tools and how they connect
`04-agent.md`	The agent loop and LLM orchestration
`05-ui.md`	Streamlit UI integration

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.streamlit		.streamlit
docs		docs
input		input
output		output
src		src
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
app.py		app.py
create_gif.py		create_gif.py
pyproject.toml		pyproject.toml
record_demo.py		record_demo.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Paper Copilot

What it does

Quickstart

Prerequisites

Setup

Run via CLI

Run via web UI

Architecture

Configuration

Project structure

Documentation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Paper Copilot

What it does

Quickstart

Prerequisites

Setup

Run via CLI

Run via web UI

Architecture

Configuration

Project structure

Documentation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages