SlideNote

Coverage-aware course notes from lecture slides

Turn PPT/PDF into readable, traceable notes with images, OCR/vision, Lecture-Weave writing, and coverage checks.

Not just a slide summarizer, but a faithful study-document pipeline.

Quick Start

git clone https://github.com/Cat-blizzard/SlideNote.git
cd SlideNote
.\install.ps1
.\run_gui.ps1

The setup script creates .venv, installs SlideNote with GUI/LLM extras, and runs slidenote doctor. The GUI lets you paste API keys in the page for a single run, so you do not have to set terminal environment variables first.

Manual setup is still available:

python -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install -e ".[dev,llm]"
python -m slidenote doctor

For a local preview without API calls:

python -m slidenote build path\to\lecture.pdf --out outputs\local --preset local --export markdown-zip

After the first install, run this Local preview command first. Confirm that notes.md and the shareable notes.zip are generated before switching to the lecture quality workflow.

For higher-quality notes with visual understanding:

$env:DASHSCOPE_API_KEY="..."
$env:DEEPSEEK_API_KEY="..."
python -m slidenote build path\to\lecture.pdf --out outputs\lecture --provider deepseek --export markdown-zip

Open outputs\lecture\notes.md after generation. Images are copied into outputs\lecture\notes.assets\ by default.

Optional GUI

SlideNote Studio is a Streamlit interface around the same CLI pipeline. It supports uploading PPT/PDF files, entering API keys in the page, selecting presets, watching progress and ETA, reviewing token/cost reports, checking page-level sources, and downloading generated results.

.\run_gui.ps1

See gui/README_GUI.md and gui/README_GUI.zh-CN.md for GUI details.

SlideNote Pipeline

SlideNote is organized as a five-stage product pipeline. Low-level modules can stay fine-grained for caching, debugging, and partial refresh; the user-facing workflow should remain simple.

Ingest -> Understand -> Write -> Guard -> Export

Stage	Purpose	Main artifacts
1. Ingest	Parse PPT/PDF into stable, traceable structure.	`content.json`, `element_ir.json`, `source_map.json`, screenshots, assets, parser adapters
2. Understand	Decide what the courseware is teaching.	`deck_understanding.json`, `page_understanding.json`, `sections.json`, `deck_brief.json`, figure/table understanding
3. Write	Turn structured material into readable study notes.	`notes.md`, Lecture-Weave page notes, teaching enrichment
4. Guard	Check faithfulness, coverage, and study quality.	`coverage.json`, `coverage.md`, `content_guard.json`, `quality_report.json`
5. Export	Publish notes and reports.	`notes.zip`, `notes.toc.md`, `notes.docx`, `notes.pdf`, `notes.tex`; review/exam packs are generated separately by `study-pack`

More detail: SlideNote Pipeline.

User Presets

Use top-level --preset for product workflows. Everyday users now only need two modes: the default lecture mode and the no-API local mode.

Preset	Best for	Behavior
`lecture`	Teacher-style detailed lecture notes.	Enables LLM, OCR auto, Vision auto, Lecture-Weave, deck brief, content guard, and teaching enrichment.
`local`	No API key, offline preview, parser checks.	Uses local rules only and does not call text, vision, or OCR APIs.

python -m slidenote build lecture.pdf --out outputs\lecture --provider deepseek
python -m slidenote build lecture.pdf --out outputs\local --preset local

More detail: User Presets.

Origin

SlideNote started from a very personal learning problem.

I have never been the kind of student who learns best by simply listening to lectures. Sometimes I cannot fully follow a teacher's explanation in real time, and I usually learn more efficiently by reading. Reading lets me slow down, go back, skip ahead, and control the pace of understanding by myself.

But lecture slides are not the same as readable notes. After class, reading the PPT directly often feels incomplete: the bullets are fragmented, the logic is implicit, and many important details live in diagrams, screenshots, formulas, or the teacher's spoken explanation. Manually rewriting everything into notes is possible, but it is time-consuming, hard to keep complete, and not always pleasant to revisit later.

So I wanted to build a tool that could turn course slides into structured, readable, traceable notes: not just a summary, but a faithful learning document that preserves images, keeps page references, checks coverage, and helps convert lecture materials into something I can actually study from.

That idea became SlideNote.

Setup

SlideNote does not require a local GPU. The local parser can run with only Python dependencies; LLM rewriting, OCR, and visual understanding require API keys for the providers you choose.

Minimum setup:

Python 3.10 or newer.
A virtual environment is recommended.
New users can run .\install.ps1 and then .\run_gui.ps1.
python -m pip install -e ".[dev]" for local parsing.
python -m pip install -e ".[dev,llm]" for LLM providers.

Optional software:

Software	Purpose
LibreOffice	Converts `.ppt` / `.pptx` to PDF and enables full-slide screenshots when PowerPoint is unavailable.
Microsoft PowerPoint + `pywin32`	Windows-only PPTX screenshot export route.
Pandoc	Word and LaTeX export.
LibreOffice + Pandoc	PDF export from `notes.docx`, usually more stable for CJK layout.

Configuration details live in CONFIG.zh-CN.md. The build entrypoint is intentionally small; provider, OCR, Vision, and cache details are handled mostly through strong defaults and environment variables.

Common Workflows

Local rule-based draft:

python -m slidenote build path\to\lecture.pptx --out outputs\local --preset local --export markdown-zip

Teacher-style lecture notes:

python -m slidenote build path\to\lecture.pdf `
  --out outputs\lecture-notes `
  --provider deepseek `
  --export markdown-zip

Review and exam pack:

python -m slidenote build path\to\lecture.pdf `
  --out outputs\lecture-review `
  --provider deepseek
python -m slidenote study-pack outputs\lecture-review --question-count 20

Text-only lecture notes:

python -m slidenote build path\to\lecture.pdf `
  --out outputs\text-only `
  --provider deepseek `
  --vision off

Technical Docs

README is intentionally kept as a landing page. Detailed behavior lives in the docs:

Topic	Link
Documentation index	docs/index.zh-CN.md
Pipeline stages	docs/pipeline.zh-CN.md
Presets	docs/presets.zh-CN.md
Coverage, content guard, quality report, review/exam packs	docs/quality-and-guard.zh-CN.md
Element IR, source map, assets	docs/ir-and-source-map.zh-CN.md
LLM providers, OCR, vision, cache, cost	docs/providers-and-cost.zh-CN.md
Roadmap design notes	docs/roadmap/extension-notes.zh-CN.md

The main output is notes.md. To share Markdown notes with images, export notes.zip; it contains notes.md and the notes.assets/ image folder. Depending on options, SlideNote can also write content.json, deck_understanding.json, page_understanding.json, element_ir.json, source_map.json, coverage.md, quality_report.json, review.md, exam.md, exam.json, exam.html, notes.docx, notes.pdf, and other reports.

Future Outlook

SlideNote is built with a hopeful assumption: future AI systems will become stronger, faster, cheaper, and easier to orchestrate through mature open-source agent frameworks. If that happens, this project should not merely run the same prompts for less money. Its ceiling should rise.

Models and providers such as DeepSeek are one example of the direction that makes this exciting: better price/performance, broader access, and a more open ecosystem can make high-quality multi-pass workflows practical for ordinary study materials. When API latency drops and agent frameworks become more reliable, SlideNote can afford to run richer stages by default: deeper deck understanding, page-level visual reasoning, teacher-style section writing, teaching enrichment, coverage repair, exam generation, wrong-answer review, and source verification.

The reason this matters is that SlideNote's bottleneck is not only "can the model summarize a slide?" The harder problem is coordinating parsing, vision, writing, grounding, quality checks, and revision without losing traceability. That is why the project invests in element_ir.json, source_map.json, coverage reports, artifact registries, presets, cache keys, and review/exam packs. Those structures let SlideNote absorb future model gains without being tied to one model, provider, or agent runtime.

The long-term vision is:

SlideNote should grow from a courseware converter into a course learning operating system.

In that version, slides, readings, personal notes, figures, formulas, quizzes, mistakes, and revisions all live in one traceable learning workflow.

Design Principle

SlideNote deliberately avoids this shortcut:

PPT -> LLM -> Summary

Instead, it follows:

PPT/PDF -> structured extraction -> source inventory -> note generation -> coverage check -> export

The local rule-based draft is only a baseline for debugging extraction and coverage. Production notes should use the default lecture preset, while coverage checks still rely on element IDs so the model cannot silently summarize away details.

License

SlideNote uses a dual-license structure:

Source code is licensed under the GNU Affero General Public License v3.0 or later (AGPL-3.0-or-later). See LICENSE.
Documentation and example educational materials are licensed under Creative Commons Attribution 4.0 International (CC BY 4.0). See LICENSES/CC-BY-4.0.txt.

The SlideNote name, logo, and other brand assets are not licensed for standalone reuse. See NOTICE for the exact scope.

Acknowledgements

SlideNote's optional review/exam study-pack workflow was conceptually inspired by WUBING2023/ExamPass-Assistant and the extended MIKUZ12/ExamPass-Assistant fork. SlideNote does not reuse their code, templates, prompts, or assets.
GUI development contributions from hongzuoj-pixel.
Testing contributions from MOm0-000.
SlideNote's parser-adapter and document-IR roadmap is informed by prior art such as Microsoft MarkItDown, Docling, Marker, MinerU, and Unstructured.
SlideNote's future retrieval, source tracing, and post-generation QA direction is informed by systems such as RAGFlow. These projects are references and inspirations, not bundled dependencies unless explicitly listed elsewhere.

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
.github/workflows		.github/workflows
LICENSES		LICENSES
assets		assets
docs		docs
gui		gui
scripts		scripts
slidenote		slidenote
tests		tests
.gitignore		.gitignore
CONFIG.zh-CN.md		CONFIG.zh-CN.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
README.zh-CN.md		README.zh-CN.md
ROADMAP.zh-CN.md		ROADMAP.zh-CN.md
install.ps1		install.ps1
pricing.template.json		pricing.template.json
pyproject.toml		pyproject.toml
requirements-gui.txt		requirements-gui.txt
run_gui.ps1		run_gui.ps1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SlideNote

Contents

Quick Start

Optional GUI

SlideNote Pipeline

User Presets

Origin

Setup

Common Workflows

Technical Docs

Future Outlook

Design Principle

License

Acknowledgements

References

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SlideNote

Contents

Quick Start

Optional GUI

SlideNote Pipeline

User Presets

Origin

Setup

Common Workflows

Technical Docs

Future Outlook

Design Principle

License

Acknowledgements

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages