Skip to content

Cat-blizzard/SlideNote

Repository files navigation

SlideNote

SlideNote

Coverage-aware course notes from lecture slides

Turn PPT/PDF into readable, traceable notes with images, OCR/vision, Lecture-Weave writing, and coverage checks.

Not just a slide summarizer, but a faithful study-document pipeline.

Python PPT PDF LLM Vision OCR Status

English | 中文 | Docs | Config | Roadmap


Contents

Quick Start

git clone https://github.com/Cat-blizzard/SlideNote.git
cd SlideNote
.\install.ps1
.\run_gui.ps1

The setup script creates .venv, installs SlideNote with GUI/LLM extras, and runs slidenote doctor. The GUI lets you paste API keys in the page for a single run, so you do not have to set terminal environment variables first.

Manual setup is still available:

python -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install -e ".[dev,llm]"
python -m slidenote doctor

For a local preview without API calls:

python -m slidenote build path\to\lecture.pdf --out outputs\local --preset local --export markdown-zip

After the first install, run this Local preview command first. Confirm that notes.md and the shareable notes.zip are generated before switching to the lecture quality workflow.

For higher-quality notes with visual understanding:

$env:DASHSCOPE_API_KEY="..."
$env:DEEPSEEK_API_KEY="..."
python -m slidenote build path\to\lecture.pdf --out outputs\lecture --provider deepseek --export markdown-zip

Open outputs\lecture\notes.md after generation. Images are copied into outputs\lecture\notes.assets\ by default.

Optional GUI

SlideNote Studio is a Streamlit interface around the same CLI pipeline. It supports uploading PPT/PDF files, entering API keys in the page, selecting presets, watching progress and ETA, reviewing token/cost reports, checking page-level sources, and downloading generated results.

.\run_gui.ps1

See gui/README_GUI.md and gui/README_GUI.zh-CN.md for GUI details.

SlideNote Pipeline

SlideNote is organized as a five-stage product pipeline. Low-level modules can stay fine-grained for caching, debugging, and partial refresh; the user-facing workflow should remain simple.

Ingest -> Understand -> Write -> Guard -> Export
Stage Purpose Main artifacts
1. Ingest Parse PPT/PDF into stable, traceable structure. content.json, element_ir.json, source_map.json, screenshots, assets, parser adapters
2. Understand Decide what the courseware is teaching. deck_understanding.json, page_understanding.json, sections.json, deck_brief.json, figure/table understanding
3. Write Turn structured material into readable study notes. notes.md, Lecture-Weave page notes, teaching enrichment
4. Guard Check faithfulness, coverage, and study quality. coverage.json, coverage.md, content_guard.json, quality_report.json
5. Export Publish notes and reports. notes.zip, notes.toc.md, notes.docx, notes.pdf, notes.tex; review/exam packs are generated separately by study-pack

More detail: SlideNote Pipeline.

User Presets

Use top-level --preset for product workflows. Everyday users now only need two modes: the default lecture mode and the no-API local mode.

Preset Best for Behavior
lecture Teacher-style detailed lecture notes. Enables LLM, OCR auto, Vision auto, Lecture-Weave, deck brief, content guard, and teaching enrichment.
local No API key, offline preview, parser checks. Uses local rules only and does not call text, vision, or OCR APIs.
python -m slidenote build lecture.pdf --out outputs\lecture --provider deepseek
python -m slidenote build lecture.pdf --out outputs\local --preset local

More detail: User Presets.

Origin

SlideNote started from a very personal learning problem.

I have never been the kind of student who learns best by simply listening to lectures. Sometimes I cannot fully follow a teacher's explanation in real time, and I usually learn more efficiently by reading. Reading lets me slow down, go back, skip ahead, and control the pace of understanding by myself.

But lecture slides are not the same as readable notes. After class, reading the PPT directly often feels incomplete: the bullets are fragmented, the logic is implicit, and many important details live in diagrams, screenshots, formulas, or the teacher's spoken explanation. Manually rewriting everything into notes is possible, but it is time-consuming, hard to keep complete, and not always pleasant to revisit later.

So I wanted to build a tool that could turn course slides into structured, readable, traceable notes: not just a summary, but a faithful learning document that preserves images, keeps page references, checks coverage, and helps convert lecture materials into something I can actually study from.

That idea became SlideNote.

Setup

SlideNote does not require a local GPU. The local parser can run with only Python dependencies; LLM rewriting, OCR, and visual understanding require API keys for the providers you choose.

Minimum setup:

  • Python 3.10 or newer.
  • A virtual environment is recommended.
  • New users can run .\install.ps1 and then .\run_gui.ps1.
  • python -m pip install -e ".[dev]" for local parsing.
  • python -m pip install -e ".[dev,llm]" for LLM providers.

Optional software:

Software Purpose
LibreOffice Converts .ppt / .pptx to PDF and enables full-slide screenshots when PowerPoint is unavailable.
Microsoft PowerPoint + pywin32 Windows-only PPTX screenshot export route.
Pandoc Word and LaTeX export.
LibreOffice + Pandoc PDF export from notes.docx, usually more stable for CJK layout.

Configuration details live in CONFIG.zh-CN.md. The build entrypoint is intentionally small; provider, OCR, Vision, and cache details are handled mostly through strong defaults and environment variables.

Common Workflows

Local rule-based draft:

python -m slidenote build path\to\lecture.pptx --out outputs\local --preset local --export markdown-zip

Teacher-style lecture notes:

python -m slidenote build path\to\lecture.pdf `
  --out outputs\lecture-notes `
  --provider deepseek `
  --export markdown-zip

Review and exam pack:

python -m slidenote build path\to\lecture.pdf `
  --out outputs\lecture-review `
  --provider deepseek
python -m slidenote study-pack outputs\lecture-review --question-count 20

Text-only lecture notes:

python -m slidenote build path\to\lecture.pdf `
  --out outputs\text-only `
  --provider deepseek `
  --vision off

Technical Docs

README is intentionally kept as a landing page. Detailed behavior lives in the docs:

Topic Link
Documentation index docs/index.zh-CN.md
Pipeline stages docs/pipeline.zh-CN.md
Presets docs/presets.zh-CN.md
Coverage, content guard, quality report, review/exam packs docs/quality-and-guard.zh-CN.md
Element IR, source map, assets docs/ir-and-source-map.zh-CN.md
LLM providers, OCR, vision, cache, cost docs/providers-and-cost.zh-CN.md
Roadmap design notes docs/roadmap/extension-notes.zh-CN.md

The main output is notes.md. To share Markdown notes with images, export notes.zip; it contains notes.md and the notes.assets/ image folder. Depending on options, SlideNote can also write content.json, deck_understanding.json, page_understanding.json, element_ir.json, source_map.json, coverage.md, quality_report.json, review.md, exam.md, exam.json, exam.html, notes.docx, notes.pdf, and other reports.

Future Outlook

SlideNote is built with a hopeful assumption: future AI systems will become stronger, faster, cheaper, and easier to orchestrate through mature open-source agent frameworks. If that happens, this project should not merely run the same prompts for less money. Its ceiling should rise.

Models and providers such as DeepSeek are one example of the direction that makes this exciting: better price/performance, broader access, and a more open ecosystem can make high-quality multi-pass workflows practical for ordinary study materials. When API latency drops and agent frameworks become more reliable, SlideNote can afford to run richer stages by default: deeper deck understanding, page-level visual reasoning, teacher-style section writing, teaching enrichment, coverage repair, exam generation, wrong-answer review, and source verification.

The reason this matters is that SlideNote's bottleneck is not only "can the model summarize a slide?" The harder problem is coordinating parsing, vision, writing, grounding, quality checks, and revision without losing traceability. That is why the project invests in element_ir.json, source_map.json, coverage reports, artifact registries, presets, cache keys, and review/exam packs. Those structures let SlideNote absorb future model gains without being tied to one model, provider, or agent runtime.

The long-term vision is:

SlideNote should grow from a courseware converter into a course learning operating system.

In that version, slides, readings, personal notes, figures, formulas, quizzes, mistakes, and revisions all live in one traceable learning workflow.

Design Principle

SlideNote deliberately avoids this shortcut:

PPT -> LLM -> Summary

Instead, it follows:

PPT/PDF -> structured extraction -> source inventory -> note generation -> coverage check -> export

The local rule-based draft is only a baseline for debugging extraction and coverage. Production notes should use the default lecture preset, while coverage checks still rely on element IDs so the model cannot silently summarize away details.

License

SlideNote uses a dual-license structure:

  • Source code is licensed under the GNU Affero General Public License v3.0 or later (AGPL-3.0-or-later). See LICENSE.
  • Documentation and example educational materials are licensed under Creative Commons Attribution 4.0 International (CC BY 4.0). See LICENSES/CC-BY-4.0.txt.

The SlideNote name, logo, and other brand assets are not licensed for standalone reuse. See NOTICE for the exact scope.

Acknowledgements

  • SlideNote's optional review/exam study-pack workflow was conceptually inspired by WUBING2023/ExamPass-Assistant and the extended MIKUZ12/ExamPass-Assistant fork. SlideNote does not reuse their code, templates, prompts, or assets.
  • GUI development contributions from hongzuoj-pixel.
  • Testing contributions from MOm0-000.
  • SlideNote's parser-adapter and document-IR roadmap is informed by prior art such as Microsoft MarkItDown, Docling, Marker, MinerU, and Unstructured.
  • SlideNote's future retrieval, source tracing, and post-generation QA direction is informed by systems such as RAGFlow. These projects are references and inspirations, not bundled dependencies unless explicitly listed elsewhere.

References

About

Coverage-aware course note generator from lecture slides

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors