Skip to content

eatoften/Video_Course_Cards

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Video Course Cards

Turn long course videos into timestamp-grounded knowledge cards, local card memory, and Obsidian-friendly Markdown.

Download | Desktop build | Local LLM setup | Roadmap | RAG plan

Python 3.11 FastAPI React Tauri SQLite

Overview

Video Course Cards is a local-first AI learning workspace for lecture videos. It turns a video into a transcript, cuts the transcript into semantic chunks, drafts grounded knowledge cards with a local LLM, stores everything in SQLite, embeds cards for retrieval, and exports portable Markdown snapshots.

The project is not trying to be another generic "chat with your transcript" demo. The core object is a claim-grounded knowledge card: a structured learning unit whose claims point back to transcript evidence and timestamps. And the future plan is to turn these cards in to a graph which can serve as an external world model that a controller can plan over, while a decoder can take the plan and graph as an input to generate answers.

video -> transcript -> semantic chunks -> grounded cards -> card embeddings -> retrieval -> Markdown export

SQLite is the source of truth. Markdown is an export format.

Why This Exists

Long technical lectures contain more than raw transcript text. A useful learning system should preserve where an idea came from, what evidence supports it, how it connects to other cards, and how the user later edits or rejects it.

This repository explores that pipeline as a local desktop application:

  • Grounded generation: cards keep claims, evidence, and source timestamps.
  • Local-first storage: videos, transcripts, cards, embeddings, and notes stay on the user's machine.
  • Structured memory: cards are JSON/SQLite records before they become Markdown.
  • Retrieval baseline: card embeddings support ordinary dense retrieval before more advanced graph-guided methods.
  • Portable output: exports are Obsidian-friendly Markdown snapshots.

Current Demo

The current demo runs on Windows as a Tauri desktop app with a packaged FastAPI sidecar.

It can:

  • upload local videos;
  • validate media with ffprobe;
  • extract audio with FFmpeg;
  • transcribe with faster-whisper;
  • show transcript segments next to the course workspace;
  • create semantic transcript chunks with Sentence Transformer embeddings;
  • generate cards manually from selected transcript text or automatically from chunks;
  • save, edit, delete, tag, and review cards;
  • attach user notes to cards;
  • embed cards and run dense card retrieval;
  • export one job or all cards as Markdown folders;
  • check local runtime dependencies such as FFmpeg, Ollama/Qwen, and embedding models.

Still rough:

  • the installer is not code-signed;
  • Windows is the only packaged target currently exercised;
  • Ollama, Qwen, FFmpeg, and model caches are user-installed dependencies;
  • RAG currently retrieves cards, but answer generation with citations is still planned;
  • exported Markdown does not sync edits back into SQLite.

Install

Download the latest Windows installer from:

https://github.com/eatoften/Video_Course_Cards/releases/latest

The installer includes:

  • Tauri desktop shell;
  • React UI;
  • packaged FastAPI backend;
  • SQLite schema and app code.

The installer does not bundle large model assets. Install local AI dependencies separately:

ollama pull qwen3:4b

Desktop data is stored under:

C:\Users\<user>\AppData\Local\Video Course Cards\

See docs/local-llm.md for local model configuration.

Developer Setup

Clone the repository, then install backend and frontend dependencies.

git clone https://github.com/eatoften/Video_Course_Cards.git
cd Video_Course_Cards

Run the backend:

cd backend
$env:PYTHONUTF8='1'
$env:PYTHONDONTWRITEBYTECODE='1'
uv run python -B -m uvicorn app.main:app --host 127.0.0.1 --port 8001 --reload

Run the frontend:

cd frontend
npm.cmd install
npm.cmd run dev

Open:

http://127.0.0.1:5174

Desktop Build

Tauri requires Rust/Cargo and the Visual Studio C++ build tools on Windows.

Build the backend sidecar:

powershell -NoProfile -ExecutionPolicy Bypass -File .\scripts\build-desktop-backend.ps1

Run the desktop app in development:

cd frontend
npm.cmd run tauri:dev

Build the Windows installer:

powershell -NoProfile -ExecutionPolicy Bypass -File .\scripts\build-windows-installer.ps1

Output:

frontend\src-tauri\target\release\bundle\nsis\Video Course Cards_0.1.0_x64-setup.exe

GitHub Actions can build and attach the installer to a tag release:

git tag v0.1.0
git push origin v0.1.0

See docs/tauri-desktop.md.

Architecture

flowchart LR
    subgraph Desktop["Tauri desktop"]
        UI["React / TypeScript"]
        RUST["Rust shell"]
    end

    subgraph Backend["FastAPI sidecar"]
        API["HTTP routes"]
        JOBS["Job service"]
        PIPELINE["Video pipeline"]
        CARDS["Card services"]
        RAG["Retrieval services"]
    end

    subgraph LocalAI["Local AI runtime"]
        FFMPEG["FFmpeg / ffprobe"]
        WHISPER["faster-whisper"]
        EMB["Sentence Transformers"]
        LLM["Ollama / Qwen"]
    end

    subgraph Storage["Local data"]
        DB["SQLite"]
        FILES["uploads / transcripts / exports"]
    end

    UI <--> API
    RUST --> API
    API --> JOBS
    JOBS --> PIPELINE
    PIPELINE --> FFMPEG
    PIPELINE --> WHISPER
    CARDS --> LLM
    RAG --> EMB
    API <--> DB
    API <--> FILES
Loading

The backend is deliberately split by responsibility:

Layer Responsibility
main.py HTTP routes and response mapping
job_service.py video job orchestration
job_store.py SQLite CRUD for jobs
video_pipeline.py media probe, audio extraction, transcription
transcript_chunker.py semantic transcript chunking
knowledge_card_service.py card persistence and updates
card_embedding_service.py card text -> embedding workflow
rag_service.py card retrieval baseline
desktop_server.py packaged backend sidecar entrypoint

Knowledge Cards

A card is stored as structured data, not just markdown text.

{
  "title": "Singular Value Decomposition",
  "summary": "SVD factors a matrix into orthogonal and diagonal structure.",
  "tags": ["linear algebra", "matrix factorization"],
  "source_start_seconds": 724.0,
  "source_end_seconds": 738.0,
  "claims": [
    {
      "text": "SVD decomposes a matrix using orthogonal and diagonal components.",
      "evidence": [
        {
          "text": "called the singular value decomposition",
          "start_seconds": 724.0,
          "end_seconds": 738.0
        }
      ]
    }
  ],
  "question": "What structure does SVD use to factor a matrix?",
  "answer": "It uses orthogonal matrices and a diagonal matrix."
}

This shape makes later work possible: duplicate detection, related-card search, graph edges, citation-aware RAG, and feedback-based evaluation.

API Surface

Selected endpoints:

Endpoint Purpose
POST /videos upload and register a local video
POST /jobs/{job_id}/run run probe -> audio -> transcription
GET /jobs/{job_id}/transcript return timestamped transcript segments
POST /jobs/{job_id}/chunks generate semantic transcript chunks
POST /jobs/{job_id}/cards/auto-generate generate cards from chunks
GET /jobs/{job_id}/cards list cards for one video job
PATCH /cards/{card_id} edit a saved card
POST /cards/{card_id}/embedding embed one card
POST /courses/{course_id}/card-embeddings embed all cards in a course
POST /rag/retrieve retrieve relevant cards for a question
POST /jobs/{job_id}/cards/export/markdown/folder export one job as Markdown
POST /cards/export/markdown/folder export all cards as Markdown
GET /runtime/status inspect local runtime dependencies

Tests

Backend:

cd backend
uv run pytest

Frontend:

cd frontend
npm.cmd run build

Tauri:

cd frontend\src-tauri
cargo check

Sidecar smoke test:

powershell -NoProfile -ExecutionPolicy Bypass -File .\scripts\test-desktop-backend.ps1

Roadmap

Near term:

  • improve automatic card generation reliability;
  • improve semantic chunk boundary quality;
  • detect duplicate or near-duplicate cards with embeddings;
  • show related cards in the UI;
  • turn card retrieval into a citation-grounded answer assistant;
  • add evaluation records for latency, unsupported claims, duplicates, and retrieval misses.

Longer term:

  • build a card similarity graph;
  • add relation types such as prerequisite, example_of, contrast_with, and part_of;
  • support human-in-the-loop graph editing;
  • compare ordinary dense RAG against graph-guided retrieval;
  • use user edits and save/delete decisions as a feedback dataset for future agentic learning loops.

Project Principles

  • Local data should stay local by default.
  • Claims should be traceable to evidence.
  • SQLite should remain the durable source of truth.
  • Markdown should be portable, inspectable, and tool-friendly.
  • Advanced AI features should be compared against simple baselines.
  • User corrections should become evaluation data before they become training data.

License

To be determined before the first public release.

About

A local-first learning tool for technical course videos: import videos and subtitles, click terms to create timestamped concept cards with context, and review them through traceable RAG over course subtitles and notes.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors