AI-Powered YouTube Lecture Notes Generator with Multi-Agent Orchestration
Transform any YouTube educational video into comprehensive, structured lecture notes with AI-extracted tools—all in real-time.
- Real-Time Streaming: ChatGPT-style SSE streaming shows progress as notes generate
- Multi-Agent Orchestration: LangGraph coordinates 3 specialized agents running in parallel
- Smart Caching: 99% cost reduction on repeat videos (7-day PostgreSQL cache)
- AI Tool Extraction: Automatically identifies and catalogues AI tools mentioned in videos
- PDF Export: Download professional PDFs with markdown-formatted notes
- Processing History: Full history with search, pagination, and detailed result views
- Dark Mode: System-aware theme with animated toggle
- Preset Videos: 7 curated educational videos for instant demos
- Horizontal Scrolling: Touch-optimized card carousels
- Cross-Browser: Safari 16.4+, Chrome 120+, Firefox 128+, Edge 120+
- Mobile-Responsive: Optimized layouts for all screen sizes
- Accessible: WCAG-compliant typography and keyboard navigation
┌─────────────────────────────────────────────────────────────┐
│ User (YouTube URL) │
└───────────────────────────┬─────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Frontend (Next.js 15 + React 19 + Tailwind v4) │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ - EventSource SSE streaming │ │
│ │ - Server Components for history pages │ │
│ │ - Magic UI animations (beams, cards, theme toggle) │ │
│ │ - react-markdown with @tailwindcss/typography │ │
│ └──────────────────────────────────────────────────────┘ │
└───────────────────────────┬─────────────────────────────────┘
│ GET /api/process/stream
▼
┌─────────────────────────────────────────────────────────────┐
│ Backend (FastAPI 0.115 + LangGraph) │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ Smart Cache Check (PostgreSQL) │ │
│ │ ├─ CACHE HIT → Stream from DB (instant) │ │
│ │ └─ CACHE MISS → Multi-Agent Orchestration │ │
│ └──────────────────────────────────────────────────────┘ │
└───────────────────────────┬─────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ LangGraph Multi-Agent StateGraph │
│ │
│ Agent 1: Transcript Extractor │
│ │ │
│ ├──────────┬──────────┐ │
│ ▼ ▼ ▼ │
│ Agent 2: Agent 3: │
│ Notes Gen Tool Extract │
│ (Gemini) (GPT-4o-mini) │
│ [PARALLEL] [PARALLEL] │
│ │ │ │
│ └──────────┴──────────┐ │
│ ▼ │
│ Merge Results │
│ │ │
│ ▼ │
│ Save to PostgreSQL │
│ │ │
│ ▼ │
│ Stream to Frontend │
└─────────────────────────────────────────────────────────────┘
Technology Stack:
- Frontend: Next.js 15.5.6, React 19, Tailwind CSS v4, shadcn/ui, Magic UI
- Backend: FastAPI 0.115, LangGraph 1.0, SQLAlchemy 2.0 (async)
- AI/LLM: Gemini 2.5 Flash (notes), GPT-4o-mini (tool extraction)
- Database: PostgreSQL (Neon), Alembic migrations
- Deployment Ready: Vercel (frontend), Railway (backend)
- Python: 3.11+ (backend)
- Node.js: 18+ (frontend)
- PostgreSQL: Neon account (free tier) or local PostgreSQL
- API Keys:
GEMINI_API_KEY(orGOOGLE_API_KEY) - Get it hereOPENAI_API_KEY- Get it here
# 1. Navigate to backend directory
cd backend
# 2. Create virtual environment
python3.11 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# 3. Install dependencies
pip install -r requirements.txt
# 4. Configure environment variables
cp .env.example .env
# Edit .env with your API keys and DATABASE_URL
# 5. Run database migrations
alembic upgrade head
# 6. Start the server
uvicorn app.main:app --reload --port 8000Backend will be available at http://localhost:8000
API docs (Swagger): http://localhost:8000/docs
# 1. Navigate to frontend directory
cd frontend
# 2. Install dependencies
npm install
# 3. Start development server
npm run devFrontend will be available at http://localhost:3000
- Enter YouTube URL in the input field on the home page
- Click "Generate Notes" or try a preset video
- Watch real-time streaming as:
- Video metadata appears
- Transcript is fetched
- Notes generate chunk-by-chunk
- AI tools are extracted
- Export to PDF or save to history
- Navigate to History page via top-right icon
- Search, sort, and filter past processing results
- Click any result to view full details
- Delete unwanted entries
If a video was cached, you'll see a "Force Reprocess" button to bypass the cache and regenerate notes fresh.
GET /api/process/stream?video_url={url}&force={bool}- SSE streaming endpoint- Parameters:
video_url(required): YouTube video URLforce(optional): Bypass cache (default: false)
- Events:
metadata,transcript,notes_chunk,tools,complete,error
- Parameters:
GET /api/history?page={int}&page_size={int}&search={str}- Paginated history listGET /api/history/{result_id}- Single result detailsDELETE /api/history/{result_id}- Delete processing result
GET /api/presets- List of 7 curated demo videosPOST /api/export-pdf- Generate and download PDF- Body:
{ video_id, title, notes, tools }
- Body:
GET /health- API health statusGET /- API information
# Required
GEMINI_API_KEY=your_gemini_api_key_here
OPENAI_API_KEY=your_openai_api_key_here
DATABASE_URL=postgresql+asyncpg://user:pass@host/db
# Optional
ENVIRONMENT=development
LOG_LEVEL=INFO- Cache Duration: 7 days
- Storage: PostgreSQL (no Redis needed)
- Cache Key: Video ID
- Behavior:
- First request: Process via APIs → Save to DB → Stream results
- Subsequent requests: Stream from DB instantly
- Cost Impact: 99% reduction on API costs for repeat videos
-
Agent 1 (Transcript Extractor):
- Fetches YouTube transcript via youtube-transcript-api
- Handles multiple languages, auto-generated captions
- Returns structured transcript data
-
Agent 2 (Notes Generator):
- Uses Gemini 2.5 Flash for summarization
- Generates markdown-formatted lecture notes
- Follows ChatGPT-style voice and structure
-
Agent 3 (Tool Extractor):
- Uses GPT-4o-mini to identify AI tools mentioned
- Returns structured JSON with tool names, URLs, descriptions
- Runs in parallel with Agent 2 for efficiency
- Server-Sent Events for unidirectional server-to-client streaming
- Automatic reconnection on connection loss
- Progressive rendering of markdown as chunks arrive
- Loading states with skeleton components and pulsing animations
-
Safari Optimizations:
- 25 BackgroundBeams (vs 50 on Chrome) for performance
- Zero initial animation delay for instant visual feedback
-webkit-prefixes for backdrop-filter and transform- GPU acceleration via
translateZ(0)
-
Browser Support Matrix:
- Safari 16.4+, Chrome 120+, Firefox 128+, Edge 120+
- Requires modern CSS: @property, color-mix(), OKLCH colors
- Tailwind CSS v4 with @plugin directive
# Backend tests (if implemented)
cd backend
pytest
# Frontend tests (if implemented)
cd frontend
npm test# Create a new migration
cd backend
alembic revision --autogenerate -m "Description of changes"
# Apply migrations
alembic upgrade head
# Rollback last migration
alembic downgrade -1# Backend linting (if ruff/black configured)
cd backend
ruff check .
black .
# Frontend linting
cd frontend
npm run lint- Fresh Processing: 10-20 seconds (varies by video length)
- Cached Processing: <1 second (instant streaming from DB)
- Cost Savings: 99% reduction on repeat videos
- Browser Performance:
- Chrome/Firefox: 50 animated beams
- Safari: 25 animated beams (50% reduction for smooth 60fps)
- Video Length: Works best with videos <2 hours (transcript API limits)
- Languages: Primarily English transcripts (multi-language support via youtube-transcript-api)
- Browser Requirements: Safari 16.4+, Chrome 120+ (Tailwind v4 CSS features)
- Rate Limits: Dependent on Gemini/OpenAI API quotas
Contributions are welcome! Improvements, bug fixes, and feature suggestions are appreciated.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- LangGraph for multi-agent orchestration framework
- shadcn/ui for beautiful, accessible UI components
- Magic UI for animated components (beams, cards, theme toggle)
- Tailwind CSS for utility-first styling
- Next.js and React teams for modern web framework
- FastAPI for blazing-fast Python API framework
- Neon for serverless PostgreSQL