Endor Teach

learn deeply · remember always

A local-first AI learning companion that adapts to your knowledge level, builds a personal knowledge base, and generates high-quality LLM training data as a side effect of learning. Everything runs on your machine via Ollama.

Quick Start

git clone https://github.com/your-org/endor-teach
cd endor-teach
./endor.sh

endor.sh handles everything automatically and explains every sudo command before running it:

Creates a Python virtual environment (via uv, falling back to python3 -m venv)
Installs all dependencies into .venv/ — never touches system Python
Installs Docker Engine if missing (APT/DNF/pacman, OS-detected)
Starts SearXNG in Docker for private, rate-limit-free web search
Checks Ollama and pulls the chat + embedding models
Launches the FastAPI server

Re-run ./endor.sh at any time — it is self-healing: restarts stopped containers, repairs broken venvs, repairs missing packages.

System Requirements

Component	Minimum	Notes
Python	3.10+	Managed in `.venv/` — system Python untouched
RAM	8 GB	4 GB for model + 4 GB headroom
GPU	Optional	Faster inference + transcription
Docker	Optional	Auto-installed if missing; DDG fallback if unavailable
Ollama	Required	`curl -fsSL https://ollama.ai/install.sh \| sh`

Features

1. Discovery-First Search Interface

The home screen is a search-first discovery interface:

Type anything — vLLM, Attention mechanism, Roman history
Focus modes: 🌐 Web · 📖 Wikipedia · 📄 Official Docs · 📰 News · 🔬 Papers
Category shortcuts: AI/ML · Technical · Science · History · Business · Medicine · Philosophy · Law · Math · Art & Culture
Results grouped by type — Wikipedia always appears first
Inline [+ Add] buttons on each result — select exactly what to learn from
Knowledge Map (right panel): SVG graph generated by the LLM showing how topics relate
- Grey dashed = prerequisites (learn first)
- Cyan lines = concurrent (learn alongside)
- Green lines = advanced (explore after mastery)
- Click any node to search that topic
Create Topic from N selected sources — research uses only your confirmed sources

2. Topic Disambiguation

Before research begins, the app:

Queries the Wikipedia API directly (reliable, no rate limits, no site: operator)
Uses your description ("for model inference") to bias search queries
Shows a confirmation modal so you always verify what you're learning
Passes the confirmed URL(s) directly to research_and_embed() as source anchors

3. Adaptive Learning (Dreyfus + Maslow Pedagogy Framework)

Every session is guided by tools/pedagogy.py — a silent teacher that adapts to your level:

Dreyfus Skill Acquisition Model (based on mastery score):

Stage	Mastery	LLM Teaching Style
Novice	0–20%	Clear rules, analogies, one concept at a time
Advanced Beginner	20–40%	Patterns, real-world context, common mistakes
Competent	40–60%	Tradeoffs, multi-step problems, Socratic guidance
Proficient	60–80%	Architecture discussions, open-ended evaluation
Expert	80–100%	Frontier challenges, cross-domain synthesis

Maslow Hierarchy of Learning Needs (maps to stage):

Foundation (Safety) — What is this?
Connection (Belonging) — How does it relate to what I know?
Confidence (Esteem) — Can I apply and explain it?
Fluency (Self-actualization) — Can I create and extend it?

The pedagogy hint bar in each topic shows your Dreyfus stage badge and recommends the most valuable next action (e.g. "Generate flashcards — lock in the core definitions").

Quiz questions are Bloom-leveled per stage: Novice → Remember/Understand; Expert → Evaluate/Create.

4. Learning Modes

Tab	Description
💬 Chat	RAG-powered conversation; sources from your selected pages; pedagogy-adapted system prompt
🃏 Cards	Spaced-repetition flashcards with SM-2 scheduling (Again/Hard/Good/Easy)
📝 Quiz	Bloom's taxonomy-leveled questions; scores 0–5; auto-calculates mastery
🔗 Connect	AI-generated conceptual connections between two of your topics
🌐 Sources	All indexed pages for this topic; each source embeddings stored for RAG

5. Word Lookup Popover (double-click any word in chat)

Double-click any word or phrase while reading chat responses:

Instant Wikipedia summary in a floating popover (anchored near selection)
"+ Learn this" — adds the Wikipedia article as a live source to your current topic and opens the discovery search for a full topic if desired
"Open Wikipedia ↗" — full article in new tab
Dismisses on click outside; non-blocking

6. Daily Quiz

Unlocks when ≥1 topic reaches 80% mastery. Tests consolidated knowledge — prevents quizzing on topics you haven't actually learned yet. Maintains a daily streak counter.

7. Search Result Caching & Offline Retrieval

Every fetched web page is stored in search_cache with vector embeddings:

No re-fetching previously seen pages
Cross-topic vector search over all cached content
Semantic cache hits before any live request

8. Automatic Training Data Generation ("Sleep Consolidation")

Every quiz answer scoring 4–5/5 is automatically harvested into training_pairs as a gold fine-tuning pair. Over many sessions this builds a high-quality, Bloom-labeled, source-attributed dataset — the "neocortex" layer of the knowledge architecture.

Export all data: Settings → Export Training Data

Architecture

endor-teach/
├── endor.sh              # Self-healing launch script
├── requirements.txt      # Python deps (installed into .venv/)
├── .venv/                # Python virtual environment
│
├── tools/
│   ├── app.py            # FastAPI server — all endpoints
│   ├── database.py       # SQLite ORM — all persistence
│   ├── rag.py            # RAG pipeline (search, fetch, embed, generate)
│   ├── pedagogy.py       # Dreyfus/Maslow framework — teaching orchestrator
│   ├── transcriber.py    # faster-whisper voice transcription
│   └── static/
│       ├── index.html
│       ├── app.js
│       └── style.css
│
└── data/
    ├── endor_teach.db    # SQLite (WAL mode, foreign keys)
    ├── config.json       # Runtime config
    └── sessions/         # FLAC voice recordings

Cognitive Architecture Mapping

The database mirrors human memory architecture:

Human Memory	Database Layer	Description
Working memory	`source_chunks`	Active RAG context (12-message window)
Hippocampus	`search_cache`	Fast-write episodic store with embeddings
Neocortex	`knowledge_nodes`	Consolidated, SM-2 scheduled concepts
Long-term	`training_pairs`	Gold fine-tuning data (score ≥ 4/5)

Key API Endpoints

Endpoint	Purpose
`GET /api/search`	Discovery search — grouped by wikipedia/docs/papers/news/web
`GET /api/related-topics`	LLM knowledge map suggestions
`GET /api/wiki-peek`	Fast Wikipedia summary for word lookup
`GET /api/disambiguate`	Disambiguation candidates
`POST /api/topics`	Create topic (accepts `source_urls[]`)
`DELETE /api/topics/{id}`	Delete topic and all data
`GET /api/topics/{id}/pedagogy`	Dreyfus stage + next recommended action
`POST /api/topics/{id}/sources/add`	Add URL to topic knowledge base (async)
`POST /api/topics/{id}/flashcards/generate`	Generate flashcards
`POST /api/topics/{id}/quiz/generate`	Generate Bloom-leveled quiz
`POST /api/quiz/answer`	Score answer + consolidate training data
`POST /api/sessions/{id}/chat`	SSE streaming chat
`GET /api/daily-quiz/status`	Quiz availability (requires mastery ≥ 80%)

Configuration

./endor.sh --model   # re-select chat model

Model	VRAM	Notes
`mistral`	4 GB	Fast, excellent for teaching (default)
`llama3.2:3b`	2 GB	Low-resource machines
`gemma2:9b`	5.5 GB	Strong reasoning
`gemma2:27b`	15 GB	Best quality

Web Search Priority

SearXNG (local Docker) — private, no rate limits
ddgs package — real web results
DDG HTML scrape — requests + bs4 only
DDG Instant Answer API — last resort

Troubleshooting

"No results found" in search → Re-run ./endor.sh to ensure SearXNG is running. DDG fallback activates automatically.

Flashcards not generating → Ollama must be running (curl http://localhost:11434/api/tags). Generation takes 30–90 seconds. Check terminal for [flashcards] log lines.

Wrong topic content (e.g. law degree vs ML framework) → Delete the topic (✕ on the card), recreate via the search interface, confirm the correct Wikipedia article in the disambiguation modal.

Docker permission denied → Log out and back in after being added to the docker group, or run newgrp docker && ./endor.sh.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data		data
tools		tools
.gitignore		.gitignore
README.md		README.md
endor.sh		endor.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Endor Teach

Quick Start

System Requirements

Features

1. Discovery-First Search Interface

2. Topic Disambiguation

3. Adaptive Learning (Dreyfus + Maslow Pedagogy Framework)

4. Learning Modes

5. Word Lookup Popover (double-click any word in chat)

6. Daily Quiz

7. Search Result Caching & Offline Retrieval

8. Automatic Training Data Generation ("Sleep Consolidation")

Architecture

Cognitive Architecture Mapping

Key API Endpoints

Configuration

Web Search Priority

Troubleshooting

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Endor Teach

Quick Start

System Requirements

Features

1. Discovery-First Search Interface

2. Topic Disambiguation

3. Adaptive Learning (Dreyfus + Maslow Pedagogy Framework)

4. Learning Modes

5. Word Lookup Popover (double-click any word in chat)

6. Daily Quiz

7. Search Result Caching & Offline Retrieval

8. Automatic Training Data Generation ("Sleep Consolidation")

Architecture

Cognitive Architecture Mapping

Key API Endpoints

Configuration

Web Search Priority

Troubleshooting

About

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages