KnowBear is a personal full-stack AI explanation workspace focused on clean architecture, resilient streaming UX, and reliable runtime behavior.
This project is intentionally built with production-style concerns:
- Frontend with strong state/data boundaries (
zustand+zod) - Backend with FastAPI, structured logging, middleware, and consistent error contracts
- Streaming generation pipeline with fallback behavior
- Redis-backed caching and token rate limiting with in-memory fallback
- Dockerized local setup and one-command local startup
- Frontend: React, Vite, TypeScript, Tailwind, Zustand, Zod
- Backend: FastAPI, Pydantic, Uvicorn
- AI layer: Groq + Gemini via provider abstraction
- Web search & Retrieval: Tavily / Serper / Exa integration hooks
- Caching & limits: Upstash Redis (optional), in-memory fallback
+---------------------------+
| React Frontend |
| - Search + mode UI |
User Browser | - Zustand global store |
------------- | - Zod runtime parsing |
| +-------------+-------------+
| HTTP + SSE |
v v
+----+--------------------------------+----+
| FastAPI Backend |
| /api/query /api/query/stream |
| /api/pinned /api/health |
| /api/keep-alive |
| |
| Middleware: CORS, security headers, |
| request-id logging, exception handlers |
+----+-------------------------+------------+
| |
| |
v v
+----+----------------+ +---+----------------------+
| Inference Services | | Cache / Rate Limiting |
| - ensemble/fast | | - Upstash Redis (opt) |
| - provider routing | | - in-memory fallback |
| - stream generation | +--------------------------+
+----+----------------+
|
v
+----+----------------+ +------------------------+
| LLM Providers | | Retrieval Providers |
| Groq / Gemini | | Tavily / Serper / Exa |
+---------------------+ +------------------------+
GET /api/pinned- curated starter topicsPOST /api/query- generate one or more explanation levelsPOST /api/query/stream- stream explanations over SSEGET /api/health- service health metadataGET /api/keep-alive- lightweight warmup probe (includes Upstash ping/get if configured)
- Node.js 20+
- Python 3.10+
npm run local:startWhat it does:
- Installs frontend dependencies (
npm install) - Installs backend dependencies (
python -m pip install -r api/requirements.txt) - Starts frontend on
http://localhost:5173 - Starts backend on
http://localhost:8000
docker compose up --buildServices/ports:
- Frontend:
5173 - Backend:
8000
Create a .env file (or set environment variables directly).
Commonly used:
VITE_API_URL(example:http://localhost:8000)GROQ_API_KEYGEMINI_API_KEY(optional fallback/judge path)TAVILY_API_KEY/SERPER_API_KEY/EXA_API_KEY(optional retrieval providers)UPSTASH_REDIS_REST_URL/UPSTASH_REDIS_REST_TOKEN(optional distributed cache/limits)
- Runtime validation with Zod at frontend API boundaries to prevent invalid payload propagation.
- Centralized Zustand store with persisted user preferences and non-persisted transient runtime state.
- Streaming resilience with fallback to non-stream response path when SSE conditions are degraded.
- Regenerate path bypasses cache and forwards temperature, enabling controlled response variation.
npm run type-check
npm run test -- --run
python -m compileall -q apisrc/ # React frontend
api/ # FastAPI backend
Dockerfile
docker-compose.yml
KnowBear is a personal project that evolved into a focused, depth-first explanation system with practical reliability features across frontend and backend.
KnowBear began as a lean web application built around single query-response interactions: fast, stateless, and minimal by design. That core philosophy remains intact.
It later expanded into a more ambitious chat platform with three dedicated workspace modes, built for multi-turn conversations and collaborative workflows. The experiment was valuable, but it also made the product heavier and less focused than intended.
The project was then reoriented toward a clearer objective: a depth-first, explanation-driven system powered by RAG (Retrieval-Augmented Generation). That iteration remains a work in progress.
The current version reflects the lessons from each phase. It is a streamlined, production-ready API focused on delivering deeper answers with relevant live retrieval context, while preserving the speed, simplicity, and low-latency experience that made the original concept effective.