WikiBot AI is a sophisticated Retrieval-Augmented Generation (RAG) system that allows users to chat with Wikipedia content through a premium, modern web interface. It combines the power of vector databases with local LLMs (Large Language Models) to provide accurate, context-aware answers.
- 🔍 Advanced RAG: Intelligent retrieval from the Wikipedia database using ChromaDB and Sentence Transformers.
- 🧠 Context-Aware Chat: Remembers previous interactions for seamless multi-turn conversations.
- 🛑 Real-time Control: Stop generation mid-stream with a dedicated abort button.
- 🖼️ Premium UI: State-of-the-art glassmorphism design with responsive layouts and micro-animations.
- 🖨️ Professional Print: Export formatted chat histories directly to PDF or paper.
- ⚡ Streaming Responses: Watch the bot think and type in real-time.
- Backend: Python, FastAPI, Uvicorn
- Vector Database: ChromaDB
- Embeddings:
all-MiniLM-L6-v2(Sentence-Transformers) - LLM Connection: OpenAI-compatible API (designed for LM Studio/Llama.cpp)
- Frontend: Vanilla HTML5, CSS3 (Glassmorphism), Modern JavaScript (ES6+)
- Python 3.9+
- LM Studio or a local LLM server running at
http://localhost:1234
# Clone the repository
git clone https://github.com/yourusername/wikipediarag.git
cd wikipediarag
# Install dependencies
pip install -r requirements.txtTo populate the vector database with Wikipedia articles:
- Place your
simplewiki-latest-pages-articles.xml.bz2in the root folder. - Run the ingestion script:
python ingest.pypython server.pyOpen your browser and navigate to http://localhost:8000.
├── static/
│ ├── index.html # semantic UI structure
│ ├── index.css # Premium glassmorphism styles
│ └── script.js # Logic for streaming & history
├── ingest.py # Wikipedia XML processor & embedder
├── query.py # CLI testing tool for queries
├── server.py # FastAPI backend with streaming
└── requirements.txt # Python dependencies
GET /: Serves the frontend application.POST /chat: Accepts{query: string, history: Array}and returns a text stream.
- Stop Button: Uses
AbortControllerto terminate server requests safely. - Print Utility: Uses
@media printqueries to strip UI elements and format text for documentation. - History Management: Local state tracking to ensure the LLM understands pronouns and follow-up context.
MIT License - feel free to use this project for your own learning or internal tools!