Interactive 3D visualization of document chunk embeddings in semantic space for demonstrating Retrieval Augmented Generation (RAG) concepts.
NEW: Real-time interactive 3D visualization with dark mode, perfect for full-screen presentations!
- Document Chunking: Splits documents into overlapping chunks with configurable size
- Semantic Embeddings: Uses sentence-transformers to generate embeddings
- Dimensionality Reduction: Supports both UMAP and t-SNE for 2D/3D visualization
- Interactive Visualization: Hover over points to see chunk text and document source
- Color-coded by Document: Visually distinguish different source documents
Fastest way to get started:
# Launch Three.js version with UMAP
docker-compose up threejs-umap
# Access at http://localhost:5000📘 See DOCKER.md for complete Docker documentation.
# Create virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txtYou have two visualization options:
Recommended for presentations - Full camera control, smooth interactions, no resets!
Launch the Three.js visualization:
# Simple launch (default: UMAP)
./launch_threejs.sh
# Or with PCA
./launch_threejs.sh --method pca
# Or manually
source venv/bin/activate
python server.py # Uses UMAP (default, port 5000)
python server.py --method pca --port 5000Then open your browser to http://localhost:5000
Why Three.js is better:
- ✅ Perfect camera control - No unexpected resets
- ✅ Smooth keyboard navigation - WASD, QE, Arrow keys work flawlessly
- ✅ Better performance - Native WebGL rendering
- ✅ Professional look - Glow effects, smooth animations
- ✅ Full control - Mouse orbit, zoom, all interactions work perfectly
Launch the real-time interactive visualization:
# Simple launch (default: UMAP)
./launch.sh
# Or with PCA dimensionality reduction
./launch.sh --method pca
# Or manually
source venv/bin/activate
python interactive_rag_3d.py # Uses UMAP (default)
python interactive_rag_3d.py --method pca # Uses PCAThen open your browser to http://localhost:8050
Dimensionality Reduction Methods:
--method umap(default): Non-linear reduction that preserves both local and global structure, better for visualizing semantic clusters--method pca: Linear reduction that maximizes variance, faster and more deterministic but may not capture complex relationships as well
Features:
- 🌑 Dark mode theme - perfect for presentations
- ⚡ Real-time query embedding - type and see embeddings instantly
- 📊 79+ data points from extensive document collection
- 🎯 3D interactive plot - rotate, zoom, explore
- 💎 Query visualization - your queries appear as golden diamonds
- 🎨 Color-coded categories - ML, cooking, climate, history, quantum, etc.
Usage Tips:
- Type at least 3 characters to see your query embedded
- Hover over points to see the chunk text
- Rotate the 3D plot to explore semantic relationships
- Full-screen (F11) for best presentation experience
Run the demo to generate a static HTML file:
python rag_visualizer.pyThis will generate rag_visualization.html which you can open in your browser.
from rag_visualizer import RAGVisualizer
# Create visualizer
visualizer = RAGVisualizer()
# Add your documents
documents = {
"Document 1": "Your text here...",
"Document 2": "More text...",
}
# Generate visualization
visualizer.visualize(
documents=documents,
chunk_size=200, # Characters per chunk
overlap=50, # Overlapping characters
method="umap", # or "tsne"
dimensions=2, # 2D or 3D
output_file="my_visualization.html"
)- chunk_size: Number of characters per chunk (default: 200)
- overlap: Number of overlapping characters between chunks (default: 50)
- method: Dimensionality reduction method - "umap" or "tsne" (default: "umap")
- dimensions: 2 for 2D plot, 3 for 3D plot (default: 2)
The visualization will show:
- Documents about similar topics clustering together in semantic space
- Different colored clusters for different documents
- Interactive hover tooltips showing the actual chunk text
- Semantic relationships between document chunks
- Start with diverse documents: The sample includes ML, cooking, climate, history, and quantum computing to show clear separation
- Show the clustering: Point out how chunks from the same topic cluster together
- Demonstrate similarity: Show how related concepts (e.g., ML and quantum computing) are closer than unrelated ones
- Interactive exploration: During the talk, hover over points to show the actual text
- Explain the implications: This is how RAG retrieves relevant context - by finding nearest neighbors in this space
For more control, use the class methods directly:
visualizer = RAGVisualizer(model_name="all-MiniLM-L6-v2")
visualizer.add_documents(documents, chunk_size=200, overlap=50)
visualizer.generate_embeddings()
visualizer.reduce_dimensions(method="umap", n_components=2)
fig = visualizer.create_interactive_plot()
fig.show() # Display in Jupyter
fig.write_html("output.html") # Save to file- Python 3.8+
- sentence-transformers: For generating embeddings
- plotly: For interactive visualization
- umap-learn: For UMAP dimensionality reduction
- scikit-learn: For t-SNE and utilities
- numpy: For numerical operations