Safari.mp4
A simple application to visualize text embeddings from CSV files using various clustering techniques.
- Upload CSV files containing text data
- Generate embeddings using OpenAI API or sentence-transformers library
- Visualize embeddings using different dimensionality reduction techniques (PCA, t-SNE, UMAP)
- Apply clustering algorithms (K-Means, DBSCAN, HDBSCAN) to identify patterns
- Interactive visualization with Plotly
- Clone this repository
- Install the required dependencies:
pip install -r requirements.txt - Create a
.envfile in the root directory and add your OpenAI API key:OPENAI_API_KEY=your_api_key_here
- Start the application:
python app.py - Open your browser and navigate to
http://localhost:5000 - Upload a CSV file containing text data
- Select an embedding model (OpenAI or sentence-transformers) and clustering parameters
- Explore the visualizations
- text-embedding-3-small
- text-embedding-3-large
- text-embedding-ada-002
- all-MiniLM-L6-v2
- all-mpnet-base-v2
Your CSV file should contain at least one column with text data. The application will allow you to select which column to use for generating embeddings.
- Python 3.8+
- OpenAI API key
- Dependencies listed in requirements.txt