A comprehensive search tutorial demonstrating advanced search capabilities using Flask and Elasticsearch. This project implements 5 different search modes including traditional BM25, dense vector search (kNN), sparse semantic search (ELSER), and hybrid approaches with full Elastic Cloud integration.
-
5 Search Modes:
- BM25: Traditional full-text search with multi-field matching
- kNN: Dense vector search using sentence transformers
- Hybrid: RRF combining BM25 + kNN with advanced ranking
- ELSER: Sparse semantic search using Elastic's learned sparse encoder
- Hybrid ELSER: ELSER search with fallback implementation
-
Advanced Features:
- Elastic Cloud Integration - Full support for Elastic Cloud deployments
- Faceted Search - Category and year filtering with aggregations
- Pagination - Complete pagination support for all search modes
- Dense Vector Embeddings - 384-dimensional semantic vectors
- Sparse Vector Tokens - ELSER-generated semantic understanding
- Reciprocal Rank Fusion (RRF) - Advanced hybrid search ranking
- Error Handling - Comprehensive fallbacks and graceful degradation
- Responsive UI - Modern Bootstrap interface with search mode selection
- Real-time Search - Live search with aggregations and filters
- Python 3.8+
- Elastic Cloud account (free trial available)
- Git
-
Clone the repository:
git clone https://github.com/mshadmanrahman/flask-elasticsearch-search-tutorial.git cd flask-elasticsearch-search-tutorial
-
Create virtual environment:
python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Configure Elastic Cloud: Create a
.env
file with your Elastic Cloud credentials:ELASTIC_CLOUD_ID="your_cloud_id_here" ELASTIC_API_KEY="your_api_key_here"
-
Deploy ELSER Model:
flask deploy-elser
-
Index your data:
flask reindex
-
Run the application:
python app.py
-
Open your browser: Navigate to
http://localhost:5001
- Uses Elasticsearch's built-in BM25 algorithm
- Multi-field search across name, summary, and content
- Best for exact keyword matches and phrase queries
- Includes pagination, aggregations, and faceted search
- Supports category and year filtering
- Uses sentence-transformers for semantic similarity
- Model:
all-MiniLM-L6-v2
(384 dimensions) - Best for finding semantically similar content
- Understands meaning beyond exact word matches
- Excellent for conceptual queries
- Combines lexical and semantic search approaches
- Uses Reciprocal Rank Fusion for intelligent result merging
- Balances precision and recall for optimal results
- Best for most general search scenarios
- Provides comprehensive coverage
- Uses Elastic's Learned Sparse EncodeR v2 model
- Generates sparse vector tokens for semantic understanding
- Best for complex semantic and conceptual queries
- Automatically deployed and managed by Elasticsearch
- Requires Elastic Cloud or ML-enabled cluster
- ELSER search with intelligent fallback mechanisms
- Handles sub_searches limitations gracefully
- Provides robust semantic search capabilities
- Ensures reliable search experience
ELASTIC_CLOUD_ID
: Your Elastic Cloud deployment IDELASTIC_API_KEY
: Your Elastic Cloud API key
- Page Size: 5 results per page (configurable)
- Vector Dimensions: 384 (sentence-transformers)
- Model ID:
.elser_model_2
(ELSER v2) - Minimum Score: Dynamic based on search mode
├── app.py # Flask application with all routes
├── search.py # Core Elasticsearch search logic
├── data.json # Sample documents for indexing
├── requirements.txt # Python dependencies
├── .env # Environment variables (not in repo)
├── .gitignore # Git ignore rules
├── templates/ # HTML templates
│ ├── base.html # Base template with Bootstrap
│ ├── index.html # Main search interface
│ └── document.html # Document detail view
└── static/ # Static assets
└── elastic-logo.svg # Elasticsearch logo
- Enter a search query (e.g., "work from home", "team collaboration")
- Select a search mode from the dropdown
- Click "Search" to see results
- Use pagination to browse through results
- Use
category:sharepoint
to filter by category - Use
year:2023
to filter by year - Combine filters with search terms:
category:teams work from home
- View faceted search options in the sidebar
- Try the same query with different modes to see variations
- Notice how BM25 finds exact matches while kNN finds semantic matches
- ELSER often provides more contextually relevant results
- Hybrid modes combine the best of both approaches
- "remote work" - Compare BM25 vs ELSER results
- "employee benefits" - See semantic understanding differences
- "team collaboration" - Test conceptual search capabilities
- "HR policies" - Explore different search approaches
Different search modes excel at different types of queries:
Query Type | BM25 | kNN | ELSER | Hybrid |
---|---|---|---|---|
Exact keywords | ★★★ | ★★ | ★★ | ★★★ |
Synonyms | ★ | ★★★ | ★★★ | ★★★ |
Conceptual | ★ | ★★★ | ★★★ | ★★★ |
Phrase matching | ★★★ | ★★ | ★★ | ★★★ |
Semantic similarity | ★ | ★★★ | ★★★ | ★★★ |
- Connection Errors: Verify your Elastic Cloud credentials in
.env
- ELSER Not Working: Ensure you've run
flask deploy-elser
- No Results: Try reindexing with
flask reindex
- License Errors: Some features require Elastic Cloud trial or paid plan
- Check the Flask application logs for detailed error messages
- Verify your Elastic Cloud deployment is running
- Ensure all dependencies are installed correctly
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Elasticsearch for the powerful search engine
- Flask for the lightweight web framework
- Sentence Transformers for dense embeddings
- Bootstrap for the responsive UI framework
- Elastic Cloud for managed Elasticsearch