This project demonstrates how to use Elasticsearch with the Python Elasticsearch client (elasticsearch-py) to build a powerful search and analytics application.
Elasticsearch is a distributed, open-source search and analytics engine built on top of Apache Lucene. It is designed for real-time search, full-text search, and analytics. This project demonstrates how to integrate Elasticsearch with Python using the official elasticsearch-py library.
Before you begin, ensure you have the following installed:
- Download and install Elasticsearch from here.
- Start Elasticsearch by running:
./bin/elasticsearchBy default, Elasticsearch runs on http://localhost:9200.
- Ensure Python 3.6+ is installed. Download it from here.
- Install the library using pip:
pip install elasticsearch- Clone this repository:
git clone https://github.com/your-username/your-repo-name.git
cd your-repo-name- Install the required Python packages:
pip install -r requirements.txtfrom elasticsearch import Elasticsearch
# Connect to a local Elasticsearch instance
es = Elasticsearch("http://localhost:9200")
# Check if the cluster is running
if es.ping():
print("Connected to Elasticsearch!")
else:
print("Could not connect to Elasticsearch.")index_name = "my_index"
# Create an index with custom mapping
mapping = {
"mappings": {
"properties": {
"title": {"type": "text"},
"description": {"type": "text"},
"timestamp": {"type": "date"}
}
}
}
es.indices.create(index=index_name, body=mapping)document = {
"title": "Introduction to Elasticsearch",
"description": "Elasticsearch is a distributed search engine.",
"timestamp": "2023-10-01"
}
# Index the document
es.index(index=index_name, id=1, body=document)query = {
"query": {
"match": {
"title": "Elasticsearch"
}
}
}
response = es.search(index=index_name, body=query)
# Print search results
for hit in response["hits"]["hits"]:
print(hit["_source"])update_body = {
"doc": {
"description": "Elasticsearch is a powerful distributed search and analytics engine."
}
}
es.update(index=index_name, id=1, body=update_body)es.delete(index=index_name, id=1)Perform multiple indexing, updating, or deleting operations in a single request:
from elasticsearch.helpers import bulk
actions = [
{"_index": index_name, "_id": 2, "_source": {"title": "Bulk Insert 1", "description": "First bulk document"}},
{"_index": index_name, "_id": 3, "_source": {"title": "Bulk Insert 2", "description": "Second bulk document"}}
]
bulk(es, actions)Perform analytics on your data:
aggregation_query = {
"size": 0,
"aggs": {
"avg_timestamp": {
"avg": {"field": "timestamp"}
}
}
}
response = es.search(index=index_name, body=aggregation_query)
print(response["aggregations"])Use AsyncElasticsearch for non-blocking operations:
from elasticsearch import AsyncElasticsearch
async def main():
es = AsyncElasticsearch("http://localhost:9200")
response = await es.search(index=index_name, body=query)
print(response)- Search Applications: Build full-text search for websites or applications.
- Log Analysis: Use with Logstash and Kibana for log aggregation and visualization.
- E-commerce: Implement product search with filters and sorting.
- Data Analytics: Perform real-time analytics on large datasets.
This project demonstrates how to integrate Elasticsearch with Python using the elasticsearch-py library. Elasticsearch is a powerful tool for search and analytics, and this library makes it easy to use in Python applications.
For more information, refer to the official documentation: