Skip to content

datainteg/elasticsearch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

Elasticsearch Python Integration

This project demonstrates how to use Elasticsearch with the Python Elasticsearch client (elasticsearch-py) to build a powerful search and analytics application.

Table of Contents

Introduction

Elasticsearch is a distributed, open-source search and analytics engine built on top of Apache Lucene. It is designed for real-time search, full-text search, and analytics. This project demonstrates how to integrate Elasticsearch with Python using the official elasticsearch-py library.

Prerequisites

Before you begin, ensure you have the following installed:

Elasticsearch:

  • Download and install Elasticsearch from here.
  • Start Elasticsearch by running:
./bin/elasticsearch

By default, Elasticsearch runs on http://localhost:9200.

Python:

  • Ensure Python 3.6+ is installed. Download it from here.

Elasticsearch Python Library:

  • Install the library using pip:
pip install elasticsearch

Installation

  1. Clone this repository:
git clone https://github.com/your-username/your-repo-name.git
cd your-repo-name
  1. Install the required Python packages:
pip install -r requirements.txt

Getting Started

Connecting to Elasticsearch

from elasticsearch import Elasticsearch

# Connect to a local Elasticsearch instance
es = Elasticsearch("http://localhost:9200")

# Check if the cluster is running
if es.ping():
    print("Connected to Elasticsearch!")
else:
    print("Could not connect to Elasticsearch.")

Creating an Index

index_name = "my_index"

# Create an index with custom mapping
mapping = {
    "mappings": {
        "properties": {
            "title": {"type": "text"},
            "description": {"type": "text"},
            "timestamp": {"type": "date"}
        }
    }
}

es.indices.create(index=index_name, body=mapping)

Indexing Documents

document = {
    "title": "Introduction to Elasticsearch",
    "description": "Elasticsearch is a distributed search engine.",
    "timestamp": "2023-10-01"
}

# Index the document
es.index(index=index_name, id=1, body=document)

Searching Documents

query = {
    "query": {
        "match": {
            "title": "Elasticsearch"
        }
    }
}

response = es.search(index=index_name, body=query)

# Print search results
for hit in response["hits"]["hits"]:
    print(hit["_source"])

Updating Documents

update_body = {
    "doc": {
        "description": "Elasticsearch is a powerful distributed search and analytics engine."
    }
}

es.update(index=index_name, id=1, body=update_body)

Deleting Documents

es.delete(index=index_name, id=1)

Advanced Features

Bulk Operations

Perform multiple indexing, updating, or deleting operations in a single request:

from elasticsearch.helpers import bulk

actions = [
    {"_index": index_name, "_id": 2, "_source": {"title": "Bulk Insert 1", "description": "First bulk document"}},
    {"_index": index_name, "_id": 3, "_source": {"title": "Bulk Insert 2", "description": "Second bulk document"}}
]

bulk(es, actions)

Aggregations

Perform analytics on your data:

aggregation_query = {
    "size": 0,
    "aggs": {
        "avg_timestamp": {
            "avg": {"field": "timestamp"}
        }
    }
}

response = es.search(index=index_name, body=aggregation_query)
print(response["aggregations"])

Asynchronous Client

Use AsyncElasticsearch for non-blocking operations:

from elasticsearch import AsyncElasticsearch

async def main():
    es = AsyncElasticsearch("http://localhost:9200")
    response = await es.search(index=index_name, body=query)
    print(response)

Use Cases

  • Search Applications: Build full-text search for websites or applications.
  • Log Analysis: Use with Logstash and Kibana for log aggregation and visualization.
  • E-commerce: Implement product search with filters and sorting.
  • Data Analytics: Perform real-time analytics on large datasets.

Conclusion

This project demonstrates how to integrate Elasticsearch with Python using the elasticsearch-py library. Elasticsearch is a powerful tool for search and analytics, and this library makes it easy to use in Python applications.

For more information, refer to the official documentation:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published