# Programming Elasticsearch with Python

While `curl` is excellent for testing and exploration, real applications need to interact with Elasticsearch programmatically. This notebook covers how to use the official **Elasticsearch Python client** to connect to a cluster, manage data, and perform searches.

We will cover:
1.  Installing and connecting with the Python client.
2.  Performing CRUD and search operations in Python.
3.  Practical demos for indexing and searching different types of documents (books, emails, and tweets).

--- 
## 1. Setting up the Python Client

First, you need to install the official library from PyPI.

**Command:**
```bash
pip install elasticsearch
```

Once installed, you can import the client and create an instance to connect to your cluster. We will then check the connection.

**Code:**
```python
from elasticsearch import Elasticsearch
import json # To pretty-print the results

# This assumes Elasticsearch is running on localhost:9200
try:
    es = Elasticsearch("http://localhost:9200")
    if es.ping():
        print("Successfully connected to Elasticsearch!")
    else:
        print("Could not connect to Elasticsearch.")
except Exception as e:
    print(f"An error occurred: {e}")
```

**Expected Output:**
```
Successfully connected to Elasticsearch!
```

--- 
## 2. CRUD and Search in Python

The Python client provides methods that map directly to the REST API endpoints.

**Code:**
```python
# The document we want to index
doc = {
    'title': 'The Hobbit',
    'author': 'J.R.R. Tolkien',
    'year': 1937
}

# Index the document in the 'books' index with ID '5'
response = es.index(index='books', id=5, document=doc)
print("--- Indexing Document ---")
print(f"Result: {response['result']}")

# Retrieve the document
response = es.get(index='books', id=5)
print("\n--- Retrieving Document ---")
print(json.dumps(response['_source'], indent=2))

# Search for the document
query = {
    'match': {
        'author': 'Tolkien'
    }
}
response = es.search(index='books', query=query)
print("\n--- Searching for Document ---")
print(f"Found {response['hits']['total']['value']} hit(s).")
for hit in response['hits']['hits']:
    print(json.dumps(hit['_source'], indent=2))

# Delete the document
response = es.delete(index='books', id=5)
print("\n--- Deleting Document ---")
print(f"Result: {response['result']}")
```

**Expected Output:**
```
--- Indexing Document ---
Result: created

--- Retrieving Document ---
{
  "title": "The Hobbit",
  "author": "J.R.R. Tolkien",
  "year": 1937
}

--- Searching for Document ---
Found 1 hit(s).
{
  "title": "The Hobbit",
  "author": "J.R.R. Tolkien",
  "year": 1937
}

--- Deleting Document ---
Result: deleted
```

--- 
## 3. Practical Demos

Let's use the Python client to perform the searches from our demo plans.

### Demo 1: Indexing and Searching Emails

We will index a few emails and then find one from a specific sender that contains the word "report".

**Code:**
```python
from elasticsearch.helpers import bulk

emails = [
    {'_index': 'emails', '_id': 1, '_source': {'sender': 'boss@example.com', 'subject': 'Urgent: Project Update', 'body': 'We need the final report by Friday.'}},
    {'_index': 'emails', '_id': 2, '_source': {'sender': 'hr@example.com', 'subject': 'Holiday Schedule', 'body': 'A reminder about the upcoming holiday.'}},
    {'_index': 'emails', '_id': 3, '_source': {'sender': 'boss@example.com', 'subject': 'Re: Yesterday\'s Meeting', 'body': 'Please send me the sales report as soon as possible.'}}
]

# Bulk index the documents
es.indices.delete(index='emails', ignore_unavailable=True) # Clear old index
bulk(es, emails)
print("Emails indexed successfully.")

# Define the complex search query
query = {
    "bool": {
        "must": [
            { "match": { "body": "report" } }
        ],
        "filter": [
            { "term": { "sender.keyword": "boss@example.com" } }
        ]
    }
}

# Execute the search
response = es.search(index='emails', query=query)

print("\n--- Search Results ---")
for hit in response['hits']['hits']:
    print(f"Found email with subject: {hit['_source']['subject']}")
```

**Expected Output:**
```
Emails indexed successfully.

--- Search Results ---
Found email with subject: Urgent: Project Update
Found email with subject: Re: Yesterday's Meeting
```

### Demo 2: Indexing and Searching Tweets

We will index tweets and search for one containing a specific phrase and hashtag.

**Code:**
```python
tweets = [
    {'_index': 'tweets', '_id': 1, '_source': {'user': 'user_a', 'timestamp': '2025-09-30T12:00:00', 'text': 'I love learning about PostgreSQL!', 'hashtags': ['database', 'postgres']}},
    {'_index': 'tweets', '_id': 2, '_source': {'user': 'user_b', 'timestamp': '2025-09-30T12:05:00', 'text': 'Elasticsearch is so fast for text search.', 'hashtags': ['database', 'search']}}
]

es.indices.delete(index='tweets', ignore_unavailable=True) # Clear old index
bulk(es, tweets)
print("Tweets indexed successfully.")

query = {
    "bool": {
        "must": [
            { "match_phrase": { "text": "love learning" } } 
        ],
        "filter": [
            { "term": { "hashtags.keyword": "database" } }
        ]
    }
}

response = es.search(index='tweets', query=query)
print("\n--- Search Results ---")
for hit in response['hits']['hits']:
    print(f"Found tweet from {hit['_source']['user']}: {hit['_source']['text']}")
```

**Expected Output:**
```
Tweets indexed successfully.

--- Search Results ---
Found tweet from user_a: I love learning about PostgreSQL!
```

--- 
## Conclusion

This notebook demonstrated how to move from command-line tools to a real programming language for interacting with Elasticsearch. We learned that:

- The official **Python client** provides a clean, programmatic interface to the entire REST API.
- The JSON **Query DSL** is passed as a simple Python dictionary to the `search()` method.
- The `es.helpers.bulk()` function is the most efficient way to index large amounts of data.

With these skills, you can now integrate Elasticsearch's powerful search capabilities into any Python application.