## Documentation

To read more about the search API, visit the docs [here](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-your-data.html) and [here](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html).

![search_api_docs](../images/search_api_docs.png)

## Connect to ElasticSearch

In [1]:
from pprint import pprint
from elasticsearch import Elasticsearch

es = Elasticsearch('http://localhost:9200')
client_info = es.info()
print('Connected to Elasticsearch!')
pprint(client_info.body)

Connected to Elasticsearch!
{'cluster_name': 'es-docker-cluster',
 'cluster_uuid': '68vsKryIR7Ss49bLh7mz5Q',
 'name': 'es01',
 'tagline': 'You Know, for Search',
 'version': {'build_date': '2025-12-16T10:09:08.849001802Z',
             'build_flavor': 'default',
             'build_hash': 'd8972a71dbbd64ff17f2f4dba9ca2c3fe09fb100',
             'build_snapshot': False,
             'build_type': 'docker',
             'lucene_version': '10.3.2',
             'minimum_index_compatibility_version': '8.0.0',
             'minimum_wire_compatibility_version': '8.19.0',
             'number': '9.2.3'}}


## Inserting documents

In [2]:
es.indices.delete(index='index_1', ignore_unavailable=True)
es.indices.create(index='index_1')

es.indices.delete(index='index_2', ignore_unavailable=True)
es.indices.create(index='index_2')

ObjectApiResponse({'acknowledged': True, 'shards_acknowledged': True, 'index': 'index_2'})

Let's index the documents sequentially in both indices.

In [3]:
import json
from tqdm import tqdm


dummy_data = json.load(open("../data/dummy_data.json"))
for document in tqdm(dummy_data, total=len(dummy_data)):
    response = es.index(index='index_1', body=document)

for document in tqdm(dummy_data, total=len(dummy_data)):
    response = es.index(index='index_2', body=document)

100%|██████████| 3/3 [00:00<00:00, 33.24it/s]
100%|██████████| 3/3 [00:00<00:00, 32.52it/s]


## Searching

We can provide the `index` argument one index at a time.

In [4]:
response = es.search(
    index='index_1',
    body={
        "query": {"match_all": {}}
    }
)

n_hits = response['hits']['total']['value']
print(f"Found {n_hits} documents in index_1")

Found 3 documents in index_1


In [None]:
# Perform a search in my-index
GET /index_1/_search
{
  "query": {
    "match_all": {}
  }
}

In [5]:
response = es.search(
    index='index_2',
    body={
        "query": {"match_all": {}}
    }
)

n_hits = response['hits']['total']['value']
print(f"Found {n_hits} documents in index_2")

Found 3 documents in index_2


Or we can provide the `index` argument multiple indices at once.

In [6]:
response = es.search(
    index='index_1,index_2',
    body={
        "query": {"match_all": {}}
    }
)

n_hits = response['hits']['total']['value']
print(f"Found {n_hits} documents in index_1 and index_2")

Found 6 documents in index_1 and index_2


We can also use wildcards `*` to match multiple indices without listing them individually, such as `"index*"`.

In [None]:
GET /index_1,index_2/_search
{
  "query": {
    "match_all": {}
  }
}

In [7]:
response = es.search(
    index='index*',
    body={
        "query": {"match_all": {}}
    }
)

n_hits = response['hits']['total']['value']
print(f"Found {n_hits} documents in all indexes with name starting with 'index'")

Found 6 documents in all indexes with name starting with 'index'


In [None]:
GET /index*/_search
{
  "query": {
    "match_all": {}
  }
}

Or, to search all indices, we use `_all`.

In [8]:
response = es.search(
    index='_all',
    body={
        "query": {"match_all": {}}
    }
)

n_hits = response['hits']['total']['value']
print(f"Found {n_hits} documents in all indexes")

Found 17 documents in all indexes


In [None]:
GET /_all/_search
{
  "query": {
    "match_all": {}
  }
}