## Connect to ElasticSearch

In [1]:
pip install "elasticsearch==8.*"

Note: you may need to restart the kernel to use updated packages.


In [2]:
from pprint import pprint
from elasticsearch import Elasticsearch

es = Elasticsearch('http://localhost:9200')
client_info = es.info()
print('Connected to Elasticsearch!')
pprint(client_info.body)

Connected to Elasticsearch!
{'cluster_name': 'docker-cluster',
 'cluster_uuid': '0cXT2SaPTdGHk8cC2kcjuQ',
 'name': '71a7b5e51874',
 'tagline': 'You Know, for Search',
 'version': {'build_date': '2024-08-05T10:05:34.233336849Z',
             'build_flavor': 'default',
             'build_hash': '1a77947f34deddb41af25e6f0ddb8e830159c179',
             'build_snapshot': False,
             'build_type': 'docker',
             'lucene_version': '9.11.1',
             'minimum_index_compatibility_version': '7.0.0',
             'minimum_wire_compatibility_version': '7.17.0',
             'number': '8.15.0'}}


## Create Index

### 1. Simplest Way

In this method, the mappings which define the structure of documents within an index are infered.

In [2]:
es.indices.delete(index='my_index', ignore_unavailable=True)
es.indices.create(index='my_index')

ObjectApiResponse({'acknowledged': True, 'shards_acknowledged': True, 'index': 'my_index'})

### 2. Specify the number of replicas and shards

Shards: Elasticsearch divides the data in an index into multiple shards. Each shard is a self-contained index that Elasticsearch can distribute across multiple nodes in a cluster. Shards are managed automatically but configured when creating the index.

Replicas: For fault tolerance and high availability, an index can have replica shards, which are copies of the primary shards.

In [4]:
es.indices.delete(index='my_index', ignore_unavailable=True)
es.indices.create(
    index="my_index",
    settings={
        "index": {
            "number_of_shards": 3,  # how many pieces the data is split into
            "number_of_replicas": 2  # how many copies of the data
        }
    },
)

ObjectApiResponse({'acknowledged': True, 'shards_acknowledged': True, 'index': 'my_index'})