## **What is Elasticsearch?**

Elasticsearch is a distributed search and analytics engine, scalable data store, and vector database built on Apache Lucene. It’s optimized for speed and relevance on production-scale workloads. Use Elasticsearch to search, index, store, and analyze data of all shapes and sizes in near real time.

Official documentation: https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html

## **Connect to Elasticsearch**

In [8]:
from pprint import pprint
from elasticsearch import Elasticsearch
from dotenv import load_dotenv
import os
load_dotenv()
# why doesn't this work?

# LOCALHOST = os.getenv('LOCALHOST')
# print(LOCALHOST)

LOCALHOST = "http://localhost:9200/"

es = Elasticsearch(LOCALHOST)
client_info = es.info()
print('Connected to Elasticsearch!')
pprint(client_info.body)

Connected to Elasticsearch!
{'cluster_name': 'docker-cluster',
 'cluster_uuid': 'AKPh90H1StWquQfBPE4Chw',
 'name': 'b66a5ae1a4a1',
 'tagline': 'You Know, for Search',
 'version': {'build_date': '2024-08-05T10:05:34.233336849Z',
             'build_flavor': 'default',
             'build_hash': '1a77947f34deddb41af25e6f0ddb8e830159c179',
             'build_snapshot': False,
             'build_type': 'docker',
             'lucene_version': '9.11.1',
             'minimum_index_compatibility_version': '7.0.0',
             'minimum_wire_compatibility_version': '7.17.0',
             'number': '8.15.0'}}


## **Connect to Elasticsearch using dotenv**

In [16]:
from pprint import pprint
from elasticsearch import Elasticsearch
from dotenv import load_dotenv
import os

# Use absolute path to load .env.local
env_path = os.path.join('e:\\', 'Study Space', 'Python Workspace', 'ELastic Search', '.env.local')
load_dotenv(dotenv_path=env_path)

# Print debugging information
print("Current working directory:", os.getcwd())
print("Environment file path:", env_path)

# Get the LOCALHOST variable
LOCALHOST = os.getenv('LOCALHOST')
print("Raw LOCALHOST value:", repr(LOCALHOST))

# Remove quotes if present and ensure it's a valid string
if LOCALHOST:
    LOCALHOST = LOCALHOST.strip('"')
    print("Processed LOCALHOST value:", repr(LOCALHOST))

# Connect to Elasticsearch
try:
    es = Elasticsearch(LOCALHOST)
    client_info = es.info()
    print('Connected to Elasticsearch!')
    pprint(client_info.body)
except Exception as e:

Current working directory: e:\Study Space\Python Workspace\ELastic Search
Environment file path: e:\Study Space\Python Workspace\ELastic Search\.env.local
Raw LOCALHOST value: 'http://localhost:9200/'
Processed LOCALHOST value: 'http://localhost:9200/'
Connected to Elasticsearch!
{'cluster_name': 'docker-cluster',
 'cluster_uuid': 'AKPh90H1StWquQfBPE4Chw',
 'name': 'b66a5ae1a4a1',
 'tagline': 'You Know, for Search',
 'version': {'build_date': '2024-08-05T10:05:34.233336849Z',
             'build_flavor': 'default',
             'build_hash': '1a77947f34deddb41af25e6f0ddb8e830159c179',
             'build_snapshot': False,
             'build_type': 'docker',
             'lucene_version': '9.11.1',
             'minimum_index_compatibility_version': '7.0.0',
             'minimum_wire_compatibility_version': '7.17.0',
             'number': '8.15.0'}}


## **Create an index**

In Elasticsearch, an index is a structure used to store and organize data, similar to a database in traditional relational database systems. It allows Elasticsearch to efficiently search, filter, and retrieve relevant data.

### **Analogy:**

Think of an index in Elasticsearch like a book's index:

* The index helps locate information quickly (efficient searching).
* The documents are the pages of the book (data records).
* The fields in a document are like headings under which the information is categorized.

## **Example:**

If you're creating a blog search engine, you might create an index named blogs where:

* Each document represents a blog post.
* Fields in each document might include title, author, content, and published_date.

In [23]:
es.indices.delete(index='test-index',ignore_unavailable=True)
es.indices.create(index='test-index')

ObjectApiResponse({'acknowledged': True, 'shards_acknowledged': True, 'index': 'test-index'})

In [24]:
es.indices.delete(index='test-index',ignore_unavailable=True)
es.indices.create(index='test-index'
                  ,settings={
                    'number_of_shards': 3, 
                    'number_of_replicas': 2
                    })

ObjectApiResponse({'acknowledged': True, 'shards_acknowledged': True, 'index': 'test-index'})