# [Elasticsearch](https://www.elastic.co/products/elasticsearch)
Elasticsearch is a highly scalable open-source full-text search and analytics engine that makes life easy when dealing with storing, retrieving and deleting large datasets. The main aim of this tutorial is to look into the basics of elasticsearch and how to incorporate elasticsearch into python applications.

### CRUD
The API documentation of Elasticsearch for python can be found on [documentation](https://elasticsearch-py.readthedocs.io/en/master/api.html#)
. We'll be creating a single elasticsearch index and work on that.
First, we'll create multiple documents and then read, update and delete them from the created index.

In [1]:
from elasticsearch import Elasticsearch
from elasticsearch import helpers
from datetime import datetime
import json
import ast

### Initialising the object and creating an index

In [2]:
es = Elasticsearch()
es.indices.create(index= 'test', ignore= 400)

{'acknowledged': True, 'shards_acknowledged': True, 'index': 'test'}

In [3]:
print(es.cat.indices(v=True))                    # Lisitng the satus of indices

health status index    uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   my_index p8P485TkRuqUDHZGMEjAbA   5   1          1            0      6.4kb          6.4kb
yellow open   test     2ib_u-xPSu6wfEL2Qt6m4A   5   1          0            0       631b           631b
green  open   .kibana  PVqMSgdWRN-xD8cRZuDylg   1   0          2            0     10.6kb         10.6kb
yellow open   udit     85JdcJ1WSESvqfuIPnr5yA   5   1          0            0      1.2kb          1.2kb
yellow open   data     5htQFDExSmypAAbsm-uRWg   5   1          0            0      1.2kb          1.2kb



Here we can see the status and other attributes of different indices
To know more: [cat indices](https://www.elastic.co/guide/en/elasticsearch/reference/current/cat-indices.html)

### CREATE

In [5]:
json_string = {"username":"udit","sex": "male","occupation": "eveloper","status": "dead", "age": 99 }

print(json_string)
es.index(index ='my_index', doc_type='users',id='01',body = json_string )  ## Testing with dummy values

{'username': 'udit', 'sex': 'male', 'occupation': 'eveloper', 'status': 'dead', 'age': 99}


{'_index': 'my_index',
 '_type': 'users',
 '_id': '01',
 '_version': 2,
 'result': 'updated',
 '_shards': {'total': 2, 'successful': 1, 'failed': 0},
 '_seq_no': 2,
 '_primary_term': 1}

#### Time to input some real data from a sample json file I made using [json-generator](https://www.json-generator.com/)

In [6]:
with open("../datasets/sample.json") as read_file:
    data = json.loads(read_file.read())

In [7]:
for i in range(0, len(data)):
    es.index(index='test', doc_type='users',id=str(i),body=data[i])

## READ
To search documents in elasticsearch we use complex DSL queries. You can learn more about DSL queries from [DSL Query](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-filter-context.html)

In [8]:
res = es.search(index='test', doc_type='users',body='{"query" : { "match_all": {} } }')

In [9]:
res['hits']['hits']

[{'_index': 'test',
  '_type': 'users',
  '_id': '0',
  '_score': 1.0,
  '_source': {'name': 'Rose Singleton',
   'isActive': True,
   'balance': '$1,691.11',
   'picture': 'http://placehold.it/32x32',
   'age': 32,
   'eyeColor': 'green',
   'gender': 'male',
   'company': 'ZANILLA',
   'email': 'rosesingleton@zanilla.com',
   'phone': '+1 (949) 556-3050',
   'address': '384 Bristol Street, Churchill, Tennessee, 5071',
   'about': 'Aliqua elit sint ullamco magna incididunt esse fugiat minim id culpa sit velit et. Velit enim velit laboris eiusmod do ut sunt eiusmod officia ad qui veniam quis minim. Dolor laboris eu cillum exercitation. Ea et ipsum consectetur dolore mollit labore ipsum ad adipisicing dolor duis exercitation minim labore. Deserunt irure sint velit enim eiusmod laborum aliquip Lorem. Anim do adipisicing laboris exercitation velit.\r\n',
   'registered': '2014-05-01T02:45:16 -06:-30',
   'latitude': 24.460019,
   'longitude': 1.16329,
   'tags': ['enim',
    'exercitation

Hence we can see that there are 20 objects which we inserted. You can play more with the search query to find complex results like finding a user who is male and likes to eat apple.

In [10]:
query = """
{ "query": 
    {
            "bool": { 
                      "must": [
                                { "match": { "gender":   "male"        }}, 
                                { "match": { "favoriteFruit": "apple" }}  
                              ]
                     }
    }
}
"""

res = es.search(index='test',doc_type='users',body=query )

In [11]:
[res['hits']['hits'][i]['_source']['name'] for i in range(0,len(res['hits']['hits']))]

['Brock Bernard', 'Best Oliver', 'Mccullough Stafford']

Looks like there are 3 males who like to eat apple

## UPDATE
To show how we can update existing doxuments in Elasticsearch we will change the 'eyeColor' field of the document with id 3.

In [12]:
query= """
{
    "query": {
                "term" : {
                            "_id": "3"
                           }
             }
}
"""
res = es.search(index='test', doc_type='users',body=query)
res['hits']['hits'][0]['_source']['eyeColor']

'brown'

We can see that it is 'brown' now let's change it to 'blue'

In [13]:
query="""
{
    "doc": {
               "eyeColor" : "blue" 
             }
}
"""
res = es.update(index='test', doc_type='users',id='3',body=query)

res

{'_index': 'test',
 '_type': 'users',
 '_id': '3',
 '_version': 2,
 'result': 'updated',
 '_shards': {'total': 2, 'successful': 1, 'failed': 0},
 '_seq_no': 3,
 '_primary_term': 1}

The operation was success and now the value has been updated

## DELETE
There are two ways to delete a document in Elasticsearch<br>
1. deleting specific document based on id
2. deleting document(s) by use of query<br>
<br>
First I will show you how to do it by query and for that we we'll delete the data of users who are male and eat apple.

In [14]:
query="""
{ "query": 
    {
            "bool": { 
                      "must": [
                                { "match": { "gender":   "male"        }}, 
                                { "match": { "favoriteFruit": "apple" }}  
                              ]
                     }
    }
}
"""
res = es.delete_by_query(index='test', doc_type='users', body=query)
print(res)

{'took': 24, 'timed_out': False, 'total': 3, 'deleted': 3, 'batches': 1, 'version_conflicts': 0, 'noops': 0, 'retries': {'bulk': 0, 'search': 0}, 'throttled_millis': 0, 'requests_per_second': -1.0, 'throttled_until_millis': 0, 'failures': []}


We can see that the documents which were returned to us in the search query has now been deleted. For confirmation you can even re-run that cell. <br>
Now, we'll delete a document by specific id.

In [15]:
es.delete(index='test',doc_type='users',id='5')

{'_index': 'test',
 '_type': 'users',
 '_id': '5',
 '_version': 2,
 'result': 'deleted',
 '_shards': {'total': 2, 'successful': 1, 'failed': 0},
 '_seq_no': 7,
 '_primary_term': 1}

Now, I'll delete all the documents in my 'test' index

In [16]:
es.delete_by_query(index='test', doc_type='users', body='{"query" : { "match_all": {} } }')

{'took': 29,
 'timed_out': False,
 'total': 16,
 'deleted': 16,
 'batches': 1,
 'version_conflicts': 0,
 'noops': 0,
 'retries': {'bulk': 0, 'search': 0},
 'throttled_millis': 0,
 'requests_per_second': -1.0,
 'throttled_until_millis': 0,
 'failures': []}

# BYE BYE !!