Inspired by this [tutorial](https://qbox.io/blog/building-an-elasticsearch-index-with-python) I tried to continue investigating [Elasticsearch](https://www.elastic.co/products/elasticsearch) since I would like to use a fast indexing tool for the data I am gathering and the applications I am developing.

## Install the Python library for Elasticsearch
https://elasticsearch-py.readthedocs.io/en/master/
``` shell
$ pip install elasticsearch
```

Note: on my Mac I installed Elasticsearch through Brew
``` shell
$ brew install elasticsearch
$ brew services start elasticsearch
```

## Creating the data
### Read the CSV files
Read the character data

In [1]:
import pandas as pd
character_df = pd.read_csv('data/nintendo_characters.csv')
character_df

Unnamed: 0,id,name,description,color,occupation,picture
0,2,Luigi,This is Luigi,green,plumber,https://upload.wikimedia.org/wikipedia/en/f/f1...
1,1,Mario,This is Mario,red,plumber,https://upload.wikimedia.org/wikipedia/en/9/99...
2,3,Peach,My name is Peach,pink,princess,https://s-media-cache-ak0.pinimg.com/originals...
3,4,Toad,I like funghi,red,,https://upload.wikimedia.org/wikipedia/en/d/d1...


Remove the NaN

In [2]:
character_df.occupation = character_df.occupation.fillna('')

Read the world data

In [3]:
world_df = pd.read_csv('data/super_mario_3_worlds.csv', sep=';')
world_df

Unnamed: 0,id,world,name,image,description,picture
0,1,World 1,Grass Land,Grass Land.PNG,Grass Land is the first world of the game. It ...,https://www.mariowiki.com/images/thumb/f/fa/Gr...
1,2,World 2,Desert Land,World2SMB3.PNG,Desert Land is the second world of the game. I...,https://www.mariowiki.com/images/thumb/d/d1/Wo...
2,3,World 3,Water Land,Sea Side.PNG,Water Land is a water-themed region that was r...,https://www.mariowiki.com/images/thumb/b/b7/Se...
3,4,World 4,Giant Land,SMAS-Big Island Map.PNG,Giant Land is mainly composed of an island in ...,https://www.mariowiki.com/images/thumb/9/9c/SM...
4,5,World 5,Sky Land,Sky world.PNG,Sky Land is the world that has been conquered ...,https://www.mariowiki.com/images/thumb/6/69/Sk...
5,6,World 6,Ice Land,SMB36.PNG,Ice Land is an area covered in snow and ice. T...,https://www.mariowiki.com/images/thumb/4/40/SM...
6,7,World 7,Pipe Land,Pipe maze.PNG,Pipe Land is a series of small islands in a ne...,https://www.mariowiki.com/images/thumb/a/aa/Pi...
7,8,World 8,Dark Land,Dark land2.PNG,The eighth and final world is ruled by King Bo...,https://www.mariowiki.com/images/thumb/0/01/Da...
8,9,World 9,Warp Zone,World 9.PNG,World 9 is only accessible by a Warp Whistle. ...,https://www.mariowiki.com/images/thumb/0/09/Wo...


## Setup Elasticsearch
### Create the parameters

In [4]:
ES_HOST = {"host" : "localhost", "port" : 9200}
INDEX_NAME = 'nintendo'
TYPE_NAME = 'character'
ID_FIELD = 'id'

### Setup the Elasticsearch connector 

In [5]:
from elasticsearch import Elasticsearch
es = Elasticsearch(hosts = [ES_HOST])

### Create the index
Create the index for `nintendo` if it does not exists, otherwise first delete it.

In [6]:
if es.indices.exists(INDEX_NAME):
    print("Deleting the '%s' index" % (INDEX_NAME))
    res = es.indices.delete(index = INDEX_NAME)
    print("Acknowledged: '%s'" % (res['acknowledged']))

request_body = {
    "settings" : {
        "number_of_shards": 1,
        "number_of_replicas": 0
    }
}
print("Creating the '%s' index!" % (INDEX_NAME))
res = es.indices.create(index = INDEX_NAME, body = request_body)
print("Acknowledged: '%s'" % (res['acknowledged']))

Deleting the 'nintendo' index
Acknowledged: 'True'
Creating the 'nintendo' index!
Acknowledged: 'True'


### Create the bulk data
Loop through the dataframe and create the data to insert into the index.

In [7]:
bulk_data = []

In [8]:
for index, row in character_df.iterrows():
    data_dict = {}
    for i in range(len(row)):
        data_dict[character_df.columns[i]] = row[i]
    op_dict = {
        "index": {
            "_index": 'nintendo',
            "_type": 'character',
            "_id": data_dict['id']
        }
    }
    bulk_data.append(op_dict)
    bulk_data.append(data_dict)

In [9]:
for index, row in world_df.iterrows():
    data_dict = {}
    for i in range(len(row)):
        data_dict[world_df.columns[i]] = row[i]
    op_dict = {
        "index": {
            "_index": 'nintendo',
            "_type": 'world',
            "_id": data_dict['id']
        }
    }
    bulk_data.append(op_dict)
    bulk_data.append(data_dict)

## Insert the data into the index

In [10]:
import json
print("Bulk indexing...")
res = es.bulk(index = INDEX_NAME, body = bulk_data, refresh = True)

Bulk indexing...


## Query using CURL

In [11]:
!curl -XGET 'http://localhost:9200/_search?pretty'

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "hits" : {
    "total" : 3295,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "nintendo",
        "_type" : "character",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "id" : 2,
          "name" : "Luigi",
          "description" : "This is Luigi",
          "color" : "green",
          "occupation" : "plumber",
          "picture" : "https://upload.wikimedia.org/wikipedia/en/f/f1/LuigiNSMBW.png"
        }
      },
      {
        "_index" : "nintendo",
        "_type" : "character",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "id" : 1,
          "name" : "Mario",
          "description" : "This is Mario",
          "color" : "red",
          "occupation" : "plumber",
          "picture" : "https://upload.wikimedia.org/wikipedia/en/9/99/MarioSMBW.png"
 

Search all worlds:
``` shell
curl -XGET 'http://localhost:9200/nintendo/world/_search?pretty'
```
Pagination:
``` shell
curl -XGET 'http://localhost:9200/nintendo/world/_search?size=2&from=2&pretty'
```
Specify the fields you want to be returned:
``` shell
curl -XGET 'http://localhost:9200/nintendo/character/_search?pretty&q=name:Luigi&fields=name,occupation'
```
Search for the word 'pipe':
``` shell
curl -XGET 'http://localhost:9200/nintendo/world/_search?pretty&q=pipe'
```