<a href="https://colab.research.google.com/github/gmossy/AIsearch/blob/main/Using_Elasticsearch_from_Colab_using_Bonsai.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Using Elasticsearch from Colab using [Bonsai](https://bonsai.io)

This notebook comes form [this blog post](https://softwaredoug.com/blog/2022/09/11/using-elasticsearch-from-colab.html) and demonstrates how to use a free tier Bonsai Elasticsearch cluster from a Colab notebook. 

## Install Elasticsearch Client, get Retrotech Dataset

In [None]:
!pip install elasticsearch==7.10.1
![ ! -d 'retrotech' ] && git clone https://github.com/ai-powered-search/retrotech.git
! cd retrotech && git pull
! cd retrotech && tar -xvf products.tgz  && tar -xvf signals.tgz

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting elasticsearch==7.10.1
  Downloading elasticsearch-7.10.1-py2.py3-none-any.whl (322 kB)
[K     |████████████████████████████████| 322 kB 4.1 MB/s 
Installing collected packages: elasticsearch
Successfully installed elasticsearch-7.10.1
Cloning into 'retrotech'...
remote: Enumerating objects: 33, done.[K
remote: Total 33 (delta 0), reused 0 (delta 0), pack-reused 33[K
Unpacking objects: 100% (33/33), done.
Already up to date.
products.csv
signals.csv


## Paste in your "Full URL" from Bonsai

See how to set this up in [this blog post](https://softwaredoug.com/blog/2022/09/11/using-elasticsearch-from-colab.html)

In [None]:
import getpass
es_url = getpass.getpass("Paste in your Elasticsearch URL")

Paste in your Elasticsearch URL··········


## Setup Elasticsearch Client

In [None]:
from elasticsearch import Elasticsearch
es = Elasticsearch(es_url)
es.ping()

True

## Index retrotech data (downloaded in first cell)

In [None]:
import csv
from elasticsearch.helpers import bulk
from elasticsearch import RequestError

def retrotech_data():
  with open('retrotech/products.csv') as csv_file:
    products_reader = csv.DictReader(csv_file)
    for row in products_reader:
      yield {
        '_source': row,
        '_index': 'retrotech',
        '_id': row['upc']
      }

try:
  es.indices.create('retrotech')
  bulk(es, retrotech_data())
except RequestError:
  print("Not recreating index that already exists")



Not recreating index that already exists


## Search!

In [None]:
hits = es.search(index='retrotech', body={'query': {'match': {'name': 'transformers'}}})
hits = hits['hits']['hits']
for hit in hits:
  print(hit['_source']['name'])


Transformers - DVD
Transformers - Original Soundtrack - CD
The Transformers: The Movie - DVD
Nintendo - Transformers 3 Cybertanium Case
Transformers Japanese Collection: Headmasters - DVD
Transformers: War for Cybertron - Windows
Transformers: Cybertron Adventures - Nintendo Wii
Transformers: The Game - PlayStation 3
Transformers/Transformers: Revenge of the Fallen: Two-Movie Mega Collection [2 Discs] - Widescreen - DVD
Transformers Prime: Darkness Rising - Fullscreen - DVD


## Cleanup when done...

In [None]:
es.indices.delete('retrotech')

{'acknowledged': True}