# ChatGPT and Elasticsearch: <font color='orange'>The RAG Really Tied the App Together</font>


## This notebook will show you how to:
 - Create an Elastics Serverless Project
- Setup an Inference API
 - This will download and deploy ELSER for embedding inference
- Create an index template
 - This will use `semantic_text` which will auto-chunk and embed the body of text
- Use the Elastic Open Crawler to crawl the Elastic Search/Observability/Security Labs
<br>
<br>

## The [accompying blog](https://www.elastic.co/search-labs/blog/app/search-labs/blog/chatgpt-elasticsearch-rag-enhancements) takes it further by showing you how to:
- Use Playground to test chat prompts and configurations
 - Then generate queries for our RAG app
- Use the queries from Playground to finish out a RAG Chatbot app
 - Python FastAPI backend with React frontend

In [None]:
!pip install elasticsearch

In [2]:
import requests
import getpass
from pprint import pprint
from elasticsearch import Elasticsearch
from elasticsearch.exceptions import ConnectionTimeout
from time import sleep
from IPython.display import clear_output

# Project Setup

## Enter your Cloud API Key

Generate your secret API key at https://cloud.elastic.co/account/keys

In [3]:
# Prompt the user for input while masking it for security
api_key = getpass.getpass("Enter your API key: ")

print("API key successfully entered!")

Enter your API key: ··········
API key successfully entered!


## Create Elasticsearch project
[Serverless API Docs](https://www.elastic.co/docs/api/doc/elastic-cloud-serverless/operation/operation-createelasticsearchproject#operation-createelasticsearchproject-body-application-json-optimized_for)

In [4]:
url = "https://api.elastic-cloud.com/api/v1/serverless/projects/elasticsearch"

project_data = {
    "name": "The RAG Really Tied the App Together",
    "region_id": "aws-us-east-1",
    "optimized_for": "vector",
}

auth_header = f"ApiKey {api_key}"
headers = {"Content-Type": "application/json", "Authorization": auth_header}

es_project = requests.post(url, json=project_data, headers=headers)

if 200 <= es_project.status_code < 300:
    es_project_keys = es_project.json()
    prg_name = es_project_keys["name"]
    print(f"Project {prg_name} creation started")

    # wait for the project to be initialized and ready
    project_id = es_project.json()["id"]
    print("Checking if project is created and ready")
    loop = 1
    while True:
        es_project_check = requests.get(url + f"/{project_id}/status", headers=headers)
        if es_project_check.json()["phase"] == "initialized":
            break
        else:
            clear_output(wait=True)
            print(
                f"Waiting for project to be ready. Current status:{es_project_check.json()['phase']} - Loop {loop} Sleeping 10 seconds"
            )
            sleep(10)
            loop += 1

    print("Project is ready")

else:
    print(es_project.text)

Waiting for project to be ready. Current status:initializing - Loop 7 Sleeping 10 seconds
Project is ready


## Create elasticsearch client

In [5]:
es = Elasticsearch(
    es_project_keys["endpoints"]["elasticsearch"],
    basic_auth=(
        es_project_keys["credentials"]["username"],
        es_project_keys["credentials"]["password"],
    ),
)

## Project API Key
Create a [Project level API key](https://www.elastic.co/guide/en/elasticsearch/reference/current/security-api-create-api-key.html)

In [6]:
project_key_response = es.security.create_api_key(
    name="full_access_key",
    metadata={"description": "API key for full access"},
    expiration="14d",
)

project_api_key = project_key_response["encoded"]
print(f"{project_key_response['name']} has been created")

full_access_key has been created


# Inference API and Index Setup

## Inference API
This will:
- Create an inference API endpoint
- Download ELSER model (if not already downloaded)
- Deploy ELSER model with `service_settings` configs

Note - This will wait for ELSER to be downloaded and deployed

In [7]:
model_config = {
    "service": "elser",
    "service_settings": {"num_allocations": 8, "num_threads": 1},
}

inference_id = "my-elser-model"

try:
    create_endpoint = es.inference.put_model(
        inference_id=inference_id, task_type="sparse_embedding", body=model_config
    )

except ConnectionTimeout:
    print(
        "Connection timed out. This can happen while waiting for the Inference model to fully deploy and start."
    )
finally:
    print("Waiting for inference model to be fully deployed")
    inf_info = es.inference.get_model(inference_id=inference_id)
    model_id = inf_info.body["endpoints"][0]["service_settings"]["model_id"]

    while True:
        try:
            model_stats = es.ml.get_trained_models_stats(model_id=model_id)
            routing_state = model_stats.body["trained_model_stats"][0][
                "deployment_stats"
            ]["nodes"][0]["routing_state"]["routing_state"]

            if routing_state == "started":
                print("Inference API created and Inference model is fully deployed.")
                break
            else:
                clear_output(wait=True)
                print("Waiting for inference model to be fully deployed")
                sleep(5)
        except (IndexError, KeyError):  # Handle missing data in the response
            clear_output(wait=True)
            print("Still waiting for model deployment...")
            sleep(5)

Waiting for inference model to be fully deployed
Inference API created and Inference model is fully deployed.


## Create index template
The two key fields here are:
- body
 - the field with the body of text and we use that as the source to copy to our semantic text field `semantic_body`
- semantic_body
 - This field will automatically handle chunking and generating embeddings

In [8]:
template_body = {
    "index_patterns": ["elastic-labs*"],
    "template": {
        "mappings": {
            "properties": {
                "body": {"type": "text", "copy_to": "semantic_body"},
                "semantic_body": {
                    "type": "semantic_text",
                    "inference_id": "my-elser-model",
                },
                "headings": {"type": "text"},
                "id": {"type": "keyword"},
                "meta_description": {"type": "text"},
                "title": {"type": "text"},
            }
        }
    },
}

template_resp = es.indices.put_index_template(name="labs_template", body=template_body)

print(template_resp.body)

{'acknowledged': True}


# Crawl the docs

# Open Crawler
<font color='red'>This HAS TO BE RUN on a Linux/Mac/Windows host/vm NOT in colab</font>

The [blog details the steps](https://www.elastic.co/search-labs/blog/app/search-labs/blog/rag-ties-the-room-together#crawl-all-the-labs) below running on a Macbook

You can also review the [Open Crawler setup](https://github.com/elastic/crawler?tab=readme-ov-file#setup).

## High level steps to configure and run crawler
*This HAS TO BE RUN on a Linux/Mac/Windows host/vm NOT in colab*

- Clone the repo
 - `git clone git@github.com:elastic/crawler.git`
- Build the Open Crawler Docker container
 - `docker build -t crawler-image . && docker run -i -d --name crawler crawler-image`
- Create a new config file
 - `vi config/elastic-labs.yml`
 - run the _generate config_ cell below then paste the output in the config file and save.
- Copy the new local config into the container
 - `docker cp config/elastic-labs.yml crawler:/app/config/elastic-labs.yml`
- Run the crawler
 - `docker exec -it crawler bin/crawler crawl config/elastic-labs.yml`

## Generate Config
Run the below cell to generate the yml config file

In [None]:
config = f"""
domains:
  - url: https://www.elastic.co
    seed_urls:
      - https://www.elastic.co/search-labs
      - https://www.elastic.co/observability-labs
      - https://www.elastic.co/security-labs
    crawl_rules:
      - policy: allow
        type: begins
        pattern: /search-labs
      - policy: allow
        type: begins
        pattern: /observability-labs
      - policy: allow
        type: begins
        pattern: /security-labs
      - policy:deny
        type: regex
        pattern: .*/author/.*
      - policy: deny
        type: regex
        pattern: .*

output_sink: elasticsearch
output_index: elastic-labs
max_crawl_depth: 25

elasticsearch:
  host: "{es_project.json()['endpoints']['elasticsearch']}"
  port: "443"
  api_key: "{project_api_key}"
  bulk_api.max_items: 10
"""

print(config)

## Confirm the docs have been crawled

First look at the count of docs for each Labs' site

In [20]:
query = {
    "size": 0,
    "aggs": {"url_path_dir1": {"terms": {"field": "url_path_dir1.keyword"}}},
}

response = es.search(index="elastic-labs", body=query)
pprint(response.body)

{'_shards': {'failed': 0, 'skipped': 0, 'successful': 5, 'total': 5},
 'aggregations': {'url_path_dir1': {'buckets': [{'doc_count': 216,
                                                 'key': 'search-labs'},
                                                {'doc_count': 214,
                                                 'key': 'security-labs'},
                                                {'doc_count': 158,
                                                 'key': 'observability-labs'}],
                                    'doc_count_error_upper_bound': 0,
                                    'sum_other_doc_count': 0}},
 'hits': {'hits': [],
          'max_score': None,
          'total': {'relation': 'eq', 'value': 588}},
 'timed_out': False,
 'took': 6}


Next review a sample doc

In [23]:
query = {"size": 1, "query": {"match": {"url_path_dir2": "blog"}}}

response = es.search(index="elastic-labs", body=query)
pprint(response.body)

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
                                                                                    'autoscaling '
                                                                                    'metrics '
                                                                                    'API '
                                                                                    'exposes '
                                                                                    'a '
                                                                                    'list '
                                                                                    'of '
                                                                                    'ingestion '
                                                                                    'load '
                                                                                    'values, '
