# Lab 2

## Setup Environment
The following code loads the environment variables required to run this notebook.


In [None]:
FILE="Elastic_GenAI_Workshop_session_2"

! pip install -qqq git+https://github.com/elastic/notebook-workshop-loader.git@main
from notebookworkshoploader import loader
import os
from dotenv import load_dotenv

if os.path.isfile("../env"):
    load_dotenv("../env", override=True)
    print('Successfully loaded environment variables from local env file')
else:
    loader.load_remote_env(file=FILE, env_url="https://notebook-workshop-setup.elasticsa.co")

In [None]:
! pip install -q streamlit "openai<1.0.0" elasticsearch elastic-apm inquirer python-dotenv

import os, inquirer, re, secrets, requests
import streamlit as st
import openai

from IPython.display import display
from ipywidgets import widgets
from pprint import pprint
from elasticsearch import Elasticsearch
from string import Template
from requests.auth import HTTPBasicAuth

#if using the Elastic AI proxy, then generate the correct API key
if os.environ['ELASTIC_PROXY'] == "True":

  #remove the api type variable: it's a must when using the proxy
  if "OPENAI_API_TYPE" in os.environ: del os.environ["OPENAI_API_TYPE"]

  #generate and share "your" unique hash
  os.environ['USER_HASH'] = secrets.token_hex(nbytes=6)
  print(f"Your unique user hash is: {os.environ['USER_HASH']}")


## Create Elasticsearch client connection

In [None]:
if 'ELASTIC_CLOUD_ID' in os.environ:
  es = Elasticsearch(
    cloud_id=os.environ['ELASTIC_CLOUD_ID'],
    api_key=(os.environ['ELASTIC_APIKEY_ID'], os.environ['ELASTIC_APIKEY_SECRET']),
    request_timeout=30
  )
elif 'ELASTIC_URL' in os.environ:
  es = Elasticsearch(
    os.environ['ELASTIC_URL'],
    api_key=(os.environ['ELASTIC_APIKEY_ID'], os.environ['ELASTIC_APIKEY_SECRET']),
    request_timeout=30
  )
else:
  print("env needs to set either ELASTIC_CLOUD_ID or ELASTIC_URL")

# Lab 2-1
- Chunking (simplified example)
- Generating embeddings
- Perform kNN search

## Chunking
Simplfied example

In [None]:
body_content = "Elastic Docs › Elasticsearch Guide [8.8] « Searchable snapshots Elasticsearch security principles » Secure the Elastic Stack edit The Elastic Stack is comprised of many moving parts. There are the Elasticsearch nodes that form the cluster, plus Logstash instances, Kibana instances, Beats agents, and clients all communicating with the cluster. To keep your cluster safe, adhere to the Elasticsearch security principles . The first principle is to run Elasticsearch with security enabled. Configuring security can be complicated, so we made it easy to start the Elastic Stack with security enabled and configured . For any new clusters, just start Elasticsearch to automatically enable password protection, secure internode communication with Transport Layer Security (TLS), and encrypt connections between Elasticsearch and Kibana. If you have an existing, unsecured cluster (or prefer to manage security on your own), you can manually enable and configure security to secure Elasticsearch clusters and any clients that communicate with your clusters. You can also implement additional security measures, such as role-based access control, IP filtering, and auditing. Enabling security protects Elasticsearch clusters by: Preventing unauthorized access with password protection, role-based access control, and IP filtering. Preserving the integrity of your data with SSL/TLS encryption. Maintaining an audit trail so you know who’s doing what to your cluster and the data it stores. If you plan to run Elasticsearch in a Federal Information Processing Standard (FIPS) 140-2 enabled JVM, see FIPS 140-2 . Preventing unauthorized access edit To prevent unauthorized access to your Elasticsearch cluster, you need a way to authenticate users in order to validate that a user is who they claim to be. For example, making sure that only the person named Kelsey Andorra can sign in as the user kandorra . The Elasticsearch security features provide a standalone authentication mechanism that enables you to quickly password-protect your cluster. If you’re already using LDAP, Active Directory, or PKI to manage users in your organization, the security features integrate with those systems to perform user authentication. In many cases, authenticating users isn’t enough. You also need a way to control what data users can access and what tasks they can perform. By enabling the Elasticsearch security features, you can authorize users by assigning access privileges to roles and assigning those roles to users. Using this role-based access control mechanism (RBAC), you can limit the user kandorra to only perform read operations on the events index restrict access to all other indices. The security features also enable you to restrict the nodes and clients that can connect to the cluster based on IP filters . You can block and allow specific IP addresses, subnets, or DNS domains to control network-level access to a cluster. See User authentication and User authorization . Preserving data integrity and confidentiality edit A critical part of security is keeping confidential data secured. Elasticsearch has built-in protections against accidental data loss and corruption. However, there’s nothing to stop deliberate tampering or data interception. The Elastic Stack security features use TLS to preserve the integrity of your data against tampering, while also providing confidentiality by encrypting communications to, from, and within the cluster. For even greater protection, you can increase the encryption strength . See Configure security for the Elastic Stack . Maintaining an audit trail edit Keeping a system secure takes vigilance. By using Elastic Stack security features to maintain an audit trail, you can easily see who is accessing your cluster and what they’re doing. You can configure the audit level, which accounts for the type of events that are logged. These events include failed authentication attempts, user access denied, node connection denied, and more. By analyzing access patterns and failed attempts to access your cluster, you can gain insights into attempted attacks and data breaches. Keeping an auditable log of the activity in your cluster can also help diagnose operational issues. See Enable audit logging . « Searchable snapshots Elasticsearch security principles » Most Popular Video Get Started with Elasticsearch Video Intro to Kibana Video ELK for Logs & Metrics"
print ("The length of the paragraph is %s characters" % len (body_content))

There are many ways to split text. We can split on individual characters, spaces, at a set length, using a library like langchain, or using a tokenizer, to name a few ways.

For this simple example we are going to split on dot+space ". ", essentially spliting individual sentences.

In [None]:
chunked_content = [chunk for chunk in re.split('\. ',  body_content)]
chunk = chunked_content[0] # We'll use this later
print ("There are now %s sentence chunks.\nThe first element is:'%s'" % (len(chunked_content), chunk))

TODO Talk about tokens

In [None]:
# Show the "tokens" from the first chunk
chunk.split()

## Generate embeddings

We need to pass our text to an embedding model to generate vectors.

Models have pre-definied token limits which restrict the amount of text (tokens really) that can be processed into vectors.

In [None]:
es_model_id = 'sentence-transformers__msmarco-minilm-l-12-v3'

In [None]:
chunk

In [None]:
docs =  [
    {
      "text_field": chunk
    }
]

In [None]:
chunk_vector = es.ml.infer_trained_model(model_id=es_model_id, docs=docs, )

In [None]:
vector_doc = {
  "_index": "chunker",
  "_id": "64837860d86b1293a9a5f620-0",
  "_source": {
      "chunk" : chunk,
      "chunk-vector" : chunk_vector['inference_results'][0]['predicted_value'],
      "body_content" : body_content
  }
}

pprint(vector_doc)

## Exceeding the model's token limit

Let's take a look at what happens when we exceed the model's token limit

In [None]:
full_paragraph =  [
    {
      "text_field": body_content
    }
]

In [None]:
chunk_vector = es.ml.infer_trained_model(model_id=es_model_id, docs=full_paragraph, )
print("When the token size exceeds the model's max token limit, the value of `is_truncated` will return True")
print('We exceeded the model token limit: %s' % chunk_vector['inference_results'][0]['is_truncated'])

We see that the model still processed the tokens up to it's limit, then simply truncated (ignored) any tokens longer than that.

Elasticsearch returns a `is_truncated : True` key:value to let you know the embedding returned is not for the full text.

## Querying with hybrid vector search

We will run through an example of searching with approximate kNN vector search combined with BM25 text search combing the results with rrf.

This is the type of query that will power the UI we will use in lab 2-2

In [None]:
def search_with_knn(query_text, es):
    # Elasticsearch query (BM25) and kNN configuration for rrf hybrid search

    query = {
        "bool": {
            "must": [{
                "match": {
                    "body_content": {
                        "query": query_text
                    }
                }
            }],
            "filter": [{
              "term": {
                "url_path_dir3": "elasticsearch"
              }
            }]
        }
    }

    knn = [
    {
      "field": "chunk-vector",
      "k": 10,
      "num_candidates": 10,
      "filter": {
        "bool": {
          "filter": [
            {
              "range": {
                "chunklength": {
                  "gte": 0
                }
              }
            },
            {
              "term": {
                "url_path_dir3": "elasticsearch"
              }
            }
          ]
        }
      },
      "query_vector_builder": {
        "text_embedding": {
          "model_id": "sentence-transformers__msmarco-minilm-l-12-v3",
          "model_text": query_text
        }
      }
    }
  ]

    rank = {
       "rrf": {
       }
   }

    fields= [
        "title",
        "url",
        "body_content"
      ]

    resp = es.search(index=os.environ['ELASTIC_INDEX_DOCS'],
                     query=query,
                     knn=knn,
                     rank=rank,
                     fields=fields,
                     size=1,
                     source=False)

    return resp

query = 'How do I start Elastic with Security Enabled?'
response = search_with_knn(query, es)
pprint(response['hits'])

# Lab 2-2
RAG

## Verify our Elasticsearch connection is still active
If you receive an error, rerun the cells in the Setup section above

In [None]:
print(es.info()['tagline']) # should return cluster info

## Main Script
We've placed the sample code in the streamlit folder of this repository

Take a look at the code [streamlit/app.py](../streamlit/app.py)

## Streamlit
To start the Streamlit app you need to use the ```streamlit run``` command from the folder.  You can do this either from this notebook or the Visual Studio Code terminal provided in Github Codespaces

In [None]:
! cd ../streamlit; streamlit run app.py 