====================================================================================================================================

<h1><font color="blue">Retrieval Augmented Generation with OCI OpenSearch and GenAI service</font></h1>

====================================================================================================================================

![Alt text](https://www.oracle.com/a/ocom/img/rc10-oci-opensearch.png)

## **Perform these steps before running this notebook**
- Create a VCN with a private subnet. Make sure there is NAT gateway attached.
- Add ingress rules to the security list: ports 9200 and 5601 ports on source 0.0.0.0/0, TCP
- Create the OpenSearch cluster in the public subnet
- Create the OCI Data Science notebook session in the private subnet
- Add the config file (API Key) and private key to this notebook in the .oci directory
- Create an object storage bucket

In [None]:
import oci
import os
import json
from urllib.request import urlopen 
import subprocess

## **Add your credentials here**

Most steps are automated after you add your credentials below. Some steps require additional steps, however, they are explained in the notebook. 

In [None]:
## The openserach API endpoint, including ":9200"
api_endpoint = ""

## Your user name in OpenSearch
username = ""

## Your password
password = ""

## Object storage bucket and namespace
bucket_name = ""
namespace = ""

#The compartment you are working in
compartment_id = ""

# **1. Register and Store a Custom Embedding Model in OpenSearch**

## **1.1 Connect to OpenSearch Cluster and perform health check**

In [None]:
output = os.popen(f"curl -XGET '{api_endpoint}/_cluster/health?pretty' -k -u {username}:{password}").read()
json_output = json.loads(output)

print(f"Status of the cluster is {json_output['status']}")
json_output

## **1.2 Change the OpenSearch Cluster settings**

In [None]:
output = subprocess.check_output("""curl -XPUT '%s/_cluster/settings' -u %s:%s -H 'Content-Type: application/json' -d'
{
  "persistent": {
    "plugins": {
      "ml_commons": {
        "only_run_on_ml_node": "false",
        "model_access_control_enabled": "true",
        "native_memory_threshold": "99",
        "rag_pipeline_feature_enabled": "true",
        "memory_feature_enabled": "true",
        "allow_registering_model_via_local_file": "true",
        "allow_registering_model_via_url": "true",
        "model_auto_redeploy.enable": "true",
        "model_auto_redeploy.lifetime_retry_times": 10
      }
    }
  }
}'
"""%(api_endpoint, username, password), shell=True, text=True)

json_output = json.loads(output)
json_output

## **1.3 Review the settings**

In [None]:
output = os.popen(f"curl -XGET '{api_endpoint}/_cluster/settings?include_defaults=true' -u {username}:{password}").read()

json_output = json.loads(output)
json_output

## **1.4. Download the custom embedding model and load into Bucket**

**1. The below first downloads the model from opensearch and then add the model to the Object Storage bucket**

You can find the supported models here: https://opensearch.org/docs/latest/ml-commons-plugin/pretrained-models/#supported-pretrained-models

In [None]:
!wget https://artifacts.opensearch.org/models/ml-models/huggingface/sentence-transformers/all-MiniLM-L12-v2/1.0.1/torch_script/sentence-transformers_all-MiniLM-L12-v2-1.0.1-torch_script.zip

**2. Push the model zip file to the bucket**

In [None]:
## when you are using a different model, change the below name to the correct .zip file
filename = "st_all-MiniLM-torch.zip"

In [None]:
#authentication using config file
config = oci.config.from_file()

object_storage_client = oci.object_storage.ObjectStorageClient(config)

#upload .zip model to bucket
put_object_response = object_storage_client.put_object(
    namespace_name=namespace,
    bucket_name=bucket_name,
    object_name=filename,
    put_object_body=filename)

output = put_object_response.headers
print(put_object_response.headers) 

## **1.5. Create a Model Group**

In [None]:
output = subprocess.check_output("""curl -XPOST '%s/_plugins/_ml/model_groups/_register' -u %s:%s -H 'Content-Type: application/json' -d'
{
    "name": "model_group_1",
    "description": "A public group",
    "access_mode": "public"
}'
"""%(api_endpoint, username, password), shell=True, text=True)

json_output = json.loads(output)

#define the model_group_id
model_group_id = json_output['model_group_id']
json_output

In [None]:
# Example response: {'model_group_id': 'Wt8e3Y8Br9Xc3t7fqub4', 'status': 'CREATED'}

## **1.6. Register the model to the OpenSearch Cluster**

**1. Load the associated model config file from the directory**

In [None]:
## Define the full path here
the_config_json = "https://artifacts.opensearch.org/models/ml-models/huggingface/sentence-transformers/all-MiniLM-L12-v2/1.0.1/torch_script/config.json"
  
#get the model config file
response = urlopen(the_config_json) 
model_config = json.loads(response.read()) 
model_config

**2. Copy the above response in the below.**

## **1.7 - Version 1: Using Object Storage Bucket**

**Change in the below / do:**
- Add your api endpoint
- Add your username and password
- Add the model group id
- Run in terminal

In [None]:
curl -XPOST '<api_endpoint>/_plugins/_ml/models/_register' -u <username>:<password> -H 'Content-Type: application/json' -d'
{
"name": "all-MiniLM-L6-v2_test",
    "version": "1.0.0",
    "description": "All miniLm-L6-v2 model",
    "model_format": "TORCH_SCRIPT",
    "model_group_id": "<model_group_id>",
    "model_content_hash_value": "c15f0d2e62d872be5b5bc6c84d2e0f4921541e29fefbef51d59cc10a8ae30e0f",
    "model_config": {
        "model_type": "bert",
        "embedding_dimension": 384,
        "framework_type": "sentence_transformers",
       "all_config": "{\"_name_or_path\":\"nreimers/MiniLM-L6-H384-uncased\",\"architectures\":[\"BertModel\"],\"attention_probs_dropout_prob\":0.1,\"gradient_checkpointing\":false,\"hidden_act\":\"gelu\",\"hidden_dropout_prob\":0.1,\"hidden_size\":384,\"initializer_range\":0.02,\"intermediate_size\":1536,\"layer_norm_eps\":1e-12,\"max_position_embeddings\":512,\"model_type\":\"bert\",\"num_attention_heads\":12,\"num_hidden_layers\":6,\"pad_token_id\":0,\"position_embedding_type\":\"absolute\",\"transformers_version\":\"4.8.2\",\"type_vocab_size\":2,\"use_cache\":true,\"vocab_size\":30522}"
    },

        "url_connector": {
            "protocol": "oci_sigv1",
            "parameters": {
                "auth_type": "resource_principal"
            },
            "actions": [
                {
                    "method": "GET",
                    "action_type": "DOWNLOAD",
                    "url": "https://fro8fl9kuqli.objectstorage.us-ashburn-1.oci.customer-oci.com/n/fro8fl9kuqli/b/oci_opensearch_custom_model/o/st_all-MiniLM-torch.zip"
                }
            ]
        }
    }'

In [None]:
# Example response {"task_id":"Md_MzY8Br9Xc3t7fn-YC","status":"CREATED"}

## **1.7 - Version 2: Using OpenSearch directory**

In [None]:
curl -XPOST '<api_endpoint>/_plugins/_ml/models/_register' -u <username>:<password> -H 'Content-Type: application/json' -d'
{
    "name": "all-MiniLM-L6-v2_open2",
    "version": "1.0.0",
    "description": "test model",
    "model_format": "TORCH_SCRIPT",
    "model_group_id": "<model_group_id>",
    "model_content_hash_value": "c15f0d2e62d872be5b5bc6c84d2e0f4921541e29fefbef51d59cc10a8ae30e0f",
    "model_config": {
        "model_type": "bert",
        "embedding_dimension": 384,
        "framework_type": "sentence_transformers",
       "all_config": "{\"_name_or_path\":\"nreimers/MiniLM-L6-H384-uncased\",\"architectures\":[\"BertModel\"],\"attention_probs_dropout_prob\":0.1,\"gradient_checkpointing\":false,\"hidden_act\":\"gelu\",\"hidden_dropout_prob\":0.1,\"hidden_size\":384,\"initializer_range\":0.02,\"intermediate_size\":1536,\"layer_norm_eps\":1e-12,\"max_position_embeddings\":512,\"model_type\":\"bert\",\"num_attention_heads\":12,\"num_hidden_layers\":6,\"pad_token_id\":0,\"position_embedding_type\":\"absolute\",\"transformers_version\":\"4.8.2\",\"type_vocab_size\":2,\"use_cache\":true,\"vocab_size\":30522}"
    },
    "url": "https://artifacts.opensearch.org/models/ml-models/huggingface/sentence-transformers/all-MiniLM-L6-v2/1.0.1/torch_script/sentence-transformers_all-MiniLM-L6-v2-1.0.1-torch_script.zip"
}'

In [None]:
# Example response: {"task_id":"Lt-1zY8Br9Xc3t7f0-ZJ","status":"CREATED"}
# Example response:{"task_id":"Mt_NzY8Br9Xc3t7fYuZN","status":"CREATED"}

## **1.8 Check status of uploading**

**Change in the below / do:**
- Add task_id below

In [None]:
task_id = ""

In [None]:
output = os.popen(f"curl -XGET '{api_endpoint}/_plugins/_ml/tasks/{task_id}' -u {username}:{password}").read()

json_output = json.loads(output)

#get the model id from the response
model_id = json_output['model_id']
json_output

In [66]:
# Example response = {'model_id': 'FqTNzY8B7gfZNJU_Yr1w'}

## **1.9 Deploy the model using the model_id**

In [None]:
output = os.popen(f"curl -XPOST '{api_endpoint}/_plugins/_ml/models/{model_id}/_deploy' -H 'Content-Type: application/json' -u {username}:{password}").read()

json_output = json.loads(output)
json_output

## **1.10. Invoke the custom embedding model / Testing**

In [None]:
output = subprocess.check_output("""curl -XPOST '%s/_plugins/_ml/_predict/text_embedding/%s' -u %s:%s -H 'Content-Type: application/json' -d'
{
  "text_docs":["hello world"]
}'
"""%(api_endpoint, model_id, username, password), shell=True, text=True)

json_output = json.loads(output)
json_output

#
#
#
# **2. Create full Retrieval Augmented Generation Pipeline**
#

## **2.1 Connector to Cohere in GenAI Service**

**Change in the below / do:**
- Change the compartment_ocid below
- Change the username
- Change the password
- Change the API
- Run in terminal. Make note of the connector_id

In [None]:
curl -XPOST '<api_endpoint>/_plugins/_ml/connectors/_create' -u <username>:<password> -H 'Content-Type: application/json' -d'
{
     "name": "llm_cohere_command",
     "description": "llm_cohere_command",
     "version": 2,
     "protocol": "oci_sigv1",
     "parameters": {
         "endpoint": "inference.generativeai.us-chicago-1.oci.oraclecloud.com",
         "auth_type": "resource_principal"
     },
     "credential": {
     },
     "actions": [
         {
             "action_type": "predict",
             "method": "POST",
             "url": "https://${parameters.endpoint}/20231130/actions/generateText",
             "request_body": "{\"compartmentId\":\"<compartment_ocid>\",\"servingMode\":{\"modelId\":\"cohere.command\",\"servingType\":\"ON_DEMAND\"},\"inferenceRequest\":{\"prompt\":\"${parameters.prompt}\",\"maxTokens\":600,\"temperature\":1,\"frequencyPenalty\":0,\"presencePenalty\":0,\"topP\":0.75,\"topK\":0,\"returnLikelihoods\":\"GENERATION\",\"isStream\":false ,\"stopSequences\":[],\"runtimeType\":\"COHERE\"}}"
         }
     ]
 }'

In [None]:
# Example response = {"connector_id":"N99Ozo8Br9Xc3t7fJeYF"}
# Example response ={"connector_id":"X9833Y8Br9Xc3t7fu-aO"}

## **2.2 Register the Cohere model**

In [None]:
connector_id = ""

In [None]:
output = subprocess.check_output("""curl -XPOST '%s/_plugins/_ml/models/_register' -u %s:%s -H 'Content-Type: application/json' -d'
{
   "name": "oci-genai-conversation",
   "function_name": "remote",
   "model_group_id": "%s",
   "description": "oci-gen-ai-cohere",
   "connector_id": "%s"
 }'
"""%(api_endpoint, username, password, model_group_id, connector_id), shell=True, text=True)

json_output = json.loads(output)

#get the model_id for the cohere model
model_id_cohere = json_output['model_id']

json_output

In [None]:
# Example response = {"task_id":"ON9Rzo8Br9Xc3t7fnuZo","status":"CREATED","model_id":"Od9Rzo8Br9Xc3t7fnuaS"}
# Example response = {'task_id': 'YN853Y8Br9Xc3t7fP-ZE', 'status': 'CREATED', 'model_id': 'Yd853Y8Br9Xc3t7fP-Zf'}

## **2.3 Deploy the Cohere model as API connection**

In [None]:
output = os.popen(f"curl -XPOST '{api_endpoint}/_plugins/_ml/models/{model_id_cohere}/_deploy' -H 'Content-Type: application/json' -u {username}:{password}").read()

json_output = json.loads(output)
json_output

In [None]:
# {'task_id': 'Ot9Wzo8Br9Xc3t7f_eYn', 'task_type': 'DEPLOY_MODEL', 'status': 'COMPLETED'}
# {'task_id': 'Yt863Y8Br9Xc3t7fL-Yj', 'task_type': 'DEPLOY_MODEL', 'status': 'COMPLETED'}

## **2.4 Create the RAG Pipeline**

The below creates a RAG pipeline, with only the Cohere model. In later steps, will add:
- Convert input text to embeddings using the custom embedding model
- Perform kNN to fetch most similar texts
- Add conversational memory
- Invoke GenAI/Cohere

In [None]:
output = subprocess.check_output("""curl -XPUT '%s/_search/pipeline/demo_rag_pipeline' -u %s:%s -H 'Content-Type: application/json' -d'
{
  "response_processors": [
    {
      "retrieval_augmented_generation": {
        "tag": "genai_conversational_search_demo",
        "description": "Demo pipeline for conversational search Using Genai Connector",
        "model_id": "%s",
        "context_field_list": ["text"],
        "system_prompt":"hepfull assistant",
        "user_instructions":"generate concise answer"
      }
    }
  ]
}'
"""%(api_endpoint, username, password, model_id_cohere), shell=True, text=True)

json_output = json.loads(output)
json_output

## **2.5. Create Search Index with kNN plugin - using model_id (Custom Embedding model)**

In [None]:
output = subprocess.check_output("""curl -XPUT '%s/_ingest/pipeline/minil12-test-pipeline' -u %s:%s -H 'Content-Type: application/json' -d'
{
  "description": "pipeline for RAG demo index",
  "processors" : [
    {
      "text_embedding": {
        "model_id": "%s",
        "field_map": {
           "text": "passage_embedding"
        }
      }
    }
  ]
}'
"""%(api_endpoint, username, password, model_id), shell=True, text=True)

json_output = json.loads(output)
json_output

## **2.6. Create the index (table)**

In [None]:
output = subprocess.check_output("""curl -XPUT '%s/conversation-demo-index-knn-new' -u %s:%s -H 'Content-Type: application/json' -d'
{
    "settings": {
        "index.knn": true,
        "default_pipeline": "minil12-test-pipeline"
    },
    "mappings": {
        "properties": {
            "passage_embedding": {
                "type": "knn_vector",
                "dimension": 384,
                "method": {
                    "name":"hnsw",
                    "engine":"lucene",
                    "space_type": "l2",
                    "parameters":{
                        "m":512,
                        "ef_construction": 245
                    }
                }
            },
            "text": {
                "type": "text"
            }
        }
    }
}'
"""%(api_endpoint, username, password), shell=True, text=True)

json_output = json.loads(output)
json_output

In [None]:
# {"acknowledged":true,"shards_acknowledged":true,"index":"conversation-demo-index-knn-new"}
# {'acknowledged': True, 'shards_acknowledged': True, 'index': 'conversation-demo-index-knn-new2'}

## **2.7. Add data to the index/table**

In [None]:
output = subprocess.check_output("""curl -XPUT '%s/conversation-demo-index-knn-new/_doc/1' -u %s:%s -H 'Content-Type: application/json' -d'
{
    "text": "The emergence of resistance of bacteria to antibiotics is a common phenomenon. Emergence of resistance often reflects evolutionary processes that take place during antibiotic therapy. The antibiotic treatment may select for bacterial strains with physiologically or genetically enhanced capacity to survive high doses of antibiotics. Under certain conditions, it may result in preferential growth of resistant bacteria, while growth of susceptible bacteria is inhibited by the drug. For example, antibacterial selection for strains having previously acquired antibacterial-resistance genes was demonstrated in 1943 by the Luria–Delbrück experiment. Antibiotics such as penicillin and erythromycin, which used to have a high efficacy against many bacterial species and strains, have become less effective, due to the increased resistance of many bacterial strains."
}'
"""%(api_endpoint, username, password), shell=True, text=True)
json_output = json.loads(output)
print(json_output)
#######################################################

output = subprocess.check_output("""curl -XPUT '%s/conversation-demo-index-knn-new/_doc/2' -u %s:%s -H 'Content-Type: application/json' -d'
{
  "text": "The successful outcome of antimicrobial therapy with antibacterial compounds depends on several factors. These include host defense mechanisms, the location of infection, and the pharmacokinetic and pharmacodynamic properties of the antibacterial. A bactericidal activity of antibacterials may depend on the bacterial growth phase, and it often requires ongoing metabolic activity and division of bacterial cells. These findings are based on laboratory studies, and in clinical settings have also been shown to eliminate bacterial infection. Since the activity of antibacterials depends frequently on its concentration, in vitro characterization of antibacterial activity commonly includes the determination of the minimum inhibitory concentration and minimum bactericidal concentration of an antibacterial. To predict clinical outcome, the antimicrobial activity of an antibacterial is usually combined with its pharmacokinetic profile, and several pharmacological parameters are used as markers of drug efficacy."
}'
"""%(api_endpoint, username, password), shell=True, text=True)
json_output = json.loads(output)
print(json_output)

#######################################################

output = subprocess.check_output("""curl -XPUT '%s/conversation-demo-index-knn-new/_doc/3' -u %s:%s -H 'Content-Type: application/json' -d'
{
  "text": "Antibacterial antibiotics are commonly classified based on their mechanism of action, chemical structure, or spectrum of activity. Most target bacterial functions or growth processes. Those that target the bacterial cell wall (penicillins and cephalosporins) or the cell membrane (polymyxins), or interfere with essential bacterial enzymes (rifamycins, lipiarmycins, quinolones, and sulfonamides) have bactericidal activities. Those that target protein synthesis (macrolides, lincosamides and tetracyclines) are usually bacteriostatic (with the exception of bactericidal aminoglycosides). Further categorization is based on their target specificity. Narrow-spectrum antibacterial antibiotics target specific types of bacteria, such as Gram-negative or Gram-positive bacteria, whereas broad-spectrum antibiotics affect a wide range of bacteria. Following a 40-year hiatus in discovering new classes of antibacterial compounds, four new classes of antibacterial antibiotics have been brought into clinical use in the late 2000s and early 2010s: cyclic lipopeptides (such as daptomycin), glycylcyclines (such as tigecycline), oxazolidinones (such as linezolid), and lipiarmycins (such as fidaxomicin)"
}'
"""%(api_endpoint, username, password), shell=True, text=True)
json_output = json.loads(output)
print(json_output)
#######################################################

output = subprocess.check_output("""curl -XPUT '%s/conversation-demo-index-knn-new/_doc/4' -u %s:%s -H 'Content-Type: application/json' -d'
{
  "text": "The Desert Land Act of 1877 was passed to allow settlement of arid lands in the west and allotted 640 acres (2.6 km2) to settlers for a fee of $.25 per acre and a promise to irrigate the land. After three years, a fee of one dollar per acre would be paid and the land would be owned by the settler. This act brought mostly cattle and sheep ranchers into Montana, many of whom grazed their herds on the Montana prairie for three years, did little to irrigate the land and then abandoned it without paying the final fees. Some farmers came with the arrival of the Great Northern and Northern Pacific Railroads throughout the 1880s and 1890s, though in relatively small numbers"
}' 
"""%(api_endpoint, username, password), shell=True, text=True)
json_output = json.loads(output)
print(json_output)
#######################################################

output = subprocess.check_output("""curl -XPUT '%s/conversation-demo-index-knn-new/_doc/5' -u %s:%s -H 'Content-Type: application/json' -d'
{
  "text": "In June 1917, the U.S. Congress passed the Espionage Act of 1917 which was later extended by the Sedition Act of 1918, enacted in May 1918. In February 1918, the Montana legislature had passed the Montana Sedition Act, which was a model for the federal version. In combination, these laws criminalized criticism of the U.S. government, military, or symbols through speech or other means. The Montana Act led to the arrest of over 200 individuals and the conviction of 78, mostly of German or Austrian descent. Over 40 spent time in prison. In May 2006, then-Governor Brian Schweitzer posthumously issued full pardons for all those convicted of violating the Montana Sedition Act."
}'
"""%(api_endpoint, username, password), shell=True, text=True)
json_output = json.loads(output)
print(json_output)

## **2.8. Check an index - Should include the generated embedding**

In [None]:
output = os.popen(f"curl -XGET '{api_endpoint}/conversation-demo-index-knn-new/_doc/2' -H 'Content-Type: application/json' -u {username}:{password}").read()

json_output = json.loads(output)
json_output

## **2.9. Create a memory id - to store conversational history when interacting in a chat-like function**

In [None]:
output = subprocess.check_output("""curl -XPOST '%s/_plugins/_ml/memory/conversation' -u %s:%s -H 'Content-Type: application/json' -d'
{
  "name": "rag-conversation"
}'
"""%(api_endpoint, username, password), shell=True, text=True)

json_output = json.loads(output)

#get the conversation id
conversation_id = json_output['conversation_id']

print(json_output)

In [None]:
#{"conversation_id":"O9-Mzo8Br9Xc3t7fs-Zn"}
#{'conversation_id': 'Y99I3Y8Br9Xc3t7fmOZH'}

## **2.10. Test the full RAG Pipeline**

#### **Add your question here. The question should relate to the texts stored in the index**

Example questions:
- Explain the Espionage Act of 1917 
- What are Antibacterial antibiotics?

In [None]:
question = "What are Antibacterial antibiotics?"

In [None]:
output = subprocess.check_output("""curl -XGET '%s/conversation-demo-index-knn-new/_search?search_pipeline=demo_rag_pipeline' -u %s:%s -H 'Content-Type: application/json' -d'
{
  "query": {
    "bool" : {
      "should" : [
        {
          "script_score": {
            "query": {
              "neural": {
                "passage_embedding": {
                  "query_text": "%s",
                  "model_id": "%s",
                  "k": 3
                }
              }
            },
            "script": {
              "source": "_score * 1.5"
            }
          }
        }
      ]
    }
  },
    "ext": {
        "generative_qa_parameters": {
            "llm_model": "oci_genai/cohere.command",
            "llm_question": "%s answer only in two sentences using provided context.",
            "conversation_id": "%s",
            "context_size": 2,
            "interaction_size": 1,
            "timeout": 15
        }
    }
}'
"""%(api_endpoint, username, password, question, model_id, question, conversation_id), shell=True, text=True)

json_output = json.loads(output)
print(json_output)

#
#


#
#
# **3. Example Chatbot interface**

For your reference, the texts stored in the table/index:
- "The emergence of resistance of bacteria to antibiotics is a common phenomenon. Emergence of resistance often reflects evolutionary processes that take place during antibiotic therapy. The antibiotic treatment may select for bacterial strains with physiologically or genetically enhanced capacity to survive high doses of antibiotics. Under certain conditions, it may result in preferential growth of resistant bacteria, while growth of susceptible bacteria is inhibited by the drug. For example, antibacterial selection for strains having previously acquired antibacterial-resistance genes was demonstrated in 1943 by the Luria–Delbrück experiment. Antibiotics such as penicillin and erythromycin, which used to have a high efficacy against many bacterial species and strains, have become less effective, due to the increased resistance of many bacterial strains."
- "The successful outcome of antimicrobial therapy with antibacterial compounds depends on several factors. These include host defense mechanisms, the location of infection, and the pharmacokinetic and pharmacodynamic properties of the antibacterial. A bactericidal activity of antibacterials may depend on the bacterial growth phase, and it often requires ongoing metabolic activity and division of bacterial cells. These findings are based on laboratory studies, and in clinical settings have also been shown to eliminate bacterial infection. Since the activity of antibacterials depends frequently on its concentration, in vitro characterization of antibacterial activity commonly includes the determination of the minimum inhibitory concentration and minimum bactericidal concentration of an antibacterial. To predict clinical outcome, the antimicrobial activity of an antibacterial is usually combined with its pharmacokinetic profile, and several pharmacological parameters are used as markers of drug efficacy."
- "Antibacterial antibiotics are commonly classified based on their mechanism of action, chemical structure, or spectrum of activity. Most target bacterial functions or growth processes. Those that target the bacterial cell wall (penicillins and cephalosporins) or the cell membrane (polymyxins), or interfere with essential bacterial enzymes (rifamycins, lipiarmycins, quinolones, and sulfonamides) have bactericidal activities. Those that target protein synthesis (macrolides, lincosamides and tetracyclines) are usually bacteriostatic (with the exception of bactericidal aminoglycosides). Further categorization is based on their target specificity. Narrow-spectrum antibacterial antibiotics target specific types of bacteria, such as Gram-negative or Gram-positive bacteria, whereas broad-spectrum antibiotics affect a wide range of bacteria. Following a 40-year hiatus in discovering new classes of antibacterial compounds, four new classes of antibacterial antibiotics have been brought into clinical use in the late 2000s and early 2010s: cyclic lipopeptides (such as daptomycin), glycylcyclines (such as tigecycline), oxazolidinones (such as linezolid), and lipiarmycins (such as fidaxomicin)"
- "The Desert Land Act of 1877 was passed to allow settlement of arid lands in the west and allotted 640 acres (2.6 km2) to settlers for a fee of $.25 per acre and a promise to irrigate the land. After three years, a fee of one dollar per acre would be paid and the land would be owned by the settler. This act brought mostly cattle and sheep ranchers into Montana, many of whom grazed their herds on the Montana prairie for three years, did little to irrigate the land and then abandoned it without paying the final fees. Some farmers came with the arrival of the Great Northern and Northern Pacific Railroads throughout the 1880s and 1890s, though in relatively small numbers"
- "In June 1917, the U.S. Congress passed the Espionage Act of 1917 which was later extended by the Sedition Act of 1918, enacted in May 1918. In February 1918, the Montana legislature had passed the Montana Sedition Act, which was a model for the federal version. In combination, these laws criminalized criticism of the U.S. government, military, or symbols through speech or other means. The Montana Act led to the arrest of over 200 individuals and the conviction of 78, mostly of German or Austrian descent. Over 40 spent time in prison. In May 2006, then-Governor Brian Schweitzer posthumously issued full pardons for all those convicted of violating the Montana Sedition Act."

**Example questions:**
- What did the desert land acto do exactly?
- What are arid lands?
- Explain the Espionage Act of 1917 
- What are Antibacterial antibiotics?

## **3.1 Define small function**

In [None]:
def rag_chain(question, history):
    
    print(f"The question = {question}")
    
    output = subprocess.check_output("""curl -XGET '%s/conversation-demo-index-knn-new/_search?search_pipeline=demo_rag_pipeline' -u %s:%s -H 'Content-Type: application/json' -d'
    {
      "query": {
        "bool" : {
          "should" : [
            {
              "script_score": {
                "query": {
                  "neural": {
                    "passage_embedding": {
                      "query_text": "%s",
                      "model_id": "%s",
                      "k": 3
                    }
                  }
                },
                "script": {
                  "source": "_score * 1.5"
                }
              }
            }
          ]
        }
      },
        "ext": {
            "generative_qa_parameters": {
                "llm_model": "oci_genai/cohere.command",
                "llm_question": "%s. answer only in two sentences using provided context.",
                "conversation_id": "%s",
                "context_size": 2,
                "interaction_size": 1,
                "timeout": 15
            }
        }
    }'
    """%(api_endpoint, username, password, question, model_id, question, conversation_id), shell=True, text=True)
    
    json_output = json.loads(output)

    answer = json_output['ext']['retrieval_augmented_generation']['answer']
    print(f"The answer = {answer}")
    
    return answer

In [None]:
#answer = rag_chain("What are Antibacterial antibiotics?", "nothing")

## **3.2 Install and run Gradio**

In [None]:
!pip install --upgrade gradio

In [None]:
import gradio as gr

gr.ChatInterface(
    rag_chain,
    chatbot=gr.Chatbot(height=500),
    textbox=gr.Textbox(placeholder="Ask me a question", container=False, scale=7),
    title="OCI OpenSearch - GenAI",
    theme="soft",
    examples=["What did the desert land acto do exactly?", "What are Antibacterial antibiotics?", "Explain the Espionage Act of 1917"],
    cache_examples=False,
    retry_btn=None,
    undo_btn="Delete Previous",
    clear_btn="Clear",
).launch(share=True, debug=True)