# 🚀 RAG with Generative AI (Deployment)

The ads.model.generic_model.GenericModel class in ADS provides an efficient way to serialize almost any model class. This section demonstrates how to use the GenericModel class to prepare model artifacts, verify models, save models to the model catalog, deploy models, and perform predictions on model deployment endpoints.

[![Notebook Examples](https://img.shields.io/badge/docs-notebook--examples-blue)](https://accelerated-data-science.readthedocs.io/en/latest/user_guide/model_registration/frameworks/genericmodel.html)
[![Conda Environments](https://img.shields.io/badge/docs-conda--environments-blue)](https://docs.oracle.com/en-us/iaas/data-science/using/conda_understand_environments.htm)
[![Source Code](https://img.shields.io/badge/source-accelerated--datascience-blue)](https://github.com/oracle/accelerated-data-science)

##### [Step-01] CustomModel

<details>
<summary><font size="2">Install Pre-Requirements</font></summary>
<font size="1">

```Install Libraries
(base) bash-4.2$ odsc conda create -f environment.yaml -n langchain_env -v 1.5
```
    
</font>    
</details>

In [1]:
import oci
import ads
import os
import requests
import json
import tempfile
from ads.model.generic_model import GenericModel
from langchain_community.embeddings.oci_generative_ai import OCIGenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS
from langchain_community.chat_models.oci_generative_ai import ChatOCIGenAI
from langchain.prompts import PromptTemplate
from langchain.schema.runnable import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate

class CustomModel:
    def __init__(self):
        # oci: Generative AI
        self.compartment_id   = os.environ['NB_SESSION_COMPARTMENT_OCID']
        self.service_endpoint = "https://inference.generativeai.us-chicago-1.oci.oraclecloud.com"
        self.genai_embeddings = 'cohere.embed-multilingual-v3.0'
        self.genai_inference  = "ocid1.generativeaimodel.oc1.us-chicago-1.***"
        self.auth_type        = "RESOURCE_PRINCIPAL"
        # oci: OpenSearch
        self.apiEndpoint      = "http://***.***.***.***:9200"
        self.username         = "opensearch"
        self.password         = "*************"
        self.searchIndex      = "oci_documents"
    
    # Function to retrieve documents from OpenSearch based on the provided obj_url
    def get_documents_opensearch(self, obj_url):
        content = ""
        # Construct the search URL for the OpenSearch endpoint
        queryurl = f"{self.apiEndpoint}/{self.searchIndex}/_search"
        # Set the authentication credentials for the OpenSearch request
        auth = (self.username, self.password)
        headers = {"Content-Type": "application/json"}
        # Construct the OpenSearch query
        query = {
            "query": {
                "bool": {
                    "must": [
                        { "term": { "obj_url.keyword": obj_url } }
                    ]
                }
            },
            "_source": ["obj_content"]  # Fetch only the obj_content field
        }

        try:
            # Send a GET request to the OpenSearch endpoint with the constructed query
            resp = requests.get(queryurl, auth=auth, headers=headers, data=json.dumps(query))
            # Raise an exception if the request returned an unsuccessful status code
            resp.raise_for_status()
            # Parse the JSON response from the request
            response = resp.json()

            # Iterate over the search results in the response
            for hit in response['hits']['hits']:
                content += hit['_source']['obj_content'] + "\n\n"

            return content
        except requests.exceptions.RequestException as e:
            print(f"[get_documents_opensearch] {str(e)}.")
            raise
            
    # Function to retrieve embeddings from OCI Generative AI service
    def get_embeddings(self):
        try:
            # Initialize the OCIGenAIEmbeddings object with the specified parameters
            embeddings = OCIGenAIEmbeddings(
                model_id         = self.genai_embeddings, # Specify the model ID for embeddings
                service_endpoint = self.service_endpoint,
                compartment_id   = self.compartment_id,
                auth_type        = self.auth_type # Authentication type
            )

            return embeddings
        except requests.exceptions.RequestException as e:
            print(f"[get_embeddings_model] {str(e)}.")
            raise
            
    #  Function to generate content embeddings and create a retriever for searching the content
    def get_content(self, text, embeddings):    
        try:
            # Generar los embeddings del contenido
            text_splitter = RecursiveCharacterTextSplitter(
                chunk_size         = 1000,  # Size of each text chunk
                chunk_overlap      = 50,    # Overlap between chunks
                length_function    = len,   # Function to measure the length of text
                is_separator_regex = False  # Whether the separator is a regex
            )

            # Split the text into chunks
            chunks = text_splitter.split_text(text)

            # Create the FAISS vector store with the embeddings
            VectorStore = FAISS.from_texts(chunks, embeddings)

            # Convert the vector store to a retriever
            retriever = VectorStore.as_retriever(search_kwargs={"k": 1000}) # Number of top results to retrieve

            return retriever
        except requests.exceptions.RequestException as e:
            print(f"[get_content] {str(e)}.")
            raise
    
    def predict(self, data):
        # Define question for LLM
        question = data["input"][0]
        
        # Define the object URL for the document in OCI Object Storage
        obj_url  = data["input"][1]
        
        # Retrieve the document text from OpenSearch based on the object URL
        text = self.get_documents_opensearch(obj_url)
        
        # Get the embeddings using the specified model and configuration 
        embeddings = self.get_embeddings()

        # Generate content embeddings and create a retriever for searching the content
        retriever   = self.get_content(text, embeddings)

        chat = ChatOCIGenAI(
            model_id         = "cohere.command-r-plus",
            service_endpoint = self.service_endpoint,
            compartment_id   = self.compartment_id,
            provider         = "cohere",
            is_stream        = True,
            auth_type        = self.auth_type,
            model_kwargs     = {
                "max_tokens": 512,
                "temperature": 0.6,
                "top_p": 0.9,
                "top_k": 20,
                "frequency_penalty": 1
            }
        )
        
        # Define a prompt template for the chat model
        prompt_template = ChatPromptTemplate.from_template("{query}, basándose únicamente en el siguiente contexto: {context}")
        
        # Create a processing chain with the query, retriever, prompt template, chat model, and output parser
        chain = (
            {"query": RunnablePassthrough(), "context": retriever}  # Pass the query and retriever context
            | prompt_template                                       # Apply the prompt template
            | chat                                                  # Invoke the chat model
            | StrOutputParser()                                     # Parse the output as a string
        )
        
        # Invoke the chain with the query (max: 250 character)
        return chain.invoke(question)

In [2]:
# Initialize the custom model object
model = CustomModel()

# Use the model to make a prediction with the given input
model.predict({"input": 
     ["Generar un resumen",  # The first input is the task or query
      "https://objectstorage.us-chicago-1.oraclecloud.com/n/idi1o0a010nx/b/DLK1LAGDEV/o/example.xlsx"]})

INFO:faiss.loader:Loading faiss with AVX2 support.
INFO:faiss.loader:Successfully loaded faiss with AVX2 support.


'Aquí está un resumen del contexto proporcionado:\n\nEl documento parece ser una lista de artículos y repuestos con sus descripciones, cantidades y precios ofrecidos. Hay un total de 40 artículos en la lista, que van desde retenes de aceite y pernos hasta filtros de aire y aceite, descalcificadores, fusibles y varios tipos de cables. Los precios ofrecidos varían, con algunos artículos cotizados en cantidades más grandes. El subtotal, IVA y total están listados como $0, lo que sugiere que este es un listado de precios sin un total calculado. La lista parece estar dirigida a repuestos y accesorios para automóviles, con referencias a marcas específicas como Hyundai, Kia, Toyota y Yohama.'

##### [Step-02] Prepare Model

<details>
<summary><font size="2">Pre-Requirements</font></summary>
<font size="1">

```Terminal
!(base) bash-4.2$ odsc conda init -b <bucket-name> -n <namespace> -a <api_key or resource_principal>
!(base) bash-4.2$ odsc conda init -b <bucket-name> -n <namespace> -a resource_principal
!(base) bash-4.2$ odsc conda publish -s <slug>
!(base) bash-4.2$ odsc conda publish -s langchain_env_v1_5
<pack> ="oci://<bucket-name>@<namespace>/conda_environments/cpu/langchain_env/1.5/langchain_env_v1_5"
```    
</font>    
</details>

<details>
<summary><font size="2">Optional</font></summary>
<font size="1">

```Terminal
!(base) bash-4.2$ odsc conda delete -s <slug>
!(base) bash-4.2$ odsc conda delete -s langchain_env_v1_5
```
    
</font>    
</details>

<details>
<summary><font size="2">generic_model.prepare</font></summary>
<font size="1">

```Terminal
generic_model.prepare(
    inference_conda_env = <pack>,
    inference_python_version = "3.8",
    model_file_name = "langchain.pkl",
    force_overwrite = True
)
```    
</font>    
</details>

In [3]:
import oci
import ads
import tempfile
from ads.model.generic_model import GenericModel

ads.set_auth("resource_principal")

generic_model = GenericModel(
    estimator = model,
    artifact_dir=tempfile.mkdtemp(),
    model_save_serializer="cloudpickle",
    model_input_serializer="json"
)
generic_model.summary_status()

generic_model.prepare(
    inference_conda_env="oci://DLK4CUR@idi1o0a010nx/conda_environments/cpu/langchain_env/1.5/langchain_env_v1_5",
    inference_python_version="3.8",
    model_file_name="langchain.pkl",
    force_overwrite=True
)

                                                                                                                                                                                            ?, ?it/s]

algorithm: null
artifact_dir:
  /tmp/tmp1761pyeq:
  - - .model-ignore
    - langchain.pkl
    - runtime.yaml
    - score.py
framework: null
model_deployment_id: null
model_id: null

##### [Step-03] Check Model
The verify method invokes the ``predict`` function defined inside ``score.py`` in the artifact_dir

In [4]:
# The verify method invokes the ``predict`` function defined inside ``score.py`` in the artifact_dir
generic_model.verify({"input": 
     ["Generar un resumen",  # The first input is the task or query
      "https://objectstorage.us-chicago-1.oraclecloud.com/n/idi1o0a010nx/b/DLK1LAGDEV/o/example.xlsx"]})

Start loading langchain.pkl from model directory /tmp/tmp1761pyeq ...
Model is successfully loaded.


{'prediction': 'Aquí está un resumen del contexto proporcionado:\n\nEl documento parece ser una lista de artículos y repuestos con sus descripciones, cantidades y precios ofrecidos. Hay un total de 40 artículos en la lista, que van desde retenes de aceite y pernos hasta filtros de aire y aceite, fusibles y varios tipos de cintas aisladoras. Los precios ofrecidos varían, pero en este caso específico, el subtotal, el IVA y el total son todos $0, lo que sugiere que esta es solo una lista de artículos sin precios definitivos o que faltan detalles de precios. La lista incluye una variedad de piezas para automóviles y equipos eléctricos, con cantidades que van desde unos pocos hasta varios cientos.'}

##### [Step-04] Save Model
Save the generic model with the specified display name

In [5]:
generic_model.save(display_name="RAG Generative AI v1.8.1")

Start loading langchain.pkl from model directory /tmp/tmp1761pyeq ...
Model is successfully loaded.
['.model-ignore', 'langchain.pkl', 'runtime.yaml', 'score.py']


loop1:   0%|          | 0/4 [00:00<?, ?it/s]

'ocid1.datasciencemodel.oc1.iad.amaaaaaafioir7ia233xjyp6t3iifz3y22s5p2zadhdj2qqdghcbfmii6xzq'

##### [Step-05] Deployment
Deploy the generic model with the specified parameters

<details>
<summary><font size="2">Oracle Cloud</font></summary>
<font size="1">

```Steps
> Data Science
> Proyects (Select Project)
> Model Deployments
> Create Model Deployment
```
</font>    
</details>


<details>
<summary><font size="2">Create Model Deployment</font></summary>
<font size="1">

```Generate Pack
> Name: Deploy RAG Generative AI v1.8.1
> Models: RAG Generative AI v1.8.1
> Shape: VM.Standard.E4.Flex
> Networking resources: Custom networking (VCN/Public Subnet)
```    
</font>    
</details>

<details>
<summary><font size="2"> Deploy RAG Generative AI v.1.8.2 > Invoking Your Model</font></summary>
<img src="img/deployment.png" alt="Invoking your Model" width="90%">
</details>

##### [Step-06] Test
The OCI SDK must be installed for this example to function properly.

In [6]:
import oci
import ads
import requests

# Supported values: resource_principal, api_key
ads.set_auth("resource_principal") 
signer = oci.auth.signers.get_resource_principals_signer()

endpoint = "https://modeldeployment.us-ashburn-1.oci.customer-oci.com/ocid1.datasciencemodeldeployment.oc1.iad.amaaaaaafioir7iat3zjurhoqro7e2kx76stea2iv2bupali55l2l5inwscq/predict"
body = {
    "input": [
        "Generar un resumen",
        "https://objectstorage.us-chicago-1.oraclecloud.com/n/idi1o0a010nx/b/DLK1LAGDEV/o/example.xlsx"
    ]
}
headers = {} # header goes here

requests.post(endpoint, json=body, auth=signer, headers=headers).json()

{'prediction': 'Aquí está un resumen del contexto proporcionado:\n\nEl documento parece ser una lista de artículos y repuestos con sus descripciones, cantidades y precios ofrecidos. Hay 40 artículos en total, que van desde retenes de aceite y pernos hasta filtros de aire y aceite, fusibles y varios tipos de cables. Los precios ofrecidos varían, pero el subtotal, el IVA y el total en la parte inferior del documento están establecidos en $0, lo que sugiere que este es solo un listado de artículos sin precios definitivos.'}