# Initial Framework RAG Model Support

## Pre-requisites

In [1]:
%pip install -q qdrant-client

Note: you may need to restart the kernel to use updated packages.


In [2]:
# load openai api key
import os

from dotenv import load_dotenv
load_dotenv()

if not 'OPENAI_API_KEY' in os.environ:
    raise ValueError('OPENAI_API_KEY is not set')

## Dataset Loader

In [3]:
# load documents
import os
from csv import DictReader
from uuid import uuid4

import pandas as pd


column_map = {"RFP_Question": "question", "RFP_Answer": "ground_truth"}


def load_documents(prefix):
    documents = []
    root_dir = "datasets/rag/"
    for file in os.listdir(root_dir):
        if file.startswith(prefix) and file.endswith(".csv"):
            # use csv dict reader to load the csv file
            with open(os.path.join(root_dir, file)) as f:
                reader = DictReader(f)
                for row in reader:
                    # add a unique id to the row
                    row["id"] = str(uuid4())
                    documents.append(row)

    df = pd.DataFrame(documents)
    df = df[["id", "RFP_Question", "RFP_Answer"]]
    # df.rename(columns=column_map, inplace=True)

    return df

def load_dataset_split(limit=None):
    df = load_documents("rfp_existing_questions")

    if limit:
        df = df.head(limit)

    # split the dataset into a "train" - which gets inserted into the vector store
    # and a "test" - which is used to evaluate the search results
    train_df = df.sample(frac=0.8)
    test_df = df.drop(train_df.index)

    return train_df, test_df

## Embedding Model Selection

First let's setup our embedding model and run some tests to make sure its working well.

In [4]:
from openai import OpenAI

from validmind.models import FunctionModel

client = OpenAI()


def embed(input):
    """Returns a text embedding for the given text"""
    input["embedding"] = (
        client.embeddings.create(
            input=input["RFP_Question"],
            model="text-embedding-3-small",
        )
        .data[0]
        .embedding
    )

    return input

vm_embedder = FunctionModel(input_id="embedding_model", predict_fn=embed)

Let's create our test dataset so we can run it through our different models.

In [5]:
import validmind as vm

train_df, test_df = load_dataset_split(20)

vm_test_ds = vm.init_dataset(
    test_df,
    text_column="RFP_Question", # some NLP which work with text data require a `text_column` to be specified
    target_column="RFP_Answer",
    __log=False,
)

vm_test_ds.df.head()

2024-05-07 13:42:58,375 - INFO(validmind.client): Pandas dataset detected. Initializing VM Dataset instance...


Unnamed: 0,id,RFP_Question,RFP_Answer
4,227914ef-6a4a-4dba-913d-36c26de755a6,What considerations do you take into account f...,Our design philosophy centers on simplicity an...
9,6594068a-f252-4443-888b-cd1252cab55d,How do your LLMs continuously learn and update...,We implement advanced continuous learning mech...
13,e8dd0378-ebe2-47c6-8006-b6a6c787328f,Describe your strategy for integrating LLMs in...,Our approach involves conducting a thorough an...
15,b37ae0ec-b7ce-41f3-8779-991caf8b428e,How does your AI strategy align with the NIST ...,Our AI solution is meticulously designed to al...


In [6]:
vm_test_ds.assign_predictions(vm_embedder)

2024-05-07 13:42:58,382 - INFO(validmind.vm_models.dataset.utils): Running predict_proba()... This may take a while
2024-05-07 13:42:58,383 - INFO(validmind.vm_models.dataset.utils): Not running predict_proba() for unsupported models.
2024-05-07 13:42:58,383 - INFO(validmind.vm_models.dataset.utils): Running predict()... This may take a while
2024-05-07 13:42:59,108 - INFO(validmind.vm_models.dataset.utils): Done running predict()


In [7]:
vm_test_ds.df.head()

Unnamed: 0,id,RFP_Question,RFP_Answer,embedding_model_prediction
4,227914ef-6a4a-4dba-913d-36c26de755a6,What considerations do you take into account f...,Our design philosophy centers on simplicity an...,"{'embedding': [-0.002952731214463711, -0.00326..."
9,6594068a-f252-4443-888b-cd1252cab55d,How do your LLMs continuously learn and update...,We implement advanced continuous learning mech...,"{'embedding': [-0.010829819366335869, 0.029368..."
13,e8dd0378-ebe2-47c6-8006-b6a6c787328f,Describe your strategy for integrating LLMs in...,Our approach involves conducting a thorough an...,"{'embedding': [0.010061484761536121, 0.0223792..."
15,b37ae0ec-b7ce-41f3-8779-991caf8b428e,How does your AI strategy align with the NIST ...,Our AI solution is meticulously designed to al...,"{'embedding': [-0.0057174162939190865, 0.02251..."


Let's go ahead and run one of the ValidMind embeddings stability analysis tests to make sure our embeddings model is working properly.

In [8]:
from validmind.tests import run_test

# result = run_test(
#     "validmind.model_validation.embeddings.StabilityAnalysisRandomNoise",
#     inputs={"model": vm_embedder, "dataset": vm_test_ds},
#     params={"probability": 0.3},
# )

## Setup Vector Store

#### Generate embeddings for the questions

> Note: We use the name `train_df` to refer to the dataset that is loaded into the vector store and used as context. This is not a great name but its consistent with data science terminology.

In [9]:
train_df["embedding"] = [embed(row)["embedding"] for _, row in train_df.iterrows()]
train_df.head()

Unnamed: 0,id,RFP_Question,RFP_Answer,embedding
1,b2e16d39-1e04-42d7-8867-ebc19fb55934,How do you maintain your AI applications with ...,We maintain a dedicated R&D team focused on in...,"[0.011783392168581486, 0.010354681871831417, 0..."
14,1cc1d530-0143-421a-9481-adf4d906006d,What is your approach to maintaining and suppo...,Our post-deployment support is designed to ens...,"[-0.004095795564353466, 0.04930035024881363, 0..."
18,bc286719-ba2c-400b-a9bb-738d10ae5409,What steps do you take to ensure the transpare...,We prioritize transparency by incorporating ex...,"[-0.0011347628897055984, -0.009663973934948444..."
16,16dd571d-1e83-4545-92b1-77de4e013313,Can you discuss your governance framework for ...,We have established an AI Risk Council that pl...,"[0.014880189672112465, 0.03505474328994751, 0...."
3,b1b93f5f-c16d-4cd3-b0e1-4e792e148ad4,What actions do you undertake to secure user d...,User privacy and data security are paramount. ...,"[0.007698851637542248, 0.007591660600155592, 0..."


#### Insert embeddings and questions into Vector DB

In [10]:
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, PointStruct, VectorParams

qdrant = QdrantClient(":memory:")
qdrant.recreate_collection(
    "rfp_rag_collection",
    vectors_config=VectorParams(size=1536, distance=Distance.COSINE),
)
qdrant.upsert(
    "rfp_rag_collection",
    points=[
        PointStruct(
            id=row["id"],
            vector=row["embedding"],
            payload={"RFP_Question": row["RFP_Question"], "RFP_Answer": row["RFP_Answer"]},
        )
        for _, row in train_df.iterrows()
    ],
)

UpdateResult(operation_id=0, status=<UpdateStatus.COMPLETED: 'completed'>)

## Setup Retrieval Model

In [11]:
def retrieve(input):
    input["contexts"] = []

    for result in qdrant.search(
        "rfp_rag_collection",
        query_vector=input["embedding"],
        limit=input.get(
            "limit", 10
        ),  # we could add a row to the dataset to specify a limit
    ):
        context = f"Q: {result.payload['RFP_Question']}\n"
        context += f"A: {result.payload['RFP_Answer']}\n"

        input["contexts"].append(context)

    return input


vm_retriever = FunctionModel(input_id="retrieval_model", predict_fn=retrieve)

In [12]:
vm_test_ds.assign_predictions(vm_retriever)

2024-05-07 13:43:03,648 - INFO(validmind.vm_models.dataset.utils): Running predict_proba()... This may take a while
2024-05-07 13:43:03,648 - INFO(validmind.vm_models.dataset.utils): Not running predict_proba() for unsupported models.
2024-05-07 13:43:03,648 - INFO(validmind.vm_models.dataset.utils): Running predict()... This may take a while
2024-05-07 13:43:03,650 - INFO(validmind.vm_models.dataset.utils): Done running predict()


In [13]:
vm_test_ds.df.head()

Unnamed: 0,id,RFP_Question,RFP_Answer,embedding_model_prediction,retrieval_model_prediction
4,227914ef-6a4a-4dba-913d-36c26de755a6,What considerations do you take into account f...,Our design philosophy centers on simplicity an...,"{'embedding': [-0.002952731214463711, -0.00326...",{'contexts': ['Q: How do you maintain your AI ...
9,6594068a-f252-4443-888b-cd1252cab55d,How do your LLMs continuously learn and update...,We implement advanced continuous learning mech...,"{'embedding': [-0.010829819366335869, 0.029368...",{'contexts': ['Q: How do you ensure your LLMs ...
13,e8dd0378-ebe2-47c6-8006-b6a6c787328f,Describe your strategy for integrating LLMs in...,Our approach involves conducting a thorough an...,"{'embedding': [0.010061484761536121, 0.0223792...",{'contexts': ['Q: How do you ensure your LLMs ...
15,b37ae0ec-b7ce-41f3-8779-991caf8b428e,How does your AI strategy align with the NIST ...,Our AI solution is meticulously designed to al...,"{'embedding': [-0.0057174162939190865, 0.02251...",{'contexts': ['Q: Can you discuss your governa...


## Setup Generation Model

In [14]:
system_prompt = """
You are an expert RFP AI assistant.
You are tasked with answering new RFP questions based on existing RFP questions and answers.
You will be provided with the existing RFP questions and answer pairs that are the most relevant to the new RFP question.
After that you will be provided with a new RFP question.
You will generate an answer and respond only with the answer.
Ignore your pre-existing knowledge and answer the question based on the provided context.
""".strip()


def generate(input):
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": "\n\n".join(input["contexts"])},
            {"role": "user", "content": input["RFP_Question"]},
        ],
    )

    input["answer"] = response.choices[0].message.content

    return input

vm_generator = FunctionModel(input_id="generation_model", predict_fn=generate)

In [15]:
vm_test_ds.assign_predictions(vm_generator)

2024-05-07 13:43:03,763 - INFO(validmind.vm_models.dataset.utils): Running predict_proba()... This may take a while
2024-05-07 13:43:03,765 - INFO(validmind.vm_models.dataset.utils): Not running predict_proba() for unsupported models.
2024-05-07 13:43:03,767 - INFO(validmind.vm_models.dataset.utils): Running predict()... This may take a while
2024-05-07 13:43:17,842 - INFO(validmind.vm_models.dataset.utils): Done running predict()


In [16]:
vm_test_ds.df.head()

Unnamed: 0,id,RFP_Question,RFP_Answer,embedding_model_prediction,retrieval_model_prediction,generation_model_prediction
4,227914ef-6a4a-4dba-913d-36c26de755a6,What considerations do you take into account f...,Our design philosophy centers on simplicity an...,"{'embedding': [-0.002952731214463711, -0.00326...",{'contexts': ['Q: How do you maintain your AI ...,{'answer': 'We prioritize user interface and u...
9,6594068a-f252-4443-888b-cd1252cab55d,How do your LLMs continuously learn and update...,We implement advanced continuous learning mech...,"{'embedding': [-0.010829819366335869, 0.029368...",{'contexts': ['Q: How do you ensure your LLMs ...,{'answer': 'Our LLMs are designed to continuou...
13,e8dd0378-ebe2-47c6-8006-b6a6c787328f,Describe your strategy for integrating LLMs in...,Our approach involves conducting a thorough an...,"{'embedding': [0.010061484761536121, 0.0223792...",{'contexts': ['Q: How do you ensure your LLMs ...,{'answer': 'Our strategy for integrating Large...
15,b37ae0ec-b7ce-41f3-8779-991caf8b428e,How does your AI strategy align with the NIST ...,Our AI solution is meticulously designed to al...,"{'embedding': [-0.0057174162939190865, 0.02251...",{'contexts': ['Q: Can you discuss your governa...,{'answer': 'Our AI strategy aligns closely wit...


## Setup RAG Model (Pipeline of "Component" Models)

Now that we have our individual models setup, let's create a `RAGModel` instance that will chain them together and give us a single model that can be evalated end-to-end.

In [21]:
from validmind.models import PipelineModel

vm_output_parser = FunctionModel(
    input_id="output_parser",
    predict_fn=lambda input: input["answer"],
)
vm_rag_model = PipelineModel(vm_embedder | vm_retriever | vm_generator | vm_output_parser)

Let's run the test dataset through the entire pipeline. It will overwrite the current predictions that we generated from the individual models, but the key here is that calling `predict` on the `RAGModel` will run the entire pipeline and store the intermediate predictions in the dataframe.

In [22]:
result_df = vm_rag_model.predict(test_df)

ValueError: FunctionModel `predict_fn` must return the input dictionary

In [19]:
result_df

[{'answer': 'We prioritize user interface and user experience design in our AI applications by focusing on intuitive navigation, clear information presentation, and seamless interaction flows. Our design process incorporates user feedback and usability testing to ensure that the interface is user-friendly and that the overall experience is engaging and efficient. Additionally, we leverage technologies such as React for responsive design, allowing for a consistent experience across different devices. By emphasizing user-centric design principles, we aim to create AI applications that not only deliver powerful functionalities but also provide a satisfying and intuitive user experience.\n'},
 {'answer': 'Our LLMs are equipped with continuous learning mechanisms that enable them to adapt to new data and evolving user expectations. We implement reinforcement learning algorithms that allow the models to adjust their behaviors based on real-time feedback and interactions. By leveraging techni

## Experiment with some RAGAS Metrics

Below I am just experimenting to see how the RAGAS metrics can work with the `RAGModel` instance. This is not a full implementation of the RAGAS metrics but just a poc. We'll want to make this work in a more general way so that the columns can be properly mapped from the user-provided `predict_col` or the default `predict_col` to the column names that RAGAS expects i.e. `question`, `contexts`, `answer`, `ground_truth`.

In [20]:
vm_ragas_ds = vm.init_dataset(result_df, __log=False)

UnsupportedDatasetError: Only Pandas datasets and Tensor Datasets are supported at the moment.

In [None]:
import plotly.express as px

def plot_distribution(scores):
    # plot distribution of scores (0-1) from ragas metric
    # scores is a list of floats
    fig = px.histogram(x=scores, nbins=10)
    fig.show()

In [None]:
import warnings

warnings.filterwarnings("ignore")

In [None]:
result = run_test(
    "validmind.model_validation.ragas.AnswerSimilarity",
    inputs={"dataset": vm_ragas_ds},
    show=False,
)
plot_distribution(result.metric.summary.results[0].data)

In [None]:
result = run_test(
    "validmind.model_validation.ragas.ContextEntityRecall",
    inputs={"dataset": vm_ragas_ds},
    show=False,
)
plot_distribution(result.metric.summary.results[0].data)

In [None]:
result = run_test(
    "validmind.model_validation.ragas.ContextPrecision",
    inputs={"dataset": vm_ragas_ds},
    show=False,
)
plot_distribution(result.metric.summary.results[0].data)

In [None]:
result = run_test(
    "validmind.model_validation.ragas.ContextRelevancy",
    inputs={"dataset": vm_ragas_ds},
    show=False,
)
plot_distribution(result.metric.summary.results[0].data)