# Retrieval and Generation using Bedrock Models for Ground Truth

### Overview

This notebook provides a practical demonstration of Retrieval-Augmented Generation (RAG) with Amazon Bedrock foundational models, utilizing FloTorch for ground truth evaluation. It walks through the process of fetching pertinent information from a knowledge base and subsequently generating responses grounded in the retrieved context.

### Prerequisites

1.  Confirm that all prerequisites outlined in the `1.1 Prerequisites.ipynb` notebook from Lab 1 have been completed.
2.  Ensure that at least one of the knowledge base creation notebooks (`1.2`, `1.3`, `1.4`, or `1.5`) from Lab 1 has been successfully executed.

### Load env variables

In [None]:
import json
with open("../Lab 1/variables.json", "r") as f:
    variables = json.load(f)

variables

### Load prompt.json

In [None]:
prompt_file_path = '../data/prompt.json'
with open(prompt_file_path, 'r') as f:
    prompt = json.load(f)

### Running the evaluation against Fixed Chunking KB

**Important:** This step assumes that your knowledge base has already been created in Lab 1. Please ensure that you have completed the knowledge base creation as part of Lab 1 before proceeding.

Inference Models considered - Amazon Nova Micro, Amazon Nova Pro, Claude Haiku 3.5, Claude Sonnet 3.5

In [None]:
bedrock_kb = variables['kbFixedChunk']

inference_models = ["us.amazon.nova-lite-v1:0","us.amazon.nova-pro-v1:0",
                "us.anthropic.claude-3-5-haiku-20241022-v1:0","us.anthropic.claude-3-5-sonnet-20241022-v2:0"]

### Experiment Config

* **Bedrock KB Id:** KnowledgeBase Id to query against
* **KNN (k-Nearest Neighbors):** 5
* **Rerank Model:** Amazon Rerank
* **N-Shot Prompt:** 1
* **Temperature:** 0.1


In [None]:
exp_config_data = {
    "bedrock_kb_id": bedrock_kb,
    "temp_retrieval_llm": "0.1",
    "gt_data": variables["s3_ground_truth_path"],
    "rerank_model_id": "amazon.rerank-v1:0",
    "retrieval_service": "bedrock",
    "knn_num": "5",
    "retrieval_model": "us.amazon.nova-lite-v1:0",
    "aws_region": variables['regionName'],
    "n_shot_prompt_guide_obj": prompt,
    "n_shot_prompts": 1
}

### Load ground truth data

In [None]:
from flotorch_core.storage.storage_provider_factory import StorageProviderFactory
from flotorch_core.reader.json_reader import JSONReader
from flotorch_rag_utils import Question

gt_data = exp_config_data['gt_data']
storage = StorageProviderFactory.create_storage_provider(gt_data)
gt_data_path = storage.get_path(gt_data)
json_reader = JSONReader(storage)
questions = json_reader.read_as_model(gt_data_path, Question)

### Initialize VectorStorage (in this case Bedrock KnowledgeBases)

In [None]:
from flotorch_core.storage.db.vector.vector_storage_factory import VectorStorageFactory

vector_storage = VectorStorageFactory.create_vector_storage(
        knowledge_base=True,
        use_bedrock_kb=True,
        embedding=None,
        knowledge_base_id=exp_config_data.get("bedrock_kb_id"),
        aws_region=exp_config_data.get("aws_region")
    )

### Initialize Reranker

In [None]:
from flotorch_core.rerank.rerank import BedrockReranker

reranker = BedrockReranker(exp_config_data.get("aws_region"), exp_config_data.get("rerank_model_id")) \
    if exp_config_data.get("rerank_model_id").lower() != "none" \
    else None 

### Execute RAG against all the inference models

Initialize inferencer and then perform the retrieval, reranking, and inference steps using the `flotorch-core` library.

In [None]:
from flotorch_core.inferencer.inferencer_provider_factory import InferencerProviderFactory
from flotorch_rag_utils import rag_with_flotorch

rag_response_dict = {}

# The evaluation process duration is dependent on the volume of questions and the number of models bases being evaluated. 
# Larger evaluations require more time, generally around 5-6 minutes.
for inference_model in inference_models:
    inferencer = InferencerProviderFactory.create_inferencer_provider(
        False,"","",
        exp_config_data.get("retrieval_service"),
        inference_model, 
        exp_config_data.get("aws_region"), 
        variables['bedrockExecutionRoleArn'],
        int(exp_config_data.get("n_shot_prompts", 0)), 
        float(exp_config_data.get("temp_retrieval_llm", 0)), 
        exp_config_data.get("n_shot_prompt_guide_obj")
    )

    responses = rag_with_flotorch(exp_config_data, vector_storage, reranker, inferencer, questions)
    rag_response_dict[inference_model] = responses


### Write the results to a JSON file

In [None]:
import json

filename = f"../results/ragas_evaluation_responses_for_different_models.json"

# Save to JSON with proper formatting
with open(filename, 'w', encoding='utf-8') as f:
    json.dump(rag_response_dict, f, indent=4, ensure_ascii=False)