# BeyondLLM Framework: Different Approach to RAG and its Evaluation
We decided to include this notebook on our project to give other perpectives on evaluating RAG models. The framework explored is **BeyondLLM** and this notebook is an adaptation of the tutorial found [here](https://colab.research.google.com/drive/1vf9ebWWmZDn6vP9uqy-Zgzt8RSUSV-1o?usp=sharing&source=post_page-----c3cd710a6357--------------------------------).

## Build - Rapid Experiment - Evaluate - Repeat

Beyond LLM is a comprehensive framework for developing, testing, and evaluating Retrieval-Augmented Generation (RAG) systems. It streamlines the process with automated integration, customizable evaluation metrics, and support for various Large Language Models (LLMs) designed to meet specific requirements of the user. The goal is to minimize the risk of hallucinations in LLMs and improve their overall reliability.

### Useful Links:
- [Documentation](https://beyondllm.aiplanet.com/)
- [Github Repo](https://github.com/aiplanethub/beyondllm)

## Install the packages

Make sure to **restart session** after installing the packages.

In [1]:
! pip install beyondllm llama_index.embeddings.huggingface



## Overview

In this notebook, we'll develop a RAG pipeline, which helps us chat with YouTube video using BeyondLLM (and evaluating its performance). The code includes:

- Getting data from source
- Creating embeddings
- Retrieving documents
- Generating LLM responses
- Evaluating responses

In [2]:
import os
import Constants
os.environ['LANGCHAIN_TRACING_V2'] = 'true'
os.environ['LANGCHAIN_ENDPOINT'] = 'https://api.smith.langchain.com'
os.environ['LANGCHAIN_API_KEY'] = Constants.LANGCHAIN_API_KEY

In [3]:
HFHUB_API_KEY = Constants.HFHUB_API_KEY

## Import BeyondLLM

In [4]:
from beyondllm import source,retrieve,embeddings,llms,generator

[nltk_data] Downloading package punkt_tab to
[nltk_data]     c:\Users\tomas\Anaconda3\envs\beyondllm\Lib\site-
[nltk_data]     packages\llama_index\core\_static/nltk_cache...
[nltk_data]   Package punkt_tab is already up-to-date!


## Fit the Data

In [5]:
data = source.fit(
    path="../Data/CELEX_02017R0745-20240709_EN_TXT.pdf", 
    dtype="pdf",
    chunk_size=512,     
    chunk_overlap=128   
)

## Embedding Model

The embedding model we are using here will be the "BAAI/bge-small-en-v1.5" model from HuggingFace Hub. This is an open-source embedding model.

Here's the [link](https://huggingface.co/BAAI/bge-small-en-v1.5) to the HuggingFace Hub repo of the model.

In [6]:
model_name='BAAI/bge-small-en-v1.5'

embed_model = embeddings.HuggingFaceEmbeddings(
    model_name=model_name
)

## Define the Retriever

Here we are using the "cross-rerank" type of retriever.

You can look into the types of retrievers that BeyondLLM has to offer [here](https://beyondllm.aiplanet.com/core-components/auto-retriever).

In [7]:
retriever = retrieve.auto_retriever(
    data=data,
    embed_model=embed_model,
    type="cross-rerank",
    mode="OR",
    top_k=2)

## LLM

Initialize the Large Language Model, that will be used to generate the responses to our questions.

In this case we will use the "mistralai/Mistral-7B-Instruct-v0.2" model.

[Link to repo](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)

In [8]:
llm = llms.HuggingFaceHubModel(
    model="mistralai/Mistral-7B-Instruct-v0.2",
    token=HFHUB_API_KEY
)

## Define the Prompt and questions

In the cell below, we define the system prompt and the questions to the LLM. The system prompt is required for open-source LLMs like the "mistralai/Mistral-7B-Instruct-v0.2".

In [9]:
questions = [
    "What types of products does Regulation (EU) 2017/745 apply to?",
    "According to Regulation (EU) 2017/745, what are the responsibilities of a manufacturer placing a device on the market?",
    "Define 'implantable device' as per Article 2 of Regulation (EU) 2017/745.",
    "What is required of importers before placing a device on the EU market under this regulation?",
    "How does the regulation define a 'serious incident,' and what obligations arise from it?",
    "What is the 'Unique Device Identifier' (UDI) system, and who is responsible for it?",
    "What steps must a health institution take to manufacture and use a device within its own operations?",
    "Describe the role and responsibilities of an 'authorised representative' for a manufacturer not based in the EU."
]

system_prompt = f"""
<s>[INST]
You are an AI Assistant.
Please provide direct answers to questions.
[/INST]
</s>
"""

In [10]:
# Define a function to process multiple questions
def process_questions(questions, retriever, system_prompt, llm, max_length=80):
    results = []
    
    for question in questions:
        # Generate response for each question
        pipeline = generator.Generate(
            question=question,
            retriever=retriever,
            system_prompt=system_prompt,
            llm=llm
        )
        
        # Execute the pipeline and collect results
        response = pipeline.call()
        evaluations = pipeline.get_rag_triad_evals()
        
        # Store the response and evaluations
        results.append({
            'question': question,
            'response': response,
            'evaluations': evaluations
        })
    
    return results



## Evaluate RAG

In [11]:
import textwrap
# Example usage
# Replace with your actual questions
results = process_questions(questions, retriever, system_prompt, llm)

# Print results with text wrapping
for result in results:
    print(f"Question: {result['question']}")
    print('\n')
    print(textwrap.fill(result['response'], width=100))  # Wrap the response text
    print('\n')
    print(f"Evaluations:")
    print(result['evaluations'])
    print("\n-----------------------\n")  # Separator for readability


Executing RAG Triad Evaluations...
Executing RAG Triad Evaluations...
Executing RAG Triad Evaluations...
Executing RAG Triad Evaluations...
Executing RAG Triad Evaluations...
Executing RAG Triad Evaluations...
Executing RAG Triad Evaluations...
Executing RAG Triad Evaluations...
Question: What types of products does Regulation (EU) 2017/745 apply to?


         ANSWER: Regulation (EU) 2017/745 applies to medical devices for human use and accessories
for such devices in the Union. It also applies to clinical investigations concerning these medical
devices and accessories conducted in the Union. Additionally, it applies to certain groups of
products without an intended medical purpose that are listed in Annex XVI, subject to the adoption
of common specifications addressing application of risk management and, where necessary, clinical
evaluation regarding safety. These common specifications will apply from six months after their
entry into force or from May 26, 2021, whichever is the late