<a href="https://colab.research.google.com/github/sunilravilla/sunilravilla/blob/main/beyondllm_application_for_youtube_video.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# BeyondLLM

## Build - Rapid Experiment - Evaluate - Repeat

Beyond LLM is a comprehensive framework for developing, testing, and evaluating Retrieval-Augmented Generation (RAG) systems. It streamlines the process with automated integration, customizable evaluation metrics, and support for various Large Language Models (LLMs) designed to meet specific requirements of the user. The goal is to minimize the risk of hallucinations in LLMs and improve their overall reliability.

### Useful Links:
- [Documentation](https://beyondllm.aiplanet.com/)
- [Github Repo](https://github.com/aiplanethub/beyondllm)

## Install the packages

Make sure to **restart session** after installing the packages.

In [None]:
! pip install beyondllm youtube_transcript_api llama-index-readers-youtube-transcript llama_index.embeddings.huggingface

Collecting beyondllm
  Downloading beyondllm-0.2.0-py3-none-any.whl (46 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/46.5 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m46.5/46.5 kB[0m [31m1.5 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting youtube_transcript_api
  Downloading youtube_transcript_api-0.6.2-py3-none-any.whl (24 kB)
Collecting llama-index-readers-youtube-transcript
  Downloading llama_index_readers_youtube_transcript-0.1.4-py3-none-any.whl (3.7 kB)
Collecting llama_index.embeddings.huggingface
  Downloading llama_index_embeddings_huggingface-0.2.1-py3-none-any.whl (7.1 kB)
Collecting llama-index==0.10.27 (from beyondllm)
  Downloading llama_index-0.10.27-py3-none-any.whl (6.9 kB)
Collecting llama-index-embeddings-gemini==0.1.6 (from beyondllm)
  Downloading llama_index_embeddings_gemini-0.1.6-py3-none-any.whl (2.9 kB)
Collecting numpy==1.26.4 (from beyondllm)
  Downloading numpy-1.26.4-c

## Overview

In this notebook, we'll develop a RAG pipeline, which helps us chat with YouTube video using BeyondLLM (and evaluating its performance). The code includes:

- Getting data from source
- Creating embeddings
- Retrieving documents
- Generating LLM responses
- Evaluating responses

## Providing Access Token

- To get your personal access token from HuggingFace Hub, vist [here](https://huggingface.co/settings/tokens)
- **Note** - if you do not have an account in Huggingface Hub, create one by signing up.
- Click the "New Token" button at the bottom to create a new token. Copy the token and paste it after running the next code block.

In [None]:
from getpass import getpass
import os

hf_token = getpass('Enter Your HuggingfaceHub Token')

os.environ['HF_TOKEN'] = hf_token

Enter Your HuggingfaceHub Token··········


## Import BeyondLLM

In [None]:
from beyondllm import source,retrieve,embeddings,llms,generator

## Fit the Data
In this case, the data will be the following YouTube video.

[Watch the Video Here](https://youtu.be/oJJyTztI_6g?si=ufZSB6qWUZsSUCDa)

In [None]:
data = source.fit(
    path="https://www.youtube.com/watch?v=oJJyTztI_6g",
    dtype="youtube",
    chunk_size=1024,
    chunk_overlap=0)

In [None]:
print(data)

[TextNode(id_='7aa9267e-9d9f-44ab-9f84-b7e7d68aa4fa', embedding=None, metadata={'video_id': 'oJJyTztI_6g'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='aad9a3bc-23fc-4ba6-9fb6-1eb2f6cf0d4a', node_type=<ObjectType.TEXT: '1'>, metadata={'video_id': 'oJJyTztI_6g'}, hash='01e83d0060377cdc09be5d73e4b038357a35c1f541dcc791f953d29f65a7f61f')}, text="hi everyone uh have you ever struggled\nto learn some complex data science\ntopics\nwell I I did I personally did and I used\nto resort to online forums and not just\nthat like you know I've seen several\nthousands of um community members at AI\nPlanet struggle with complex topics\neither they resort to online forums\nDiscord or they either go to some\nmentors who can help them in the\nthat's what we always thought how can we\nsolve this problem and I'm very excited\nto introduce you to Jupiter\nwhich is the AI Guru that would that\nwould rise and also simpl

## Embedding Model

The embedding model we are using here will be the "BAAI/bge-small-en-v1.5" model from HuggingFace Hub. This is an open-source embedding model.

Here's the [link](https://huggingface.co/BAAI/bge-small-en-v1.5) to the HuggingFace Hub repo of the model.

In [None]:
model_name='BAAI/bge-small-en-v1.5'

embed_model = embeddings.HuggingFaceEmbeddings(
    model_name=model_name
)

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/94.8k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/52.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/743 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/133M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/366 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

## Define the Retriever

Here we are using the "cross-rerank" type of retriever.

You can look into the types of retrievers that BeyondLLM has to offer [here](https://beyondllm.aiplanet.com/core-components/auto-retriever).

In [None]:
retriever = retrieve.auto_retriever(
    data=data,
    embed_model=embed_model,
    type="cross-rerank",
    mode="OR",
    top_k=2)

In [None]:
# validate the retriever
retrieved_nodes = retriever.retrieve("which tool is mentioned in the video?")
print(retrieved_nodes)

config.json:   0%|          | 0.00/794 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/62.5M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/316 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

[NodeWithScore(node=TextNode(id_='7aa9267e-9d9f-44ab-9f84-b7e7d68aa4fa', embedding=None, metadata={'video_id': 'oJJyTztI_6g'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='aad9a3bc-23fc-4ba6-9fb6-1eb2f6cf0d4a', node_type=<ObjectType.TEXT: '1'>, metadata={'video_id': 'oJJyTztI_6g'}, hash='01e83d0060377cdc09be5d73e4b038357a35c1f541dcc791f953d29f65a7f61f')}, text="hi everyone uh have you ever struggled\nto learn some complex data science\ntopics\nwell I I did I personally did and I used\nto resort to online forums and not just\nthat like you know I've seen several\nthousands of um community members at AI\nPlanet struggle with complex topics\neither they resort to online forums\nDiscord or they either go to some\nmentors who can help them in the\nthat's what we always thought how can we\nsolve this problem and I'm very excited\nto introduce you to Jupiter\nwhich is the AI Guru that would that\nwould 

## LLM

Initialize the Large Language Model, that will be used to generate the responses to our questions.

In this case we will use the "mistralai/Mistral-7B-Instruct-v0.2" model.

[Link to repo](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2)

In [None]:
llm = llms.HuggingFaceHubModel(
    model="mistralai/Mistral-7B-Instruct-v0.2",
    token=os.environ.get('HF_TOKEN')
)

## Define the Prompt

In the cell below, we define the system prompt and the questions to the LLM. The system prompt is required for open-source LLMs like the "mistralai/Mistral-7B-Instruct-v0.2".

In [None]:
question1 = "what organization is the video mentioning about?"
question2 = "what tool is mentioned?"

system_prompt = f"""
<s>[INST]
You are an AI Assistant.
Please provide direct answers to questions.
[/INST]
</s>
"""

## First Question - **question1**

In [None]:
pipeline = generator.Generate(
    question=question1,
    retriever=retriever,
    system_prompt=system_prompt,
    llm=llm)

In [None]:
# executing the pipeline
print(pipeline.call())


Answer: AI Planet.


### Evaluate using RAG Triads

More information about evaluating RAG pipelines using BeyondLLM can be found [here](https://beyondllm.aiplanet.com/core-components/evaluation)

In [None]:
print(pipeline.get_rag_triad_evals())

Executing RAG Triad Evaluations...
Context relevancy Score: 10.0
This response meets the evaluation threshold. It demonstrates strong comprehension and coherence.
Answer relevancy Score: 10.0
This response meets the evaluation threshold. It demonstrates strong comprehension and coherence.
Groundness score: 5.0
This response does not meet the evaluation threshold. Consider refining the structure and content for better clarity and effectiveness.


## Second Question - **question2**

In [None]:
pipeline = generator.Generate(
    question=question2,
    retriever=retriever,
    system_prompt=system_prompt,
    llm=llm)

In [None]:
# execute the pipeline
print(pipeline.call())


Answer: Jupiter, an AI tool for explaining complex data science topics in various formats including as if you are Phi or in the form of a movie plot.


In [None]:
print(pipeline.get_rag_triad_evals())

Executing RAG Triad Evaluations...
Context relevancy Score: 10.0
This response meets the evaluation threshold. It demonstrates strong comprehension and coherence.
Answer relevancy Score: 10.0
This response meets the evaluation threshold. It demonstrates strong comprehension and coherence.
Groundness score: 7.0
This response does not meet the evaluation threshold. Consider refining the structure and content for better clarity and effectiveness.
