In [None]:
!pip install beyondllm

In [None]:
!pip install llama-index-finetuning
!pip install llama-index-embeddings-huggingface

## 1. Import Required Libraries

First, import the necessary libraries and set up the environment.

In [None]:
from beyondllm import source, retrieve, llms,generator
from beyondllm.embeddings import FineTuneEmbeddings
import os

In [None]:
from getpass import getpass

os.environ['GOOGLE_API_KEY'] = getpass("API key:")

API key:··········


2. Load and Prepare Data

Load your data from a specified path and prepare it for processing. The data should be in PDF format for this example, and it will be chunked appropriately.

In [None]:
path = "Tarun_CV.pdf"

In [None]:
data = source.fit(path, dtype="pdf", chunk_size=1024, chunk_overlap=0)

### 3. Initialize the Language Model

Initialize the language model from beyondllm. Here, we are using the GeminiModel.

In [None]:
llm = llms.GeminiModel()

### 4. Fine-Tune the Embeddings

Why Fine-Tuning Embeddings?

Fine-tuning embeddings with your specific dataset enhances the contextual understanding of the language model, making it better suited for retrieving and generating relevant information. This process adapts the embeddings to the nuances and domain-specific language present in your data, thereby improving the accuracy and relevance of RAG pipelines.

Fine-tune the embeddings using the FineTuneEmbeddings class. This involves training the embeddings on your specific dataset to better capture the nuances of your data.

To fine-tune embeddings, the LLM generates a dataset of pairs from the given data. This process is essential for training the model. The model used in this case is the BAAI/bge-small-en-v1.5, an open-source embedding model from HuggingFace.

In [None]:
fine_tuned_model = FineTuneEmbeddings()

In [None]:
embed_model = fine_tuned_model.train([path], "BAAI/bge-small-en-v1.5", llm, "fintune")

Parsing nodes: 0it [00:00, ?it/s]

Parsed 0 nodes


Parsing nodes:   0%|          | 0/1 [00:00<?, ?it/s]

Parsed 1 nodes


0it [00:00, ?it/s]
100%|██████████| 1/1 [00:04<00:00,  4.91s/it]


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/94.8k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/52.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/743 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/133M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/366 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

Epoch:   0%|          | 0/2 [00:00<?, ?it/s]

Iteration: 0it [00:00, ?it/s]

Iteration: 0it [00:00, ?it/s]

> Note: In the training function, we declare a path where the model needs to be saved. For our case, this path is set to finetune. This same path variable is also used to load the fine-tuned embedding model.


## 5. Load the Fine-Tuned Model

HuggingFace embeddings now are updated, so we will now use that in our retrieval and generation pipeline.

In [None]:
embed_model = fine_tuned_model.load_model("fintune")

## 6. Set Up the Retriever

Set up the retriever using the fine-tuned embeddings. The retriever is responsible for fetching relevant documents based on the query.

In [None]:
retriever = retrieve.auto_retriever(data, embed_model, type="normal", top_k=4)

## 7. Retrieve Information

You can now use the retriever to fetch relevant information. For example, to retrieve information about "Tarun's role at AI Planet":

In [None]:
retriever.retrieve("what is Tarun role in AI planet")[0].text

"Greetings\nDemetrios\nBrinkmann,\nI\nhope\nyou\nare\ndoing\nwell.\nI\nam\nwriting\nto\nexpress\nmy\ninterest\nin\nthe\nDeveloper\nAdvocate\nposition\nthat\nI\nsaw\non\nQdrant\npage.\nI\nhave\nover\n2\nyears\nexperience\nin\nDeveloper\nAdvocacy\nwith\none\nyear\nindustry\nexperience\nat\nan\nAI\nstartup\necosystem.\nI\nhave\na\nstrong\ntrack\nrecord\nof\nsuccess\nin\nbuilding\nand\nengaging\ndeveloper\ncommunities,\nas\nwell\nas\nin\ndeveloping\nand\ndeploying\nstate-of-the-art\nAI\nmodels.\nIn\nmy\ncurrent\nrole\nat\nAl\nPlanet\n,\nI\nwear\nmultiple\nhats\nby\nbeing\npart\nof\nthe\nData\nScience\nteam\nand\nhandling\nthe\ncommunity.\nI\nhave\nworked\non\nFine\nTuning\nLLMs,\nbuilding\nConsultant\nPOC\nto\nmigrate\nthe\nenterprise\nand\nbusiness\ninto\nAI,\nand\ndeploying\n6+\nstate-of-the-art\nmodels\non\nAl\nPlanet's\nAI\nMarketplace.\nI\nhave\nalso\norganised\n30+\nlive\nsessions\nwith\nexperts\nfrom\nGoogle,\nWeights\n&\nBiases,\nIntel,\nand\nmore.\nAI\nPlanet\ncurrently\nhas\n16K+

## 8. Generate Responses with RAG

Finally, set up the generation pipeline using the retriever and the language model. This pipeline will use the retrieved information to generate a coherent response.

### Evaluating RAG Performance

The evaluation metrics for the RAG triad assess the effectiveness of your pipeline. These metrics include Context Relevance, Answer Relevance, Groundedness, and Ground Truth. They collectively measure how well the system retrieves, understands, and generates responses that are accurate, factual, and contextually appropriate.

In [None]:
pipeline = generator.Generate(question="Tarun's role at AI Planet summarize it",retriever=retriever,llm=llm)

In [None]:
print(pipeline.call())

Tarun wears multiple hats at AI Planet, being part of the Data Science team and handling the community. He has worked on Fine Tuning LLMs, building Consultant POC to migrate the enterprise and business into AI, and deploying 6+ state-of-the-art models on AI Planet's AI Marketplace. He has also organised 30+ live sessions with experts from Google, Weights & Biases, Intel, and more.


In [None]:
print(pipeline.get_rag_triad_evals())

Executing RAG Triad Evaluations...
Context relevancy Score: 8.0
This response meets the evaluation threshold. It demonstrates strong comprehension and coherence.
Answer relevancy Score: 9.0
This response meets the evaluation threshold. It demonstrates strong comprehension and coherence.
Groundness score: 10.0
This response meets the evaluation threshold. It demonstrates strong comprehension and coherence.
