<a href="https://colab.research.google.com/github/towardsai/ai-tutor-rag-system/blob/main/notebooks/Larger_Context_Larger_N.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Install Packages and Setup Variables

In [None]:
!pip install -q llama-index==0.10.37 openai==1.30.1 tiktoken==0.7.0 chromadb==0.5.0 llama-index-llms-gemini==0.1.10 llama-index-vector-stores-chroma==0.1.7

In [None]:
import os

# Set the "OPENAI_API_KEY" in the Python environment. Will be used by OpenAI client later.
os.environ["OPENAI_API_KEY"] = "[OPENAI_API_KEY]"
os.environ["GOOGLE_API_KEY"] = "[GOOGLE_API_KEY]"

# Load Gemini Model

In [None]:
from llama_index.llms.gemini import Gemini

llm = Gemini(model="models/gemini-pro")

# Download the Vector Store

In [None]:
!wget https://github.com/AlaFalaki/tutorial_notebooks/raw/main/data/vectorstore.zip

--2024-06-05 18:22:24--  https://github.com/AlaFalaki/tutorial_notebooks/raw/main/data/vectorstore.zip
Resolving github.com (github.com)... 140.82.121.3
Connecting to github.com (github.com)|140.82.121.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/AlaFalaki/tutorial_notebooks/main/data/vectorstore.zip [following]
--2024-06-05 18:22:24--  https://raw.githubusercontent.com/AlaFalaki/tutorial_notebooks/main/data/vectorstore.zip
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1479982 (1.4M) [application/zip]
Saving to: ‘vectorstore.zip’


2024-06-05 18:22:24 (54.1 MB/s) - ‘vectorstore.zip’ saved [1479982/1479982]



In [None]:
!unzip vectorstore.zip

Archive:  vectorstore.zip
   creating: mini-llama-articles/
   creating: mini-llama-articles/a361e92f-9895-41b6-ba72-4ad38e9875bd/
  inflating: mini-llama-articles/a361e92f-9895-41b6-ba72-4ad38e9875bd/data_level0.bin  
  inflating: mini-llama-articles/a361e92f-9895-41b6-ba72-4ad38e9875bd/header.bin  
 extracting: mini-llama-articles/a361e92f-9895-41b6-ba72-4ad38e9875bd/link_lists.bin  
  inflating: mini-llama-articles/a361e92f-9895-41b6-ba72-4ad38e9875bd/length.bin  
  inflating: mini-llama-articles/chroma.sqlite3  


In [None]:
import chromadb
from llama_index.vector_stores.chroma import ChromaVectorStore

# Load the vector store from the local storage.
db = chromadb.PersistentClient(path="./mini-llama-articles")
chroma_collection = db.get_or_create_collection("mini-llama-articles")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

In [None]:
from llama_index.core import VectorStoreIndex

# Create the index based on the vector store.
index = VectorStoreIndex.from_vector_store(vector_store, llm=llm)

In [None]:
query_engine = index.as_query_engine()

res = query_engine.query("How many parameters LLaMA2 model has?")

In [None]:
res.response

'The Llama 2 model has four different model sizes: 7 billion, 13 billion, 34 billion, and 70 billion parameters.'

In [None]:
# Show the retrieved nodes
for src in res.source_nodes:
  print("Node ID\t", src.node_id)
  print("Title\t", src.metadata['title'])
  print("Text\t", src.text)
  print("Score\t", src.score)
  print("-_"*20)

Node ID	 d6f533e5-fef8-469c-a313-def19fd38efe
Title	 Meta's Llama 2: Revolutionizing Open Source Language Models for Commercial Use
Text	 I. Llama 2: Revolutionizing Commercial Use Unlike its predecessor Llama 1, which was limited to research use, Llama 2 represents a major advancement as an open-source commercial model. Businesses can now integrate Llama 2 into products to create AI-powered applications. Availability on Azure and AWS facilitates fine-tuning and adoption. However, restrictions apply to prevent exploitation. Companies with over 700 million active daily users cannot use Llama 2. Additionally, its output cannot be used to improve other language models.  II. Llama 2 Model Flavors Llama 2 is available in four different model sizes: 7 billion, 13 billion, 34 billion, and 70 billion parameters. While 7B, 13B, and 70B have already been released, the 34B model is still awaited. The pretrained variant, trained on a whopping 2 trillion tokens, boasts a context window of 4096 toke

# Evaluate

In [None]:
!wget https://raw.githubusercontent.com/AlaFalaki/tutorial_notebooks/main/data/rag_eval_dataset.json

--2024-06-05 19:43:23--  https://raw.githubusercontent.com/AlaFalaki/tutorial_notebooks/main/data/rag_eval_dataset.json
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 476714 (466K) [text/plain]
Saving to: ‘rag_eval_dataset.json’


2024-06-05 19:43:24 (25.0 MB/s) - ‘rag_eval_dataset.json’ saved [476714/476714]



In [None]:
# We can also load the dataset from a previously saved json file.
from llama_index.core.evaluation import EmbeddingQAFinetuneDataset

rag_eval_dataset = EmbeddingQAFinetuneDataset.from_json(
    "./rag_eval_dataset.json"
)

In [None]:
from llama_index.core.evaluation import RelevancyEvaluator, FaithfulnessEvaluator, BatchEvalRunner
from llama_index.llms.openai import OpenAI

llm_gpt4 = OpenAI(temperature=0, model="gpt-4o")

faithfulness_evaluator = FaithfulnessEvaluator(llm=llm_gpt4)
relevancy_evaluator = RelevancyEvaluator(llm=llm_gpt4)

# Run evaluation
queries = list(rag_eval_dataset.queries.values())
batch_eval_queries = queries[:20]

runner = BatchEvalRunner(
{"faithfulness": faithfulness_evaluator, "relevancy": relevancy_evaluator},
workers=32,
)

for i in [2, 4, 6, 8, 10, 15, 20, 25, 30]:
    # Set Faithfulness and Relevancy evaluators
    query_engine = index.as_query_engine(similarity_top_k=i)

    eval_results = await runner.aevaluate_queries(
        query_engine, queries=batch_eval_queries
    )
    faithfulness_score = sum(result.passing for result in eval_results['faithfulness']) / len(eval_results['faithfulness'])
    print(f"top_{i} faithfulness_score: {faithfulness_score}")

    relevancy_score = sum(result.passing for result in eval_results['relevancy']) / len(eval_results['relevancy'])
    print(f"top_{i} relevancy_score: {relevancy_score}")

top_2 faithfulness_score: 1.0
top_2 relevancy_score: 1.0
top_4 faithfulness_score: 1.0
top_4 relevancy_score: 0.95
top_6 faithfulness_score: 1.0
top_6 relevancy_score: 0.95
top_8 faithfulness_score: 1.0
top_8 relevancy_score: 1.0
top_10 faithfulness_score: 1.0
top_10 relevancy_score: 1.0
top_15 faithfulness_score: 0.95
top_15 relevancy_score: 0.95
top_20 faithfulness_score: 1.0
top_20 relevancy_score: 0.95
top_25 faithfulness_score: 0.95
top_25 relevancy_score: 1.0
top_30 faithfulness_score: 0.95
top_30 relevancy_score: 0.95
