<a href="https://colab.research.google.com/github/Shariar076/notebook-snapshots/blob/main/rag_with_llama_or_gemma_and_faiss.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Installing Libraries


In [None]:
%%capture
! pip install datasets pandas faiss-cpu sentence_transformers
! pip install "transformers>=4.45.1"
Install below if using GPU
! pip install -U accelerate
! pip install -U git+https://github.com/huggingface/peft
! pip install bitsandbytes
! pip install git+https://github.com/huggingface/transformers

In [None]:
import os
os.kill(os.getpid(), 9)

## Data sourcing and preparation


The data utilised in this tutorial is sourced from Hugging Face datasets, specifically the
[AIatMongoDB/embedded_movies dataset](https://huggingface.co/datasets/AIatMongoDB/embedded_movies).

In [None]:
# Load Dataset
from datasets import load_dataset
import pandas as pd
import itertools

fullwiki_dataset = load_dataset("hotpotqa/hotpot_qa", 'distractor', trust_remote_code=True, split="validation[:500]")

# Convert the dataset to a pandas dataframe
dataset_df = pd.DataFrame(fullwiki_dataset)
# Convert the context column into a 1d list of sentences
dataset_df['context'] = dataset_df['context'].apply(lambda x: list(itertools.chain.from_iterable(x['sentences'])))
dataset_df.head(5)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Unnamed: 0,id,question,answer,type,level,supporting_facts,context
0,5a8b57f25542995d1e6f1371,Were Scott Derrickson and Ed Wood of the same ...,yes,comparison,hard,"{'title': ['Scott Derrickson', 'Ed Wood'], 'se...",[Ed Wood is a 1994 American biographical perio...
1,5a8c7595554299585d9e36b6,What government position was held by the woman...,Chief of Protocol,bridge,hard,"{'title': ['Kiss and Tell (1945 film)', 'Shirl...","[Meet Corliss Archer, a program from radio's G..."
2,5a85ea095542994775f606a8,"What science fantasy young adult series, told ...",Animorphs,bridge,hard,"{'title': ['The Hork-Bajir Chronicles', 'The H...",[The Andre Norton Award for Young Adult Scienc...
3,5adbf0a255429947ff17385a,Are the Laleli Mosque and Esma Sultan Mansion ...,no,comparison,hard,"{'title': ['Laleli Mosque', 'Esma Sultan Mansi...",[Esma Sultan (21 March 1873 – 7 May 1899) was ...
4,5a8e3ea95542995a26add48d,"The director of the romantic comedy ""Big Stone...","Greenwich Village, New York City",bridge,hard,"{'title': ['Big Stone Gap (film)', 'Adriana Tr...",[Just Another Romantic Wrestling Comedy is a 2...


## Create a document dataframe using the contexts

In [None]:

# Find the unique docs by further flattenning the contexts
document_df = pd.DataFrame()
document_df['doc'] =  pd.Series(itertools.chain.from_iterable(dataset_df['context'].values)).unique()
document_df = document_df[document_df['doc'].str.len() > 0]
document_df

Unnamed: 0,doc
0,Adam Collis is an American filmmaker and actor.
1,He attended the Duke University from 1986 to ...
2,He also studied cinema at the University of S...
3,Collis first work was the assistant director ...
4,"In 1998, he played ""Crankshaft"" in Eric Koyan..."
...,...
21011,Ako recently played “Klook” in “Klook’s Last ...
21012,He also recently played esteemed British acto...
21013,Ako has also worked at London’s Donmar Wareho...
21014,Ako’s credits also include: Pilot in Nick Llo...


## Data Preparation

In [None]:
# Remove data point where plot coloumn is missing
dataset_df = dataset_df.dropna(subset=["context"])
print("\nNumber of missing values in each column after removal:")
print(dataset_df.isnull().sum())

# Remove the plot_embedding from each data point in the dataset as we are going to create new embeddings with an open source embedding model from Hugging Face
# document_df = document_df.drop(columns=["context_embedding"])
# document_df.head(5)


Number of missing values in each column after removal:
id                  0
question            0
answer              0
type                0
level               0
supporting_facts    0
context             0
dtype: int64


## Generating embeddings

**The steps in the code snippets are as follows:**
1. Import the `SentenceTransformer` class to access the embedding models.
2. Load the embedding model using the `SentenceTransformer` constructor to instantiate the `gte-large` embedding model.
3. Define the `get_embedding` function, which takes a text string as input and returns a list of floats representing the embedding. The function first checks if the input text is not empty (after stripping whitespace). If the text is empty, it returns an empty list. Otherwise, it generates an embedding using the loaded model.
4. Generate embeddings by applying the `get_embedding` function to the "fullplot" column of the `dataset_df` DataFrame, generating embeddings for each movie's plot. The resulting list of embeddings is assigned to a new column named embedding.

*Note: It's not necessary to chunk the text in the full plot, as we can ensure that the text length remains within a manageable range.*



In [None]:
import faiss

from sentence_transformers import SentenceTransformer

# https://huggingface.co/thenlper/gte-large
embedding_model = SentenceTransformer("thenlper/gte-large")

docs = document_df['doc'].to_numpy()
embeddings = embedding_model.encode(docs)
print(embeddings.shape)
dimension = embeddings.shape[1]
index = faiss.IndexFlatL2(dimension)
index.add(embeddings)


(21015, 1024)


Unnamed: 0,doc,embedding
0,Adam Collis is an American filmmaker and actor.,0.001379
1,He attended the Duke University from 1986 to ...,-0.011071
2,He also studied cinema at the University of S...,0.013232
3,Collis first work was the assistant director ...,-0.007571
4,"In 1998, he played ""Crankshaft"" in Eric Koyan...",0.015223


In [None]:
import pickle
pickle.dump(embeddings, open(f"hotpotqa_embeddings.pkl", "wb"))

## Perform Vector Search on User Queries


In [None]:
from functools import partial

def vector_index_search(user_query, k):
    query_embedding = embedding_model.encode([user_query])
    D, I = index.search(query_embedding, k)
    return document_df.iloc[I[0]].values


dataset_df['top_1'] = dataset_df['question'].apply(vector_index_search, k=1)
dataset_df['top_3'] = dataset_df['question'].apply(vector_index_search, k=3)
dataset_df['top_5'] = dataset_df['question'].apply(vector_index_search, k=5)


dataset_df

Unnamed: 0,id,question,answer,type,level,supporting_facts,context,top_1,top_3,top_5
0,5a8b57f25542995d1e6f1371,Were Scott Derrickson and Ed Wood of the same ...,yes,comparison,hard,"{'title': ['Scott Derrickson', 'Ed Wood'], 'se...",[Adam Collis is an American filmmaker and acto...,[[Ed Wood is a 1994 American biographical peri...,[[Ed Wood is a 1994 American biographical peri...,[[Ed Wood is a 1994 American biographical peri...
1,5a8c7595554299585d9e36b6,What government position was held by the woman...,Chief of Protocol,bridge,hard,"{'title': ['Kiss and Tell (1945 film)', 'Shirl...",[A Kiss for Corliss is a 1949 American comedy ...,[[Kiss and Tell is a 1945 American comedy film...,[[Kiss and Tell is a 1945 American comedy film...,[[Kiss and Tell is a 1945 American comedy film...
2,5a85ea095542994775f606a8,"What science fantasy young adult series, told ...",Animorphs,bridge,hard,"{'title': ['The Hork-Bajir Chronicles', 'The H...",[Animorphs is a science fantasy series of youn...,"[[ It is told in first person, with all six ma...","[[ It is told in first person, with all six ma...","[[ It is told in first person, with all six ma..."
3,5adbf0a255429947ff17385a,Are the Laleli Mosque and Esma Sultan Mansion ...,no,comparison,hard,"{'title': ['Laleli Mosque', 'Esma Sultan Mansi...",[Esma Sultan is the name of three daughters of...,"[[The Esma Sultan Mansion (Turkish: ""Esma Sult...","[[The Esma Sultan Mansion (Turkish: ""Esma Sult...","[[The Esma Sultan Mansion (Turkish: ""Esma Sult..."
4,5a8e3ea95542995a26add48d,"The director of the romantic comedy ""Big Stone...","Greenwich Village, New York City",bridge,hard,"{'title': ['Big Stone Gap (film)', 'Adriana Tr...","[Great Eastern Conventions, Inc. was an entert...",[[Big Stone Gap is a 2014 American drama roman...,[[Big Stone Gap is a 2014 American drama roman...,[[Big Stone Gap is a 2014 American drama roman...
...,...,...,...,...,...,...,...,...,...,...
495,5a79b7f6554299029c4b5f6f,How many restaurants comprise the quick servic...,4613,bridge,hard,"{'title': ['Ron Joyce', 'Ron Joyce', 'Tim Hort...",[Arcos Dorados Holdings Inc. is McDonald’s lar...,[[ It is also Canada's largest quick service r...,[[ It is also Canada's largest quick service r...,[[ It is also Canada's largest quick service r...
496,5ab626d555429953192ad279,Anthony Avent played basketball fo a High Scho...,lower Manhattan,bridge,hard,"{'title': ['Anthony Avent', 'Anthony Avent', '...",[Locust Lake State Park is a Pennsylvania stat...,"[[Anthony Avent (born October 18, 1969) is a r...","[[Anthony Avent (born October 18, 1969) is a r...","[[Anthony Avent (born October 18, 1969) is a r..."
497,5a84873e5542997175ce1eec,What edition of tennis' US Open was the 2017 U...,137th,bridge,hard,"{'title': ['Petra Kvitová career statistics', ...",[The 2017 Winston–Salem Open was a men's tenni...,[[ It was the last event of the 2017 US Open S...,[[ It was the last event of the 2017 US Open S...,[[ It was the last event of the 2017 US Open S...
498,5ac537975542996feb3fea3c,Walt Zeboski photographed which 40th President...,Ronald Wilson Reagan,bridge,hard,"{'title': ['Walt Zeboski', 'Ronald Reagan'], '...",[Bernhard Vogel (born 19 December 1932) is a G...,[[ Zeboski extensively photographed Ronald Rea...,[[ Zeboski extensively photographed Ronald Rea...,[[ Zeboski extensively photographed Ronald Rea...


In [None]:
dataset_df.to_csv("hotpotqa_retieval_results.csv", index=False)

## Evaluation


In [None]:
# from transformers import AutoTokenizer, AutoModelForCausalLM

# tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b-it")
# model = AutoModelForCausalLM.from_pretrained("google/gemma-2b-it", device_map="auto")

In [None]:
import huggingface_hub
huggingface_hub.login('')

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: write).
Your token has been saved to /root/.cache/huggingface/token
Login successful


In [None]:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

# model_name = "meta-llama/Llama-3.1-8B-Instruct"
model_name = "google/gemma-2-2b-it"

# adapter_name = "shariar076/Llama-3.1-8B-Instruct-hotpotqa-raft-1000_adapter"
adapter_name = "shariar076/gemma-2-2b-hotpotqa-raft-1k_adapter"

tokenizer = AutoTokenizer.from_pretrained(model_name)

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    # load_in_8bit=use_8_bit,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=getattr(torch, "bfloat16"),
    bnb_4bit_use_double_quant=True,
)

model = AutoModelForCausalLM.from_pretrained(model_name,
                                             quantization_config=bnb_config,
                                             torch_dtype=torch.float16,
                                             # attn_implementation="flash_attention_2",
                                             device_map='auto')

model = PeftModel.from_pretrained(
        model,
        adapter_name,
        torch_dtype=torch.float16
    )

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

## Default Dataset

In [None]:
tokenizer.pad_token = tokenizer.eos_token


def batch_generate(rows):
  user_queries = rows['question']
  search_results = rows['context']
  instruction = "Given the question and context above, please provide one logical reasoning and one answer. Please use the format:\n\n##Reasoning: {reasoning}\n\n##Answer: {answer}."
  prompts = [f"Question: {q}\n\nContext: {c}\n\nInstruction: {instruction}" for q, c in zip(user_queries, search_results)]
  input_ids = tokenizer(prompts, padding=True, return_tensors="pt").to("cuda")
  with torch.no_grad():
    generated_ids = model.generate(**input_ids, max_new_tokens=512)
  outputs =tokenizer.batch_decode(generated_ids, skip_special_tokens=True) #, skip_special_tokens=True

  # return {'generated_answer': outputs}
  return outputs

In [None]:
import re
from tqdm.notebook import tqdm
from datasets import Dataset

all_generated_answers = []
batch_size = 8

for i in tqdm(range(0, len(dataset_df), batch_size)):
    batch = dataset_df[i:i + batch_size]
    generated_answers = batch_generate(batch)
    all_generated_answers.extend(generated_answers)
# dataset = Dataset.from_pandas(dataset_df)
# dataset = dataset.map(batch_generate, batched=True, batch_size=batch_size)
# dataset

answers = []
for index, output in enumerate(all_generated_answers):
  # if index==1:
  #   print(output)
  answer = re.split('[.\n]', output.split('##Answer:')[-1])[0].strip()
  answers.append(answer)

dataset_df['answer_pred'] = answers
dataset_df[['answer', 'answer_pred']]

  0%|          | 0/63 [00:00<?, ?it/s]

Unnamed: 0,answer,answer_pred
0,yes,Both were American
1,Chief of Protocol,Meet Corliss Archer
2,Animorphs,{answer}
3,no,Yes
4,"Greenwich Village, New York City",The context does not provide any specific info...
...,...,...
495,4613,1
496,lower Manhattan,"Seton Hall University in Newark, New Jersey"
497,137th,2017 US Open
498,Ronald Wilson Reagan,Ronald Reagan


In [None]:
ds_acc = (dataset_df.answer_pred.str.extract(r'(\w+)', expand=False).str.lower()==dataset_df.answer.str.extract(r'(\w+)', expand=False).str.lower()).mean()
print("ds_acc", ds_acc*100)

ds_acc 34.0


In [None]:
dataset_df.to_csv("hotpotqa_gemma-2-2b-it_distractor_cot_ft_results.csv", index = False)

## FAISS Retrieved

In [None]:
import pandas as pd

dataset_df = pd.read_csv("hotpotqa_retieval_results.csv", index_col='id')
# dataset_df = pd.read_csv("hotpotqa_llama-3.1-8B-instruct_no_cot_results.csv")
dataset_df.head(5)

Unnamed: 0_level_0,question,answer,type,level,supporting_facts,context,top_1,top_3,top_5
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
5a8b57f25542995d1e6f1371,Were Scott Derrickson and Ed Wood of the same ...,yes,comparison,hard,"{'title': ['Scott Derrickson', 'Ed Wood'], 'se...",['Adam Collis is an American filmmaker and act...,[['Ed Wood is a 1994 American biographical per...,[['Ed Wood is a 1994 American biographical per...,[['Ed Wood is a 1994 American biographical per...
5a8c7595554299585d9e36b6,What government position was held by the woman...,Chief of Protocol,bridge,hard,"{'title': ['Kiss and Tell (1945 film)', 'Shirl...",['A Kiss for Corliss is a 1949 American comedy...,[['Kiss and Tell is a 1945 American comedy fil...,[['Kiss and Tell is a 1945 American comedy fil...,[['Kiss and Tell is a 1945 American comedy fil...
5a85ea095542994775f606a8,"What science fantasy young adult series, told ...",Animorphs,bridge,hard,"{'title': ['The Hork-Bajir Chronicles', 'The H...",['Animorphs is a science fantasy series of you...,"[[' It is told in first person, with all six m...","[[' It is told in first person, with all six m...","[[' It is told in first person, with all six m..."
5adbf0a255429947ff17385a,Are the Laleli Mosque and Esma Sultan Mansion ...,no,comparison,hard,"{'title': ['Laleli Mosque', 'Esma Sultan Mansi...",['Esma Sultan is the name of three daughters o...,"[['The Esma Sultan Mansion (Turkish: ""Esma Sul...","[['The Esma Sultan Mansion (Turkish: ""Esma Sul...","[['The Esma Sultan Mansion (Turkish: ""Esma Sul..."
5a8e3ea95542995a26add48d,"The director of the romantic comedy ""Big Stone...","Greenwich Village, New York City",bridge,hard,"{'title': ['Big Stone Gap (film)', 'Adriana Tr...","['Great Eastern Conventions, Inc. was an enter...",[['Big Stone Gap is a 2014 American drama roma...,[['Big Stone Gap is a 2014 American drama roma...,[['Big Stone Gap is a 2014 American drama roma...


In [None]:
import re
def generate_answer(row, k):
  user_query = row['question']
  search_results = row[f'top_{k}']
  # prompt = f"Query: {user_query}\nProvide answer to the query by using the Search Results:\n{search_results}\n\n."+" Please use the format: Answer:{only answer with no explanation}"
  instruction = "Given the question and context above, please provide one logical reasoning and one answer. Please use the format:\n\n##Reasoning: {reasoning}\n\n##Answer: {answer}."
  prompt = f"Question: {user_query}\n\nContext: {search_results}\n\nInstruction: {instruction}"
  input_ids = tokenizer(prompt, return_tensors="pt").to("cuda")
  response = model.generate(**input_ids, max_new_tokens=512)
  output =tokenizer.decode(response[0]) #, skip_special_tokens=True
  # print("="*100)
  # print(output)
  # print("="*100)
  try:
    # answer = re.sub(r'[^A-Za-z0-9 ]+', '', output.split('##Answer:')[-1].split('.')[0]).strip()
    answer = output.split('##Answer:')[-1].split('[.\n]')[0].strip()
  except:
    answer = output
  return answer



In [None]:
tokenizer.pad_token = tokenizer.eos_token

def batch_generate(rows, k):
  user_queries = rows['question']
  search_results = rows[f'top_{k}']
  instruction = "Given the question and context above, please provide one logical reasoning and one answer. Please use the format:\n\n##Reasoning: {reasoning}\n\n##Answer: {answer}."
  prompts = [f"Question: {q}\n\nContext: {c}\n\nInstruction: {instruction}" for q, c in zip(user_queries, search_results)]
  input_ids = tokenizer(prompts, padding=True, return_tensors="pt").to("cuda")
  with torch.no_grad():
    generated_ids = model.generate(**input_ids, max_new_tokens=512)
  outputs =tokenizer.batch_decode(generated_ids, skip_special_tokens=True) #, skip_special_tokens=True

  return outputs




In [None]:
batch_generate(dataset_df[100:101], 1)

In [None]:
# dataset_df['answer_k_1'] = dataset_df.apply(generate_answer, k=1, axis=1)
import re
from tqdm.notebook import tqdm

all_generated_answers = []
batch_size = 64
for i in tqdm(range(0, len(dataset_df), batch_size)):
    batch = dataset_df[i:i + batch_size]
    generated_answers = batch_generate(batch, k=1)
    all_generated_answers.extend(generated_answers)

answers = []
for index, output in enumerate(all_generated_answers):
  # if index==1:
  #   print(output)
  answer = re.split('[.\n]', output.split('##Answer:')[-1])[0].strip()
  answers.append(answer)

dataset_df['answer_k_1'] = answers
dataset_df[['answer', 'answer_k_1']]

# dataset_df.answer.str.extract(r'(\w+)', expand=False).str.lower().head(10)

  0%|          | 0/8 [00:00<?, ?it/s]

Unnamed: 0_level_0,answer,answer_k_1
id,Unnamed: 1_level_1,Unnamed: 2_level_1
5a8b57f25542995d1e6f1371,yes,Ed Wood is American
5a8c7595554299585d9e36b6,Chief of Protocol,Shirley Temple
5a85ea095542994775f606a8,Animorphs,"The series you are describing is likely ""The H..."
5adbf0a255429947ff17385a,no,"No, the Laleli Mosque is not located in the sa..."
5a8e3ea95542995a26add48d,"Greenwich Village, New York City",New York City
...,...,...
5a79b7f6554299029c4b5f6f,4613,4613
5ab626d555429953192ad279,lower Manhattan,
5a84873e5542997175ce1eec,137th,2017
5ac537975542996feb3fea3c,Ronald Wilson Reagan,Ronald Reagan


In [None]:
k_1_acc = (dataset_df.answer_k_1.str.extract(r'(\w+)', expand=False).str.lower()==dataset_df.answer.str.extract(r'(\w+)', expand=False).str.lower()).mean()
print("k_1_acc", k_1_acc*100)

k_1_acc 22.0


In [None]:
# dataset_df['answer_k_3'] = dataset_df.apply(generate_answer, k=3, axis=1)
import re
from tqdm.notebook import tqdm

all_generated_answers = []
batch_size = 64
for i in tqdm(range(0, len(dataset_df), batch_size)):
    batch = dataset_df[i:i + batch_size]
    generated_answers = batch_generate(batch, k=3)
    all_generated_answers.extend(generated_answers)

answers = []
for index, output in enumerate(all_generated_answers):
  # if index==1:
  #   print(output)
  answer = re.split('[.\n]', output.split('##Answer:')[-1])[0].strip()
  answers.append(answer)

dataset_df['answer_k_3'] = answers
dataset_df[['answer', 'answer_k_3']]

  0%|          | 0/8 [00:00<?, ?it/s]

Unnamed: 0_level_0,answer,answer_k_3
id,Unnamed: 1_level_1,Unnamed: 2_level_1
5a8b57f25542995d1e6f1371,yes,
5a8c7595554299585d9e36b6,Chief of Protocol,Shirley Temple was a child actress who was fam...
5a85ea095542994775f606a8,Animorphs,
5adbf0a255429947ff17385a,no,
5a8e3ea95542995a26add48d,"Greenwich Village, New York City",We cannot answer this question based on the pr...
...,...,...
5a79b7f6554299029c4b5f6f,4613,4613
5ab626d555429953192ad279,lower Manhattan,
5a84873e5542997175ce1eec,137th,
5ac537975542996feb3fea3c,Ronald Wilson Reagan,


In [None]:
k_3_acc = (dataset_df.answer_k_3.str.extract(r'(\w+)', expand=False).str.lower()==dataset_df.answer.str.extract(r'(\w+)', expand=False).str.lower()).mean()
print("k_3_acc", k_3_acc*100)

k_3_acc 17.2


In [None]:
# dataset_df['answer_k_5'] = dataset_df.apply(generate_answer, k=5, axis=1)
import re
from tqdm.notebook import tqdm

all_generated_answers = []
batch_size = 64
for i in tqdm(range(0, len(dataset_df), batch_size)):
    batch = dataset_df[i:i + batch_size]
    generated_answers = batch_generate(batch, k=5)
    all_generated_answers.extend(generated_answers)

answers = []
for index, output in enumerate(all_generated_answers):
  # if index==1:
  #   print(output)
  answer = re.split('[.\n]', output.split('##Answer:')[-1])[0].strip()
  answers.append(answer)

dataset_df['answer_k_5'] = answers
dataset_df[['answer', 'answer_k_5']]

  0%|          | 0/8 [00:00<?, ?it/s]

Unnamed: 0_level_0,answer,answer_k_5
id,Unnamed: 1_level_1,Unnamed: 2_level_1
5a8b57f25542995d1e6f1371,yes,
5a8c7595554299585d9e36b6,Chief of Protocol,
5a85ea095542994775f606a8,Animorphs,
5adbf0a255429947ff17385a,no,
5a8e3ea95542995a26add48d,"Greenwich Village, New York City",We cannot determine the director's location ba...
...,...,...
5a79b7f6554299029c4b5f6f,4613,27
5ab626d555429953192ad279,lower Manhattan,"Newark, New Jersey"
5a84873e5542997175ce1eec,137th,
5ac537975542996feb3fea3c,Ronald Wilson Reagan,


In [None]:
k_5_acc = (dataset_df.answer_k_5.str.extract(r'(\w+)', expand=False).str.lower()==dataset_df.answer.str.extract(r'(\w+)', expand=False).str.lower()).mean()
print("k_5_acc", k_5_acc*100)

k_5_acc 17.4


In [None]:
dataset_df.to_csv("hotpotqa_gemma-2-2b-it_no_cot_results.csv", index = False)