# RAG + LLM Assessment

Your task is to create a Retrieval-Augmented Generation (RAG) system using a Large Language Model (LLM). The RAG system should be able to retrieve relevant information from a knowledge base and generate coherent and informative responses to user queries.

Steps:

1. Choose a domain and collect a suitable dataset of documents (at least 5 documents - PDFs or HTML pages) to serve as the knowledge base for your RAG system. Select one of the following topics:
   * latest scientific papers from arxiv.org,
   * fiction books released,
   * legal documents or,
   * social media posts.

   Make sure that the documents are newer then the training dataset of the applied LLM. (20 points)

2. Create three relevant prompts to the dataset, and one irrelevant prompt. (20 points)

3. Load an LLM with at least 5B parameters. (10 points)

4. Test the LLM with your prompts. The goal should be that without the collected dataset your model is unable to answer the question. If it gives you a good answer, select another question to answer and maybe a different dataset. (10 points)

5. Create a LangChain-based RAG system by setting up a vector database from the documents. (20 points)

6. Provide your three relevant and one irrelevant prompts to your RAG system. For the relevant prompts, your RAG system should return relevant answers, and for the irrelevant prompt, an empty answer. (20 points)


For my dataset, I've chosen to focus on the Met Gala 2024, which took place on May 2, 2024.

Prompts:

- What are the appearance of Indian designers at Met Gala 2024?
- What is Jennie Blackpink wearing to Met Gala 2024?
- Did Taylor Swift go to Met Gala 2024?



Irrelevant Prompt:

- Summarize Bridgerton Season 3



In [1]:
!pip install transformers>=4.32.0 optimum>=1.12.0 > null
!pip install auto-gptq --extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/ > null
!pip install langchain > null
!pip install chromadb > null
!pip install sentence_transformers > null # ==2.2.2
!pip install unstructured > null
!pip install pdf2image > null
!pip install pdfminer.six > null
!pip install unstructured-pytesseract > null
!pip install unstructured-inference > null
!pip install faiss-gpu > null
!pip install pikepdf > null
!pip install pypdf > null
!pip install accelerate > null
!pip install pillow_heif > null
!pip install -i https://pypi.org/simple/ bitsandbytes > null

In [3]:
import locale
from textwrap import fill

from huggingface_hub import login
from langchain.chains import RetrievalQA
from langchain.document_loaders import UnstructuredURLLoader
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.llms import HuggingFacePipeline
from langchain.prompts import PromptTemplate
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.vectorstores.utils import (
    filter_complex_metadata,  # 'filter_complex_metadata' removes complex metadata that are not in str, int, float or bool format
)
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    GenerationConfig,
    pipeline,
)

locale.getpreferredencoding = lambda: "UTF-8"

# you need to define your private User Access Token from Huggingface
# to be able to access models with accepted licence
HUGGINGFACE_UAT = "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"
login(HUGGINGFACE_UAT)

  from .autonotebook import tqdm as notebook_tqdm


The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: read).
Your token has been saved to /home/lehoangchibach/.cache/huggingface/token
Login successful


In [4]:
import torch

model_name = "meta-llama/Meta-Llama-3-8B-Instruct"

quantization_config = BitsAndBytesConfig(load_in_4bit=True)

tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    quantization_config=quantization_config,
    device_map="auto",
)

gen_cfg = GenerationConfig.from_pretrained(model_name)
gen_cfg.max_new_tokens = 512
gen_cfg.temperature = (
    0.0000001  # 0.0 # For RAG we would like to have determenistic answers
)
gen_cfg.return_full_text = True
gen_cfg.do_sample = True
gen_cfg.repetition_penalty = 1.11

pipe = pipeline(
    task="text-generation", model=model, tokenizer=tokenizer, generation_config=gen_cfg
)

llm = HuggingFacePipeline(pipeline=pipe)

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:08<00:00,  2.04s/it]


In [5]:
template = """
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

If you can't find the relevant information in the context, just say you don't have enough information to answer the question. Don't try to make up an answer.

<|begin_of_text|><|start_header_id|>user<|end_header_id|>

{text}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
"""

prompt = PromptTemplate(
    input_variables=["text"],
    template=template,
)

In [6]:
text = "What are the appearance of Indian designers at Met Gala 2024?"
result = llm(prompt.format(text=text))
print(fill(result.strip(), width=100))

  warn_deprecated(
Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


<|begin_of_text|><|start_header_id|>system<|end_header_id|>  If you can't find the relevant
information in the context, just say you don't have enough information to answer the question. Don't
try to make up an answer.  <|begin_of_text|><|start_header_id|>user<|end_header_id|>  What are the
appearance of Indian designers at Met Gala
2024?<|eot_id|><|start_header_id|>assistant<|end_header_id|> I don't have enough information to
answer this question as it is about a future event (Met Gala 2024) that has not yet occurred. The
Metropolitan Museum of Art's Costume Institute Benefit, commonly known as the Met Gala, typically
takes place annually in May and features a theme and guest list that is announced well in advance.
Since the event for 2024 has not been officially announced, I do not have any information on the
appearances of Indian designers or anyone else who may be attending.


In [7]:
web_loader = UnstructuredURLLoader(
    urls=[
        "https://www.harpersbazaar.com/celebrity/latest/a60701816/blackpink-jennie-kim-red-carpet-photos-met-gala-2024/",
        "https://www.bbc.com/news/live/world-us-canada-68955283",
        "https://edition.cnn.com/style/gallery/met-gala-2024-red-carpet-fashion/index.html",
        "https://www.vogue.com/article/everything-to-know-met-gala",
        "https://www.cosmopolitan.com/uk/entertainment/g60711944/who-skipped-the-2024-met-gala/",
    ],
    mode="elements",
    strategy="fast",
)
web_doc = web_loader.load()
updated_web_doc = filter_complex_metadata(web_doc)

In [8]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=2048, chunk_overlap=512)
chunked_web_doc = text_splitter.split_documents(updated_web_doc)
len(chunked_web_doc)

557

In [9]:
embeddings = HuggingFaceEmbeddings()



In [10]:
%%time

# Create the vectorized db with FAISS

db_web = FAISS.from_documents(chunked_web_doc, embeddings)

CPU times: user 868 ms, sys: 2.76 ms, total: 871 ms
Wall time: 705 ms


In [11]:
prompt_template = """
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

Use the following context to answer the question at the end. Do not use any other information. If you can't find the relevant information in the context, just say you don't have enough information to answer the question. Don't try to make up an answer.

{context}<|eot_id|><|start_header_id|>user<|end_header_id|>

{question}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
"""

In [12]:
prompt = PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)
Chain_web = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    # retriever=db.as_retriever(search_type="similarity_score_threshold", search_kwargs={'k': 5, 'score_threshold': 0.8})
    # Similarity Search is the default way to retrieve documents relevant to a query, but we can use MMR by setting search_type = "mmr"
    # k defines how many documents are returned; defaults to 4.
    # score_threshold allows to set a minimum relevance for documents returned by the retriever, if we are using the "similarity_score_threshold" search type.
    # return_source_documents=True, # Optional parameter, returns the source documents used to answer the question
    retriever=db_web.as_retriever(
        search_type="similarity_score_threshold",
        search_kwargs={"k": 10, "score_threshold": 0.1},
    ),
    chain_type_kwargs={"prompt": prompt},
)

In [13]:
query = "What are the appearance of Indian designers at Met Gala 2024?"
result = Chain_web.invoke(query)
print(fill(result["result"].strip(), width=100))

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


<|begin_of_text|><|start_header_id|>system<|end_header_id|>  Use the following context to answer the
question at the end. Do not use any other information. If you can't find the relevant information in
the context, just say you don't have enough information to answer the question. Don't try to make up
an answer.  Posted at 2:122:12Indian designers in full bloom at the Met GalaSuranjana TewariAsia
business reporterGetty ImagesCopyright: Getty ImagesNatasha PoonawallaImage caption: Natasha
PoonawallaFor Indian celebrities, the Met Gala is an opportunity to shine on the global stage. Some
attendees like Natasha Poonawalla are regulars. This year the socialite – dubbed Mrs Vaccine on
social media owing to her Covid vaccine-making billionaire husband - served a custom look from
Maison Margiela’s Artisanal Collection designed by John Galliano.Indian designers also continue to
have their moment in the sun. Actress and producer Mindy Kaling stunned in a champagne ensemble
crafted by Gaurav Gup

In [15]:
query = "What is Jennie Blackpink wearing at Met Gala 2024?"
result = Chain_web.invoke(query)
print(fill(result["result"].strip(), width=100))

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


<|begin_of_text|><|start_header_id|>system<|end_header_id|>  Use the following context to answer the
question at the end. Do not use any other information. If you can't find the relevant information in
the context, just say you don't have enough information to answer the question. Don't try to make up
an answer.  Tonight marks Jennie’s second appearance at the Met Gala; she first attended last year.
At the time, she wore a vintage Chanel dress plucked from the fashion house’s Fall/Winter 1990
collection: a strapless white minidress with a scalloped neckline and a pleated hem. Coordinating
with a black ribbon tied around the bustier, she finished off the look with black gloves, opaque
black tights, and black platform heels.  Blackpink’s Jennie Looks Sensational in a Fluid Cobalt
Minidress at the 2024 Met Gala  Blackpink’s Jennie Looks Sensational in a Fluid Cobalt Minidress at
the 2024 Met Gala  FashionTonight’s 2024 Met Gala Dress Code, ExplainedBy Lilah Ramzi  The Best
After-Party Loo

In [17]:
query = "Did Taylor Swift go to Met Gala 2024?"
result = Chain_web.invoke(query)
print(fill(result["result"].strip(), width=100))

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


<|begin_of_text|><|start_header_id|>system<|end_header_id|>  Use the following context to answer the
question at the end. Do not use any other information. If you can't find the relevant information in
the context, just say you don't have enough information to answer the question. Don't try to make up
an answer.  Several outlets reported that Taylor Swift "passed on" the event this year despite
receiving an invitation. Tay hasn't attended since she was one of the gala's co-chairs in 2016
(pictured here), and according to her Eras Tour schedule, she's set to continue her international
leg on 9th May — just three days after the Met Gala.  Per TMZ, “sources with direct knowledge” say
Travis and Taylor were “each issued individual invitations to the Met Gala."  Taylor Russell Is an
It Girl at the Met Gala  Where is the Met Gala held?  What happens at the Met Gala?  Who is invited
to the Met Gala?  A refresher that the actor skipped the Met Gala in 2023 because she had just
welcomed her fou

In [18]:
#irrelevant prompt

query = "Summarize Bridgerton season 3 for me"
result = Chain_web.invoke(query)
print(fill(result["result"].strip(), width=100))

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


<|begin_of_text|><|start_header_id|>system<|end_header_id|>  Use the following context to answer the
question at the end. Do not use any other information. If you can't find the relevant information in
the context, just say you don't have enough information to answer the question. Don't try to make up
an answer.  <|eot_id|><|start_header_id|>user<|end_header_id|>  Summarize Bridgerton season 3 for
me<|eot_id|><|start_header_id|>assistant<|end_header_id|> I apologize, but I don't have enough
information to summarize Bridgerton season 3 because there is no mention of a third season in the
provided context. The text does not contain any information about the show "Bridgerton" or its
seasons.
