In [1]:
!pip install langchain langchain-community langchain-huggingface chromadb pypdf transformers torch sentence-transformers wikipedia bitsandbytes



## Load and process the PDF

In [2]:
from langchain_community.document_loaders import PyPDFLoader

pdf_path = "data/rendang.pdf"
loader = PyPDFLoader(pdf_path)
documents = loader.load()

In [3]:
print(documents[0])

page_content='Review Article
Rendang: The treasure of Minangkabau
Muthia Nurmuﬁda, Gervasius H. Wangrimen*, Risty Reinalta, Kevin Leonardi
Nutrition and Food Technology Department, Faculty of Life Science, Surya University, Indonesia
article info
Article history:
Received 19 July 2017
Received in revised form
11 October 2017
Accepted 11 October 2017
Available online 19 October 2017
Keywords:
Cuisine
Culture
History
Minangkabau
Rendang
abstract
Rendang is a traditional food originating from West Sumatra and prepared by Minangkabau people.
Rendang is commonly made with beef (especially tenderloin) with special sauce containing a high
amount of coconut milk. In the past, Minangkabau people preparedrendang in such a way that it has long
shelf life and could be stored during long journeys. The long shelf life ofrendang is thought to be
contributed by the spices used during the cooking process. Nowadays,rendang is known worldwide, but
its history and cultural signiﬁcance are given less atten

## Split documents into chunks

In [4]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
texts = text_splitter.split_documents(documents)

## embedding and ChromaDB

In [5]:
from langchain_huggingface import HuggingFaceEmbeddings
from langchain.vectorstores import Chroma

embedding_model = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
vectorstore = Chroma.from_documents(documents=texts, embedding=embedding_model, persist_directory="./chroma_db")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


## Set up LLaMA 3 Hugging face model with quantization

In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from langchain.llms import HuggingFacePipeline
import torch
from transformers import BitsAndBytesConfig

model_name = "glaiveai/Llama-3-8B-RAG-v1"
tokenizer = AutoTokenizer.from_pretrained(model_name)

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True
)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config,
    device_map="auto"
)

llm_pipeline = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=512,
    temperature=0.7,
    top_p=0.9,
    device_map="auto",
    pad_token_id=tokenizer.eos_token_id,
    return_full_text=False
)
llm = HuggingFacePipeline(pipeline=llm_pipeline)

## Set up Wikipedia tool as another data source

In [27]:
from langchain_community.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper

wikipedia = WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper())

## Define prompt template and create RAG chain

In [35]:
from langchain_core.prompts import PromptTemplate

prompt_template = PromptTemplate(
    input_variables=["pdf_context", "wiki_context", "query"],
    template="""Using the following PDF and Wikipedia contexts, provide a concise answer to the query. Output only the answer, without repeating the query or context.

PDF Context:
{pdf_context}

Wikipedia Context:
{wiki_context}

Query: {query}

Answer:"""
)

retriever = vectorstore.as_retriever(search_kwargs={"k": 2})

def rag_chain(query):
    # Retrieve PDF context
    pdf_docs = retriever.invoke(query)
    pdf_context = "\n".join([doc.page_content for doc in pdf_docs])

    # Retrieve Wikipedia context
    wiki_context = wikipedia.run(query)

    # combine prompt
    prompt = prompt_template.format(
        pdf_context=pdf_context,
        wiki_context=wiki_context,
        query=query
    )

    answer = llm(prompt).strip()

    return answer, pdf_docs, wiki_context

## First question

In [41]:
query = "What is the most delicious food in the world?"
answer, pdf_sources, wiki_result = rag_chain(query)
print("Question:")
print(query)
print("\n" + "="*80)
print("ANSWER:")
print("="*80)
print(answer)

Question:
What is the most delicious food in the world?

ANSWER:
According to a survey conducted by Cable News Network in 2011 and 2017, Rendang, a traditional dish of West Sumatra, was voted as the most delicious food in the world based on readers' choice. Rendang is known for its generous amounts of different kinds of spices and ingredients including meat and coconut milk. It is typically made with basic ingredients such as coconut milk, beef, or buffalo meat, which are common in this area. The dish is a staple in the region, especially in Padang, the capital of West Sumatra.


In [42]:
print("\n" + "="*80)
print("PDF SOURCES")
print("="*80)
for i, source in enumerate(pdf_sources, 1):
    print(f"PDF Source {i}: {source.page_content[:200]}...")
print("\n" + "="*80)
print("WIKIPEDIA RESULT")
print("="*80)
print(wiki_result)


PDF SOURCES
PDF Source 1: Gusti Asnan, Professor of History at Andalas University, Padang,
West Sumatra, Indonesia; Prof. Dr. Nur Indrawaty Lipoeto, MSc,
Ph.D., SpGK, Professor of Nutritional Sciences at Andalas University,
Pa...
PDF Source 2: © 2017 Korea Food Research Institute. Published by Elsevier B.V. This is an open access article under the
CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
1. Introduction
Indon...

WIKIPEDIA RESULT
Page: Delicious in Dungeon
Summary: Delicious in Dungeon (Japanese: ダンジョン飯, Hepburn: Danjon Meshi, lit. "Dungeon Meal") is a Japanese manga series written and illustrated by Ryoko Kui. It was serialized in Enterbrain's seinen manga magazine Harta from February 2014 to September 2023, with its chapters collected in 14 tankōbon volumes. The story follows a group of adventurers in a fantasy world who, after failing to defeat a dragon that consumed one of their own, embark on a journey through a dungeon to revive her, surviving by 

## Second question

In [40]:
query = "What is the history of rendang?"
answer, pdf_sources, wiki_result = rag_chain(query)
print("Question:")
print(query)
print("\n" + "="*80)
print("ANSWER:")
print("="*80)
print(answer)

Question:
What is the history of rendang?

ANSWER:
Rendang is believed to have originated from a combination of meat and spices prepared in North India known as curry, which was brought to West Sumatra by Indian merchants. This dish was then adapted and further cooked by the Minang people, transforming it into the unique rendang we know today. The dish's development is closely tied to the cultural exchange and interactions between the Indian merchants and the Minang people, reflecting the historical and culinary influences that have shaped it. Rendang is not only a food but also a symbol of cultural identity, reflecting the aesthetic of the food as well as the native cultural identity of the Minang people. The history of rendang also highlights the role of acculturation and the spread of culinary practices through trade and cultural exchange. The essential esthetic of the food, as well as its strong fragrant aroma, are a result of the slow and long cooking process that is unique to ren

In [39]:
print("\n" + "="*80)
print("PDF SOURCES")
print("="*80)
for i, source in enumerate(pdf_sources, 1):
    print(f"PDF Source {i}: {source.page_content[:200]}...")
print("\n" + "="*80)
print("WIKIPEDIA RESULT")
print("="*80)
print(wiki_result)


PDF SOURCES
PDF Source 1: world based on readers' choice.Rendang is a traditional dish of
West Sumatra with generous amounts of different kinds of spices
and ingredients including meat and coconut milk. Usually, the
Minang peo...
PDF Source 2: humans utilize food, from how it is selected, obtained, and
distributed to who prepares, serves, and eats it. These processes are
unique to humankind. Kittler and Kathryn [5] added that the
essential ...

WIKIPEDIA RESULT
Page: Rendang
Summary: Rendang is a fried meat or dry curry made of meat stewed in coconut milk and spices, widely popular across Brunei, Indonesia, Malaysia, Singapore, and the Philippines, where each version is considered local cuisine. It refers to both a cooking method of frying and the dish cooked in that way. The process involves slowly cooking meat in spiced coconut milk in an uncovered pot or pan until the oil separates, allowing the dish to fry in its own sauce, coating the meat in a rich, flavorful glaze.
Rooted in Ma