# Setting up Google Colab and Hugging Face API

Open this notebook in [colab](https://colab.research.google.com/github/Chair-of-Banking-and-Finance/Bachelor_thesis_24_25_Template/blob/main/Llama_RAG/LAMA_3_local_RAG.ipynb).

## Getting a Hugging Face API Token
1. **Create a Hugging Face account**: Go to [Hugging Face](https://huggingface.co/) and create an account if you don’t already have one.
2. **Generate an API Token**: After logging in, click on your profile icon in the top right corner, and go to "Settings".
3. **Access Tokens**: On the settings page, navigate to the "Access Tokens" tab.
4. **Create a new token**: Click on "New Token", give it a name, and set the role to "write". This token will be used to authenticate and download models.
5. **Copy the Token**: Copy the generated token and replace the `Hugging_Face_Token` variable in the script with your token.

---


In [1]:
Hugging_face_token = "XXXXXXXXXXXXXXXXXXXXXX"
from huggingface_hub import login
login(token=Hugging_face_token)
import os

os.environ["HF_TOKEN"] = Hugging_face_token


Visit [Hugging Face's model page for Llama 2](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) and request access to the model.

### Install Required Libraries

In [2]:
!pip install faiss-cpu
!pip install -U bitsandbytes
!pip install PyPDF2



### Import Libraries

In [3]:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from sentence_transformers import SentenceTransformer
import faiss
import numpy as np
import os
import PyPDF2

In [4]:
class LlamaRAGPipeline:
    def __init__(self,
                 model_name='meta-llama/Llama-2-7b-chat-hf',
                 embedding_model='BAAI/bge-small-en-v1.5'):
        """
        Initialize Llama RAG Pipeline

        Args:
            model_name (str): Hugging Face Llama model path
            embedding_model (str): Sentence transformer for embeddings
        """
        # Set the Hugging Face token as an environment variable
        HF_TOKEN = Hugging_face_token

        # Quantization configuration for memory efficiency
        quantization_config = BitsAndBytesConfig(
            load_in_8bit=True  # Only include valid arguments
        )

        # Initialize tokenizer and model
        self.tokenizer = AutoTokenizer.from_pretrained(
            model_name,
            use_fast=True,
            trust_remote_code=True
        )

        self.model = AutoModelForCausalLM.from_pretrained(
            model_name,
            device_map='auto',
            torch_dtype=torch.float16,
            quantization_config=quantization_config
        )

        # Initialize embedding model
        self.embedding_model = SentenceTransformer(embedding_model)

        # Initialize vector store
        self.vector_store = None
        self.documents = []

    def add_documents(self, documents):
        """
        Add documents to the RAG system. Handles text and PDF files.

        Args:
            documents (list): List of document file paths (can be .txt or .pdf)
        """
        text_documents = []

        for doc in documents:
            if doc.endswith(".pdf"):
                # Convert PDF to text
                try:
                    with open(doc, 'rb') as pdf_file:
                        reader = PyPDF2.PdfReader(pdf_file)
                        pdf_text = ""
                        for page in reader.pages:
                            pdf_text += page.extract_text()
                        text_documents.append(pdf_text)
                except Exception as e:
                    print(f"Error reading PDF {doc}: {e}")
            elif doc.endswith(".txt"):
                # Read text file
                try:
                    with open(doc, 'r', encoding='utf-8') as text_file:
                        text_documents.append(text_file.read())
                except Exception as e:
                    print(f"Error reading text file {doc}: {e}")
            else:
                print(f"Unsupported file format: {doc}")

        # Extend the documents list with the processed text
        self.documents.extend(text_documents)

        # Create embeddings
        embeddings = self.embedding_model.encode(text_documents)

        # Create FAISS index
        dimension = embeddings.shape[1]
        index = faiss.IndexFlatL2(dimension)
        index.add(embeddings)

        self.vector_store = index

    def retrieve_context(self, query, top_k=3):
        """
        Retrieve most relevant documents

        Args:
            query (str): Search query
            top_k (int): Number of documents to retrieve

        Returns:
            list: Most relevant documents
        """
        # Embed query
        query_embedding = self.embedding_model.encode([query])

        # Search vector store
        distances, indices = self.vector_store.search(query_embedding, top_k)

        # Retrieve documents
        retrieved_docs = [self.documents[i] for i in indices[0]]
        return retrieved_docs

    def format_prompt(self, query, context):
        """
        Format prompt for Llama with retrieved context

        Args:
            query (str): User query
            context (list): Retrieved context documents

        Returns:
            str: Formatted prompt
        """
        context_str = "\n".join(context)
        prompt = f"""[INST]
        Context: {context_str}

        Question: {query}

        Based on the context, provide a comprehensive and precise answer. [/INST]"""

        return prompt

    def generate_response(self, prompt):
        """
        Generate response using Llama model

        Args:
            prompt (str): Formatted prompt

        Returns:
            str: Generated response
        """
        # Tokenize input
        input_ids = self.tokenizer(prompt, return_tensors="pt", truncation=True, max_length=2048).input_ids.to(self.model.device)

        # Generate response
        output_ids = self.model.generate(
            input_ids,
            max_new_tokens=300,  # Limit the number of newly generated tokens
            num_return_sequences=1,
            temperature=0.7,
            top_p=0.9,
            do_sample=True
        )

        # Decode response
        response = self.tokenizer.decode(output_ids[0], skip_special_tokens=True)
        return response

    def rag_pipeline(self, query):
        """
        Complete RAG pipeline

        Args:
            query (str): User query

        Returns:
            str: Generated response
        """
        # Retrieve context
        context = self.retrieve_context(query)

        # Format prompt
        prompt = self.format_prompt(query, context)

        # Generate response
        response = self.generate_response(prompt)

        return response


In [5]:
# Define the text to be written to the file
roman_empire_text = """
The Roman Empire: An Overview
The Roman Empire was one of the most influential civilizations in human history, spanning over a millennium and leaving a legacy that shaped the world in areas such as governance, architecture, engineering, and law. Officially beginning in 27 BCE with the rise of Augustus Caesar, Rome transitioned from a republic to an empire, dominating vast territories that stretched from Britain in the northwest to Egypt in the southeast.

Formation and Expansion
The Roman Empire's foundation was built on centuries of conquest during the Roman Republic. Under Augustus, the empire ushered in a period of peace and stability known as the Pax Romana (Roman Peace), lasting about 200 years. During this time, Rome expanded its borders, solidifying control over Europe, North Africa, and parts of the Middle East.

The empire was characterized by a vast network of cities connected by advanced roads and aqueducts, facilitating trade, military movements, and cultural exchange. Notable conquests include Gaul (modern-day France) under Julius Caesar, the annexation of Egypt after Cleopatra's defeat, and the consolidation of power in regions such as Spain and the Balkans.

Culture and Society
Roman society was highly stratified, with a clear distinction between the elite patricians, common plebeians, and enslaved individuals. Roman culture blended Latin traditions with influences from Greece and the regions it conquered. This fusion led to remarkable achievements in literature (Virgil’s Aeneid), philosophy (Cicero, Seneca), and architecture (the Colosseum, aqueducts, and the Pantheon).

The Roman Empire was also a melting pot of religions. Initially polytheistic, it later became a cradle for Christianity, with Emperor Constantine legalizing the faith in 313 CE and Emperor Theodosius I declaring it the state religion by 380 CE.

Governance and Law
Rome was renowned for its administrative prowess and legal systems. The empire was divided into provinces, each governed by an appointed official. Roman law, codified in the Twelve Tables and later expanded, formed the foundation for many modern legal systems. Concepts like innocent until proven guilty and legal representation have their roots in Roman jurisprudence.

Decline and Fall
The decline of the Roman Empire was a gradual process influenced by internal and external factors. Political instability, economic struggles, and military overreach weakened the empire. The division of the empire into Eastern and Western halves in 395 CE further strained its cohesion. While the Western Roman Empire fell in 476 CE after being overrun by Germanic tribes, the Eastern Roman Empire, known as the Byzantine Empire, endured for another thousand years until the fall of Constantinople in 1453.

Legacy
The Roman Empire profoundly shaped Western civilization. Its contributions to governance, infrastructure, and culture remain influential today. Latin, the language of Rome, evolved into the Romance languages (Italian, French, Spanish, etc.), and Roman architecture inspired countless generations. The very concept of a republic and the rule of law owe much to Rome’s enduring influence.

In essence, the Roman Empire stands as a testament to humanity’s capacity for organization, innovation, and adaptation, making it a cornerstone of global history.
"""

# Specify the directory and file name
output_dir = "./data"
file_name = "roman_empire_overview.txt"
file_path = os.path.join(output_dir, file_name)

# Ensure the output directory exists; if not, create it
os.makedirs(output_dir, exist_ok=True)

# Write the text to the file with UTF-8 encoding
with open(file_path, 'w', encoding='utf-8') as file:
    file.write(roman_empire_text)

In [6]:
# Initialize RAG pipeline
rag = LlamaRAGPipeline()

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [7]:
import os

# Automatically retrieve all files in the 'data' folder
data_folder = "data"
documents = [os.path.join(data_folder, file) for file in os.listdir(data_folder) if os.path.isfile(os.path.join(data_folder, file))]

# Print the documents to verify
print("Documents loaded:", documents)

# Add documents
rag.add_documents(documents)

# Example queries
queries = [
    "Tell me about the founding of the Roman Empire",
    "What made Roman engineering significant?",
    "Whats the revenue of the quaterly statement?"
]

# Run queries
for query in queries:
    print(f"Query: {query}")
    response = rag.rag_pipeline(query)
    print(f"Response: {response}\n")


Documents loaded: ['data/18Q1-aic-Quarterly-Statement.pdf', 'data/roman_empire_overview.txt']
Query: Tell me about the founding of the Roman Empire
Response: [INST]
        Context: 
The Roman Empire: An Overview
The Roman Empire was one of the most influential civilizations in human history, spanning over a millennium and leaving a legacy that shaped the world in areas such as governance, architecture, engineering, and law. Officially beginning in 27 BCE with the rise of Augustus Caesar, Rome transitioned from a republic to an empire, dominating vast territories that stretched from Britain in the northwest to Egypt in the southeast.

Formation and Expansion
The Roman Empire's foundation was built on centuries of conquest during the Roman Republic. Under Augustus, the empire ushered in a period of peace and stability known as the Pax Romana (Roman Peace), lasting about 200 years. During this time, Rome expanded its borders, solidifying control over Europe, North Africa, and parts of th