<a href="https://colab.research.google.com/github/EllouziMedAmin/DSWithPytorch/blob/main/RAG_Assistant.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!apt-get update
!apt-get install -y curl git

In [None]:
!curl -fsSL https://ollama.com/install.sh | sh

In [3]:
!nohup ollama serve > /dev/null 2>&1 &

In [None]:
!ollama pull llama3.2

In [5]:
!ollama list

NAME               ID              SIZE      MODIFIED               
llama3.2:latest    a80c4f17acd5    2.0 GB    Less than a second ago    


In [None]:
!ollama pull mxbai-embed-large

In [None]:
!pip install langchain-ollama

In [None]:
!pip install PyPDF2

In [None]:
!pip install faiss-cpu langchain sentence-transformers

In [None]:
!pip install -U langchain-community

In [None]:
!pip install gradio

In [12]:
from PyPDF2 import PdfReader
from langchain_ollama.llms import OllamaLLM
from langchain_core.prompts import ChatPromptTemplate
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
import requests
from langchain.embeddings import OllamaEmbeddings
import gradio as gr

###test

In [9]:
model = OllamaLLM(model="llama3.2")

template = """
You are an exeprt in answering questions about a pizza restaurant

Here are some relevant reviews: {reviews}

Here is the question to answer: {question}
"""
prompt = ChatPromptTemplate.from_template(template)
chain = prompt | model

In [10]:
while True:
    print("\n\n-------------------------------")
    question = input("Ask your question (q to quit): ")
    print("\n\n")
    if question == "q":
        break

    reviews = []
    result = chain.invoke({"reviews": reviews, "question": question})
    print(result)



-------------------------------
Ask your question (q to quit): whats the best pizza restaurant in town



As an expert in answering questions about pizza restaurants, I can give you some valuable insights based on popular reviews.

While opinions may vary, I'd recommend "Bella Vita" as one of the top-rated pizza restaurants in town. With a 4.9-star rating and over 500 glowing reviews, it's clear that they're serving up delicious pies that satisfy even the most discerning palates.

According to reviewers, Bella Vita offers a unique combination of traditional Italian flavors with modern twists and creative toppings. Many have praised their crispy crusts, flavorful sauces, and generous portions of melted mozzarella cheese.

Some reviewers have specifically highlighted the restaurant's attention to detail, from the cozy atmosphere to the friendly service. Whether you're in the mood for classic margherita or something more adventurous, like their signature "Fig and Prosciutto" pizza, Bell

### RAG local

In [14]:
def extract_text_from_pdf(pdf_path):
    reader = PdfReader(pdf_path)
    return "\n".join([page.extract_text() for page in reader.pages if page.extract_text()])

In [17]:
# Chunking
def chunk_text(text, chunk_size=500, chunk_overlap=50):
    splitter = RecursiveCharacterTextSplitter(chunk_size=chunk_size, chunk_overlap=chunk_overlap)
    return splitter.split_text(text)

# Upload and process
from google.colab import files

uploaded_files = files.upload()
pdf_paths = list(uploaded_files.keys())  # List of uploaded filenames
print("Uploaded PDFs:", pdf_paths)


Saving Assignment Description 202504.pdf to Assignment Description 202504.pdf
Number of chunks: 17


In [None]:
# Extract text from all PDFs
all_texts = []
for path in pdf_paths:
    text = extract_text_from_pdf(path)
    all_texts.append(text)

# Combine all extracted texts
full_text = "\n".join(all_texts)

# Chunk combined text
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = splitter.split_text(full_text)

print(f"Total chunks created: {len(chunks)}")

In [22]:
# Use Ollama-based embedding model
embedding_model = OllamaEmbeddings(model="mxbai-embed-large")

# Create the vector store
vector_db = FAISS.from_texts(chunks, embedding_model)
vector_db.save_local("vectorstore")


  embedding_model = OllamaEmbeddings(model="mxbai-embed-large")


In [24]:
# Load for querying later
# Load FAISS vector store and allow dangerous deserialization (safe in this case)
vector_db = FAISS.load_local("vectorstore", embedding_model, allow_dangerous_deserialization=True)

In [25]:
def retrieve_context(query, k=3):
    docs = vector_db.similarity_search(query, k=k)
    return "\n\n".join(doc.page_content for doc in docs)

In [29]:
def ask_ollama(question, context, model="llama3.2"):
    prompt = f"""You are a helpful assistant for a course assignment. Use the context below to answer the question.

Context:
{context}

Question: {question}

Answer:"""

    response = requests.post(
        "http://localhost:11434/api/generate",
        json={"model": model, "prompt": prompt, "stream": False}
    )

    return response.json()["response"]


In [30]:
user_question = "can u explain the topic of the assignment?"
context = retrieve_context(user_question)
answer = ask_ollama(user_question, context)

print("Answer:\n", answer)


Answer:
 The topic of this assignment appears to be related to integrity and anti-corruption in daily activities and organizations.

Specifically, it seems that the assignment is focused on explaining three key concepts:

1. Integrity and its importance in daily life (CLO 1).
2. Forms of corruption and abuse of power in daily activities and organizations (CLO 2).
3. The values and principles of integrity and anti-corruption in current issues (CLO 3).

The assignment is likely asking students to demonstrate their understanding of these concepts through a written submission, such as an essay or report.

Please let me know if you need help with the next step of the assignment!


### Deployment

In [17]:
def rag_pipeline(question, files):
    # Process uploaded files
    texts = []
    for file in files:
        reader = PdfReader(file.name)
        pdf_text = "\n".join(page.extract_text() for page in reader.pages)
        texts.append(pdf_text)

    full_text = "\n".join(texts)
    chunks = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50).split_text(full_text)

    # Vector DB
    embedding_model = OllamaEmbeddings(model="mxbai-embed-large")
    vector_db = FAISS.from_texts(chunks, embedding_model)

    # Query
    context = "\n\n".join([doc.page_content for doc in vector_db.similarity_search(question, k=3)])

    # Ask Ollama
    response = requests.post(
        "http://localhost:11434/api/generate",
        json={"model": "llama3.2", "prompt": f"{context}\n\nQuestion: {question}\nAnswer:", "stream": False}
    )

    return response.json()["response"]

demo = gr.Interface(
    fn=rag_pipeline,
    inputs=[gr.Textbox(label="Your Question"), gr.File(file_types=[".pdf"], label="Upload Course PDFs", file_count="multiple")],
    outputs="text",
    title="Course RAG Assistant",
    description="Ask questions based on uploaded course PDFs. Powered by LLaMA3 + local embeddings."
)

demo.launch(share=True)


Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://78fba05bc18ac89d6c.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




### Final

In [13]:
def rag_pipeline(question, files):
    # Step 1: Read and extract text from uploaded PDFs
    texts = []
    for file in files:
        reader = PdfReader(file.name)
        pdf_text = "\n".join(page.extract_text() for page in reader.pages if page.extract_text())
        texts.append(pdf_text)

    # Step 2: Chunk the combined text
    full_text = "\n".join(texts)
    splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
    chunks = splitter.split_text(full_text)

    # Step 3: Create in-memory FAISS vector DB
    embedding_model = OllamaEmbeddings(model="mxbai-embed-large")
    vector_db = FAISS.from_texts(chunks, embedding_model)

    # Step 4: Perform similarity search
    similar_docs = vector_db.similarity_search(question, k=3)
    context = "\n\n".join([doc.page_content for doc in similar_docs])

    # Step 5: Construct prompt with Bou Asba's persona
    prompt = f"""
You are Bou Asba, a helpful and knowledgeable assistant here to help students understand course materials.

Use the context below, which comes from course PDFs, to answer the question in a clear and friendly manner.

Context:
{context}

Question: {question}

Answer as Bou Asba:
"""

    # Step 6: Call Ollama (LLaMA 3.2)
    response = requests.post(
        "http://localhost:11434/api/generate",
        json={"model": "llama3.2", "prompt": prompt, "stream": False}
    )

    return response.json()["response"]

# 🎨 Gradio UI
demo = gr.Interface(
    fn=rag_pipeline,
    inputs=[
        gr.Textbox(label="Your Question", placeholder="Ask me anything from your course material..."),
        gr.File(file_types=[".pdf"], label="Upload Course PDFs", file_count="multiple")
    ],
    outputs="text",
    title="📘 Bou Asba - Your Study Assistant",
    description="Meet Bou Asba, your friendly AI tutor! 📚 Upload your course PDFs and ask anything. Bou is here to help you learn 🤓."
)

demo.launch(share=True)

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://08b432540bc9455bf2.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


