<center><img src="https://upload.wikimedia.org/wikipedia/commons/e/e9/4_RGB_McCombs_School_Brand_Branded.png" width="300" height="100"/>
  <img src="https://mma.prnewswire.com/media/1458111/Great_Learning_Logo.jpg?p=facebook" width="200" height="100"/></center>

<center><font size=10>Artificial Intelligence and Machine Learning</center></font>
<center><font size=6>Natural Language Processing with Generative AI - Retrieval Augmented Generation</center></font>

<center><img src="https://i.ibb.co/pBF9nKpf/apple.png" width="720"></center>

<center><font size=6>Apple HBR Report Document Q&A</center></font>

# RAG LLM Application Notebook

This notebook demonstrates the complete workflow of a Retrieval-Augmented Generation (RAG) LLM application. It covers data loading, chunking, embedding, vector database setup, question answering, and evaluation.

## 1. Setup and Library Installation

First, we need to install all the necessary Python libraries. This ensures that all dependencies for data processing, LLM interaction, and vector database operations are met. If you are running this in a Colab environment, these commands will typically install the packages.

**Note**: If you are running this locally, ensure you have Ollama installed and the `llama3.2` model pulled (`ollama pull llama3.2`).

In [1]:
# Install necessary packages
# %pip install -U -q huggingface_hub tiktoken pymupdf langchain-community langchain langchain-chroma langchain-huggingface ollama


print("All specified packages installed (or upgraded) successfully!")
print("Please ensure the model is pulled for Ollama with the command:")

All specified packages installed (or upgraded) successfully!
Please ensure the model is pulled for Ollama with the command:


## 2. Import Necessary Modules

We import the `RAG_LLM` class from `functions.py` and constants from `config.py`. Make sure `functions.py`, `config.py`, and `prompt_templates.py` are in the same directory as this notebook.

In [2]:
from functions import RAG_LLM
from config import APPLE_PDF_PATH, DEFAULT_K_RETRIEVER, DEFAULT_MAX_TOKENS, DEFAULT_TEMPERATURE
import os

print("Modules imported successfully.")

Modules imported successfully.


## 3. Initialize the RAG_LLM System

Here, we create an instance of our `RAG_LLM` class. This object will manage the entire RAG pipeline, including data loading, processing, retrieval, and LLM interaction.

In [3]:
# Initialize the RAG_LLM class
rag_system = RAG_LLM()
print("RAG_LLM system initialized.")

RAG_LLM initialized.
RAG_LLM system initialized.


## 4. Load the PDF Document

We load the `HBR_How_Apple_Is_Organized_For_Innovation.pdf` document. Ensure this PDF file is available in the working directory.

In [4]:
# Load the PDF document
documents = rag_system.load_data(pdf_path=APPLE_PDF_PATH)
if not documents:
    print("Failed to load documents. Please check the PDF path and file existence.")
else:
    print("PDF document loaded successfully.")

Loading data from: HBR_How_Apple_Is_Organized_For_Innovation.pdf
Successfully loaded 11 pages.
PDF document loaded successfully.


## 5. Chunk the Loaded Data

The loaded document is chunked into smaller, overlapping segments. This is crucial for efficient retrieval and to fit content within the LLM's context window.

In [5]:
# Chunk the loaded data
document_chunks = rag_system.chunk_data(documents)
if not document_chunks:
    print("Failed to chunk documents.")
else:
    print("Documents chunked successfully.")

Chunking data with chunk_size=1024, chunk_overlap=20
Created 16 chunks.
Documents chunked successfully.


## 6. Create Embedding Model

An embedding model (SentenceTransformer) is initialized to convert text chunks into numerical vectors, enabling semantic search.

In [6]:
# Create embedding model
rag_system.create_embeddings()
if not rag_system.embedding_model:
    print("Failed to create embedding model.")
else:
    print("Embedding model created successfully.")

Initializing embedding model: mixedbread-ai/mxbai-embed-large-v1
Embedding model initialized successfully.
Embedding model created successfully.


## 7. Set Up the Vector Database

The Chroma vector database is set up using the chunked documents and the embedding model. This database will store the embeddings and facilitate quick retrieval of relevant context.

In [7]:
# Set up the vector database
rag_system.setup_vector_database(document_chunks=document_chunks)
if not rag_system.vectorstore:
    print("Failed to set up vector database.")
else:
    print("Vector database set up and retriever initialized.")

Setting up vector database in: vector_db_1024
Vector database loaded from existing directory.
Retriever initialized.
Vector database set up and retriever initialized.


## 8. Demonstrate Question Answering with RAG

Now, we can ask questions and see how the RAG system retrieves relevant information and generates answers based on the loaded document.

In [8]:
# Example Query 1
user_input_1 = "Who are the authors of this article and who published this article ?"
print(f"\nQuery 1: {user_input_1}")
llm_response_1 = rag_system.get_answer(user_input_1)
print(f"Response 1: \n{llm_response_1}")


Query 1: Who are the authors of this article and who published this article ?
Retrieving 3 relevant documents for the query.
RAG prompt created.
Generating LLM response using model: deepseek-r1
LLM response generated.
Response 1: 
<think>
Okay, let's look at the user's question carefully. The user wants to know who the authors of the article are and who published it. 

First, I'll scan through the provided context for any information about authors or publishers. In the first part of the document header, there's a line that says "AUTHORS FOR ARTICLE REPRINTS CALL..." which clearly lists the names: Joel M. Podolny and Morten T. Hansen. Then in another section, it mentions Harvard Business Review as the publisher with the date November–December 2020.

The user seems to be asking for straightforward factual information from the text. They might need this for citation purposes or just general knowledge about the article's origins. 

Since both pieces of information are explicitly stated in

In [9]:
# Example Query 2
user_input_2 = "List down the three leadership characteristics in bulleted points and explain each one of the characteristics under two lines."
print(f"\nQuery 2: {user_input_2}")
# Adjust max_tokens to allow for a more complete answer for a list
llm_response_2 = rag_system.get_answer(user_input_2, max_tokens=150, temperature=0.1)
print(f"Response 2: \n{llm_response_2}")


Query 2: List down the three leadership characteristics in bulleted points and explain each one of the characteristics under two lines.
Retrieving 3 relevant documents for the query.
RAG prompt created.
Generating LLM response using model: deepseek-r1
LLM response generated.
Response 2: 
<think>
Okay, let's tackle this query step by step. The user wants a list of three leadership characteristics from Apple's context, presented as bullet points with brief explanations for each.

First, scanning through the provided text about Apple's organizational structure and Tim Cook's changes... I see mentions of "leadership approach described above" which includes experts leading experts, immersion in details, and collaborative debate. That seems like a solid starting point.

Then there's this part about Rosner evolving his leadership role: he had been immersed in details especially concerning top-level software aspects but now spends more time teaching others... This suggests that "teaching box"

In [10]:
# Example Query 3 (expected to be "I don't know" if not in context)
user_input_3 = "Can you explain specific examples from the article where Apple's approach to leadership has led to successful innovations?"
print(f"\nQuery 3: {user_input_3}")
llm_response_3 = rag_system.get_answer(user_input_3)
print(f"Response 3: \n{llm_response_3}")


Query 3: Can you explain specific examples from the article where Apple's approach to leadership has led to successful innovations?
Retrieving 3 relevant documents for the query.
RAG prompt created.
Generating LLM response using model: deepseek-r1
LLM response generated.
Response 3: 
<think>
Okay, let me analyze this carefully. The user wants specific examples from the article about how Apple's leadership approach contributed to successful innovations. 

Looking at the context provided, I see several key points that directly address this question. First, there's mention of Rosner (VP of applications) managing a diverse portfolio including News, Clips, Books and Final Cut Pro - showing leaders need expertise beyond their core areas as the company grows complex.

Then there are concrete examples like: 
- Adding AI/ML as a functional area
- Merging human interface with industrial design

These changes demonstrate how Apple adapted its leadership structure to enable collaboration across d

## 9. Demonstrate Output Evaluation (LLM-as-a-Judge)

Finally, we demonstrate how the RAG system can evaluate its own answers for 'groundedness' (adherence to context) and 'relevance' (how well it answers the question).

In [11]:
# Evaluate Query 1
user_input_1 = "Who are the authors of this article and who published this article ?"
print("\nEvaluating Query 1:")
rag_system.calculate_rating(question=user_input_1)


Evaluating Query 1:

--- Calculating Ratings for Question: 'Who are the authors of this article and who published this article ?' ---
Retrieving 3 relevant documents for the query.
RAG prompt created.
Generating LLM response using model: deepseek-r1
LLM response generated.
Rating overall answer quality (groundedness and relevance)...
Rating groundedness...
Retrieving 3 relevant documents for the query.
Generating LLM response using model: deepseek-r1
LLM response generated.
Rating relevance...
Retrieving 3 relevant documents for the query.
Generating LLM response using model: deepseek-r1
LLM response generated.

--- Results ---
Question: 
 Who are the authors of this article and who published this article ?

Answer: 
 <think>
Okay, let's look at the user's question carefully. The user wants to know who the authors of the article are and who published it. 

First, I'll scan through the provided context for any information about authors or publishers. In the first part of the document h

In [12]:
# Evaluate Query 2
user_input_2 = "List down the three leadership characteristics in bulleted points and explain each one of the characteristics under two lines."
print("\nEvaluating Query 2:")
# Using parameters that yield a better answer for evaluation
rag_system.calculate_rating(question=user_input_2)


Evaluating Query 2:

--- Calculating Ratings for Question: 'List down the three leadership characteristics in bulleted points and explain each one of the characteristics under two lines.' ---
Retrieving 3 relevant documents for the query.
RAG prompt created.
Generating LLM response using model: deepseek-r1
LLM response generated.
Rating overall answer quality (groundedness and relevance)...
Rating groundedness...
Retrieving 3 relevant documents for the query.
Generating LLM response using model: deepseek-r1
LLM response generated.
Rating relevance...
Retrieving 3 relevant documents for the query.
Generating LLM response using model: deepseek-r1
LLM response generated.

--- Results ---
Question: 
 List down the three leadership characteristics in bulleted points and explain each one of the characteristics under two lines.

Answer: 
 <think>
Okay, let's tackle this query step by step. The user wants a list of three leadership characteristics from Apple's context, presented in bulleted p

In [14]:
# Evaluate Query 3
user_input_3 = "Can you explain specific examples from the article where Apple's approach to leadership has led to successful innovations?"
print("\nEvaluating Query 3:")
rag_system.calculate_rating(question=user_input_3)


Evaluating Query 3:

--- Calculating Ratings for Question: 'Can you explain specific examples from the article where Apple's approach to leadership has led to successful innovations?' ---
Retrieving 3 relevant documents for the query.
RAG prompt created.
Generating LLM response using model: deepseek-r1
LLM response generated.
Rating overall answer quality (groundedness and relevance)...
Rating groundedness...
Retrieving 3 relevant documents for the query.
Generating LLM response using model: deepseek-r1
LLM response generated.
Rating relevance...
Retrieving 3 relevant documents for the query.
Generating LLM response using model: deepseek-r1
LLM response generated.

--- Results ---
Question: 
 Can you explain specific examples from the article where Apple's approach to leadership has led to successful innovations?

Answer: 
 <think>
Okay, let me analyze this carefully. The user wants specific examples from the article about how Apple's leadership approach contributed to successful inno

## Conclusion

This notebook provides a comprehensive walkthrough of setting up and using a RAG LLM system for document Q&A and evaluation. You can modify the parameters and queries to experiment further.