# <span style="color:blue">Section II: Capstone Project: Automated Technical Document Summarization using GenAI</span>


**Capstone Requirements**:    

This notebook demonstrates several required GenAI capabilities, including:

<span style="color:red">**Document Understanding, Embeddings, Vector Search, RAG, Agents (LangGraph), Structured Output (JSON), Few-Shot Prompting, and Gen AI Evaluation.**</span> 

## <span style="color:green">Step 1: Setup and Dependencies</span>
 
**Explanation**: This section handles the initial setup required for the project. It involves installing necessary Python libraries like langchain, langgraph, google-generativeai, ChromaDB, PyMuPDF (for PDF parsing), and others. It also imports all the required modules into the notebook environment.

In [1]:
# Uninstall potentially conflicting base packages MORE aggressively
print("Uninstalling potentially conflicting packages...")
!pip uninstall -qqy kfp jupyterlab libpysal thinc spacy fastai ydata-profiling google-cloud-bigquery google-cloud-aiplatform tensorflow tensorflow-decision-forests pandas-gbq gcsfs google-api-core google-cloud-translate google-cloud-bigtable opentelemetry-proto protobuf

# Install core requirements with specific versions where needed
print("\nInstalling required packages...")
!pip install --upgrade --quiet \
    "google-genai==1.7.0" \
    "langchain" \
    "langchain-google-genai==2.1.2" \
    "langgraph==0.3.21" \
    "chromadb" \
    "PyMuPDF" \
    "requests" \
    "beautifulsoup4" \
    "faiss-cpu" 

# Check for broken dependencies
print("\nChecking for broken dependencies...")
!pip check

print("\n--- Installs Attempted ---")

Uninstalling potentially conflicting packages...
[0m
Installing required packages...
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.3/67.3 kB[0m [31m4.0 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m43.5/43.5 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m144.7/144.7 kB[0m [31m6.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m42.0/42.0 kB[0m [31m1.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m138.0/138.0 kB[0m [31m6.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.0/1.0 MB[0m [31m27.6 MB/s[0m eta [36m0:00:00[0m:00:01[0m
[2K   [90m

In [2]:
# PyMuPDF
import fitz  

# Import other libraries
import requests
import numpy as np
import os
import json
from tqdm import tqdm
from bs4 import BeautifulSoup

# Display and Secrets
from IPython.display import display, Markdown
from kaggle_secrets import UserSecretsClient

# LangChain & Google GenAI specific
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain.agents import Tool 
from google import genai
from google.genai import types

# ChromaDB specific imports
import chromadb
from chromadb import Documents, EmbeddingFunction, Embeddings
from chromadb.config import Settings

# LangGraph specific
from typing import TypedDict, Annotated, Sequence, List
from langgraph.graph import StateGraph, END
from langgraph.graph.message import add_messages
from langchain_core.messages import HumanMessage, AIMessage, ToolMessage
from langchain_core.runnables import RunnableLambda

# Utilities
import operator
import pprint

print("--- Imports Loaded ---")

--- Imports Loaded ---


## <span style="color:green">Step 2: API Configuration</span>

**Explanation**: Securely access the Google API Key using Kaggle Secrets. This key is essential for authenticating requests to the Gemini API (both Pro and Flash models) and the embedding model. An instance of the genai.Client is created using this key.


In [3]:
from kaggle_secrets import UserSecretsClient


GOOGLE_API_KEY = UserSecretsClient().get_secret("GOOGLE_API_KEY")
client = genai.Client(api_key=GOOGLE_API_KEY)

## <span style="color:green">Step 3: Load Documents</span>

**Explanation**: This part focuses on acquiring the raw data. It defines functions to:

- Load text content from PDF files using PyMuPDF.
- Fetch and extract textual content from web URLs using requests and BeautifulSoup, removing common non-content HTML tags.
- It then iterates through the specified PDF directory (/kaggle/input/pdf-files) and a list of web links, loading the text from each source into a list of dictionaries (raw_documents).

**GenAI Problem Solving**: The whole notebook addresses the challenge of handling diverse input formats like PDFs and web pages and extracting the core text needed for AI processing.


In [4]:
def load_pdf_text(pdf_path: str) -> str:
    """
    Extracts all text content from a given PDF file using PyMuPDF.
    Args:
        pdf_path: The file system path to the PDF file.
    Returns:
        A single string containing all extracted text from the PDF.
    """
    doc = fitz.open(pdf_path)
    text = "\n".join(page.get_text() for page in doc)
    
    doc.close()
    return text

def load_web_text(url: str) -> str:
    """
    Extracts the main textual content from a given URL using requests and BeautifulSoup.
    Removes common script, style, nav, footer, header, and aside tags.
    Args:
        url: The URL of the web page.
    Returns:
        A string containing the extracted text, or an empty string if loading fails.
        (Note: Simplified version removes explicit error handling).
    """
    response = requests.get(url, timeout=15)
    response.raise_for_status() # Will raise HTTPError for bad responses (4xx or 5xx)
    soup = BeautifulSoup(response.text, "html.parser")
    for tag in soup(["script", "style", "nav", "footer", "header", "aside"]):
        tag.decompose()
    text = soup.get_text(separator=' ', strip=True)
    return text

# --- Define Data Sources ---
pdf_dir = "/kaggle/input/pdf-files"
pdf_files = [os.path.join(pdf_dir, f) for f in os.listdir(pdf_dir) if f.endswith(".pdf")]

web_links = [
    "https://medium.com/thedeephub/building-vision-transformer-from-scratch-using-pytorch-an-image-worth-16x16-words-24db5f159e27",
    "https://medium.com/data-science/attention-for-vision-transformers-explained-70f83984c673#id_token=eyJhbGciOiJSUzI1NiIsImtpZCI6ImMzN2RhNzVjOWZiZTE4YzJjZTkxMjViOWFhMWYzMDBkY2IzMWU4ZDkiLCJ0eXAiOiJKV1QifQ.eyJpc3MiOiJodHRwczovL2FjY291bnRzLmdvb2dsZS5jb20iLCJhenAiOiIyMTYyOTYwMzU4MzQtazFrNnFlMDYwczJ0cDJhMmphbTRsamRjbXMwMHN0dGcuYXBwcy5nb29nbGV1c2VyY29udGVudC5jb20iLCJhdWQiOiIyMTYyOTYwMzU4MzQtazFrNnFlMDYwczJ0cDJhMmphbTRsamRjbXMwMHN0dGcuYXBwcy5nb29nbGV1c2VyY29udGVudC5jb20iLCJzdWIiOiIxMTc5MjM1MDc2MTM3Njc5ODU3ODkiLCJlbWFpbCI6ImFsaWJhZ2hpemFkZUBnbWFpbC5jb20iLCJlbWFpbF92ZXJpZmllZCI6dHJ1ZSwibmJmIjoxNzQ0NjM5OTgyLCJuYW1lIjoiQWxpIEJhZ2hpIHphZGVoIiwicGljdHVyZSI6Imh0dHBzOi8vbGgzLmdvb2dsZXVzZXJjb250ZW50LmNvbS9hL0FDZzhvY0lHd09lZFY2RFYzSFNuU1FkWWc4aU5JYXctQ2w2UmN1Y0Q4bXhnUzVEcTZ2NVJoMGRVPXM5Ni1jIiwiZ2l2ZW5fbmFtZSI6IkFsaSIsImZhbWlseV9uYW1lIjoiQmFnaGkgemFkZWgiLCJpYXQiOjE3NDQ2NDAyODIsImV4cCI6MTc0NDY0Mzg4MiwianRpIjoiZWQ1OWY5MWViOTE5ZGQ2ZWNkNDNhNjAxMDE1N2QyNDVjZjlkZWExNCJ9.JOvQAwLXa3Y7voKyTDpFLEglUAzFtiUVUvWJ6zDEkGM1KxQ5PNL2wDmdJfdTjaqbhddbfBSWUIyDzAA5khZihHeJh_d-j2ICCYN_CBpD-v5Dm3A4GfOVk75dl6yS2g4t7wp1jIQtWZ9Emjg9T_ats8PzHsD85xiBVbjeHS54veqx_TEigs9vb45YTzHAFCQkd5MFBLcgU7xz9vM5OTyUyZw9T1d4tSI4Mb4_QTcH8e-U4s47kzipRnN6tH18leDJMkAoBDwiKX-4JMEXqgvazIBCoqVFhTxLTvqD63YYNB6I2PEF0e6r-N_VkZ8G3n5h6Fl3DEVTn4A3w1xKcgaG1Q",
    "https://medium.com/data-science/vision-transformers-explained-a9d07147e4c8",
    "https://medium.com/analytics-vidhya/understanding-the-vision-transformer-and-counting-its-parameters-988a4ea2b8f3",
    "https://becominghuman.ai/transformers-in-vision-e2e87b739feb",
    "https://tintn.github.io/Implementing-Vision-Transformer-from-Scratch/",
    "https://sh-tsang.medium.com/review-bridging-the-gap-between-vision-transformers-and-convolutional-neural-networks-on-small-faa0bc8e50ad",
    "https://arxiv.org/html/2504.06158",
    "https://arxiv.org/html/2504.03108",
    "https://doi.org/10.48550/arXiv.2504.04749",
    "https://arxiv.org/html/2504.07468",
    "https://arxiv.org/html/2504.08481",
]

# --- Load and Wrap Documents ---
raw_documents = []
for pdf in pdf_files: 
    text = load_pdf_text(pdf)
    if text: 
        raw_documents.append({"type": "pdf", "source": os.path.basename(pdf), "text": text})

for url in web_links:
    text = load_web_text(url)
    if text: 
        raw_documents.append({"type": "web", "source": url, "text": text})

In [5]:
# To check resources loaded to the raw_documents
raw_documents[15]["source"]

'https://arxiv.org/html/2504.03108'

In [6]:
len(raw_documents[15]["text"])

63183

## <span style="color:green">Step 4: Chunking and Metadata</span>

**Explanation**: Large documents need to be broken down into smaller, manageable pieces (chunks) for effective processing by embedding models and LLMs, which have context window limitations. This section uses LangChain's RecursiveCharacterTextSplitter to split the text loaded in the step 3. Each chunk retains metadata linking it back to its original source document.

**GenAI Problem Solving**: Chunking is a standard technique in RAG to prepare data for vectorization and retrieval, enabling the system to pinpoint relevant sections within large documents.

In [11]:
splitter = RecursiveCharacterTextSplitter(
    chunk_size=1500,   
    chunk_overlap=200,
)

chunked_documents = []
for doc in raw_documents:
    if 'text' in doc and isinstance(doc['text'], str):
        # Split the text content
        chunks = splitter.split_text(doc['text']) 
        # Iterate through the TEXT chunks
        for i, chunk_text in enumerate(chunks): 
            chunk_id = f"{doc['source']}_chunk_{i}"
            chunked_documents.append({
                'text': chunk_text,
                'metadata': {
                    'source': doc['source'],
                    'type': doc['type'],
                    'chunk_id': chunk_id
                }
            })
    else:
         print(f"Warning: Skipping document with missing or invalid text field: {doc.get('source', 'Unknown')}")

## <span style="color:green">Step 5: ChromaDB Setup and Indexing</span>

**Explanation**: This section converts the text chunks into numerical representations (embeddings) using the google-genai embedding model (models/text-embedding-004).    
These embeddings capture the semantic meaning of the text. The generated embeddings are then stored in a FAISS index (IndexFlatL2), which allows for efficient similarity searching.

**GenAI Problem Solving**: This is the core of the retrieval mechanism. By representing text as vectors, we can mathematically find chunks whose meaning is closest to a user's query vector.

In [14]:
class GeminiEmbeddingFunction(EmbeddingFunction):
    """
    Custom embedding function for ChromaDB using Google Gemini API.
    Handles switching between document and query embedding task types
    and respects the API's batch size limit.
    """
    def __init__(self):
        """Initializes the embedding function."""
        super().__init__() 

    def __call__(self, input_texts: Documents) -> Embeddings:
        """
        Generates embeddings for a list of texts, handling API batch limits.
        Args:
            input_texts: A list of strings (documents or queries) to embed.
        Returns:
            A list of embeddings (each embedding is a list of floats).
        """
        embedding_task = "retrieval_document" 

        all_embeddings = []
        API_BATCH_SIZE = 100 

        for i in range(0, len(input_texts), API_BATCH_SIZE):
            batch = input_texts[i : i + API_BATCH_SIZE]
            response = client.models.embed_content(
                model="models/text-embedding-004",
                contents=batch,
                config=types.EmbedContentConfig(task_type=embedding_task)
                 )
            all_embeddings.extend([e.values for e in response.embeddings])
        return all_embeddings

# Initialize ChromaDB 
DB_NAME = "capstone_rag_db" 
#chroma_client = chromadb.Client()
db_directory = "./chroma_capstone_db"
chroma_client = chromadb.PersistentClient(path=db_directory)

# Get or create the collection
print(f"Creating/loading ChromaDB collection: {DB_NAME}")
db_collection = chroma_client.get_or_create_collection(
    name=DB_NAME,
    embedding_function=GeminiEmbeddingFunction()
)

doc_texts = [doc['text'] for doc in chunked_documents]

doc_ids = [doc['metadata']['chunk_id'] for doc in chunked_documents] 
doc_metadatas = [{'source': doc['metadata']['source'], 'type': doc['metadata']['type']} for doc in chunked_documents]

print(f"Adding {len(doc_texts)} documents to ChromaDB...")
batch_size = 500 
for i in tqdm(range(0, len(doc_texts), batch_size), desc="Adding to ChromaDB"):
    db_collection.add(
        documents=doc_texts[i:i+batch_size], 
        metadatas=doc_metadatas[i:i+batch_size],
        ids=doc_ids[i:i+batch_size]
    )

print(f"--- Added {db_collection.count()} documents to ChromaDB collection '{DB_NAME}' ---")

  super().__init__()


Creating/loading ChromaDB collection: capstone_rag_db
Adding 663 documents to ChromaDB...


Adding to ChromaDB: 100%|██████████| 2/2 [00:07<00:00,  3.62s/it]

--- Added 663 documents to ChromaDB collection 'capstone_rag_db' ---





## <span style="color:green">Step 6: Evaluation Tool Setup</span>

**Explanation**: To fulfill the "Gen AI Evaluation" requirement, we define a custom evaluation mechanism. This involves:
- A Python function (evaluate_summary) that takes a generated summary (as a JSON string) and the original context.
- Inside this function, it prompts a different, typically faster/cheaper LLM (Gemini Flash) to assess the summary based on criteria like faithfulness, completeness, etc.
- We then wrapp this function using LangChain's Tool class, making it callable within our agent workflow.
- 
**GenAI Problem Solving**: Provides an automated way to assess the quality of the AI-generated summary, adding a layer of quality control.

In [15]:
def evaluate_summary(summary_json_str: str, context: str) -> str:
    """
    Evaluates a summary JSON string against its original context using Gemini Flash.
    Args:
        summary_json_str: The summary to evaluate (as a JSON string).
        context: The original context used to generate the summary.
    Returns:
        A string containing evaluation feedback from Gemini Flash.
    """
    summary_text_for_eval = summary_json_str 

    max_context_len = 8000 
    eval_prompt = f"""
    Evaluate the following summary based on the context.
    Context (potentially truncated):
    ---
    {context[:max_context_len]}
    ---
    Generated Summary (JSON String):
    ---
    {summary_text_for_eval}
    ---
    Criteria: Faithfulness, Completeness, Conciseness, Reference Check.
    Provide a brief assessment.
    """
    eval_llm = ChatGoogleGenerativeAI(
        model="models/gemini-1.5-flash",
        google_api_key=GOOGLE_API_KEY,
    )
    response = eval_llm.invoke(eval_prompt)
    return response.content

# Wrap in a LangChain Tool
evaluation_tool = Tool(
    name="evaluate_summary_quality",
    func=evaluate_summary,
    description="Evaluates summary quality. Input requires 'summary_json_str' and 'context'."
)

## <span style="color:green">Step 7: LangGraph State and Nodes</span>

**Explanation**: This section defines the agent's structure and processing steps using LangGraph.  

**State Definition**: An AgentState class (using TypedDict) defines the structure of data passed between steps. Key fields include:
- **messages**: A sequence holding the conversation history (human queries, AI responses, evaluation results), managed using add_messages.
- **context_string**: Stores the formatted context retrieved from the vector store.
- **initial_summary_json**: Holds the structured summary generated by the primary LLM.

**Node Definitions**: Each core step is defined as a Python function (node):
- **embed_and_search_node**: Embeds the latest user query and searches the [BLUE_START]ChromaDB[BLUE_END] vector store. It retrieves relevant document chunks, formats them into a context_string (stored in the state), and appends an AIMessage to the history containing this context and instructions for the summarization node.
- **generate_summary_node**: Calls the Gemini 1.5 Pro model using the message history (which includes the query and the context/instructions from the previous node). It requests and parses the structured JSON output, storing it in the initial_summary_json state field. *(Note: The few-shot examples are included in the prompt message prepared by the embed_and_search_node).*
- **evaluate_node**: Directly calls the underlying Python evaluation function. This function uses Gemini 1.5 Flash to assess the quality of the summary (from initial_summary_json) based on the retrieved context (from context_string). The textual evaluation result is then added back into the messages sequence as a new AIMessage.
- 
**GenAI Problem Solving**: LangGraph orchestrates these nodes, managing the state transitions and enabling a complex RAG and evaluation process to be implemented as a manageable, potentially conditional, flow.

In [16]:
# --- Define the State ---
class AgentState(TypedDict):
    messages: Annotated[Sequence[HumanMessage | AIMessage | ToolMessage], add_messages]
    context_string: str | None = None
    initial_summary_json: dict | None = None

# --- Define Nodes ---
def embed_and_search_node(state: AgentState) -> AgentState:
    """
    Embeds the query and searches the ChromaDB vector store for relevant documents.
    Args:
        state (AgentState): The current state of the agent, containing the messages.
    Returns:
        AgentState: The updated state with the search results added.
    """
    print("---EMBEDDING AND SEARCHING---")
    messages = state["messages"]
    query = messages[-1].content  

    # Ensure db_collection is initialized and accessible
    if 'db_collection' not in globals():
        print("Error: db_collection is not initialized.")
       
        state["messages"].append(
            AIMessage(content="Error: Vector database collection is not available.")
        )
        return state

    print(f"Querying ChromaDB collection '{DB_NAME}' for: '{query}'")

    # Query ChromaDB 
    results = db_collection.query(
        query_texts=[query],
        n_results=10,  
        include=['documents', 'metadatas', 'distances'] 
    )

    # Process results
    documents = results.get('documents', [[]])[0] 
    metadatas = results.get('metadatas', [[]])[0] 
    
    print(f"Found {len(documents)} relevant documents.")
    
    # Construct context string from retrieved documents and their metadata
    context_items = []
    unique_sources_list = set() 
    for doc, meta in zip(documents, metadatas):
        source_name = meta.get('source', 'Unknown') # Get source name
        unique_sources_list.add(source_name) # Add to set
        context_items.append(doc)

    # Join the document text snippets
    context_str = "\n\n---\n\n".join(context_items)
    # Convert the set of unique sources to a sorted list for the prompt
    sources_for_prompt = sorted(list(unique_sources_list))

    # Add the context string directly to the state for other nodes
    state["context_string"] = context_str

    prompt_instruction = f"""You are an AI assistant. You will be provided with text snippets retrieved from various documents below. Use ONLY this provided text content to answer the user's query. Do not attempt to access external websites or files, even if they are mentioned.

    User Query: '{query}'
    --- START OF PROVIDED TEXT CONTEXT ---
    {context_str}
    --- END OF PROVIDED TEXT CONTEXT ---
    Based *only* on the text provided above, generate a JSON object containing the fields 'answer' and 'sources'.
    - The 'answer' field should contain a comprehensive, synthesized answer to the User Query.
    - The 'sources' field MUST be a list of strings containing ALL unique source document names that were provided in the context above. The source names identified in the context were: {sources_for_prompt}. List only these names.
    Respond ONLY with the valid JSON object.
    """
    
    # Append the results as context for the LLM
    message = AIMessage(content=prompt_instruction, name="EmbedAndSearch")
    state["messages"].append(message)
    return state

def generate_summary_node(state: AgentState) -> dict:
    """
    Generates the initial summary JSON using the Gemini Pro model. It takes
    the prepared prompt message from the state.

    Args:
        state (AgentState): The current graph state containing the message history.
                            The last message includes the prompt for summarization.

    Returns:
        dict: A dictionary containing the update for the state:
              - 'initial_summary_json': The parsed JSON summary dictionary.
    """
    messages = state['messages']
    llm = ChatGoogleGenerativeAI(
        model="models/gemini-1.5-pro", google_api_key=GOOGLE_API_KEY,
        generation_config={"response_mime_type": "application/json"}
    )

    # Invoke the LLM with the message history (Query + Context/Instructions)
    response_message = llm.invoke(messages)
    raw_content = response_message.content
    
    # Basic cleaning of markdown fences
    if raw_content.startswith("```json"): raw_content = raw_content[len("```json"):].strip()
    if raw_content.endswith("```"): raw_content = raw_content[:-len("```")].strip()

    # Parse JSON - assumes valid JSON is returned
    summary_json = json.loads(raw_content) 

    # Return the parsed JSON summary
    return {"initial_summary_json": summary_json}


def evaluate_node(state: AgentState) -> dict:
    """
    Directly calls the evaluation function (which uses Gemini Flash) using
    the generated summary and context stored in the state. Adds the evaluation
    result as a new AIMessage to the message history.
    Args:
        state (AgentState): The current graph state, expected to contain 'initial_summary_json' and 'context_string'.
    Returns:
        dict: A dictionary containing the update for the state:
              - 'messages': A list containing a new AIMessage with the evaluation result prefixed.
    """
    summary_data = state["initial_summary_json"]
    full_context = state["context_string"]
    summary_json_str = json.dumps(summary_data)

    eval_result = evaluation_tool.func(summary_json_str=summary_json_str, context=full_context)

    # Return result wrapped in an AIMessage
    return {"messages": [AIMessage(content=f"EVALUATION_RESULT:\n{eval_result}")]}

## <span style="color:green">Step 8: Graph Compilation </span>

**Explanation** The nodes defined previously are assembled into a computational graph using StateGraph. Edges define the flow: the user query triggers the search, the search results feed the summarizer, and the summary feeds the evaluator. The graph is then compiled into an executable agent_executor.

In [17]:
graph_builder = StateGraph(AgentState)

# Add nodes using simplified functions
graph_builder.add_node("embed_and_search", embed_and_search_node)
graph_builder.add_node("generate_summary", generate_summary_node)
graph_builder.add_node("evaluate_summary", evaluate_node)

# Define the simple linear flow
graph_builder.set_entry_point("embed_and_search")
graph_builder.add_edge("embed_and_search", "generate_summary")
graph_builder.add_edge("generate_summary", "evaluate_summary")
graph_builder.add_edge("evaluate_summary", END)

# Compile the graph
agent_executor = graph_builder.compile()

## <span style="color:green">Step 9: Agent Execution and Output Display </span>

**Explanation**: This is the final step where the compiled agent is invoked with a sample query ("Attention mechanisms...").
- The agent executes the defined workflow (search -> summarize -> evaluate). The output section then extracts and displays:
- The structured summary generated by Gemini Pro (formatted as Markdown for readability).
- The evaluation feedback provided by Gemini Flash (retrieved from the agent's final state).
- 
**GenAI Problem Solving**: Demonstrates the end-to-end execution of the GenAI pipeline, delivering the final summary and its quality assessment.

In [18]:
query = "Please explain attention blocks in Transformer-based architectures"
initial_state = {"messages": [HumanMessage(content=query)]}

# Invoke the agent - Assumes successful execution up to this point
final_state = agent_executor.invoke(initial_state, {"recursion_limit": 15})

# --- Format and Display the Final Output ---

print("\n--- Generated Summary (Formatted) ---")
summary_data = final_state["initial_summary_json"]

markdown_output_lines = []

# Add Title 
markdown_output_lines.append("# Summary")

# Add Answer Content
markdown_output_lines.append(summary_data['answer']) 

# Add References
if 'sources' in summary_data and summary_data['sources']:
    markdown_output_lines.append("\n## References")
    # Clean up and deduplicate the sources list
    unique_refs = sorted(list(set(ref.split(", Chunk:")[0] for ref in summary_data['sources'])))
    
    for ref in unique_refs:
        markdown_output_lines.append(f"- <font color='blue'>{ref}</font>")
        

# Join lines with double newlines for paragraph spacing and display
formatted_summary = "\n\n".join(markdown_output_lines)
display(Markdown(formatted_summary))

# ---- Display Evaluation Result ----
display(Markdown("## <font color='red'><b>--- Evaluation Result ---</b></font>"))

last_msg_content = final_state['messages'][-1].content

evaluation_result_text = last_msg_content.split("EVALUATION_RESULT:\n", 1)[1]
display(Markdown(evaluation_result_text))
print("\n##--- End of the Summary Results ---")

---EMBEDDING AND SEARCHING---
Querying ChromaDB collection 'capstone_rag_db' for: 'Please explain attention blocks in Transformer-based architectures'
Found 10 relevant documents.

--- Generated Summary (Formatted) ---


# Summary

Attention blocks are crucial components within Transformer architectures.  Each Transformer layer consists of a multi-head attention module and a feed-forward network, supplemented by Layer normalization layers and skip connections for stability and scaling.  The multi-head attention mechanism allows the model to focus on relevant words in a sequence when processing a specific word, enhancing performance by capturing relationships between words regardless of distance. The attention mechanism involves Query, Key, and Value parameters derived from the input sequence, which are processed through separate linear layers and combined using an attention formula to produce attention scores. In decoders, self-attention and encoder-decoder attention mechanisms are present. Self-attention relates every word in the target sequence to every other word in the same sequence.  Encoder-decoder attention integrates information from both the input and target sequences.  This attention block structure is key to the Transformer's parallel processing capability, enabling faster computation compared to sequential models.


## References

- <font color='blue'>Transformers Explained Visually (Part 1)_ Overview of Functionality _ Towards Data Science.pdf</font>

- <font color='blue'>Transformers Explained Visually (Part 2)_ How it works step-by-step _ Towards Data Science.pdf</font>

- <font color='blue'>Transformers Explained Visually (Part 3)_ Multi-head Attention deep dive _ Towards Data Science.pdf</font>

- <font color='blue'>Visual attention network.pdf</font>

- <font color='blue'>https://tintn.github.io/Implementing-Vision-Transformer-from-Scratch/</font>

## <font color='red'><b>--- Evaluation Result ---</b></font>

The summary is fairly faithful to the provided text, accurately describing the core components of the Transformer architecture (multi-head attention, feed-forward networks, skip connections, layer normalization).  It correctly explains the roles of Query, Key, and Value in the attention mechanism and the distinction between self-attention and encoder-decoder attention. The explanation of parallel processing and its speed advantage over sequential models is also accurate.

However, the summary lacks completeness.  While it covers the main elements, it omits crucial details like the specific attention formula used and the internal workings of the MLP.  The reference check is partially incomplete; while some sources seem relevant (based on titles), the actual content of those sources isn't verifiable from what's given. The provided JSON lacks specific page numbers or sections within the documents, making verification difficult.  Finally, the sources listed seem to be related to a specific blog post series rather than the research papers cited in the provided context.  The summary could be more concise by removing some slightly redundant phrasing.

In short:  Good overview, but needs more detail, better source referencing, and improved conciseness to be considered excellent.


##--- End of the Summary Results ---
