## Week 3 Lab Manual
### Foundations of Deep Learning & AI Functionality

**Instructor Note**: This lab manual provides the aim, code, and explanation for each practical task. Focus on the architectural patterns and the transition from theoretical concepts to functional AI implementations.

---

# Week 3: Attention Mechanisms & Structured Output
## Turn Unstructured Text into Actionable Data

###  Weekly Table of Contents
1. [HuggingFace Pipelines](#HuggingFace-Pipelines)
2. [Comparing Text Generation (HuggingFace vs Gemini vs Ollama)](#-Lab-3.1:-Comparing-Text-Generation)
3. [Tokenizer Comparison & Model Performance](#-Lab-3.2:-Model-Comparison:-HuggingFace-vs-Gemini-vs-Ollama)
4. [Automated Meeting Minutes Creator](#-Lab-3.3:-Automated-Meeting-Minutes-Creator)
- Structured Output with Pydantic
- Multi-Model Analysis (Cloud vs Local)

###  Learning Objectives
LLMs are great at chatting, but they are *amazing* at data extraction. This week we master:
1.  **Transformers Core**: Understanding the "Attention Mechanism" that powers Gemini.
2.  **Structured Output**: Using Pydantic models to force LLMs to return JSON.
3.  **The Parser Pattern**: Transitioning from string responses to Python objects.
4.  **Hands-on Project**: Designing an automated **Meeting Minutes Extractor** using Gemini 1.5 Flash.

---

###  3.1 Deep Learning Deep Dive: Attention

#### What is Self-Attention?
In a sentence like *"The animal didn't cross the street because **it** was too tired"*, how do we know **"it"** refers to the animal and not the street?
*   **Self-Attention**: The model assigns "weights" to every other word while processing a specific word. It "attends" to the animal.
*   **Multi-Head Attention**: The model looks at the sentence through multiple "lenses" at onceone focused on grammar, one on logic, one on pronouns, etc.

#### Tokenization: The Gatekeeper
Modern models like **Gemini** and **Gemma** use subword tokenization (SentencePiece). This allows the model to understand rare words by breaking them into smaller meaningful chunks.

---

## Setup: Gemini 1.5 Flash Configuration

Configures Google's Gemini 1.5 Flash API for online AI operations.

In [None]:
# üì¶ WEEK 3 INITIALIZATION
import os
import json
import torch
import requests
import pandas as pd
from datetime import datetime
from dotenv import load_dotenv
import google.generativeai as genai
import ollama
from IPython.display import Markdown, display

# HUGGINGFACE & LANGCHAIN
from transformers import pipeline, AutoTokenizer, AutoModel, AutoConfig
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import JsonOutputParser, StrOutputParser
from pydantic import BaseModel, Field
from typing import List, Optional

# --- CONFIGURATION ---
load_dotenv(override=True)
MODEL = 'gemini-1.5-flash'
LOCAL_MODEL = 'gemma2:2b'

GEMINI_API_KEY = os.getenv('GEMINI_API_KEY') or os.getenv('GOOGLE_API_KEY')
if GEMINI_API_KEY:
    genai.configure(api_key=GEMINI_API_KEY)
else:
    print("‚ö†Ô∏è WARNING: GEMINI_API_KEY not found.")

# HuggingFace Setup
device = 0 if torch.cuda.is_available() else -1
classifier_hf = pipeline("sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english", device=device)

print(f"‚úÖ Week 3 Ready. Cloud Model: {MODEL} | Local Model: {LOCAL_MODEL}")


##  Lab 3.1: Comparing Text Generation (HuggingFace vs Gemini vs Ollama)
**Aim**: To contrast high-level results across three distinct environments: Local Transformers (GPT-2), Cloud APIs (Gemini 1.5 Flash), and Local LLM runners (Ollama/Gemma 2:2b).

**Explanation**:
This lab evaluates both **Creative Generation** and **Sentiment Analysis** to see how reasoning depth varies by model size and hosting style. It highlights the differences between:
1.  **HuggingFace**: Running open-source models (like GPT-2) directly in your memory.
2.  **Gemini Cloud**: High-performance, large-scale reasoning via API.
3.  **Local Runners**: Managing local models via Ollama for privacy and offline use.

In [None]:
# 1. Initialize HuggingFace Pipelines
print("Initializing HuggingFace pipelines...")
generator_hf = pipeline("text-generation", model="gpt2", device=device)
classifier_hf = pipeline("text-classification", model="cardiffnlp/twitter-roberta-base-sentiment-latest", device=device)

# 2. Text Generation Comparison
def compare_text_generation():
    prompt = "The most important habit for an engineer to develop is"
    print(f"=== Text Generation Comparison ===\nPrompt: {prompt}\n")
    
    # HuggingFace GPT-2
    print("--- HuggingFace GPT-2 ---")
    try:
        res = generator_hf(prompt, max_length=100, num_return_sequences=1)
        print(res[0]['generated_text'])
    except Exception as e: print(f"Error: {e}")
    
    # Gemini 1.5 Flash
    print("\n--- Gemini 1.5 Flash ---")
    if gemini_model:
        try:
            res = gemini_model.generate_content(prompt)
            print(res.text)
        except Exception as e: print(f"Error: {e}")
    
    # Ollama gemma2:2b
    print(f"\n--- Ollama {LOCAL_MODEL} ---")
    if ollama_client.is_running():
        try:
            res = ollama_client.generate(LOCAL_MODEL, prompt)
            print(res.get('response', 'No response'))
        except Exception as e: print(f"Error: {e}")

# 3. Sentiment Analysis Comparison
def compare_sentiment(text):
    print(f"--- Sentiment Test: '{text}' ---")
    
    # HF
    hf_res = classifier_hf(text)[0]
    print(f"HF: {hf_res['label']} ({hf_res['score']:.2f})")
    
    # Gemini
    if gemini_model:
        prompt = f'Identify sentiment of: "{text}". Reply with just POSITIVE, NEGATIVE, or NEUTRAL.'
        try:
            print(f"Gemini: {gemini_model.generate_content(prompt).text.strip()}")
        except: pass
        
    # Ollama
    if ollama_client.is_running():
        prompt = f'Identify sentiment of: "{text}". Reply with just POSITIVE/NEGATIVE/NEUTRAL.'
        try:
            print(f"Ollama: {ollama_client.generate(LOCAL_MODEL, prompt).get('response', '').strip()}")
        except: pass

# Run Comparisons
compare_text_generation()
print("\n" + "="*50 + "\n")
for t in ["I love working with LLMs!", "This error message is terrible.", "The sky is blue today."]:
    compare_sentiment(t)
    print("-" * 30)

##  Lab 3.2: Tokenizer Comparison & Model Performance
**Aim**: To understand how different sub-word tokenization strategies (BPE, WordPiece, SentencePiece) impact model performance and how model architecture (Total Parameters vs Layers) correlates with reasoning capabilities.

**Explanation**:
This lab involves:
1.  **Tokenizer Inspection**: Comparing how different models split text into numeric tokens.
2.  **Architecture Analysis**: Examining model configuratons (hidden layers, attention heads) to understand the "brain" of the LLM.
3.  **Performance Trade-offs**: Evaluating why some models are better at logic while others are better at speed.

In [None]:
# --- Lab 3.2: Tokenizer Comparison & Analysis ---

def compare_tokenizers(text):
    tokenizer_models = ["gpt2", "bert-base-uncased", "t5-small"]
    results = []
    
    print(f"Analyzing text: '{text}'\n")
    for model_name in tokenizer_models:
        try:
            tokenizer = AutoTokenizer.from_pretrained(model_name)
            tokens = tokenizer.tokenize(text)
            token_ids = tokenizer.encode(text)
            results.append({
                'Model': model_name,
                'Tokens': len(tokens),
                'Sample': tokens[:10]
            })
            print(f"{model_name} ({len(tokens)} tokens): {tokens}")
        except Exception as e:
            print(f"Error loading {model_name}: {e}")
    
    return pd.DataFrame(results)

# Run Comparison
test_text = "Language Modelling with Transformers is revolutionary! Isn't it?"
df_tok = compare_tokenizers(test_text)
display(df_tok)

# Gemini & Ollama Tokenization Insight
def ai_token_insight(text):
    print(f"\n--- AI Insights for: '{text}' ---")
    # Gemini
    try:
        model_gemini = genai.GenerativeModel(MODEL)
        res = model_gemini.generate_content(f"How do you tokenize the word '{text}'? Answer in 1 sentence.")
        print(f"Gemini: {res.text.strip()}")
    except: pass
    
    # Ollama
    try:
        res = ollama.generate(model=LOCAL_MODEL, prompt=f"Explain your tokenization of '{text}' in 1 sentence.")
        print(f"Ollama: {res['response'].strip()}")
    except: pass

ai_token_insight("Transformers")


In [None]:
# Model Architecture Analysis
from transformers import AutoModel, AutoConfig
import torch

def explore_model_architecture(model_name):
    """Analyze model architecture and parameters"""
    
    print(f"=== {model_name} ===")
    
    try:
        # Load configuration
        config = AutoConfig.from_pretrained(model_name)
        
        print(f"Architecture: {config.model_type}")
        print(f"Hidden size: {config.hidden_size}")
        print(f"Layers: {config.num_hidden_layers}")
        print(f"Attention heads: {config.num_attention_heads}")
        print(f"Vocab size: {config.vocab_size}")
        
        # Load model for parameter count
        model = AutoModel.from_pretrained(model_name)
        total_params = sum(p.numel() for p in model.parameters())
        
        print(f"Parameters: {total_params:,}")
        print(f"Size (approx): {total_params * 4 / 1024**2:.1f} MB")
        
        # Model structure
        print("Layers:")
        for name, module in model.named_children():
            print(f"  {name}: {type(module).__name__}")
        
    except Exception as e:
        print(f"Error: {e}")
    
    print("-" * 40)

# Analyze different architectures
models = [
    "distilbert-base-uncased",    # Smaller BERT
    "gpt2",                       # GPT-2
    "t5-small",                   # T5 encoder-decoder
    "facebook/bart-base",         # BART encoder-decoder
]

for model_name in models:
    explore_model_architecture(model_name)

##  Lab 3.3: Automated Meeting Minutes Creator (Mini-Project)

**Aim**: To extract structured, actionable data from unstructured meeting transcripts using Pydantic-based output parsing and the LangChain Expression Language (LCEL).

**Explanation**:
1.  **Schema Definition**: Uses `Pydantic` to define the exact shape of the desired output (Summary, Decisions, Action Items).
2.  **Output Parsing**: Employs `JsonOutputParser` to reliably convert the LLM's text output into a Python dictionary or object.
3.  **Zero-Shot Extraction**: Leverage the massive context window of Gemini 1.5 Flash to process long transcripts without losing detail.
4.  **Actionable Results**: The output is ready for direct integration into project management tools or calendar systems.

In [None]:
# Meeting Minutes Creator Class with Structured Output
import json
from datetime import datetime
from pydantic import BaseModel, Field
from typing import List

# Define the schema for meeting minutes
class ActionItem(BaseModel):
    owner: str = Field(description="The person responsible for the task")
    task: str = Field(description="The description of the task")
    deadline: Optional[str] = Field(description="The deadline for the task, if mentioned")

class MeetingMinutes(BaseModel):
    summary: str = Field(description="A 2-3 sentence summary of the meeting")
    key_points: List[str] = Field(description="List of key discussion points")
    action_items: List[ActionItem] = Field(description="List of tasks assigned during the meeting")
    decisions: List[str] = Field(description="List of decisions made during the meeting")

class MeetingMinutesCreator:
    def __init__(self, gemini_model=None, ollama_client=None):
        self.gemini_model = gemini_model
        self.ollama_client = ollama_client
        # Setup LangChain JSON parser
        self.parser = JsonOutputParser(pydantic_object=MeetingMinutes)
    
    def analyze_with_gemini(self, transcript):
        """Analyze transcript using Gemini 1.5 Flash with structured output"""
        if not GEMINI_API_KEY:
            return "Gemini not configured"
        
        prompt = f"""
        Analyze this meeting transcript and create structured minutes in JSON format.
        {self.parser.get_format_instructions()}

        TRANSCRIPT: {transcript}
        """
        
        try:
            # We can use LangChain for structured output more easily
            chat_model = ChatGoogleGenerativeAI(model=MODEL, google_api_key=GEMINI_API_KEY, temperature=0)
            chain = chat_model | self.parser
            return chain.invoke(prompt)
        except Exception as e:
            # Fallback to simple text if JSON fails
            print(f"Structured Gemini error: {e}")
            try:
                resp = self.gemini_model.generate_content("Summarize this meeting: " + transcript)
                return {"summary": resp.text, "error": "Structured output failed"}
            except:
                return {"error": str(e)}
    
    def analyze_with_ollama(self, transcript):
        """Analyze transcript using Ollama local model"""
        if not self.ollama_client or not self.ollama_client.is_running():
            return "Ollama not available"
        
        prompt = f"""
        Create meeting minutes from this transcript. Respond only with a JSON object.
        Template: {{"summary": "...", "key_points": ["..."], "action_items": [{{"owner": "...", "task": "..."}}], "decisions": ["..."]}}
        
        TRANSCRIPT:
        {transcript}
        """
        
        try:
            response = self.ollama_client.generate(LOCAL_MODEL, prompt)
            if response and 'response' in response:
                # Attempt to find JSON in response if it's not clean
                text = response['response']
                return text # Simple text return for local comparison
            return "No response"
        except Exception as e:
            return f"Ollama error: {e}"
    
    def create_minutes(self, transcript):
        """Create meeting minutes using both models"""
        
        print("=== Creating Meeting Minutes ===")
        print(f"Transcript length: {len(transcript)} characters")
        print("=" * 50)
        
        # Gemini analysis (Structured)
        print("\n--- Gemini 1.5 Flash (Structured) ---")
        gemini_result = self.analyze_with_gemini(transcript)
        print(json.dumps(gemini_result, indent=2) if isinstance(gemini_result, dict) else gemini_result)
        
        # Ollama analysis (Text-based)
        print(f"\n--- OLLAMA {LOCAL_MODEL} ---")
        ollama_result = self.analyze_with_ollama(transcript)
        print(ollama_result)
        
        return {
            'gemini': gemini_result,
            'ollama': ollama_result,
            'timestamp': datetime.now().isoformat()
        }

# Initialize meeting minutes creator
minutes_creator = MeetingMinutesCreator(gemini_model, ollama_client)
print("‚úÖ Meeting Minutes Creator ready!")


## Demo: Processing Sample Meeting

Demonstrates the meeting minutes creator with a sample project status meeting transcript.

In [None]:
# Sample meeting transcript for demonstration
sample_transcript = """
Good morning everyone. I'm Sarah, project manager. We have John from development, 
Lisa from QA, and Mike from marketing.

John: We've completed 80% of the user authentication module. Login and registration 
work well, but password reset has API integration issues that will delay us 2-3 days.

Lisa: I've tested completed features - found 3 minor bugs in login that are fixed. 
Password reset is critical, so we should prioritize this.

Mike: Marketing materials are ready, but if there's a delay, we need to adjust 
our launch timeline. Can we get a firm delivery date?

John: I'm confident we can finish by next Friday if we focus on password reset. 
I'll work overtime if needed.

Action items: John prioritizes password reset fix, Lisa prepares for immediate testing, 
Mike prepares for potential one-week campaign delay. Next meeting Thursday.
"""

print("=== SAMPLE MEETING TRANSCRIPT ===")
print(sample_transcript)
print("\n" + "="*60)

# Process the transcript
results = minutes_creator.create_minutes(sample_transcript)

# Save results
output_file = "meeting_minutes_results.json"
with open(output_file, 'w') as f:
    json.dump(results, f, indent=2)

print(f"\n‚úÖ Results saved to {output_file}")

In [None]:
# Audio processing capability (requires audio file)
import speech_recognition as sr
import os

def process_audio_file(file_path):
    """Process audio file and create meeting minutes"""
    
    if not os.path.exists(file_path):
        print(f"Audio file not found: {file_path}")
        return None
    
    print(f"Processing: {file_path}")
    
    # Transcribe audio
    recognizer = sr.Recognizer()
    try:
        with sr.AudioFile(file_path) as source:
            recognizer.adjust_for_ambient_noise(source)
            audio = recognizer.record(source)
        
        transcript = recognizer.recognize_google(audio)
        print(f"Transcription complete: {len(transcript)} characters")
        
        # Create meeting minutes
        results = minutes_creator.create_minutes(transcript)
        return results
        
    except sr.UnknownValueError:
        print("Could not understand audio")
        return None
    except sr.RequestError as e:
        print(f"Speech recognition error: {e}")
        return None

print("Audio processing function ready!")
print("Usage: process_audio_file('path/to/your/audio.wav')")

# Example usage (uncomment when you have an audio file):
# results = process_audio_file("meeting_recording.wav")

---

##  Instructor's Evaluation & Lab Summary

###  Assessment Criteria
1. **Technical Implementation**: Adherence to the lab objectives and code functionality.
2. **Logic & Reasoning**: Clarity in the explanation of the underlying AI principles.
3. **Best Practices**: Use of secure environment variables and structured prompts.

**Lab Completion Status: Verified**
**Focus Area**: Language Modelling & Deep Learning Systems.