# Session 1: AI/GenAI Fundamentals ‚Äî Interactive Demos

**Audience:** Banking Technologists  
**Duration:** 60-90 minutes (hands-on exploration)  
**Approach:** INTERACTIVE - You experiment, modify, and discover!

---

## üéØ Learning by Doing

This notebook is designed for **active participation**:
- ‚úÖ Run cells and see results
- ‚úÖ Modify parameters and re-run
- ‚úÖ Answer questions by experimenting
- ‚úÖ Break things and fix them

**Don't just read ‚Äî INTERACT!**

---

## Setup

In [None]:
# Install required packages
!pip install -q tiktoken scikit-learn numpy pandas matplotlib plotly ipywidgets

print("‚úÖ Packages installed!")

In [None]:
import warnings
warnings.filterwarnings('ignore')

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import plotly.graph_objects as go
import plotly.express as px
from sklearn.manifold import TSNE
from sklearn.decomposition import PCA
from sklearn.metrics.pairwise import cosine_similarity
import tiktoken
import ipywidgets as widgets
from IPython.display import display, HTML, Markdown, clear_output
import json
import time
from collections import Counter

# Nice display settings
plt.style.use('seaborn-v0_8-darkgrid')
pd.set_option('display.max_colwidth', None)

print("\n‚úÖ All imports ready!")
print("üìä Interactive visualizations enabled")
print("üéÆ Let's explore AI fundamentals!\n")

---

# üß™ Interactive Demo 1: Word Embeddings Exploration

## Understanding How AI "Sees" Words

**Key Question:** How does AI understand that "bank" and "finance" are related?

**Answer:** Embeddings - vectors that capture meaning!

In [None]:
# Simulated word embeddings (in reality these are 768+ dimensions)
# We'll use 5 dimensions for easy visualization

np.random.seed(42)  # For reproducibility

# Banking/Finance cluster
banking_words = {
    'bank': np.array([0.8, 0.7, 0.1, 0.2, 0.1]),
    'finance': np.array([0.75, 0.72, 0.15, 0.18, 0.12]),
    'loan': np.array([0.78, 0.68, 0.12, 0.25, 0.08]),
    'credit': np.array([0.82, 0.73, 0.09, 0.22, 0.14]),
    'mortgage': np.array([0.76, 0.69, 0.11, 0.28, 0.16]),
    'deposit': np.array([0.79, 0.71, 0.13, 0.19, 0.11]),
}

# Tech cluster
tech_words = {
    'computer': np.array([0.2, 0.1, 0.85, 0.78, 0.12]),
    'software': np.array([0.18, 0.12, 0.82, 0.75, 0.15]),
    'algorithm': np.array([0.15, 0.08, 0.88, 0.80, 0.10]),
    'code': np.array([0.22, 0.14, 0.80, 0.77, 0.13]),
}

# Food cluster
food_words = {
    'apple': np.array([0.1, 0.15, 0.2, 0.1, 0.82]),
    'banana': np.array([0.12, 0.18, 0.18, 0.12, 0.85]),
    'orange': np.array([0.08, 0.16, 0.22, 0.14, 0.80]),
    'grape': np.array([0.14, 0.20, 0.19, 0.11, 0.88]),
}

# Combine all words
word_embeddings = {**banking_words, **tech_words, **food_words}

print("üì¶ Created simulated embeddings for demonstration")
print(f"   Total words: {len(word_embeddings)}")
print(f"   Embedding dimensions: 5 (real models use 768-1536!)\n")

# Show example
print("Example - 'bank' embedding:")
print(f"  {word_embeddings['bank']}")
print("\nEach number captures different aspects of meaning!")

## üéÆ Interactive Activity 1.1: Compare Words

**Your Turn!** Pick two words and see how similar they are.

In [None]:
def calculate_similarity(word1, word2):
    """Calculate cosine similarity between two words"""
    if word1 not in word_embeddings or word2 not in word_embeddings:
        return None
    
    vec1 = word_embeddings[word1].reshape(1, -1)
    vec2 = word_embeddings[word2].reshape(1, -1)
    
    similarity = cosine_similarity(vec1, vec2)[0][0]
    return similarity

def compare_words_interactive():
    """Interactive word comparison"""
    word_list = list(word_embeddings.keys())
    
    # Create dropdown widgets
    word1_dropdown = widgets.Dropdown(
        options=word_list,
        value='bank',
        description='Word 1:',
    )
    
    word2_dropdown = widgets.Dropdown(
        options=word_list,
        value='finance',
        description='Word 2:',
    )
    
    output = widgets.Output()
    
    def on_change(change):
        with output:
            clear_output(wait=True)
            word1 = word1_dropdown.value
            word2 = word2_dropdown.value
            
            similarity = calculate_similarity(word1, word2)
            
            print(f"\n{'='*50}")
            print(f"  Comparing: '{word1}' vs '{word2}'")
            print(f"{'='*50}\n")
            
            print(f"Embedding '{word1}':")
            print(f"  {word_embeddings[word1]}\n")
            
            print(f"Embedding '{word2}':")
            print(f"  {word_embeddings[word2]}\n")
            
            print(f"Cosine Similarity: {similarity:.4f}")
            
            # Interpretation
            if similarity > 0.9:
                print("\n‚úÖ VERY SIMILAR - These words are highly related!")
            elif similarity > 0.7:
                print("\n‚úÖ SIMILAR - These words are related")
            elif similarity > 0.5:
                print("\n‚ö†Ô∏è  MODERATELY SIMILAR - Some relationship")
            else:
                print("\n‚ùå NOT SIMILAR - These words are unrelated")
    
    word1_dropdown.observe(on_change, names='value')
    word2_dropdown.observe(on_change, names='value')
    
    # Initial display
    on_change(None)
    
    display(word1_dropdown, word2_dropdown, output)

print("üéÆ Interactive Word Comparison Ready!")
print("   Try comparing:")
print("   ‚Ä¢ bank vs finance (should be similar)")
print("   ‚Ä¢ bank vs apple (should be different)")
print("   ‚Ä¢ computer vs software (should be similar)\n")

compare_words_interactive()

### üí° Questions to Explore:

1. **Which pair has the HIGHEST similarity?** (Try different combinations!)
2. **Which pair has the LOWEST similarity?**
3. **Do banking words cluster together?**

**Write your findings here:**
- Highest similarity: _____________ (similarity: _____)
- Lowest similarity: _____________ (similarity: _____)
- Observation: _________________________________

## üìä Interactive Activity 1.2: Visualize Embeddings

Let's see word embeddings in 2D space!

In [None]:
# Reduce 5D embeddings to 2D using PCA
words = list(word_embeddings.keys())
vectors = np.array([word_embeddings[w] for w in words])

pca = PCA(n_components=2)
vectors_2d = pca.fit_transform(vectors)

# Create interactive plot
fig = go.Figure()

# Add points for each cluster
categories = {
    'Banking': list(banking_words.keys()),
    'Tech': list(tech_words.keys()),
    'Food': list(food_words.keys())
}

colors = {'Banking': 'blue', 'Tech': 'red', 'Food': 'green'}

for category, word_list in categories.items():
    indices = [words.index(w) for w in word_list]
    
    fig.add_trace(go.Scatter(
        x=vectors_2d[indices, 0],
        y=vectors_2d[indices, 1],
        mode='markers+text',
        name=category,
        text=word_list,
        textposition='top center',
        marker=dict(size=15, color=colors[category]),
        textfont=dict(size=12)
    ))

fig.update_layout(
    title="Word Embeddings Visualization (2D Projection)",
    xaxis_title="Dimension 1",
    yaxis_title="Dimension 2",
    height=600,
    showlegend=True
)

fig.show()

print("\nüîç Observe the clustering:")
print("   ‚Ä¢ Banking words (blue) cluster together")
print("   ‚Ä¢ Tech words (red) form another cluster")
print("   ‚Ä¢ Food words (green) are separate\n")
print("üí° This is how AI 'understands' semantic relationships!")

## üßÆ Interactive Activity 1.3: Word Arithmetic

**The Famous Example:** king - man + woman = queen

Let's try it with banking terms!

In [None]:
# Pre-computed word analogies (simulated - real Word2Vec results)
# Source: Real documented biases from Bolukbasi et al. (2016)

analogies = {
    # Banking analogies
    ('credit', 'debit'): [
        ('deposit', 'withdrawal'),
        ('loan', 'repayment'),
        ('income', 'expense'),
    ],
    
    # Gender bias examples (REAL Word2Vec)
    ('programmer', 'man'): [
        ('woman', 'homemaker'),  # Documented bias!
        ('woman', 'nurse'),
        ('woman', 'receptionist'),
    ],
    
    ('doctor', 'man'): [
        ('woman', 'nurse'),  # Documented bias!
        ('woman', 'midwife'),
    ],
}

def word_arithmetic_demo():
    """Interactive word arithmetic explorer"""
    
    print("‚ïê" * 60)
    print("  WORD ARITHMETIC: Exploring Relationships & Biases")
    print("‚ïê" * 60)
    print()
    print("üìê If embeddings capture meaning, then:")
    print("     credit - debit + withdrawal = ?")
    print()
    
    # Example 1: Banking (works well)
    print("\n‚úÖ EXAMPLE 1: Banking Relationship")
    print("-" * 60)
    print("  Given: credit is to debit")
    print("         as deposit is to _______?")
    print()
    print("  Answer: withdrawal")
    print("  Logic: credit/debit are opposites, deposit/withdrawal are opposites")
    print("  ‚úì Makes sense!")
    
    # Example 2: Gender bias (problematic)
    print("\n\n‚ö†Ô∏è  EXAMPLE 2: Gender Bias (REAL Word2Vec)")
    print("-" * 60)
    print("  Given: programmer is to man")
    print("         as _______ is to woman?")
    print()
    print("  Word2Vec says: homemaker, nurse, receptionist")
    print()
    print("  üö® PROBLEM: This reflects gender bias in training data!")
    print("     Real programmers are all genders, but word2vec")
    print("     learned stereotypes from internet text.")
    print()
    print("  üìö Source: Bolukbasi et al. (2016)")
    print("     'Man is to Computer Programmer as Woman is to Homemaker?")
    print("      Debiasing Word Embeddings'")
    
    # Example 3: More bias
    print("\n\n‚ö†Ô∏è  EXAMPLE 3: More Gender Bias")
    print("-" * 60)
    print("  Given: doctor is to man")
    print("         as _______ is to woman?")
    print()
    print("  Word2Vec says: nurse, midwife")
    print()
    print("  üö® PROBLEM: Reinforces stereotypes about medical professions")
    print()
    
    print("\n" + "‚ïê" * 60)
    print("  KEY LESSON: AI learns from data - including biases!")
    print("‚ïê" * 60)
    print()
    print("Banking Implication:")
    print("  ‚Ä¢ Credit scoring models using embeddings may inherit biases")
    print("  ‚Ä¢ Fraud detection may profile based on biased patterns")
    print("  ‚Ä¢ Resume screening AI may favor certain demographics")
    print()
    print("‚öñÔ∏è  Always audit AI systems for fairness!")

word_arithmetic_demo()

### üí≠ Reflection Questions:

1. **How could gender bias in embeddings affect banking?**
   - Think about: loan approvals, fraud detection, customer service routing
   
2. **What other biases might exist in word embeddings?**
   - Race? Age? Geography?
   
3. **How would you test for bias in a banking AI system?**

**Write your thoughts:**
- _________________________________________________
- _________________________________________________

---

# üß™ Interactive Demo 2: Tokenization Explorer

## How LLMs Break Down Text

Before processing, LLMs convert text into **tokens**. Let's explore!

In [None]:
# Load tokenizer
tokenizer = tiktoken.get_encoding("cl100k_base")  # GPT-4, Claude 3+

def tokenize_interactive():
    """Interactive tokenization explorer"""
    
    text_input = widgets.Textarea(
        value='The customer deposited $5000 into their savings account.',
        placeholder='Enter any text...',
        description='Your Text:',
        layout=widgets.Layout(width='80%', height='100px')
    )
    
    button = widgets.Button(
        description='Tokenize!',
        button_style='success',
        icon='check'
    )
    
    output = widgets.Output()
    
    def on_button_click(b):
        with output:
            clear_output(wait=True)
            text = text_input.value
            
            # Tokenize
            tokens = tokenizer.encode(text)
            token_strings = [tokenizer.decode([t]) for t in tokens]
            
            print("\n" + "‚ïê" * 70)
            print("  TOKENIZATION RESULT")
            print("‚ïê" * 70 + "\n")
            
            print(f"Original Text: \"{text}\"\n")
            print(f"Total Tokens: {len(tokens)}")
            print(f"Total Characters: {len(text)}")
            print(f"Tokens per Character: {len(tokens)/len(text):.2f}\n")
            
            print("Token Breakdown:")
            print("-" * 70)
            for i, (token_id, token_str) in enumerate(zip(tokens, token_strings), 1):
                # Show token with visual separator
                display_str = token_str.replace(' ', '‚ê£')  # Show spaces
                display_str = display_str.replace('\n', '‚Üµ')  # Show newlines
                print(f"  Token {i:2d}: [{token_id:5d}] = '{display_str}'")
            
            # Cost calculation
            print("\n" + "‚ïê" * 70)
            print("  COST ESTIMATION (Claude Sonnet 4.5)")
            print("‚ïê" * 70 + "\n")
            
            input_cost_per_1m = 3.00
            output_cost_per_1m = 15.00
            
            input_cost = (len(tokens) / 1_000_000) * input_cost_per_1m
            output_cost = (len(tokens) / 1_000_000) * output_cost_per_1m
            
            print(f"If this was INPUT (prompt):")
            print(f"  Cost: ${input_cost:.6f} ({len(tokens)} tokens √ó $3/1M)\n")
            
            print(f"If this was OUTPUT (response):")
            print(f"  Cost: ${output_cost:.6f} ({len(tokens)} tokens √ó $15/1M)\n")
            
            # Banking scale
            print("At Banking Scale (1000 similar texts/day):")
            print(f"  Monthly INPUT cost: ${input_cost * 1000 * 30:.2f}")
            print(f"  Monthly OUTPUT cost: ${output_cost * 1000 * 30:.2f}")
    
    button.on_click(on_button_click)
    
    display(text_input, button, output)
    
    # Initial tokenization
    on_button_click(None)

print("üéÆ Interactive Tokenizer Ready!\n")
print("Try these examples:")
print("  ‚Ä¢ Simple: 'Hello world'")
print("  ‚Ä¢ Banking: 'The customer deposited $5000'")
print("  ‚Ä¢ Code: 'function calculateInterest(principal, rate) {'")
print("  ‚Ä¢ Numbers: '1234567890'")
print("  ‚Ä¢ Special: 'r√©sum√© caf√© na√Øve'\n")

tokenize_interactive()

### üî¨ Experiment Time!

**Try tokenizing these and observe patterns:**

1. **"bank" vs "banking" vs "banks"** - How many tokens each?
2. **"$1000" vs "$1,000" vs "one thousand dollars"** - Which uses fewer tokens?
3. **"AI" vs "artificial intelligence"** - Token efficiency?
4. **Your own banking text** - How many tokens?

**Observations:**
- _________________________________________________
- _________________________________________________

---

# üß™ Interactive Demo 3: Semantic Search Simulator

## Find Documents by Meaning (Not Keywords!)

This is how RAG (Retrieval Augmented Generation) works - coming in Session 3!

In [None]:
# Simulated banking knowledge base
documents = [
    {"id": 1, "title": "How to Deposit a Check", 
     "content": "Mobile check deposit allows customers to photograph checks using the mobile app. Simply open the app, select deposit, take photos of front and back.",
     "embedding": np.array([0.8, 0.7, 0.3, 0.2, 0.4])},
    
    {"id": 2, "title": "Opening a Savings Account",
     "content": "To open a new savings account, you'll need government ID, proof of address, and an initial deposit of $25 minimum.",
     "embedding": np.array([0.6, 0.5, 0.4, 0.3, 0.5])},
    
    {"id": 3, "title": "Credit Card Rewards Program",
     "content": "Earn 2% cash back on all purchases with our premium credit card. No annual fee for the first year. Redeem points for cash or travel.",
     "embedding": np.array([0.3, 0.4, 0.2, 0.8, 0.7])},
    
    {"id": 4, "title": "Wire Transfer Instructions",
     "content": "To send a domestic wire transfer, you'll need the recipient's account number, routing number, and bank name. Fees apply.",
     "embedding": np.array([0.7, 0.6, 0.5, 0.3, 0.3])},
    
    {"id": 5, "title": "Mobile Check Capture Tutorial",
     "content": "Step-by-step guide to depositing checks remotely using your smartphone camera. Works for personal and business checks up to $5000.",
     "embedding": np.array([0.82, 0.72, 0.28, 0.22, 0.38])},
    
    {"id": 6, "title": "Fraud Protection Services",
     "content": "Our 24/7 fraud monitoring watches for suspicious activity. Get instant alerts via text or email. Zero liability protection included.",
     "embedding": np.array([0.2, 0.3, 0.6, 0.7, 0.8])},
]

# Query embeddings (what user searches for)
query_embeddings = {
    "How do I deposit a check?": np.array([0.78, 0.68, 0.32, 0.24, 0.42]),
    "Opening new account": np.array([0.62, 0.52, 0.42, 0.32, 0.48]),
    "Credit card benefits": np.array([0.32, 0.42, 0.22, 0.78, 0.68]),
    "Send money to someone": np.array([0.72, 0.62, 0.48, 0.32, 0.28]),
}

def semantic_search_demo():
    """Interactive semantic search"""
    
    query_dropdown = widgets.Dropdown(
        options=list(query_embeddings.keys()),
        value=list(query_embeddings.keys())[0],
        description='Your Query:',
        style={'description_width': 'initial'},
        layout=widgets.Layout(width='50%')
    )
    
    output = widgets.Output()
    
    def on_change(change):
        with output:
            clear_output(wait=True)
            query = query_dropdown.value
            query_vec = query_embeddings[query].reshape(1, -1)
            
            print("\n" + "‚ïê" * 70)
            print("  SEMANTIC SEARCH RESULTS")
            print("‚ïê" * 70 + "\n")
            print(f"üîç Query: \"{query}\"\n")
            
            # Calculate similarities
            results = []
            for doc in documents:
                doc_vec = doc['embedding'].reshape(1, -1)
                similarity = cosine_similarity(query_vec, doc_vec)[0][0]
                results.append((doc, similarity))
            
            # Sort by similarity
            results.sort(key=lambda x: x[1], reverse=True)
            
            print("Top 3 Results (by semantic similarity):\n")
            print("-" * 70)
            
            for rank, (doc, similarity) in enumerate(results[:3], 1):
                print(f"\n{rank}. {doc['title']}")
                print(f"   Similarity: {similarity:.4f} {'üî•' if similarity > 0.9 else '‚úì' if similarity > 0.7 else ''}")
                print(f"   Content: {doc['content'][:100]}...")
            
            # Show what keyword search would miss
            print("\n\n" + "‚ïê" * 70)
            print("  KEYWORD SEARCH COMPARISON")
            print("‚ïê" * 70 + "\n")
            
            # Simple keyword matching
            query_words = set(query.lower().split())
            keyword_matches = []
            for doc in documents:
                title_words = set(doc['title'].lower().split())
                content_words = set(doc['content'].lower().split())
                all_words = title_words.union(content_words)
                
                matches = query_words.intersection(all_words)
                if matches:
                    keyword_matches.append((doc, len(matches)))
            
            if keyword_matches:
                keyword_matches.sort(key=lambda x: x[1], reverse=True)
                print("Keyword search found:")
                for doc, match_count in keyword_matches[:3]:
                    print(f"  ‚Ä¢ {doc['title']} ({match_count} matching words)")
            else:
                print("‚ùå Keyword search found NOTHING!")
                print("   (No exact word matches)\n")
                print("But semantic search found relevant docs because it")
                print("understands MEANING, not just keywords! üéØ")
    
    query_dropdown.observe(on_change, names='value')
    on_change(None)  # Initial display
    
    display(query_dropdown, output)

print("üéÆ Semantic Search Ready!\n")
print("Try different queries and see how semantic search")
print("finds relevant documents even without keyword matches!\n")

semantic_search_demo()

### üí° Observations:

1. **Try "How do I deposit a check?"**
   - Does it find "Mobile Check Capture Tutorial"?
   - Even though the title doesn't contain "deposit"!

2. **Try "Send money to someone"**
   - Should find "Wire Transfer Instructions"
   - Semantic understanding: "send money" = "wire transfer"

**Why This Matters for Banking:**
- Customers don't use technical terms ("wire transfer")
- They use natural language ("send money")
- Semantic search bridges this gap!

**Your observations:**
- _________________________________________________

---

# üß™ Interactive Demo 4: Prompt Engineering Playground

## See How Temperature Affects Output

Let's simulate different temperature settings!

In [None]:
def simulate_temperature_effect():
    """Simulate how temperature affects next-word prediction"""
    
    # Simulated next-word probabilities
    base_probs = {
        'approved': 0.45,
        'accepted': 0.30,
        'granted': 0.15,
        'confirmed': 0.08,
        'finalized': 0.02
    }
    
    temp_slider = widgets.FloatSlider(
        value=0.7,
        min=0.0,
        max=2.0,
        step=0.1,
        description='Temperature:',
        continuous_update=False,
        readout=True,
        readout_format='.1f',
    )
    
    output = widgets.Output()
    
    def on_change(change):
        with output:
            clear_output(wait=True)
            temp = temp_slider.value
            
            print("\n" + "‚ïê" * 70)
            print("  TEMPERATURE EFFECT SIMULATOR")
            print("‚ïê" * 70 + "\n")
            print(f"Prompt: \"Your loan application has been _______\"\n")
            print(f"Temperature: {temp}\n")
            
            # Apply temperature (simplified simulation)
            if temp == 0.0:
                # Deterministic - always pick highest
                adjusted_probs = {k: (1.0 if k == 'approved' else 0.0) for k in base_probs}
            else:
                # Adjust probabilities based on temperature
                import math
                logits = {k: math.log(v) / temp for k, v in base_probs.items()}
                exp_logits = {k: math.exp(v) for k, v in logits.items()}
                sum_exp = sum(exp_logits.values())
                adjusted_probs = {k: v/sum_exp for k, v in exp_logits.items()}
            
            # Display probabilities
            print("Next Word Probabilities:\n")
            sorted_probs = sorted(adjusted_probs.items(), key=lambda x: x[1], reverse=True)
            
            for word, prob in sorted_probs:
                bar_length = int(prob * 50)
                bar = '‚ñà' * bar_length
                print(f"  {word:12s} ‚îÇ{bar:<50s}‚îÇ {prob:.1%}")
            
            # Interpretation
            print("\n" + "-" * 70)
            if temp == 0.0:
                print("\nüéØ DETERMINISTIC (temp=0.0):")
                print("   ‚Ä¢ Always picks 'approved' (highest probability)")
                print("   ‚Ä¢ Same input = same output every time")
                print("   ‚Ä¢ Use for: compliance docs, calculations, factual answers")
            elif temp < 0.5:
                print("\nüìä LOW TEMPERATURE (temp<0.5):")
                print("   ‚Ä¢ Strongly favors top choices")
                print("   ‚Ä¢ High consistency, low creativity")
                print("   ‚Ä¢ Use for: policy Q&A, structured outputs")
            elif temp < 1.0:
                print("\n‚öñÔ∏è  MODERATE TEMPERATURE (0.5-1.0):")
                print("   ‚Ä¢ Balanced randomness")
                print("   ‚Ä¢ Some variety while staying coherent")
                print("   ‚Ä¢ Use for: customer emails, content generation")
            else:
                print("\nüé≤ HIGH TEMPERATURE (>1.0):")
                print("   ‚Ä¢ High randomness")
                print("   ‚Ä¢ Creative but potentially incoherent")
                print("   ‚Ä¢ Use for: brainstorming, creative writing")
                print("   ‚Ä¢ AVOID for banking (too unpredictable)")
    
    temp_slider.observe(on_change, names='value')
    on_change(None)
    
    display(temp_slider, output)

print("üéÆ Temperature Simulator Ready!\n")
print("üî¨ Experiment with different values:")
print("   ‚Ä¢ 0.0 - Completely deterministic")
print("   ‚Ä¢ 0.3 - Banking standard (consistency)")
print("   ‚Ä¢ 0.7 - Balanced (default)")
print("   ‚Ä¢ 1.5 - Very creative (risky!)\n")

simulate_temperature_effect()

### üéØ Banking Recommendations:

| Use Case | Recommended Temperature | Why? |
|----------|------------------------|------|
| Compliance documents | 0.0 - 0.1 | Need exact, consistent output |
| Loan calculations | 0.0 | Deterministic results required |
| Policy Q&A | 0.1 - 0.3 | Factual, minimal variation |
| Customer emails | 0.5 - 0.7 | Professional with some personality |
| Marketing content | 0.7 - 0.9 | Creative but coherent |
| Brainstorming | 1.0 - 1.2 | Maximum creativity |

**Never use >1.5 for banking!** (Too unpredictable)

---

# üß™ Interactive Demo 5: Few-Shot Learning Explorer

## Teaching AI Through Examples

In [None]:
def few_shot_demo():
    """Compare zero-shot vs few-shot learning"""
    
    num_examples = widgets.IntSlider(
        value=0,
        min=0,
        max=3,
        step=1,
        description='Examples:',
        continuous_update=False,
    )
    
    output = widgets.Output()
    
    examples_data = [
        {
            "transaction": "$45 at GroceryStore, 2pm, home city",
            "classification": "LEGITIMATE",
            "reason": "Normal amount, normal time, normal location"
        },
        {
            "transaction": "$3000 at OnlineGambling, 4am, foreign country",
            "classification": "FRAUDULENT",
            "reason": "High amount, unusual time, customer never traveled"
        },
        {
            "transaction": "$20 at CoffeeShop, 8am, near workplace",
            "classification": "LEGITIMATE",
            "reason": "Small amount, normal time, known location"
        },
    ]
    
    test_case = {
        "transaction": "$1200 at ElectronicsStore, 3am, overseas IP",
        "expected": "FRAUDULENT",
        "reason": "High amount + unusual time + suspicious location"
    }
    
    def on_change(change):
        with output:
            clear_output(wait=True)
            n = num_examples.value
            
            print("\n" + "‚ïê" * 70)
            print(f"  {'ZERO-SHOT' if n == 0 else f'{n}-SHOT'} LEARNING DEMONSTRATION")
            print("‚ïê" * 70 + "\n")
            
            print("Task: Classify transaction as LEGITIMATE or FRAUDULENT\n")
            print("-" * 70)
            
            if n == 0:
                print("\nüìù ZERO-SHOT: No examples provided\n")
                print("Prompt to AI:")
                print(f"  \"Classify: {test_case['transaction']}\"\n")
                print("Expected behavior:")
                print("  ‚ö†Ô∏è  AI might guess or use general knowledge")
                print("  ‚ö†Ô∏è  May not align with YOUR fraud criteria")
                print("  ‚ö†Ô∏è  Inconsistent results\n")
                
            else:
                print(f"\nüìù {n}-SHOT: Providing {n} example(s)\n")
                print("Examples given to AI:\n")
                
                for i, ex in enumerate(examples_data[:n], 1):
                    print(f"  Example {i}:")
                    print(f"    Transaction: {ex['transaction']}")
                    print(f"    Classification: {ex['classification']}")
                    print(f"    Reason: {ex['reason']}\n")
                
                print("-" * 70)
                print("\nNow classify:")
                print(f"  Transaction: {test_case['transaction']}\n")
                
                print("Expected behavior:")
                print("  ‚úÖ AI learns pattern from examples")
                print("  ‚úÖ Applies YOUR fraud criteria")
                print("  ‚úÖ More consistent results")
                print(f"  ‚úÖ Likely classifies as: {test_case['expected']}\n")
            
            # Show improvement
            print("\n" + "‚ïê" * 70)
            print("  ACCURACY IMPROVEMENT")
            print("‚ïê" * 70 + "\n")
            
            accuracies = [65, 78, 88, 94]  # Simulated
            
            print(f"Zero-shot accuracy: {accuracies[0]}%")
            if n > 0:
                print(f"{n}-shot accuracy: {accuracies[n]}%")
                improvement = accuracies[n] - accuracies[0]
                print(f"\n‚ú® Improvement: +{improvement}% accuracy!")
    
    num_examples.observe(on_change, names='value')
    on_change(None)
    
    display(num_examples, output)

print("üéÆ Few-Shot Learning Explorer Ready!\n")
print("üî¨ Experiment:")
print("   ‚Ä¢ 0 examples (zero-shot) - How does AI perform?")
print("   ‚Ä¢ 1 example - Does it help?")
print("   ‚Ä¢ 2-3 examples - Even better?\n")

few_shot_demo()

### üí° Key Lessons:

1. **Few-shot learning is powerful** - Just 2-3 examples dramatically improve accuracy
2. **Examples teach YOUR standards** - Not generic AI knowledge
3. **Banking application** - Use few-shot for:
   - Fraud classification
   - Document categorization
   - Tone/style matching
   - Custom business logic

**Your observations:**
- _________________________________________________

---

# üìä Summary Dashboard

## What You Experienced Today

In [None]:
print("\n" + "‚ïê" * 70)
print("  SESSION 1 SUMMARY: What You Learned Through Interaction")
print("‚ïê" * 70 + "\n")

summary = {
    "1. Embeddings": [
        "‚úÖ Words are converted to vectors (dense arrays of numbers)",
        "‚úÖ Similar meanings ‚Üí similar vectors (clustering)",
        "‚úÖ Can do arithmetic: king - man + woman = queen",
        "‚ö†Ô∏è  Embeddings can contain biases from training data",
        "üè¶ Banking use: Semantic search, fraud detection, clustering"
    ],
    "2. Tokenization": [
        "‚úÖ LLMs break text into tokens (subwords)",
        "‚úÖ ~0.75 words per token on average",
        "‚úÖ Tokens = billing unit ($3-$15 per million)",
        "üí° Shorter text = lower cost",
        "üè¶ Banking use: Cost optimization, context management"
    ],
    "3. Semantic Search": [
        "‚úÖ Find by meaning, not just keywords",
        "‚úÖ Uses cosine similarity between embeddings",
        "‚úÖ Works even without exact word matches",
        "üí° Much better than keyword search for natural language",
        "üè¶ Banking use: Knowledge bases, policy docs, FAQs"
    ],
    "4. Temperature": [
        "‚úÖ Controls randomness (0 = deterministic, 2 = chaotic)",
        "‚úÖ Banking standard: 0.0-0.3 (consistency matters!)",
        "‚úÖ Higher temp = more creative, less predictable",
        "‚ö†Ô∏è  Never use >1.5 for banking (too unpredictable)",
        "üè¶ Banking use: 0.0 for compliance, 0.7 for customer emails"
    ],
    "5. Few-Shot Learning": [
        "‚úÖ Examples teach AI your specific standards",
        "‚úÖ 2-3 examples can improve accuracy by 20-30%",
        "‚úÖ Zero-shot = generic, Few-shot = customized",
        "üí° Always provide examples for business-critical tasks",
        "üè¶ Banking use: Fraud classification, categorization, QA"
    ]
}

for topic, points in summary.items():
    print(f"\n{topic}")
    print("-" * 70)
    for point in points:
        print(f"  {point}")

print("\n\n" + "‚ïê" * 70)
print("  NEXT: Session 2 - GenAI vs Agentic AI")
print("‚ïê" * 70)
print("\nYou'll learn:")
print("  ‚Ä¢ Reactive (GenAI) vs Proactive (Agentic) systems")
print("  ‚Ä¢ Autonomy levels (0-4)")
print("  ‚Ä¢ Agent capabilities: Planning, Tools, Memory, Reflection")
print("  ‚Ä¢ When to use GenAI vs when to use Agentic AI")
print("\nüéâ Great job on Session 1! See you in Session 2!\n")

---

# üéØ Self-Assessment Quiz

Test your understanding! (Answers at the bottom)

In [None]:
quiz = [
    {
        "question": "What is an embedding?",
        "options": [
            "A) A type of database",
            "B) A dense vector representing word meaning",
            "C) A compression algorithm",
            "D) A programming language"
        ],
        "answer": "B",
        "explanation": "Embeddings are dense vectors (arrays of numbers) that capture semantic meaning of words/sentences."
    },
    {
        "question": "For banking compliance documents, what temperature should you use?",
        "options": [
            "A) 0.0 - 0.3 (deterministic)",
            "B) 0.7 - 1.0 (balanced)",
            "C) 1.5 - 2.0 (very creative)",
            "D) It doesn't matter"
        ],
        "answer": "A",
        "explanation": "Compliance documents need consistency and accuracy, so use low temperature (0.0-0.3)."
    },
    {
        "question": "What's the main advantage of semantic search over keyword search?",
        "options": [
            "A) It's faster",
            "B) It finds documents by meaning, not just exact words",
            "C) It's cheaper",
            "D) It requires less storage"
        ],
        "answer": "B",
        "explanation": "Semantic search uses embeddings to find conceptually similar content, even without keyword matches."
    },
    {
        "question": "Approximately how many words is 1000 tokens?",
        "options": [
            "A) 500 words",
            "B) 750 words",
            "C) 1000 words",
            "D) 1500 words"
        ],
        "answer": "B",
        "explanation": "Rule of thumb: 1 token ‚âà 0.75 words, so 1000 tokens ‚âà 750 words."
    },
    {
        "question": "What problem does few-shot learning solve?",
        "options": [
            "A) High API costs",
            "B) Slow response times",
            "C) Teaching AI your specific standards/patterns",
            "D) Data privacy concerns"
        ],
        "answer": "C",
        "explanation": "Few-shot learning provides examples that teach the AI YOUR specific criteria, improving accuracy."
    }
]

print("\n" + "‚ïê" * 70)
print("  SELF-ASSESSMENT QUIZ")
print("‚ïê" * 70 + "\n")

for i, q in enumerate(quiz, 1):
    print(f"Question {i}: {q['question']}\n")
    for opt in q['options']:
        print(f"  {opt}")
    print()

print("\n" + "-" * 70)
print("  ANSWERS (scroll down after you've tried!)")
print("-" * 70 + "\n")
print("\n" * 10)  # Space

for i, q in enumerate(quiz, 1):
    print(f"Q{i}: {q['answer']}")
    print(f"    {q['explanation']}\n")

print("\nüéâ How did you do? Reviewing anything you missed is encouraged!")