# Day 1, Session 1 - Lab: Hands-on with HuggingFace Pipelines

## Building Your First Invoice Processing Components

Welcome to your first hands-on lab! In this exercise, you'll implement the fundamental components that we'll use to build our invoice processing agent. By the end of this lab, you'll have practical experience with HuggingFace pipelines and understand how they can be applied to document processing tasks.

### Lab Objectives

By completing this lab, you will:
1. Set up Google Colab with GPU acceleration
2. Install and configure the transformers library
3. Implement 3 different pipelines for invoice processing
4. Monitor and optimize memory usage
5. Compare CPU vs GPU performance

### Success Criteria

You've successfully completed this lab when you can:
- ✅ Run all three pipeline tasks without errors
- ✅ Achieve < 1 second inference time on GPU
- ✅ Extract correct information from sample invoice text
- ✅ Demonstrate GPU memory management

### Time Estimate: 45 minutes

---

## Part 1: Environment Setup (10 minutes)

First, we need to set up our environment properly. This is a critical step that you'll do in every project.

### Task 1.1: Verify GPU Access

**Your Task**: Check if GPU is available and enabled in your Colab environment.

**Hints**:
- Use `nvidia-smi` command to check GPU info
- Use PyTorch's `cuda.is_available()` to verify CUDA access
- If no GPU, go to Runtime → Change runtime type → GPU

In [None]:
# TODO: Import necessary modules (subprocess, torch)
import subprocess
import torch

# TODO: Check if GPU is available using nvidia-smi
# Hint: Use subprocess.check_output() or !nvidia-smi

# Your code here:


# TODO: Verify CUDA availability with PyTorch
# Expected output: True if GPU is available

# Your code here:


### Task 1.2: Install Required Packages

**Your Task**: Install the necessary packages for this lab.

**Required packages**:
- transformers (latest version)
- torch (if not already installed)
- pillow (for image handling)
- accelerate (for optimization)

In [None]:
# TODO: Install required packages
# Use pip install with -q flag for quiet installation

# Your code here:


### Task 1.3: Import Libraries and Check Versions

**Your Task**: Import all necessary libraries and verify their versions.

In [None]:
# TODO: Import the following:
# - transformers (and check version)
# - torch (and check version)
# - PIL.Image
# - requests
# - json
# - time

# Your code here:


# TODO: Print versions of transformers and torch
# Expected output format: "Transformers version: X.X.X"

# Your code here:


---

## Part 2: Task 1 - Text Classification Pipeline (10 minutes)

In this task, you'll build a text classifier to categorize invoice-related communications.

### Task 2.1: Create Text Classification Pipeline

**Your Task**: Create a sentiment analysis pipeline that can classify invoice-related text.

**Requirements**:
- Use the default sentiment-analysis model
- Configure it to use GPU if available
- Test on invoice-related messages

**Test Messages**:
1. "Please find attached the invoice for your recent order"
2. "Payment overdue - urgent action required"
3. "Thank you for your prompt payment"
4. "Invoice contains errors and needs correction"
5. "Discount applied to your invoice as discussed"

In [None]:
from transformers import pipeline
import time

# TODO: Create a sentiment analysis pipeline
# Hint: pipeline("sentiment-analysis", device=...)

# Your code here:


# Test messages for classification
test_messages = [
    "Please find attached the invoice for your recent order",
    "Payment overdue - urgent action required",
    "Thank you for your prompt payment",
    "Invoice contains errors and needs correction",
    "Discount applied to your invoice as discussed"
]

# TODO: Classify each message and measure time
# For each message:
# 1. Record start time
# 2. Run classification
# 3. Record end time
# 4. Print message, result, and time taken

# Your code here:


### Task 2.2: Analyze Results

**Your Task**: Answer these questions based on your results:

1. Which messages were classified as POSITIVE vs NEGATIVE?
2. Do the classifications make business sense?
3. What was the average inference time?
4. How might you use this in an invoice processing system?

In [None]:
# TODO: Calculate and print statistics
# - Count of positive vs negative classifications
# - Average confidence score
# - Average inference time

# Your code here:


---

## Part 3: Task 2 - Question Answering Pipeline (10 minutes)

Now you'll implement a QA system to extract specific information from invoice text.

### Task 3.1: Create QA Pipeline

**Your Task**: Implement a question-answering pipeline for invoice information extraction.

**Model to use**: 'distilbert-base-cased-distilled-squad' (faster) or 'deepset/roberta-base-squad2' (more accurate)

**Context**: Use this sample invoice text:
```
INVOICE #2024-5678
Date: March 15, 2024
Due Date: April 14, 2024

Bill To: ABC Corporation
Address: 123 Main Street, New York, NY 10001

Services Provided:
- Consulting Services: 40 hours @ $150/hour = $6,000
- Software Development: 60 hours @ $200/hour = $12,000
- Project Management: 20 hours @ $125/hour = $2,500

Subtotal: $20,500
Tax (8%): $1,640
Total Amount Due: $22,140

Payment Terms: Net 30 days
Late Fee: 2% per month after due date
Accepted Payment Methods: Wire transfer, Check, Credit Card
```

In [None]:
# Invoice context
invoice_context = """INVOICE #2024-5678
Date: March 15, 2024
Due Date: April 14, 2024

Bill To: ABC Corporation
Address: 123 Main Street, New York, NY 10001

Services Provided:
- Consulting Services: 40 hours @ $150/hour = $6,000
- Software Development: 60 hours @ $200/hour = $12,000
- Project Management: 20 hours @ $125/hour = $2,500

Subtotal: $20,500
Tax (8%): $1,640
Total Amount Due: $22,140

Payment Terms: Net 30 days
Late Fee: 2% per month after due date
Accepted Payment Methods: Wire transfer, Check, Credit Card"""

# TODO: Create a question-answering pipeline
# Hint: pipeline("question-answering", model=..., device=...)

# Your code here:


# Questions to ask
questions = [
    "What is the invoice number?",
    "What is the total amount due?",
    "When is the payment due?",
    "What is the late fee percentage?",
    "How many hours of software development were provided?",
    "Which payment methods are accepted?"
]

# TODO: For each question:
# 1. Ask the question
# 2. Get the answer
# 3. Print question, answer, and confidence score

# Your code here:


### Task 3.2: Accuracy Assessment

**Your Task**: Evaluate the QA system's performance.

Create a simple accuracy checker:
1. Define expected answers for each question
2. Compare model answers with expected answers
3. Calculate accuracy percentage

In [None]:
# Expected answers (for validation)
expected_answers = {
    "What is the invoice number?": "2024-5678",
    "What is the total amount due?": "$22,140",
    "When is the payment due?": "April 14, 2024",
    "What is the late fee percentage?": "2%",
    "How many hours of software development were provided?": "60",
    "Which payment methods are accepted?": "Wire transfer, Check, Credit Card"
}

# TODO: Compare model answers with expected answers
# Calculate how many answers contain the expected information

# Your code here:


---

## Part 4: Task 3 - Document Image Analysis (10 minutes)

Learn to analyze document images - a key skill for real invoice processing.

### Task 4.1: Image Classification Pipeline

**Your Task**: Create an image classifier to identify document types.

**Test with these image URLs**:
- Invoice image: Use one from the generated images or a public URL
- Receipt image: Use one from the generated images or a public URL
- General document: Any PDF icon or document image

**Note**: In production, you would use the images we generated earlier. For this lab, you can use any publicly available document images.

In [None]:
from PIL import Image
import requests
from io import BytesIO

# Function to download image from URL
def download_image(url):
    response = requests.get(url)
    return Image.open(BytesIO(response.content))

# TODO: Create an image classification pipeline
# Hint: pipeline("image-classification", model="google/vit-base-patch16-224", device=...)

# Your code here:


# Sample image URLs (replace with actual URLs)
image_urls = {
    "invoice": "https://www.invoicesimple.com/wp-content/uploads/2018/06/Invoice-Template-Google-Docs.png",
    "receipt": "https://www.smartsheet.com/sites/default/files/2020-09/IC-Receipt-Template.png",
    # Add more URLs as needed
}

# TODO: For each image:
# 1. Download the image
# 2. Classify it
# 3. Print top 3 predictions
# 4. Measure inference time

# Your code here:


### Task 4.2: Document Question Answering (Advanced)

**Your Task**: If time permits, try document question-answering on an invoice image.

**Model**: 'impira/layoutlm-document-qa'

This combines OCR with question answering - exactly what we need for invoice processing!

In [None]:
# TODO (Optional Advanced Task):
# 1. Create a document-question-answering pipeline
# 2. Load an invoice image
# 3. Ask questions about the invoice
# 4. Compare with text-based QA results

# Your code here:


---

## Part 5: Performance Optimization (10 minutes)

Understanding performance is crucial for production systems.

### Task 5.1: CPU vs GPU Comparison

**Your Task**: Compare the performance difference between CPU and GPU inference.

**Steps**:
1. Create two identical pipelines (one on CPU, one on GPU)
2. Run the same task 10 times on each
3. Compare average times
4. Calculate speedup factor

In [None]:
import time

# Test text for comparison
test_text = "This invoice requires immediate payment to avoid late fees."

# TODO: Create CPU pipeline
# Hint: pipeline("sentiment-analysis", device=-1)

# Your code here:


# TODO: Create GPU pipeline (if available)
# Hint: pipeline("sentiment-analysis", device=0)

# Your code here:


# TODO: Benchmark both pipelines
# Run each 10 times and calculate average time

# Your code here:


# TODO: Calculate and print speedup
# Format: "GPU is X.Xx faster than CPU"

# Your code here:


### Task 5.2: Memory Management

**Your Task**: Monitor and manage GPU memory usage.

**Learn to**:
1. Check current GPU memory usage
2. Load a model and see memory increase
3. Clear memory properly
4. Verify memory is released

In [None]:
import torch
import gc

# TODO: Create a function to get GPU memory usage
def get_gpu_memory():
    if torch.cuda.is_available():
        # Return allocated and reserved memory in MB
        # Hint: torch.cuda.memory_allocated() and memory_reserved()
        pass
    return 0, 0

# Your code here:


# TODO: Demonstrate memory lifecycle
# 1. Check initial memory
# 2. Load a large model (e.g., "gpt2")
# 3. Check memory after loading
# 4. Delete the model
# 5. Run garbage collection and clear cache
# 6. Check final memory

# Your code here:


---

## Part 6: Integration Challenge (5 minutes)

Bring it all together!

### Final Challenge: Invoice Processing Pipeline

**Your Task**: Create a simple invoice processing function that:
1. Takes invoice text as input
2. Classifies the sentiment (urgent vs normal)
3. Extracts key information (amount, due date, invoice number)
4. Returns a structured summary

**Bonus**: Add error handling and performance metrics

In [None]:
def process_invoice(invoice_text):
    """
    Process an invoice and extract key information.
    
    Args:
        invoice_text (str): The invoice text to process
        
    Returns:
        dict: Extracted information including sentiment, amount, due date, etc.
    """
    result = {}
    
    # TODO: Implement the following:
    # 1. Classify sentiment/urgency
    # 2. Extract invoice number
    # 3. Extract total amount
    # 4. Extract due date
    # 5. Measure total processing time
    
    # Your code here:
    
    
    return result

# Test your function
test_invoice = """URGENT - FINAL NOTICE
INVOICE #2024-9999
Date: March 1, 2024
Due Date: March 31, 2024
Total Amount Due: $5,000
This invoice is now 30 days overdue. Immediate payment required."""

# TODO: Process the test invoice and print results

# Your code here:


---

## Lab Summary and Self-Assessment

### What You've Accomplished

If you've completed all tasks, you've successfully:
- ✅ Set up a GPU-accelerated environment
- ✅ Implemented text classification for invoice categorization
- ✅ Built a QA system for information extraction
- ✅ Explored image classification for documents
- ✅ Measured and optimized performance
- ✅ Created an integrated invoice processing function

### Self-Assessment Questions

Answer these to check your understanding:

1. **Why is GPU acceleration important for production systems?**
   - Your answer:

2. **What's the difference between sentiment analysis and question answering?**
   - Your answer:

3. **How would you handle a 100-page invoice document?**
   - Your answer:

4. **What are the memory implications of loading multiple models?**
   - Your answer:

5. **How could you improve the accuracy of information extraction?**
   - Your answer:

### Next Steps

In the next session, you'll learn how to:
- Combine these pipelines into an agent
- Add reasoning capabilities with LLMs
- Implement the ReAct pattern
- Handle errors and edge cases

### Additional Challenges (Optional)

If you finish early, try these:
1. Implement a pipeline for different languages
2. Create a batch processing function for multiple invoices
3. Add a caching mechanism to avoid reprocessing
4. Build a simple UI with ipywidgets
5. Export your processing function as a reusable module