In [None]:
print('Setup complete.')

# Extend Agents Safely - Hands-on Lab

## Lab Objectives
In this hands-on lab, you will learn to safely extend AI agents by implementing new tools with proper security controls. You'll build a complete safety framework including feature flags, rate limiting, input validation, and comprehensive testing.

## What You'll Build
- A new AI agent tool (choose from Weather API, CSV Database, or Log Analyzer)
- Feature flag system for controlled rollouts
- Rate limiting to prevent abuse
- Input validation and error handling
- Comprehensive test suite
- Safety policy documentation

## Lab Structure
1. **Setup** - Install dependencies and configure environment
2. **Tool Selection** - Choose your tool to implement
3. **Safety Framework** - Build core safety controls
4. **Tool Implementation** - Implement your chosen tool safely
5. **Feature Flags** - Configure gradual rollout system
6. **Testing** - Create comprehensive test suite
7. **Policy Documentation** - Generate safety policy
8. **Validation** - Confirm lab completion

## Success Criteria
- ✅ Tool implemented with all safety controls
- ✅ Feature flags functional with rollout percentages
- ✅ Rate limiting prevents abuse
- ✅ All tests passing (>80% success rate)
- ✅ Complete safety policy documented
- ✅ Google Colab compatible

**Note:** This lab is designed for Google Colab. Dependencies will be installed automatically.

## Step 1: Environment Setup

**Instructions:** Run this cell to install required dependencies for Google Colab.

**What this does:**
- Detects if running in Google Colab
- Installs required packages: `pytest`, `requests`, `pandas`
- Sets up imports and logging

**Your task:** Execute this setup cell before proceeding.

In [None]:
# TODO: Check if we're running in Google Colab
# Hint: try importing google.colab and handle ImportError

# TODO: If in Colab, install required packages:
# - pytest
# - requests 
# - pandas
# Use: !pip install package_name

# TODO: Import required standard libraries:
# - unittest, json, os, time, logging
# - hashlib, dataclasses, datetime, enum, typing

# TODO: Try importing optional packages (requests, pandas)
# Handle ImportError gracefully

# TODO: Configure logging at INFO level
# Set format: '%(asctime)s - %(levelname)s - %(message)s'

# TODO: Print setup completion message

## Step 2: Tool Selection

**Instructions:** Choose ONE tool to implement by uncommenting the appropriate line below.

**Tool Options:**
- **Weather API**: Simulated weather data retrieval
- **CSV Database**: Read and query CSV files
- **Log Analyzer**: Parse and analyze log files

**Your task:** Uncomment ONE line and run this cell.

In [None]:
# TODO: Uncomment ONE of the following tool choices:
# CHOSEN_TOOL = "weather_api"      # For weather data simulation
# CHOSEN_TOOL = "csv_database"     # For CSV file operations
# CHOSEN_TOOL = "log_analyzer"     # For log file analysis

# TODO: Add validation to ensure a tool was selected
# Check if CHOSEN_TOOL variable exists and is not empty
# Print confirmation message with selected tool

# TODO: Initialize global variables for lab tracking:
# - TEST_RESULTS = {} (will store test outcomes)
# - my_tool = None (will hold the tool instance)

## Step 3: Safety Framework Implementation

**Instructions:** Implement the core safety framework with feature flags, rate limiting, and error handling.

**What you'll build:**
- `FeatureFlag` class for rollout control
- `ToolResult` dataclass for structured responses
- `SafeToolBase` class with safety controls

**Your task:** Complete the framework classes below.

In [None]:
# TODO: Create FeatureFlag class
# Include these attributes:
# - enabled: bool
# - rollout_percentage: int (0-100)
# - name: str

# TODO: Add method is_enabled_for_user(user_id: str) -> bool
# Use hash of user_id to determine if user gets feature
# Return True if hash % 100 < rollout_percentage

# TODO: Create ToolResult dataclass with:
# - success: bool
# - data: Any (use typing.Any)
# - error_message: str = ""
# - timestamp: datetime (use datetime.now())

# TODO: Create SafeToolBase class with:
# - name: str
# - feature_flag: FeatureFlag
# - max_calls_per_minute: int = 10
# - call_timestamps: list = field(default_factory=list)
# - call_count: int = 0

# TODO: Add _check_rate_limit() method
# Remove timestamps older than 1 minute
# Return True if under rate limit, False otherwise

# TODO: Add _log_call() method
# Record timestamp and increment call_count
# Log the call with logging.info()

# TODO: Add get_stats() method returning dict with:
# - total_calls, success_rate, feature_flag_status

## Step 4: Tool Implementation

**Instructions:** Implement your chosen tool class that inherits from SafeToolBase.

**What you'll build:**
- Tool-specific class with input validation
- Mock data for testing
- Safe execution method

**Your task:** Complete the implementation for your chosen tool.

In [None]:
# TODO: Create WeatherAPITool class (if weather_api chosen)
# Inherit from SafeToolBase
# Add method: get_weather(city: str) -> ToolResult
# Validate city name (length 1-50, alphanumeric + spaces)
# Return mock weather data: temp, humidity, conditions

# TODO: Create CSVDatabaseTool class (if csv_database chosen) 
# Inherit from SafeToolBase
# Add method: query_csv(query: str) -> ToolResult
# Validate query (no SQL injection patterns)
# Return mock CSV data with employee records

# TODO: Create LogAnalyzerTool class (if log_analyzer chosen)
# Inherit from SafeToolBase
# Add method: analyze_logs(pattern: str) -> ToolResult
# Validate pattern (safe regex, no ReDoS)
# Return mock log analysis results

# TODO: Create mock data dictionaries for each tool:
# - MOCK_WEATHER_DATA: dict of cities to weather info
# - MOCK_CSV_DATA: list of employee records
# - MOCK_LOG_DATA: list of log entries with different levels

# TODO: Initialize the chosen tool instance:
# Create FeatureFlag with name=CHOSEN_TOOL, enabled=False, rollout_percentage=0
# Instantiate the appropriate tool class
# Assign to global variable 'my_tool'

## Step 5: Feature Flag Configuration

**Instructions:** Configure and test your feature flag system for controlled rollouts.

**What you'll learn:**
- How to safely disable features
- Gradual rollout strategies
- User-based feature targeting

**Your task:** Test different feature flag configurations.

In [None]:
# TODO: Test 1 - Feature Completely Disabled
# Set my_tool.feature_flag.enabled = False
# Set rollout_percentage = 0
# Try calling your tool method with test data
# Verify it returns appropriate "feature disabled" message

# TODO: Test 2 - Partial Rollout (25%)
# Set my_tool.feature_flag.enabled = True
# Set rollout_percentage = 25
# Test with multiple user IDs: "user1", "user2", "user3", "user4"
# Check which users get the feature using is_enabled_for_user()
# Print results for each user

# TODO: Test 3 - Full Rollout (100%)
# Set rollout_percentage = 100
# Test that all users now get the feature
# Make actual tool calls to verify functionality

## Step 6: Comprehensive Testing

**Instructions:** Build a complete test suite to validate all safety controls.

**What you'll test:**
- Feature flag functionality
- Rate limiting behavior
- Input validation
- Error handling
- Tool functionality

**Your task:** Implement the test class and run comprehensive tests.

In [None]:
# TODO: Create ColabTestRunner class for Google Colab compatibility
# Include methods:
# - run_test(test_method, test_name)
# - run_all_tests(test_class_instance)
# Print results with ✅ for pass, ❌ for fail

# TODO: Create ToolSafetyTests class with these test methods:

# test_feature_flag_disabled()
# - Disable feature flag
# - Verify tool calls are blocked
# - Check appropriate error message returned

# test_feature_flag_rollout()
# - Set rollout to 50%
# - Test multiple user IDs
# - Verify consistent user assignment

# test_rate_limiting()
# - Enable feature flag fully
# - Make calls up to rate limit
# - Verify additional calls are blocked
# - Wait and verify rate limit resets

# test_input_validation()
# - Test with invalid inputs (empty, too long, special chars)
# - Verify appropriate validation errors
# - Test with valid inputs for success

# test_error_handling()
# - Simulate various error conditions
# - Verify graceful error responses
# - Check error logging works

# test_tool_functionality()
# - Test core tool functionality with valid inputs
# - Verify expected data structure returned
# - Check success/failure status correctly set

In [None]:
# TODO: Run the comprehensive test suite
# - Create instance of ToolSafetyTests
# - Create instance of ColabTestRunner
# - Run all tests using the runner
# - Store results in TEST_RESULTS global variable

# TODO: Calculate and display test statistics:
# - Total tests run
# - Number passed/failed
# - Success rate percentage
# - List any failed tests

# TODO: Print final test summary with pass/fail status

## Step 7: Safety Policy Documentation

**Instructions:** Generate a comprehensive safety policy document for your tool.

**What you'll document:**
- Tool purpose and functionality
- Security controls implemented
- Risk assessment and mitigations
- Monitoring requirements
- Deployment checklist

**Your task:** Complete the policy generator function and create your policy document.

In [None]:
# TODO: Create generate_safety_policy function
# Parameters: tool_name (str), test_results (dict)
# Returns: formatted policy document (str)

# TODO: Define tool_descriptions dictionary with:
# For each tool type include:
# - purpose: what the tool does
# - risks: list of potential security risks
# - data_sensitivity: classification level

# TODO: Generate policy document with these sections:
# - Header with tool name, author, date, version
# - Purpose section
# - Security Controls (feature flags, rate limiting, validation, error handling)
# - Risk Assessment table with likelihood/impact/mitigation
# - Data Classification and retention policy
# - Monitoring and alerting requirements
# - Testing status and results
# - Deployment checklist
# - Dependencies and requirements
# - Approval sign-off section

# TODO: Call the function to generate policy for your chosen tool
# Store result in SAFETY_POLICY_CONTENT variable
# Display the generated policy

## Step 8: Lab Completion and Validation

**Instructions:** Complete the final validation to confirm your lab implementation.

**What you'll validate:**
- All components implemented correctly
- Tests passing with good success rate
- Safety policy generated
- Deliverables ready for submission

**Your task:** Run validation checks and generate final deliverables.

In [None]:
# TODO: Display lab completion validation header
# Print "LAB COMPLETION VALIDATION" with separator line

# TODO: Get final tool statistics
# Call my_tool.get_stats() and store result
# Display tool summary including:
# - Tool name
# - Rate limit setting
# - Feature flag status (enabled/disabled)
# - Rollout percentage
# - Total calls made
# - Success rate

# TODO: Display testing summary
# Show from TEST_RESULTS:
# - Total tests implemented
# - Tests passing percentage
# - Number of failures
# - Number of errors

# TODO: Run validation checklist
# Check each item and mark with ✅ or ❌:
# - Tool implemented (my_tool is not None)
# - Feature flag functional (has feature_flag attribute)
# - Rate limiting active (max_calls_per_minute > 0)
# - Tests created (TEST_RESULTS['total_tests'] > 0)
# - Tests passing (success_rate >= 80%)
# - Policy documented (SAFETY_POLICY_CONTENT exists)
# - Google Colab compatible (always true if setup ran)

# TODO: Display final lab status
# Show "COMPLETED SUCCESSFULLY" or "NEEDS ATTENTION"
# based on validation results

In [None]:
# TODO: Generate final deliverables
# Display "Final Deliverables Summary" header

# TODO: Create deliverables dictionary containing:
# - safety_policy: SAFETY_POLICY_CONTENT
# - test_results: dict with tool_name, test_results, tool_stats, timestamp, platform

# TODO: Handle Google Colab file creation
# If running in Colab:
# - Import google.colab.files
# - Save policy as markdown file: {CHOSEN_TOOL}_safety_policy.md
# - Save test results as JSON file: {CHOSEN_TOOL}_test_results.json
# - Print file creation confirmation
# - Show download instructions

# TODO: Handle non-Colab environment
# Print that policy and test results are available in variables:
# - SAFETY_POLICY_CONTENT
# - TEST_RESULTS

# TODO: Display congratulations message
# Show completion celebration
# List key concepts learned:
# - Feature flag implementation and rollout strategies
# - Rate limiting and abuse prevention
# - Comprehensive input validation
# - Safety-first tool development
# - Test-driven development for AI tools
# - Security policy documentation
# - Google Colab deployment considerations

# TODO: Display final statistics
# Show test success rate and total tool calls made