# 🧙‍♂️ SageMaker Cluster Recommendation Wizard

**Intelligent cluster sizing for SageMaker workloads**

This notebook provides the same functionality as the desktop version, optimized for Amazon SageMaker environment.

## Features:
- 300+ synthetic workload examples
- TF-IDF similarity matching
- LLM analysis via AWS Bedrock
- SageMaker-specific recommendations
- Public URL sharing

---

## 📦 Step 1: Install Dependencies

First, let's install the required packages in the SageMaker environment:

In [None]:
# Install required packages
!pip install gradio>=4.0.0 pandas>=1.5.0 numpy>=1.21.0 scikit-learn>=1.0.0 --quiet

print("✅ Dependencies installed successfully!")

## 🔧 Step 2: Import and Setup

Import the cluster wizard and verify everything is working:

In [None]:
# Import the SageMaker cluster wizard
import sys
import os

# If running from uploaded file, adjust path as needed
# sys.path.append('/path/to/your/files')

# Copy the sagemaker_cluster_wizard.py content here or import it
exec(open('sagemaker_cluster_wizard.py').read())

print("✅ Cluster Wizard imported successfully!")

## 🧪 Step 3: Test the Wizard

Let's test the core functionality before launching the full interface:

In [None]:
# Test the wizard functionality
print("🧪 Testing SageMaker Cluster Wizard...")

# Initialize wizard
wizard = SageMakerClusterWizard()

# Test with a sample workload
test_input = "SageMaker training job for deep learning model with 64 CPU cores, 256GB RAM, and 10TB of image data"

recommendation, llm_analysis, similar_workloads = wizard.recommend_cluster(test_input)

print("\n📊 Sample Recommendation:")
print(recommendation[:500] + "..." if len(recommendation) > 500 else recommendation)

print("\n✅ Wizard is working correctly!")

## 🚀 Step 4: Launch the Full Application

Now let's launch the complete Gradio interface with public URL sharing:

In [None]:
# Launch the SageMaker Cluster Wizard with public sharing
print("🚀 Launching SageMaker Cluster Recommendation Wizard...")
print("📡 Generating publicly shareable URL...")
print("🔗 You'll be able to share this URL with colleagues!")

# Launch with public sharing enabled
interface = launch_sagemaker_wizard(share=True, debug=False)

print("\n✅ Application launched successfully!")
print("📱 Access the application using the URLs displayed above")
print("🌐 The public URL can be shared with others")

## 🎯 Step 5: Quick Launch (Alternative)

For convenience, you can also use the quick launch function:

In [None]:
# Alternative: Quick launch with default settings
# Uncomment the line below if you prefer this method

# quick_launch()

## 📋 Example SageMaker Workloads

Here are some example workload descriptions you can try in the application:

In [None]:
# Example workloads for testing
examples = [
    "SageMaker training job for computer vision model on 15TB image dataset. Need 128 CPU cores, 512GB RAM, 8 GPUs, distributed training for 4 data scientists.",
    
    "SageMaker processing job for real-time fraud detection. Processing 2TB transaction data daily, sub-50ms inference latency, 2000 concurrent API calls.",
    
    "SageMaker batch transform job for NLP model inference on 50TB text data. Need 64 CPU cores, 256GB RAM, process 1M documents per hour.",
    
    "SageMaker hyperparameter tuning job testing 100 model configurations. Need 32 CPU cores per job, 128GB RAM, GPU acceleration, parallel execution.",
    
    "SageMaker multi-model endpoint serving 20 models simultaneously. Need 16 CPU cores, 64GB RAM, auto-scaling for variable traffic."
]

print("📋 Example SageMaker Workloads:")
for i, example in enumerate(examples, 1):
    print(f"\n{i}. {example}")

print("\n💡 Copy any of these examples into the application to see recommendations!")

## 🔧 Advanced Configuration

You can customize the wizard behavior for specific needs:

In [None]:
# Advanced configuration options

# 1. Launch without public sharing (local only)
def launch_local_only():
    return launch_sagemaker_wizard(share=False, debug=False)

# 2. Launch with debug mode enabled
def launch_debug_mode():
    return launch_sagemaker_wizard(share=True, debug=True)

# 3. Test individual components
def test_components():
    wizard = SageMakerClusterWizard()
    
    # Test requirement extraction
    test_text = "Need 64 CPU cores and 256GB RAM for ML training"
    requirements = wizard.extract_requirements(test_text)
    print(f"Extracted requirements: {requirements}")
    
    # Test similarity matching
    similar = wizard.find_similar_workloads(test_text, top_k=3)
    print(f"Found {len(similar)} similar workloads")
    
    return wizard

print("🔧 Advanced configuration functions defined")
print("   - launch_local_only(): Launch without public sharing")
print("   - launch_debug_mode(): Launch with debug enabled")
print("   - test_components(): Test individual wizard components")

## 🛡️ SageMaker IAM Permissions

For full LLM analysis functionality, ensure your SageMaker execution role has the following permissions:

In [None]:
# Check current AWS credentials and permissions
import boto3

try:
    # Check if we can access Bedrock
    bedrock = boto3.client('bedrock-runtime', region_name='us-east-1')
    print("✅ Bedrock client initialized successfully")
    print("🔑 SageMaker execution role has Bedrock access")
    
except Exception as e:
    print(f"⚠️  Bedrock access limited: {e}")
    print("\n📋 Required IAM permissions for full functionality:")
    print("   - bedrock:InvokeModel")
    print("   - bedrock:ListFoundationModels")
    print("\n🔧 Add these to your SageMaker execution role for LLM analysis")

# Check current region
session = boto3.Session()
current_region = session.region_name
print(f"\n🌍 Current AWS region: {current_region}")
print("💡 Ensure Claude-3.7-Sonnet is available in this region")

## 📊 Dataset Information

Information about the synthetic dataset used for recommendations:

In [None]:
# Display dataset information
wizard = SageMakerClusterWizard()

print("📊 Dataset Statistics:")
print(f"   Total samples: {len(wizard.dataset)}")
print(f"   Workload types: {wizard.dataset['workload_type'].nunique()}")
print(f"   Cluster sizes: {wizard.dataset['recommended_cluster_size'].nunique()}")

print("\n🏷️  Workload Type Distribution:")
print(wizard.dataset['workload_type'].value_counts())

print("\n⚙️  Cluster Size Distribution:")
print(wizard.dataset['recommended_cluster_size'].value_counts())

print("\n💰 Cost Range:")
print(f"   Min: ${wizard.dataset['estimated_cost_per_hour'].min():.2f}/hour")
print(f"   Max: ${wizard.dataset['estimated_cost_per_hour'].max():.2f}/hour")
print(f"   Avg: ${wizard.dataset['estimated_cost_per_hour'].mean():.2f}/hour")

## 🔄 Restart/Reset

If you need to restart the application or clear any issues:

In [None]:
# Restart/reset functions

def restart_wizard():
    """Restart the wizard with fresh data"""
    global wizard
    wizard = SageMakerClusterWizard()
    print("✅ Wizard restarted successfully")

def clear_gradio_cache():
    """Clear Gradio temporary files"""
    import tempfile
    import shutil
    
    temp_dir = tempfile.gettempdir()
    gradio_dirs = [d for d in os.listdir(temp_dir) if d.startswith('gradio')]
    
    for gradio_dir in gradio_dirs:
        try:
            shutil.rmtree(os.path.join(temp_dir, gradio_dir))
            print(f"✅ Cleared {gradio_dir}")
        except:
            pass
    
    print("🧹 Gradio cache cleared")

print("🔄 Restart functions available:")
print("   - restart_wizard(): Reinitialize the wizard")
print("   - clear_gradio_cache(): Clear temporary files")

## 📞 Support & Troubleshooting

Common issues and solutions:

In [None]:
# Troubleshooting helper
def run_diagnostics():
    """Run diagnostic checks"""
    print("🔍 Running SageMaker Cluster Wizard Diagnostics...")
    print("=" * 50)
    
    # Check Python version
    print(f"🐍 Python version: {sys.version}")
    
    # Check required packages
    packages = ['gradio', 'pandas', 'numpy', 'sklearn', 'boto3']
    for pkg in packages:
        try:
            __import__(pkg)
            print(f"✅ {pkg}")
        except ImportError:
            print(f"❌ {pkg} - not installed")
    
    # Check AWS credentials
    try:
        session = boto3.Session()
        credentials = session.get_credentials()
        if credentials:
            print("✅ AWS credentials available")
        else:
            print("⚠️  AWS credentials not found")
    except Exception as e:
        print(f"❌ AWS credentials error: {e}")
    
    # Check network connectivity
    try:
        import urllib.request
        urllib.request.urlopen('https://www.google.com', timeout=5)
        print("✅ Internet connectivity")
    except:
        print("❌ No internet connectivity")
    
    print("\n🎯 Diagnostics complete!")

# Run diagnostics
run_diagnostics()

---

## 🎉 Ready to Use!

Your SageMaker Cluster Recommendation Wizard is now ready. The application provides:

- **🎯 Intelligent Recommendations** - Based on 300+ workload examples
- **🤖 LLM Analysis** - Powered by Claude-3.7-Sonnet via Bedrock
- **📊 Similarity Matching** - TF-IDF vectorization with cosine similarity
- **💰 Cost Estimation** - Hourly and daily pricing estimates
- **🌐 Public Sharing** - Shareable URLs for collaboration
- **⚙️ SageMaker Integration** - Optimized for SageMaker workloads

**Next Steps:**
1. Use the application via the URLs provided above
2. Try the example workloads to see recommendations
3. Share the public URL with your team
4. Customize for your specific use cases

**Happy cluster sizing! 🚀**