[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ContextLab/clustrix/blob/master/docs/ssh_key_automation_tutorial.ipynb)

# 🔑 SSH Key Automation Tutorial

**Transform 15-30 minutes of manual SSH setup into 15 seconds of automated bliss!**

This tutorial demonstrates how to use Clustrix's automated SSH key setup feature to quickly and securely connect to remote clusters.

## 🚀 What You'll Learn

- ⚡ **Speed**: Setup in <15 seconds vs 15-30 minutes manually
- 🔒 **Security**: Ed25519 keys with proper permissions
- 🎯 **Simplicity**: One-click setup in Jupyter, single CLI command
- 🧹 **Cleanup**: Automatic removal of conflicting old keys
- 🔄 **Rotation**: Force refresh to generate new keys
- 🌐 **Cross-platform**: Works on Windows, macOS, Linux

## 📦 Installation

First, let's install Clustrix. If you're running this in Google Colab, the installation will happen automatically.

In [None]:
# Install Clustrix (uncomment if not already installed)
# !pip install clustrix

# Import Clustrix - the widget will appear automatically!
import clustrix

print("✅ Clustrix imported successfully!")
print("📱 Look for the interactive widget that appeared above or below this cell.")
print("🔑 Find the 'SSH Key Setup' section in the widget interface.")

## 🎯 Method 1: Interactive Widget (Recommended)

**This is the easiest method!** When you imported Clustrix above, an interactive widget should have appeared. Look for the **"SSH Key Setup"** section.

### 📋 Widget Steps:
1. **Enter your cluster hostname** (e.g., `cluster.university.edu`)
2. **Enter your username**
3. **Enter your password** (will be securely handled)
4. **Optional**: Check "Force refresh SSH keys" to generate new keys
5. **Click "Setup SSH Keys"**

The widget will show real-time progress and success/error messages.

### 💡 Colab Secret Storage (Recommended)
Instead of entering your password each time, store it securely in Colab secrets:

1. Click the **key icon** (🔑) in the Colab sidebar
2. Add a secret with key: `CLUSTER_PASSWORD_HOSTNAME` 
   - Example: `CLUSTER_PASSWORD_CLUSTER_UNIVERSITY_EDU`
3. Clustrix will automatically retrieve it!

---

## 🖥️ Method 2: Python API

For programmatic access or when you want more control:

In [None]:
from clustrix import setup_ssh_keys_with_fallback
from clustrix.config import ClusterConfig

# 🔧 Configure your cluster details
# Replace these with your actual cluster information
config = ClusterConfig(
    cluster_type="slurm",  # or "pbs", "sge", "ssh", etc.
    cluster_host="cluster.university.edu",  # Your cluster hostname
    username="your_username"  # Your cluster username
)

print("✅ Cluster configuration created!")
print(f"🎯 Target: {config.cluster_host}")
print(f"👤 User: {config.username}")
print("\n📝 Ready to run SSH key setup. Execute the next cell when ready.")

### 🔐 Password Handling Options

You have several options for providing your password:

In [None]:
# Option 1: Let Clustrix automatically handle password retrieval
# This will check environment variables, Colab secrets, and prompt if needed
print("🔄 Running SSH key setup with automatic password fallback...")

result = setup_ssh_keys_with_fallback(
    config=config,
    # password="your_password",  # Uncomment and fill if you want to provide directly
    cluster_alias="my_cluster",  # Creates SSH config alias for easy access
    key_type="ed25519",  # Modern, secure key type (recommended)
    force_refresh=False,  # Set True to generate new keys
)

# 📊 Display results
print("\n" + "="*50)
print("📋 SSH KEY SETUP RESULTS")
print("="*50)

if result["success"]:
    print("✅ SUCCESS! SSH keys setup completed")
    print(f"🔑 Key path: {result['key_path']}")
    print(f"📦 Key already existed: {result['key_already_existed']}")
    print(f"🚀 Key deployed: {result['key_deployed']}")
    print(f"🔗 Connection tested: {result['connection_tested']}")
    
    if "details" in result:
        details = result["details"]
        if "message" in details:
            print(f"💬 Message: {details['message']}")
        if "ssh_config_updated" in details:
            print("⚙️ SSH config updated with alias")
            
    print("\n🎉 You can now connect to your cluster with:")
    if "ssh_config_updated" in result.get("details", {}):
        print(f"   ssh my_cluster")
    else:
        print(f"   ssh {config.username}@{config.cluster_host}")
        
else:
    print("❌ FAILED! SSH key setup encountered an error")
    print(f"🔍 Error: {result.get('error', 'Unknown error')}")
    
    # 🩺 Show details for troubleshooting
    if "details" in result:
        print("\n🔧 Troubleshooting details:")
        for key, value in result["details"].items():
            print(f"   {key}: {value}")
            
print("\n" + "="*50)

## 🛠️ Advanced Features Demo

Let's explore some advanced features of the SSH key automation system:

### 🔄 Key Rotation (Force Refresh)

Sometimes you need to generate fresh keys (e.g., for security rotation or fixing conflicts):

In [None]:
# 🔄 Force generation of new SSH keys
print("🔄 Demonstrating key rotation with force refresh...")

# This will remove old keys and generate fresh ones
refresh_result = setup_ssh_keys_with_fallback(
    config=config,
    cluster_alias="my_cluster_fresh",
    force_refresh=True,  # 🔥 This forces new key generation
    key_type="ed25519"
)

if refresh_result["success"]:
    print("✅ Fresh SSH keys generated and deployed!")
    print(f"🆕 New key path: {refresh_result['key_path']}")
    if refresh_result["details"].get("key_generated"):
        print("🔨 Confirmed: New key was generated (not reused)")
else:
    print(f"❌ Key refresh failed: {refresh_result.get('error')}")

### 🔍 SSH Key Discovery

Let's explore what SSH keys Clustrix can find on your system:

In [None]:
from clustrix import find_ssh_keys, list_ssh_keys

print("🔍 Discovering SSH keys on your system...")
print()

# Find all SSH keys
keys = find_ssh_keys()
print(f"🔑 Found {len(keys)} SSH private keys:")
for i, key in enumerate(keys, 1):
    print(f"   {i}. {key}")

print()

# Get detailed information about SSH keys
key_info = list_ssh_keys()
print("📋 Detailed SSH key information:")
print()

for info in key_info:
    if info["exists"]:
        print(f"🔑 Key: {info['path']}")
        print(f"   Type: {info.get('type', 'Unknown')}")
        print(f"   Size: {info.get('bit_size', 'Unknown')} bits")
        print(f"   Fingerprint: {info.get('fingerprint', 'Unknown')[:50]}...")
        if info.get('comment'):
            print(f"   Comment: {info['comment'][:60]}...")
        print()

if not key_info:
    print("ℹ️ No SSH keys found. The setup process will generate new ones!")

## 🏢 Enterprise Cluster Support

Many university and enterprise clusters use **Kerberos authentication**. Here's how Clustrix handles this:

In [None]:
# 🏫 Example: University cluster with Kerberos authentication
print("🏢 Demonstrating enterprise cluster setup...")
print()

# Many university clusters use Kerberos (like Dartmouth's Discovery cluster)
university_config = ClusterConfig(
    cluster_type="slurm",
    cluster_host="ndoli.dartmouth.edu",  # Example Kerberos cluster
    username="your_netid"
)

print("📝 For Kerberos clusters, SSH key deployment will succeed,")
print("   but authentication requires Kerberos tickets:")
print()
print("   # Setup Kerberos ticket")
print("   kinit your_netid@UNIVERSITY.EDU")
print()
print("   # Now SSH will work")
print("   ssh your_netid@ndoli.dartmouth.edu")
print()
print("💡 Clustrix gracefully handles this and provides clear guidance!")

# The setup will deploy keys successfully but note the authentication method
# Uncomment to test with a real Kerberos cluster:
# kerberos_result = setup_ssh_keys_with_fallback(university_config)
# print(f"Result: {kerberos_result}")

## 🚨 Troubleshooting Common Issues

Let's demonstrate how to handle common SSH key setup challenges:

In [None]:
def troubleshoot_ssh_setup(result):
    """Helper function to provide troubleshooting guidance"""
    
    print("🔧 TROUBLESHOOTING GUIDE")
    print("="*40)
    
    if not result["success"]:
        error = result.get("error", "")
        
        if "cluster_host must be specified" in error:
            print("❌ Missing hostname")
            print("💡 Solution: Provide cluster_host in your configuration")
            
        elif "username must be specified" in error:
            print("❌ Missing username")
            print("💡 Solution: Provide username in your configuration")
            
        elif "Connection refused" in error:
            print("❌ Connection refused")
            print("💡 Solutions:")
            print("   - Check if hostname is correct")
            print("   - Verify the cluster is accessible")
            print("   - Try different port (default is 22)")
            
        elif "Permission denied" in error:
            print("❌ Permission denied")
            print("💡 Solutions:")
            print("   - Verify username is correct")
            print("   - Check password is correct")
            print("   - Try force_refresh=True to clean old keys")
            
        else:
            print(f"❌ General error: {error}")
            print("💡 Solutions:")
            print("   - Check network connectivity")
            print("   - Verify cluster allows SSH key authentication")
            print("   - Try manual SSH connection first")
    
    elif result["success"] and not result["connection_tested"]:
        print("⚠️ Keys deployed but connection test failed")
        print("💡 This is normal for:")
        print("   - Kerberos/GSSAPI clusters (university HPC)")
        print("   - Clusters with key propagation delays")
        print("   - Multi-factor authentication clusters")
        
    else:
        print("✅ SSH key setup successful!")
        print("🎉 No troubleshooting needed.")
    
    print("\n📚 For more help, see:")
    print("   - Clustrix documentation")
    print("   - GitHub issues: https://github.com/ContextLab/clustrix/issues")

# Example: Show troubleshooting for a mock error
mock_error_result = {
    "success": False,
    "error": "Permission denied (publickey,password)",
    "details": {}
}

print("🎭 Example troubleshooting for common error:")
troubleshoot_ssh_setup(mock_error_result)

## 🧪 Environment Detection Demo

Clustrix automatically detects your environment and adapts its behavior:

In [None]:
from clustrix.auth_fallbacks import detect_environment, get_cluster_password

print("🔍 Environment Detection:")
print()

env = detect_environment()
print(f"📱 Current environment: {env}")
print()

env_descriptions = {
    "colab": "🟢 Google Colab - Uses userdata secrets for passwords",
    "notebook": "📓 Jupyter Notebook - Uses GUI popups for passwords", 
    "cli": "💻 Command Line - Uses terminal prompts for passwords",
    "script": "📜 Python Script - Uses input prompts for passwords"
}

print(f"💡 This means: {env_descriptions.get(env, 'Unknown environment')}")
print()

if env == "colab":
    print("🔑 Colab Secret Storage Available!")
    print("   Store your cluster password in Colab secrets with key:")
    print("   CLUSTER_PASSWORD_HOSTNAME or CLUSTER_PASSWORD")
    print()
    
print("🌍 Environment Variables Checked:")
checked_vars = [
    "CLUSTRIX_PASSWORD_*",
    "CLUSTER_PASSWORD_*", 
    "CLUSTRIX_DEFAULT_PASSWORD",
    "CLUSTER_PASSWORD"
]
for var in checked_vars:
    print(f"   - {var}")

print()
print("✨ This automatic detection ensures the best user experience")
print("   for your specific environment!")

## 🎯 Integration with Clustrix Workflows

Now let's see how SSH key automation integrates with actual cluster computing:

In [None]:
# 🔧 Complete workflow: Setup SSH keys + Run cluster job
def setup_and_run_cluster_job():
    """Demonstrates end-to-end workflow with SSH automation"""
    
    print("🚀 COMPLETE CLUSTRIX WORKFLOW")
    print("="*50)
    print()
    
    # Step 1: Configure cluster
    print("📋 Step 1: Configure cluster connection")
    config = ClusterConfig(
        cluster_type="slurm",  # Adjust for your cluster
        cluster_host="your-cluster.edu",  # Replace with real hostname
        username="your_username",  # Replace with real username
        default_cores=4,
        default_memory="8GB"
    )
    print(f"   ✅ Configured: {config.cluster_host}")
    print()
    
    # Step 2: Setup SSH keys automatically
    print("🔑 Step 2: Setup SSH keys")
    # ssh_result = setup_ssh_keys_with_fallback(config)
    # Commented out to avoid requiring real cluster access
    print("   ✅ SSH keys would be setup here")
    print()
    
    # Step 3: Configure Clustrix with the validated setup
    print("⚙️ Step 3: Apply cluster configuration")
    clustrix.configure(
        cluster_type=config.cluster_type,
        cluster_host=config.cluster_host,
        username=config.username,
        default_cores=config.default_cores,
        default_memory=config.default_memory
    )
    print("   ✅ Clustrix configured for remote execution")
    print()
    
    # Step 4: Define and run cluster function
    print("🔬 Step 4: Define cluster computation")
    
    from clustrix import cluster
    
    @cluster(cores=4, memory="8GB", time="00:10:00")
    def scientific_computation(n_samples=1000):
        """Example scientific computation that benefits from cluster resources"""
        import numpy as np
        import time
        
        # Simulate computational work
        data = np.random.randn(n_samples, n_samples)
        
        # Compute eigenvalues (computationally intensive)
        eigenvalues = np.linalg.eigvals(data)
        
        # Return summary statistics
        return {
            "n_samples": n_samples,
            "mean_eigenvalue": float(np.mean(eigenvalues.real)),
            "max_eigenvalue": float(np.max(eigenvalues.real)),
            "computation_node": "cluster_node",  # Would show actual node
            "timestamp": time.time()
        }
    
    print("   ✅ Defined @cluster decorated function")
    print()
    
    # Step 5: Execute on cluster (would happen here)
    print("🚀 Step 5: Execute on cluster")
    print("   📤 Function would be submitted to cluster queue")
    print("   ⏳ Job would execute on cluster nodes")
    print("   📥 Results would be retrieved automatically")
    # result = scientific_computation(n_samples=500)
    # print(f"   ✅ Result: {result}")
    print()
    
    print("🎉 Workflow complete! SSH automation enabled seamless cluster access.")
    
# Run the demonstration
setup_and_run_cluster_job()

## 🔐 Security Best Practices

Let's review the security features built into Clustrix SSH automation:

In [None]:
print("🔐 SECURITY FEATURES IN CLUSTRIX SSH AUTOMATION")
print("="*60)
print()

security_features = [
    ("🔑", "Ed25519 Keys (Default)", "Modern, secure, quantum-resistant encryption"),
    ("🔒", "Proper Permissions", "Private keys: 600, Public keys: 644, SSH dir: 700"),
    ("🧹", "Automatic Cleanup", "Removes conflicting old keys before deployment"),
    ("💾", "Secure Storage", "Keys stored in standard ~/.ssh/ with proper permissions"),
    ("🔄", "Key Rotation", "Force refresh generates new keys for security rotation"),
    ("🚫", "No Plain Text", "Passwords cleared from memory after use"),
    ("📝", "Informative Comments", "Keys tagged with timestamp and generator info"),
    ("🔍", "Connection Testing", "Verifies keys work before reporting success"),
    ("⚡", "SSH Agent Compatible", "Works with SSH agents for additional security"),
    ("🏢", "Enterprise Ready", "Graceful handling of Kerberos and MFA clusters")
]

for icon, feature, description in security_features:
    print(f"{icon} **{feature}**")
    print(f"   {description}")
    print()

print("🛡️ **Key Security Recommendations:**")
print()
recommendations = [
    "Use Ed25519 keys (default) - more secure than RSA",
    "Store cluster passwords in environment variables or Colab secrets",
    "Regularly rotate keys using force_refresh=True",
    "Use cluster aliases to avoid exposing hostnames in commands",
    "Keep your local system secure - SSH keys are only as safe as your device",
    "Monitor SSH key usage and remove unused keys periodically"
]

for i, rec in enumerate(recommendations, 1):
    print(f"   {i}. {rec}")

print()
print("✅ Following these practices ensures maximum security for your cluster access!")

## 🎓 Summary and Next Steps

Congratulations! You've learned how to use Clustrix's SSH key automation feature. Here's what we covered:

In [None]:
print("🎓 TUTORIAL SUMMARY")
print("="*50)
print()

topics_covered = [
    "✅ Interactive widget SSH setup (recommended method)",
    "✅ Python API for programmatic access", 
    "✅ Advanced features: key rotation, discovery, aliases",
    "✅ Enterprise cluster support (Kerberos/GSSAPI)",
    "✅ Troubleshooting common issues",
    "✅ Environment detection and password fallbacks",
    "✅ Integration with cluster computing workflows",
    "✅ Security best practices and features"
]

print("📚 **What You Learned:**")
for topic in topics_covered:
    print(f"   {topic}")

print()
print("🚀 **Next Steps:**")
next_steps = [
    "Set up SSH keys for your actual cluster",
    "Configure Clustrix for your cluster environment", 
    "Start using @cluster decorator for your computations",
    "Explore cloud provider integrations (AWS, GCP, Azure)",
    "Try advanced features like cost monitoring",
    "Check out filesystem utilities for data management"
]

for i, step in enumerate(next_steps, 1):
    print(f"   {i}. {step}")

print()
print("📖 **Additional Resources:**")
resources = [
    "Clustrix Documentation: https://clustrix.readthedocs.io",
    "GitHub Repository: https://github.com/ContextLab/clustrix",
    "Issue Tracker: https://github.com/ContextLab/clustrix/issues",
    "SSH Key Technical Design: docs/ssh_key_automation_technical_design.md"
]

for resource in resources:
    print(f"   • {resource}")

print()
print("🎉 **Happy cluster computing with automated SSH setup!** 🚀")
print()
print("💡 Remember: 15 seconds of automation beats 15-30 minutes of manual setup!")

## 🆘 Support and Feedback

If you encounter any issues or have suggestions for improvement:

- 🐛 **Report bugs**: [GitHub Issues](https://github.com/ContextLab/clustrix/issues)
- 💡 **Request features**: [GitHub Discussions](https://github.com/ContextLab/clustrix/discussions)
- 📚 **Documentation**: [Read the Docs](https://clustrix.readthedocs.io)
- 🔑 **SSH Key Issues**: [Issue #57](https://github.com/ContextLab/clustrix/issues/57)

---

**✨ Thank you for using Clustrix! We hope this SSH automation feature makes your cluster computing experience much smoother. ✨**