# üöÄ Self-Healing Platform - Environment Setup

**Phase**: Setup (00)  
**Objective**: Verify and configure the workbench environment  
**Time**: 5-10 minutes  
**Status**: ‚úÖ Ready

## Overview

This notebook verifies that your RHODS workbench is properly configured for executing all 30 notebooks in the Self-Healing Platform.

### What This Notebook Does

1. ‚úÖ Verifies Python and PyTorch installation
2. ‚úÖ Checks GPU availability
3. ‚úÖ Verifies persistent storage volumes
4. ‚úÖ Tests required dependencies
5. ‚úÖ Creates necessary directories
6. ‚úÖ Generates setup summary report

### Prerequisites

- You're in the RHODS workbench
- Repository is cloned to `/opt/app-root/src/openshift-aiops-platform/`
- You have terminal access

---

## Step 1: Verify Python and PyTorch Installation

In [None]:
import sys
import os

print("=" * 80)
print("PYTHON & PYTORCH VERIFICATION")
print("=" * 80)

# Python version
print(f"\n‚úì Python Version: {sys.version}")
print(f"‚úì Python Executable: {sys.executable}")

# PyTorch version
import torch
print(f"\n‚úì PyTorch Version: {torch.__version__}")
print(f"‚úì PyTorch Location: {torch.__file__}")

## Step 2: Check GPU Availability

In [None]:
print("\n" + "=" * 80)
print("GPU AVAILABILITY CHECK")
print("=" * 80)

cuda_available = torch.cuda.is_available()
print(f"\n‚úì CUDA Available: {cuda_available}")

if cuda_available:
    device_count = torch.cuda.device_count()
    print(f"‚úì GPU Device Count: {device_count}")
    
    for i in range(device_count):
        print(f"  - GPU {i}: {torch.cuda.get_device_name(i)}")
        print(f"    Memory: {torch.cuda.get_device_properties(i).total_memory / 1e9:.2f} GB")
    
    # Test GPU
    x = torch.randn(1000, 1000).cuda()
    y = torch.randn(1000, 1000).cuda()
    z = torch.matmul(x, y)
    print(f"\n‚úì GPU Test: PASSED (matrix multiplication successful)")
else:
    print("\n‚ö†Ô∏è  GPU Not Available - Will use CPU")
    print("   Note: Phase 2 LSTM notebook requires GPU")
    print("   You can still run other notebooks on CPU")

## Step 3: Verify Persistent Storage Volumes

In [None]:
import os
import shutil

print("\n" + "=" * 80)
print("PERSISTENT STORAGE VERIFICATION")
print("=" * 80)

# Check data volume
data_path = '/opt/app-root/src/data'
models_path = '/opt/app-root/src/models'

print(f"\n‚úì Data Volume: {data_path}")
if os.path.exists(data_path):
    print(f"  Status: EXISTS")
    stat = shutil.disk_usage(data_path)
    print(f"  Total: {stat.total / 1e9:.2f} GB")
    print(f"  Used: {stat.used / 1e9:.2f} GB")
    print(f"  Free: {stat.free / 1e9:.2f} GB")
else:
    print(f"  Status: NOT FOUND")

print(f"\n‚úì Models Volume: {models_path}")
if os.path.exists(models_path):
    print(f"  Status: EXISTS")
    stat = shutil.disk_usage(models_path)
    print(f"  Total: {stat.total / 1e9:.2f} GB")
    print(f"  Used: {stat.used / 1e9:.2f} GB")
    print(f"  Free: {stat.free / 1e9:.2f} GB")
else:
    print(f"  Status: NOT FOUND")

## Step 4: Test Required Dependencies

## Step 3.5: Install Missing Dependencies (Optional)

Run the cell below to install any missing packages. This is optional but recommended.

In [None]:
import subprocess
import sys

print("\n" + "=" * 80)
print("CHECKING/INSTALLING DEPENDENCIES")
print("=" * 80)

# Packages that may need to be installed
packages_to_check = {
    'statsmodels': 'statsmodels',
    'prophet': 'prophet',
    'pyod': 'pyod',
    'xgboost': 'xgboost',
    'lightgbm': 'lightgbm',
    'kserve': 'kserve',
    'seaborn': 'seaborn',
    'yaml': 'pyyaml',  # import name differs from pip name
}

# Check which packages are missing
missing_packages = []
for import_name, pip_name in packages_to_check.items():
    try:
        __import__(import_name)
        print(f"‚úì {pip_name} already installed")
    except ImportError:
        missing_packages.append(pip_name)
        print(f"‚úó {pip_name} not found")

if missing_packages:
    print(f"\nInstalling missing packages: {', '.join(missing_packages)}")
    print("This may take a few minutes...\n")
    try:
        subprocess.check_call([sys.executable, '-m', 'pip', 'install', '--quiet'] + missing_packages)
        print("\n‚úÖ Installation complete!")
    except subprocess.CalledProcessError as e:
        print(f"\n‚ö†Ô∏è  pip install failed (exit code {e.returncode})")
        print("   This is expected in read-only container environments.")
        print("   If running in validation mode, packages should be pre-installed in the image.")
else:
    print("\n‚úÖ All dependencies already installed - skipping pip install")

In [None]:
print("\n" + "=" * 80)
print("DEPENDENCY VERIFICATION")
print("=" * 80)

# Core dependencies (required for all notebooks)
core_dependencies = {
    'numpy': 'NumPy',
    'pandas': 'Pandas',
    'sklearn': 'Scikit-learn',
    'statsmodels': 'Statsmodels',
    'prophet': 'Prophet',
    'pyod': 'PyOD',
    'xgboost': 'XGBoost',
    'lightgbm': 'LightGBM',
    'prometheus_client': 'Prometheus Client',
    'matplotlib': 'Matplotlib',
    'seaborn': 'Seaborn',
    'plotly': 'Plotly',
    'requests': 'Requests',
    'yaml': 'PyYAML',  # Note: import as 'yaml', not 'pyyaml'
}

# Optional dependencies (nice to have, but not critical)
optional_dependencies = {
    'kubernetes': 'Kubernetes',  # Has dependency conflicts, optional
    'kserve': 'KServe',  # Has dependency conflicts, optional
}

missing = []
installed = []
optional_missing = []

print("\nCore Dependencies:")
for module, name in core_dependencies.items():
    try:
        __import__(module)
        installed.append(name)
        print(f"‚úì {name}")
    except ImportError:
        missing.append(name)
        print(f"‚úó {name} - NOT INSTALLED")

print("\nOptional Dependencies:")
for module, name in optional_dependencies.items():
    try:
        __import__(module)
        print(f"‚úì {name}")
    except ImportError:
        optional_missing.append(name)
        print(f"‚ö†Ô∏è  {name} - NOT INSTALLED (optional)")

print(f"\n" + "=" * 80)
print(f"Summary: {len(installed)}/{len(core_dependencies)} core dependencies installed")
print(f"=" * 80)


if missing:
    print(f"\n‚ùå Missing core dependencies: {', '.join(missing)}")
    print(f"\nTo install missing packages, run in terminal:")
    print(f"pip install --user {' '.join(missing).lower()}")
else:
    print(f"\n‚úÖ All core dependencies installed!")

if optional_missing:
    print(f"\n‚ö†Ô∏è  Optional dependencies not installed: {', '.join(optional_missing)}")
    print(f"   These are optional and may have dependency conflicts.")
    print(f"   You can still run all notebooks without them.")

## Step 5: Create Necessary Directories

In [None]:
import os

print("\n" + "=" * 80)
print("CREATING NECESSARY DIRECTORIES")
print("=" * 80)

directories = [
    '/opt/app-root/src/data/processed',
    '/opt/app-root/src/data/training',
    '/opt/app-root/src/data/reports',
    '/opt/app-root/src/models/anomaly-detection',
    '/opt/app-root/src/models/serving',
    '/opt/app-root/src/models/checkpoints',
]

for directory in directories:
    os.makedirs(directory, exist_ok=True)
    print(f"‚úì {directory}")

print(f"\n‚úÖ All directories created/verified!")

## Step 6: Generate Setup Summary Report

In [None]:
import json
from datetime import datetime

print("\n" + "=" * 80)
print("SETUP SUMMARY REPORT")
print("=" * 80)

summary = {
    'timestamp': datetime.now().isoformat(),
    'python_version': sys.version.split()[0],
    'pytorch_version': torch.__version__,
    'cuda_available': torch.cuda.is_available(),
    'gpu_count': torch.cuda.device_count() if torch.cuda.is_available() else 0,
    'dependencies_installed': len(installed),
    'dependencies_missing': len(missing),
    'data_volume_exists': os.path.exists('/opt/app-root/src/data'),
    'models_volume_exists': os.path.exists('/opt/app-root/src/models'),
}

print(f"\nSetup Timestamp: {summary['timestamp']}")
print(f"Python Version: {summary['python_version']}")
print(f"PyTorch Version: {summary['pytorch_version']}")
print(f"CUDA Available: {summary['cuda_available']}")
print(f"GPU Count: {summary['gpu_count']}")
print(f"Dependencies: {summary['dependencies_installed']}/{len(core_dependencies)} installed")
print(f"Data Volume: {'‚úì' if summary['data_volume_exists'] else '‚úó'}")
print(f"Models Volume: {'‚úì' if summary['models_volume_exists'] else '‚úó'}")

# Save summary
summary_path = '/opt/app-root/src/data/setup_summary.json'
with open(summary_path, 'w') as f:
    json.dump(summary, f, indent=2)

print(f"\n‚úÖ Setup summary saved to: {summary_path}")

## ‚úÖ Setup Complete!

Your environment is ready for the Self-Healing Platform notebooks!

### Next Steps

1. **Review the summary above** - Check that all components are verified
2. **If GPU is available** - Great! You can run all notebooks including Phase 2 LSTM
3. **If GPU is not available** - You can still run all notebooks except Phase 2 LSTM on CPU
4. **If dependencies are missing** - Run the pip install command shown above

### Start Executing Notebooks

Now you're ready to start with Phase 1:

1. Navigate to: `notebooks/01-data-collection/`
2. Open: `01-prometheus-metrics-collection.ipynb`
3. Run all cells with "Run All" button
4. Follow the execution checklist: `docs/NOTEBOOK-EXECUTION-CHECKLIST.md`

### Execution Timeline

- **Phase 1**: Data Collection (2-3 hours)
- **Phase 2**: Anomaly Detection (3-4 hours)
- **Phase 3**: Self-Healing Logic (2-3 hours)
- **Phase 4**: Model Serving (2-3 hours)
- **Phase 5**: End-to-End Scenarios (2-3 hours)
- **Phase 6**: MCP & Lightspeed (2 hours)
- **Phase 7**: Monitoring & Operations (2 hours)
- **Phase 8**: Advanced Scenarios (2-3 hours)

**Total: 18-24 hours**

---

**Happy Learning! üöÄ**