# Neuronas AI Benchmarking - Environment Setup Example

This notebook demonstrates how to set up and validate your environment for AI benchmarking with the Neuronas repository.

## Table of Contents
1. [Environment Setup](#environment-setup)
2. [Dependency Verification](#dependency-verification)
3. [Resource Availability](#resource-availability)
4. [Dataset Loading Examples](#dataset-loading-examples)
5. [Basic AI Model Examples](#basic-ai-model-examples)

## Environment Setup

First, let's check our Python environment and install any missing dependencies.

In [None]:
import sys
print(f"Python version: {sys.version}")
print(f"Python executable: {sys.executable}")

# Check if we're running in Google Colab
IN_COLAB = 'google.colab' in sys.modules
print(f"Running in Google Colab: {IN_COLAB}")

In [None]:
# Install dependencies if in Colab or if packages are missing
try:
    import numpy
    print("✓ Dependencies already installed")
except ImportError:
    print("Installing dependencies...")
    if IN_COLAB:
        !pip install -q -r requirements.txt
    else:
        print("Please run: pip install -r requirements.txt")

## Dependency Verification

Let's verify that all key dependencies are available and check their versions.

In [None]:
import importlib
from typing import Dict, Optional

def check_package(package_name: str, import_name: str = None) -> Optional[str]:
    """Check if a package is available and return its version."""
    if import_name is None:
        import_name = package_name
    
    try:
        module = importlib.import_module(import_name)
        version = getattr(module, '__version__', 'unknown')
        print(f"✓ {package_name}: {version}")
        return version
    except ImportError:
        print(f"✗ {package_name}: Not available")
        return None

# Check core dependencies
packages = {
    'NumPy': 'numpy',
    'Pandas': 'pandas', 
    'Scikit-learn': 'sklearn',
    'Matplotlib': 'matplotlib',
    'PyTorch': 'torch',
    'TensorFlow': 'tensorflow',
    'Transformers': 'transformers',
    'Datasets': 'datasets',
    'Qiskit': 'qiskit',
    'Jupyter': 'jupyter'
}

print("Package Versions:")
print("=" * 40)
for display_name, import_name in packages.items():
    check_package(display_name, import_name)

## Resource Availability

Check system resources and hardware acceleration availability.

In [None]:
# Check GPU availability
def check_gpu_resources():
    """Check for GPU availability across different frameworks."""
    print("GPU Resource Check:")
    print("=" * 30)
    
    # PyTorch CUDA check
    try:
        import torch
        if torch.cuda.is_available():
            gpu_count = torch.cuda.device_count()
            gpu_name = torch.cuda.get_device_name(0)
            print(f"✓ PyTorch CUDA: {gpu_count} GPU(s) - {gpu_name}")
            print(f"  CUDA Version: {torch.version.cuda}")
        else:
            print("○ PyTorch CUDA: Not available (CPU mode)")
    except ImportError:
        print("✗ PyTorch: Not installed")
    
    # TensorFlow GPU check
    try:
        import tensorflow as tf
        gpus = tf.config.list_physical_devices('GPU')
        if gpus:
            print(f"✓ TensorFlow GPU: {len(gpus)} GPU(s) detected")
            for i, gpu in enumerate(gpus):
                print(f"  GPU {i}: {gpu.name}")
        else:
            print("○ TensorFlow GPU: Not available (CPU mode)")
    except ImportError:
        print("✗ TensorFlow: Not installed")

check_gpu_resources()

In [None]:
# Check system memory
try:
    import psutil
    
    memory = psutil.virtual_memory()
    print(f"\nSystem Memory:")
    print(f"Total: {memory.total / (1024**3):.1f} GB")
    print(f"Available: {memory.available / (1024**3):.1f} GB")
    print(f"Usage: {memory.percent}%")
    
    # CPU info
    print(f"\nCPU Cores: {psutil.cpu_count(logical=False)} physical, {psutil.cpu_count(logical=True)} logical")
    
except ImportError:
    print("psutil not available - cannot check system resources")
    # Fallback using basic Python
    import os
    print(f"CPU count (logical): {os.cpu_count()}")

## Dataset Loading Examples

Demonstrate loading datasets for different AI tasks.

In [None]:
# Example 1: Load a simple dataset using scikit-learn
try:
    from sklearn.datasets import load_iris, load_digits
    import numpy as np
    
    print("Loading Iris dataset...")
    iris = load_iris()
    print(f"✓ Iris dataset loaded: {iris.data.shape} samples")
    print(f"  Features: {len(iris.feature_names)}")
    print(f"  Classes: {len(iris.target_names)}")
    
    print("\nLoading Digits dataset...")
    digits = load_digits()
    print(f"✓ Digits dataset loaded: {digits.data.shape} samples")
    print(f"  Image size: {digits.images[0].shape}")
    
except ImportError as e:
    print(f"✗ Cannot load sklearn datasets: {e}")

In [None]:
# Example 2: Load dataset using Hugging Face datasets (if available)
try:
    from datasets import load_dataset
    
    print("Loading sample dataset from Hugging Face...")
    # Load a small, quick dataset for demonstration
    dataset = load_dataset("imdb", split="train[:100]")  # Just first 100 samples
    print(f"✓ IMDB dataset sample loaded: {len(dataset)} samples")
    print(f"  Features: {list(dataset.features.keys())}")
    
    # Show a sample
    print(f"\nSample text (truncated): {dataset[0]['text'][:100]}...")
    
except ImportError:
    print("✗ Hugging Face datasets not available")
except Exception as e:
    print(f"○ Could not load Hugging Face dataset (may require internet): {e}")

## Basic AI Model Examples

Demonstrate basic AI model functionality with the available frameworks.

In [None]:
# Example 1: Simple scikit-learn model
try:
    from sklearn.datasets import load_iris
    from sklearn.model_selection import train_test_split
    from sklearn.ensemble import RandomForestClassifier
    from sklearn.metrics import accuracy_score
    
    print("Training a simple Random Forest classifier...")
    
    # Load data
    iris = load_iris()
    X_train, X_test, y_train, y_test = train_test_split(
        iris.data, iris.target, test_size=0.2, random_state=42
    )
    
    # Train model
    clf = RandomForestClassifier(n_estimators=10, random_state=42)
    clf.fit(X_train, y_train)
    
    # Evaluate
    y_pred = clf.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    
    print(f"✓ Model trained successfully!")
    print(f"  Training samples: {len(X_train)}")
    print(f"  Test samples: {len(X_test)}")
    print(f"  Accuracy: {accuracy:.3f}")
    
except ImportError as e:
    print(f"✗ Cannot run scikit-learn example: {e}")

In [None]:
# Example 2: Simple PyTorch tensor operations
try:
    import torch
    import torch.nn as nn
    
    print("Testing PyTorch tensor operations...")
    
    # Create random tensors
    x = torch.randn(3, 4)
    y = torch.randn(4, 2)
    
    # Matrix multiplication
    z = torch.mm(x, y)
    
    print(f"✓ PyTorch operations successful!")
    print(f"  Input tensor shape: {x.shape}")
    print(f"  Weight tensor shape: {y.shape}")
    print(f"  Output tensor shape: {z.shape}")
    print(f"  Device: {z.device}")
    
    # Simple neural network
    model = nn.Sequential(
        nn.Linear(4, 8),
        nn.ReLU(),
        nn.Linear(8, 3)
    )
    
    output = model(x)
    print(f"  Neural network output shape: {output.shape}")
    
except ImportError as e:
    print(f"✗ Cannot run PyTorch example: {e}")

In [None]:
# Example 3: Simple TensorFlow operations  
try:
    import tensorflow as tf
    
    print("Testing TensorFlow operations...")
    
    # Create random tensors
    x = tf.random.normal([3, 4])
    y = tf.random.normal([4, 2])
    
    # Matrix multiplication
    z = tf.matmul(x, y)
    
    print(f"✓ TensorFlow operations successful!")
    print(f"  Input tensor shape: {x.shape}")
    print(f"  Weight tensor shape: {y.shape}")
    print(f"  Output tensor shape: {z.shape}")
    
    # Simple neural network
    model = tf.keras.Sequential([
        tf.keras.layers.Dense(8, activation='relu', input_shape=(4,)),
        tf.keras.layers.Dense(3)
    ])
    
    output = model(x)
    print(f"  Neural network output shape: {output.shape}")
    
except ImportError as e:
    print(f"✗ Cannot run TensorFlow example: {e}")

## Validation Summary

Run the complete environment validation script.

In [None]:
# Run the validation script from the validation directory
import sys
import os

# Add the repository root to the path
repo_root = os.path.abspath('..')  # Assuming notebook is in notebooks/ subdirectory
if repo_root not in sys.path:
    sys.path.insert(0, repo_root)

try:
    from validation.validate_environment import run_validation
    
    print("Running complete environment validation...")
    print("=" * 50)
    
    success = run_validation()
    
    if success:
        print("\n🎉 Environment is ready for AI benchmarking!")
    else:
        print("\n⚠️  Some validation checks failed. Please review the output above.")
        
except ImportError as e:
    print(f"Cannot import validation module: {e}")
    print("Please ensure you're running from the correct directory.")

## Next Steps

If your environment validation was successful, you're ready to:

1. **Explore AI Benchmarks**: Check out other notebooks in the `/notebooks/` directory
2. **Load Datasets**: Use the scripts in `/datasets/` to download and prepare data
3. **Run Tests**: Execute `pytest tests/` to run the full test suite
4. **Contribute**: Follow the guidelines in `.github/copilot-instructions.md`

### Useful Commands

```bash
# Install dependencies
pip install -r requirements.txt

# Run environment validation
python validation/validate_environment.py

# Run tests
pytest tests/ -v

# Format code
black .

# Lint code
flake8 .
```

Happy benchmarking! 🚀