# Advanced Features and Optimization Tutorial

This tutorial demonstrates advanced features of the Mixed-Precision Multigrid Solvers package, including performance optimization techniques, advanced solver configurations, and integration with monitoring tools.

## Learning Objectives
- Configure advanced solver parameters for optimal performance
- Use real-time monitoring and visualization tools
- Implement custom boundary conditions and source terms
- Optimize memory usage and computational efficiency
- Set up automated benchmarking and validation workflows

## 1. Setup and Imports

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import time
import psutil
from typing import Dict, List, Optional, Tuple, Callable

# Core multigrid components
import multigrid
from multigrid import (
    Grid, MultigridSolver, AdaptivePrecisionSolver,
    LaplacianOperator, PrecisionManager
)

# Advanced features
from multigrid.applications import HeatEquationSolver
from multigrid.benchmarks import BenchmarkSuite
from multigrid.validation import MMSValidation
from multigrid.visualization import (
    solver_dashboard, create_monitoring_widgets,
    AdvancedVisualizationTools
)

# Try to import GPU components
try:
    from multigrid.gpu import (
        GPUMultigridSolver, GPUMemoryManager,
        GPUPerformanceProfiler
    )
    GPU_AVAILABLE = multigrid.GPU_AVAILABLE
    print(f"✅ GPU support available: CuPy detected")
except ImportError:
    GPU_AVAILABLE = False
    print("⚠️  GPU support not available - using CPU only")

print(f"Mixed-Precision Multigrid version: {multigrid.__version__}")
print(f"NumPy version: {np.__version__}")

## 2. Advanced Solver Configuration

Learn how to fine-tune solver parameters for optimal performance.

In [None]:
def create_optimized_solver(grid_size: int, problem_type: str = 'isotropic') -> MultigridSolver:
    """
    Create a multigrid solver with optimized parameters for different problem types.
    
    Args:
        grid_size: Size of the computational grid
        problem_type: Type of problem ('isotropic', 'anisotropic', 'jump_coeffs')
    
    Returns:
        Configured MultigridSolver instance
    """
    grid = Grid(nx=grid_size, ny=grid_size)
    
    # Problem-specific parameter optimization
    if problem_type == 'isotropic':
        # Standard Poisson equation - optimal settings
        config = {
            'smoother': 'gauss_seidel',
            'pre_smooth_steps': 1,
            'post_smooth_steps': 1,
            'cycle_type': 'V',
            'coarse_solver': 'direct',
            'tolerance': 1e-10
        }
    elif problem_type == 'anisotropic':
        # Problems with aspect ratio issues
        config = {
            'smoother': 'line_gauss_seidel',  # Better for anisotropic problems
            'pre_smooth_steps': 2,
            'post_smooth_steps': 2,
            'cycle_type': 'W',  # More robust convergence
            'coarse_solver': 'iterative',
            'tolerance': 1e-8
        }
    elif problem_type == 'jump_coeffs':
        # Problems with coefficient discontinuities
        config = {
            'smoother': 'red_black_gauss_seidel',
            'pre_smooth_steps': 3,
            'post_smooth_steps': 3,
            'cycle_type': 'V',
            'coarse_solver': 'direct',
            'tolerance': 1e-8
        }
    
    solver = MultigridSolver(grid, **config)
    print(f"✅ Created optimized solver for {problem_type} problems")
    print(f"   Grid size: {grid_size}×{grid_size}")
    print(f"   Configuration: {config}")
    
    return solver

# Example: Create solvers for different problem types
solvers = {
    'isotropic': create_optimized_solver(128, 'isotropic'),
    'anisotropic': create_optimized_solver(128, 'anisotropic'),
    'jump_coeffs': create_optimized_solver(128, 'jump_coeffs')
}

## 3. Real-Time Monitoring and Dashboard

Set up real-time monitoring to track solver performance and system resources.

In [None]:
# Create real-time monitoring dashboard
dashboard = solver_dashboard()

# Custom solver callback for monitoring
class MonitoredSolver:
    def __init__(self, base_solver: MultigridSolver):
        self.solver = base_solver
        self.iteration = 0
        self.residual_history = []
        self.precision_history = []
        self.current_precision = 64  # Start with double precision
        
        # Set up dashboard callback
        dashboard.set_solver_callback(self.get_metrics)
    
    def get_metrics(self) -> Dict:
        """Callback function for the real-time dashboard."""
        current_residual = self.residual_history[-1] if self.residual_history else 1.0
        
        return {
            'residual': current_residual,
            'iteration': self.iteration,
            'precision_level': self.current_precision
        }
    
    def solve_with_monitoring(self, rhs: np.ndarray, initial_guess: np.ndarray, 
                            max_iterations: int = 50) -> np.ndarray:
        """Solve with real-time monitoring."""
        print("🔍 Starting monitored solve...")
        dashboard.start_monitoring()
        
        # Simulate iterative solving with monitoring
        solution = initial_guess.copy()
        
        for i in range(max_iterations):
            self.iteration = i + 1
            
            # Perform one multigrid cycle
            solution = self.solver.solve(rhs, solution, max_iter=1, tol=1e-12)
            
            # Calculate current residual
            operator = LaplacianOperator(self.solver.grid)
            residual = np.linalg.norm(rhs - operator.apply(solution))
            self.residual_history.append(residual)
            
            # Simulate precision switching
            if residual < 1e-6 and self.current_precision == 64:
                self.current_precision = 32
            elif residual < 1e-3 and self.current_precision == 32:
                self.current_precision = 64
            
            self.precision_history.append(self.current_precision)
            
            # Check convergence
            if residual < 1e-10:
                print(f"✅ Converged in {i+1} iterations (residual: {residual:.2e})")
                break
            
            # Brief pause for visualization
            time.sleep(0.1)
        
        dashboard.stop_monitoring()
        return solution

# Create monitored solver
base_solver = create_optimized_solver(64, 'isotropic')
monitored_solver = MonitoredSolver(base_solver)

print("✅ Real-time monitoring dashboard configured")
print("   Dashboard will track: residual, iteration, precision level")
print("   System metrics: CPU usage, memory, GPU utilization")

## 4. Performance Profiling and Optimization

Demonstrate advanced profiling techniques and performance optimization strategies.

In [None]:
class PerformanceProfiler:
    """Advanced performance profiling for multigrid solvers."""
    
    def __init__(self):
        self.profile_data = {
            'solve_times': [],
            'memory_usage': [],
            'cpu_utilization': [],
            'gpu_utilization': [],
            'iteration_counts': [],
            'grid_sizes': []
        }
    
    def profile_solver(self, solver: MultigridSolver, problem_sizes: List[int],
                      num_runs: int = 3) -> Dict:
        """Profile solver performance across different problem sizes."""
        print(f"🔬 Profiling solver performance...")
        print(f"   Problem sizes: {problem_sizes}")
        print(f"   Runs per size: {num_runs}")
        
        results = {
            'problem_sizes': problem_sizes,
            'avg_solve_times': [],
            'avg_memory_usage': [],
            'avg_iterations': [],
            'scaling_efficiency': []
        }
        
        baseline_time = None
        
        for size in problem_sizes:
            print(f"\n📊 Testing grid size: {size}×{size}")
            
            # Create test problem
            grid = Grid(nx=size, ny=size)
            solver.grid = grid  # Update solver grid
            
            x = np.linspace(0, 1, size)
            y = np.linspace(0, 1, size)
            X, Y = np.meshgrid(x, y)
            
            # Manufactured solution
            u_exact = np.sin(np.pi * X) * np.sin(np.pi * Y)
            rhs = 2 * np.pi**2 * u_exact
            
            # Run multiple times for averaging
            solve_times = []
            memory_usage = []
            iteration_counts = []
            
            for run in range(num_runs):
                # Monitor memory before solving
                process = psutil.Process()
                memory_before = process.memory_info().rss / 1024 / 1024  # MB
                
                # Time the solve
                initial_guess = np.zeros_like(rhs)
                start_time = time.perf_counter()
                
                solution = solver.solve(rhs, initial_guess, tol=1e-8, max_iter=100)
                
                solve_time = time.perf_counter() - start_time
                
                # Monitor memory after solving
                memory_after = process.memory_info().rss / 1024 / 1024  # MB
                memory_used = memory_after - memory_before
                
                solve_times.append(solve_time)
                memory_usage.append(memory_used)
                
                # Get iteration count (if available)
                if hasattr(solver, 'iteration_count'):
                    iteration_counts.append(solver.iteration_count)
                else:
                    iteration_counts.append(-1)  # Unknown
                
                # Verify solution accuracy
                error = np.max(np.abs(solution - u_exact))
                if run == 0:  # Report accuracy for first run
                    print(f"   Run {run+1}: {solve_time:.4f}s, Error: {error:.2e}")
            
            # Calculate averages
            avg_solve_time = np.mean(solve_times)
            avg_memory = np.mean(memory_usage)
            avg_iterations = np.mean([it for it in iteration_counts if it > 0])
            
            results['avg_solve_times'].append(avg_solve_time)
            results['avg_memory_usage'].append(avg_memory)
            results['avg_iterations'].append(avg_iterations)
            
            # Calculate scaling efficiency
            if baseline_time is None:
                baseline_time = avg_solve_time
                scaling_eff = 1.0
            else:
                # Theoretical scaling for O(N) complexity
                size_ratio = size**2 / problem_sizes[0]**2
                theoretical_time = baseline_time * size_ratio
                scaling_eff = theoretical_time / avg_solve_time if avg_solve_time > 0 else 0
            
            results['scaling_efficiency'].append(scaling_eff)
            
            print(f"   Average: {avg_solve_time:.4f}s, Memory: {avg_memory:.1f}MB, "
                  f"Iterations: {avg_iterations:.1f}, Scaling: {scaling_eff:.2f}")
        
        return results
    
    def plot_performance_analysis(self, results: Dict) -> None:
        """Create comprehensive performance analysis plots."""
        fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(15, 10))
        fig.suptitle('Multigrid Solver Performance Analysis', fontsize=16, fontweight='bold')
        
        problem_sizes = results['problem_sizes']
        N_values = [size**2 for size in problem_sizes]  # Total unknowns
        
        # 1. Solve time vs problem size
        ax1.loglog(N_values, results['avg_solve_times'], 'bo-', linewidth=2, markersize=8)
        ax1.loglog(N_values, [results['avg_solve_times'][0] * (N/N_values[0]) for N in N_values], 
                   'r--', label='O(N) reference')
        ax1.set_xlabel('Problem Size (N)')
        ax1.set_ylabel('Solve Time (seconds)')
        ax1.set_title('Computational Complexity')
        ax1.legend()
        ax1.grid(True, alpha=0.3)
        
        # 2. Memory usage vs problem size
        ax2.loglog(N_values, results['avg_memory_usage'], 'go-', linewidth=2, markersize=8)
        ax2.set_xlabel('Problem Size (N)')
        ax2.set_ylabel('Memory Usage (MB)')
        ax2.set_title('Memory Scaling')
        ax2.grid(True, alpha=0.3)
        
        # 3. Iteration count (should be grid-independent)
        ax3.semilogx(N_values, results['avg_iterations'], 'mo-', linewidth=2, markersize=8)
        ax3.axhline(y=np.mean(results['avg_iterations']), color='red', linestyle='--', 
                   label=f'Average: {np.mean(results["avg_iterations"]):.1f}')
        ax3.set_xlabel('Problem Size (N)')
        ax3.set_ylabel('Iterations to Convergence')
        ax3.set_title('Grid-Independent Convergence')
        ax3.legend()
        ax3.grid(True, alpha=0.3)
        
        # 4. Scaling efficiency
        ax4.semilogx(N_values, results['scaling_efficiency'], 'co-', linewidth=2, markersize=8)
        ax4.axhline(y=1.0, color='red', linestyle='--', label='Ideal Scaling')
        ax4.set_xlabel('Problem Size (N)')
        ax4.set_ylabel('Scaling Efficiency')
        ax4.set_title('Performance Scaling')
        ax4.legend()
        ax4.grid(True, alpha=0.3)
        
        plt.tight_layout()
        plt.show()
        
        # Performance summary
        print("\n📊 Performance Summary:")
        print(f"   Best solve time: {min(results['avg_solve_times']):.4f}s")
        print(f"   Peak memory usage: {max(results['avg_memory_usage']):.1f}MB")
        print(f"   Average iterations: {np.mean(results['avg_iterations']):.1f}")
        print(f"   Scaling efficiency: {np.mean(results['scaling_efficiency']):.2f}")

# Create profiler and run performance analysis
profiler = PerformanceProfiler()
print("✅ Performance profiler initialized")
print("   Ready to analyze solver performance across different problem sizes")

## 5. Advanced Boundary Conditions and Custom Problems

Implement sophisticated boundary conditions and custom problem setups.

In [None]:
def create_complex_problem(grid_size: int = 128) -> Tuple[np.ndarray, np.ndarray, Callable]:
    """
    Create a complex PDE problem with mixed boundary conditions and variable coefficients.
    
    Problem: -∇·(a(x,y)∇u) = f(x,y) in Ω
    with mixed boundary conditions
    """
    grid = Grid(nx=grid_size, ny=grid_size)
    x = np.linspace(0, 1, grid_size)
    y = np.linspace(0, 1, grid_size)
    X, Y = np.meshgrid(x, y)
    
    # Variable diffusion coefficient
    def diffusion_coeff(x, y):
        """Variable diffusion coefficient with jump discontinuity."""
        condition = (x < 0.5) & (y < 0.5)
        return np.where(condition, 100.0, 1.0)  # 100:1 ratio
    
    # Exact solution for manufactured problem
    def exact_solution(x, y):
        """Manufactured exact solution."""
        return np.sin(2*np.pi*x) * np.cos(2*np.pi*y) * np.exp(-x*y)
    
    # Calculate source term from exact solution
    u_exact = exact_solution(X, Y)
    a_coeff = diffusion_coeff(X, Y)
    
    # Compute -∇·(a∇u) using finite differences
    dx, dy = 1.0/(grid_size-1), 1.0/(grid_size-1)
    
    # Initialize source term
    source = np.zeros_like(u_exact)
    
    # Interior points - compute divergence
    for i in range(1, grid_size-1):
        for j in range(1, grid_size-1):
            # Compute a*∇u at cell faces
            a_e = 0.5 * (a_coeff[i+1,j] + a_coeff[i,j])
            a_w = 0.5 * (a_coeff[i-1,j] + a_coeff[i,j])
            a_n = 0.5 * (a_coeff[i,j+1] + a_coeff[i,j])
            a_s = 0.5 * (a_coeff[i,j-1] + a_coeff[i,j])
            
            # Compute fluxes
            flux_e = a_e * (u_exact[i+1,j] - u_exact[i,j]) / dx
            flux_w = a_w * (u_exact[i,j] - u_exact[i-1,j]) / dx
            flux_n = a_n * (u_exact[i,j+1] - u_exact[i,j]) / dy
            flux_s = a_s * (u_exact[i,j] - u_exact[i,j-1]) / dy
            
            # Compute -∇·(a∇u)
            source[i,j] = -((flux_e - flux_w)/dx + (flux_n - flux_s)/dy)
    
    # Apply boundary conditions to source term
    # Dirichlet on left and right boundaries
    source[0, :] = u_exact[0, :]   # Left boundary
    source[-1, :] = u_exact[-1, :] # Right boundary
    
    # Neumann on top and bottom boundaries (natural BC for variational form)
    source[:, 0] = 0   # Bottom boundary
    source[:, -1] = 0  # Top boundary
    
    print(f"✅ Created complex problem with:")
    print(f"   Variable coefficient (jump ratio: 100:1)")
    print(f"   Mixed boundary conditions")
    print(f"   Grid size: {grid_size}×{grid_size}")
    print(f"   Exact solution available for validation")
    
    return source, u_exact, exact_solution

# Create and visualize the complex problem
rhs_complex, u_exact_complex, exact_func = create_complex_problem(64)

# Visualize the problem setup
fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(12, 10))

# Diffusion coefficient
x = np.linspace(0, 1, 64)
y = np.linspace(0, 1, 64)
X, Y = np.meshgrid(x, y)
condition = (X < 0.5) & (Y < 0.5)
a_coeff = np.where(condition, 100.0, 1.0)

im1 = ax1.imshow(a_coeff, extent=[0, 1, 0, 1], origin='lower', cmap='viridis')
ax1.set_title('Diffusion Coefficient a(x,y)')
ax1.set_xlabel('x')
ax1.set_ylabel('y')
plt.colorbar(im1, ax=ax1)

# Source term
im2 = ax2.imshow(rhs_complex, extent=[0, 1, 0, 1], origin='lower', cmap='RdBu')
ax2.set_title('Source Term f(x,y)')
ax2.set_xlabel('x')
ax2.set_ylabel('y')
plt.colorbar(im2, ax=ax2)

# Exact solution
im3 = ax3.imshow(u_exact_complex, extent=[0, 1, 0, 1], origin='lower', cmap='plasma')
ax3.set_title('Exact Solution u(x,y)')
ax3.set_xlabel('x')
ax3.set_ylabel('y')
plt.colorbar(im3, ax=ax3)

# Boundary condition illustration
bc_illustration = np.zeros_like(u_exact_complex)
bc_illustration[0, :] = 1   # Dirichlet left
bc_illustration[-1, :] = 1  # Dirichlet right
bc_illustration[:, 0] = 2   # Neumann bottom
bc_illustration[:, -1] = 2  # Neumann top

im4 = ax4.imshow(bc_illustration, extent=[0, 1, 0, 1], origin='lower', cmap='Set3')
ax4.set_title('Boundary Conditions\n(1=Dirichlet, 2=Neumann)')
ax4.set_xlabel('x')
ax4.set_ylabel('y')
plt.colorbar(im4, ax=ax4)

plt.tight_layout()
plt.show()

## 6. GPU Acceleration and Multi-GPU Setup

Demonstrate GPU acceleration and multi-GPU domain decomposition (if available).

In [None]:
if GPU_AVAILABLE:
    print("🚀 GPU acceleration is available!")
    
    def compare_cpu_gpu_performance(problem_sizes: List[int] = [64, 128, 256]):
        """Compare CPU vs GPU performance across different problem sizes."""
        print("\n🏁 CPU vs GPU Performance Comparison")
        print("=" * 50)
        
        results = {
            'sizes': problem_sizes,
            'cpu_times': [],
            'gpu_times': [],
            'speedups': [],
            'gpu_memory': []
        }
        
        for size in problem_sizes:
            print(f"\n📊 Testing {size}×{size} grid...")
            
            # Create test problem
            grid = Grid(nx=size, ny=size)
            x = np.linspace(0, 1, size)
            y = np.linspace(0, 1, size)
            X, Y = np.meshgrid(x, y)
            
            u_exact = np.sin(np.pi * X) * np.sin(np.pi * Y)
            rhs = 2 * np.pi**2 * u_exact
            initial_guess = np.zeros_like(rhs)
            
            # CPU solver
            cpu_solver = MultigridSolver(grid)
            cpu_start = time.perf_counter()
            cpu_solution = cpu_solver.solve(rhs, initial_guess, tol=1e-8)
            cpu_time = time.perf_counter() - cpu_start
            
            # GPU solver
            try:
                gpu_solver = GPUMultigridSolver(grid)
                
                # Monitor GPU memory
                import cupy as cp
                mempool = cp.get_default_memory_pool()
                mempool.free_all_blocks()  # Clear memory
                
                gpu_start = time.perf_counter()
                gpu_solution = gpu_solver.solve(rhs, initial_guess, tol=1e-8)
                gpu_time = time.perf_counter() - gpu_start
                
                gpu_memory_used = mempool.used_bytes() / 1024 / 1024  # MB
                
                # Verify solutions match
                if isinstance(gpu_solution, cp.ndarray):
                    gpu_solution_cpu = cp.asnumpy(gpu_solution)
                else:
                    gpu_solution_cpu = gpu_solution
                
                solution_diff = np.max(np.abs(cpu_solution - gpu_solution_cpu))
                
                speedup = cpu_time / gpu_time if gpu_time > 0 else 0
                
                results['cpu_times'].append(cpu_time)
                results['gpu_times'].append(gpu_time)
                results['speedups'].append(speedup)
                results['gpu_memory'].append(gpu_memory_used)
                
                print(f"   CPU time: {cpu_time:.4f}s")
                print(f"   GPU time: {gpu_time:.4f}s")
                print(f"   Speedup: {speedup:.2f}×")
                print(f"   GPU memory: {gpu_memory_used:.1f}MB")
                print(f"   Solution difference: {solution_diff:.2e}")
                
            except Exception as e:
                print(f"   ⚠️ GPU solver failed: {e}")
                results['cpu_times'].append(cpu_time)
                results['gpu_times'].append(float('inf'))
                results['speedups'].append(0)
                results['gpu_memory'].append(0)
        
        # Plot comparison results
        fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(12, 10))
        
        # Execution times
        ax1.plot(problem_sizes, results['cpu_times'], 'bo-', label='CPU', linewidth=2)
        ax1.plot(problem_sizes, results['gpu_times'], 'ro-', label='GPU', linewidth=2)
        ax1.set_xlabel('Grid Size')
        ax1.set_ylabel('Execution Time (s)')
        ax1.set_title('CPU vs GPU Execution Times')
        ax1.legend()
        ax1.grid(True, alpha=0.3)
        
        # Speedup
        ax2.plot(problem_sizes, results['speedups'], 'go-', linewidth=2, markersize=8)
        ax2.axhline(y=1, color='red', linestyle='--', alpha=0.7, label='No speedup')
        ax2.set_xlabel('Grid Size')
        ax2.set_ylabel('GPU Speedup')
        ax2.set_title('GPU Speedup vs Problem Size')
        ax2.legend()
        ax2.grid(True, alpha=0.3)
        
        # GPU memory usage
        ax3.plot(problem_sizes, results['gpu_memory'], 'mo-', linewidth=2)
        ax3.set_xlabel('Grid Size')
        ax3.set_ylabel('GPU Memory Usage (MB)')
        ax3.set_title('GPU Memory Consumption')
        ax3.grid(True, alpha=0.3)
        
        # Performance efficiency
        N_values = [size**2 for size in problem_sizes]
        cpu_flops_per_sec = [N / t for N, t in zip(N_values, results['cpu_times'])]
        gpu_flops_per_sec = [N / t for N, t in zip(N_values, results['gpu_times']) if t != float('inf')]
        
        ax4.loglog(N_values, cpu_flops_per_sec, 'bo-', label='CPU FLOPS/s', linewidth=2)
        if gpu_flops_per_sec:
            ax4.loglog(N_values[:len(gpu_flops_per_sec)], gpu_flops_per_sec, 'ro-', label='GPU FLOPS/s', linewidth=2)
        ax4.set_xlabel('Problem Size (N)')
        ax4.set_ylabel('Performance (ops/second)')
        ax4.set_title('Computational Throughput')
        ax4.legend()
        ax4.grid(True, alpha=0.3)
        
        plt.tight_layout()
        plt.show()
        
        return results
    
    # Run GPU performance comparison
    gpu_results = compare_cpu_gpu_performance([64, 128, 256])
    
    # GPU performance summary
    if gpu_results['speedups']:
        max_speedup = max([s for s in gpu_results['speedups'] if s > 0])
        avg_speedup = np.mean([s for s in gpu_results['speedups'] if s > 0])
        
        print(f"\n🏆 GPU Performance Summary:")
        print(f"   Maximum speedup: {max_speedup:.2f}×")
        print(f"   Average speedup: {avg_speedup:.2f}×")
        print(f"   Peak GPU memory: {max(gpu_results['gpu_memory']):.1f}MB")

else:
    print("⚠️ GPU acceleration not available")
    print("   Install CuPy for GPU support: pip install cupy-cuda11x")
    print("   Or use the GPU Docker image with NVIDIA runtime")

## 7. Automated Benchmarking and Validation Workflow

Set up automated workflows for continuous benchmarking and validation.

In [None]:
class AutomatedValidationWorkflow:
    """Automated validation and benchmarking workflow."""
    
    def __init__(self):
        self.validation_results = []
        self.benchmark_results = []
        self.performance_baselines = {}
    
    def run_mms_validation_suite(self) -> Dict:
        """Run comprehensive Method of Manufactured Solutions validation."""
        print("🧪 Running MMS Validation Suite...")
        
        # Initialize MMS validator
        validator = MMSValidation()
        
        # Test different problem types
        test_problems = [
            ('polynomial', lambda x, y: x**2 * y**2 * (1-x)**2 * (1-y)**2),
            ('trigonometric', lambda x, y: np.sin(np.pi*x) * np.sin(np.pi*y)),
            ('exponential', lambda x, y: np.exp(x) * np.cos(2*np.pi*y)),
            ('mixed', lambda x, y: x*np.sin(2*np.pi*y) + y*np.cos(2*np.pi*x))
        ]
        
        grid_sizes = [32, 64, 128, 256]
        validation_summary = {
            'test_count': 0,
            'passed_count': 0,
            'convergence_rates': [],
            'max_errors': [],
            'problem_types': []
        }
        
        for problem_name, exact_func in test_problems:
            print(f"\n🔍 Testing {problem_name} problem...")
            
            try:
                # Run validation across grid sizes
                errors = []
                
                for size in grid_sizes:
                    result = validator.validate_solver(
                        grid_size=size,
                        exact_solution=exact_func,
                        tolerance=1e-10
                    )
                    errors.append(result['max_error'])
                    validation_summary['test_count'] += 1
                
                # Calculate convergence rate
                h_values = [1.0/size for size in grid_sizes]
                
                # Linear regression: log(error) = p*log(h) + c
                log_h = np.log(h_values)
                log_errors = np.log(errors)
                
                # Remove any -inf values (perfect solutions)
                valid_indices = np.isfinite(log_errors)
                if np.sum(valid_indices) >= 2:
                    coeffs = np.polyfit(log_h[valid_indices], log_errors[valid_indices], 1)
                    convergence_rate = coeffs[0]
                else:
                    convergence_rate = 2.0  # Assume optimal
                
                # Check if convergence rate is acceptable
                if convergence_rate >= 1.8:  # Allow some tolerance
                    validation_summary['passed_count'] += 1
                    status = "✅ PASSED"
                else:
                    status = "❌ FAILED"
                
                validation_summary['convergence_rates'].append(convergence_rate)
                validation_summary['max_errors'].append(max(errors))
                validation_summary['problem_types'].append(problem_name)
                
                print(f"   Convergence rate: {convergence_rate:.3f} {status}")
                print(f"   Max error: {max(errors):.2e}")
                
            except Exception as e:
                print(f"   ⚠️ Validation failed: {e}")
                validation_summary['test_count'] += 1
        
        pass_rate = validation_summary['passed_count'] / len(test_problems) if test_problems else 0
        
        print(f"\n📊 Validation Summary:")
        print(f"   Tests run: {len(test_problems)}")
        print(f"   Tests passed: {validation_summary['passed_count']}")
        print(f"   Pass rate: {pass_rate:.1%}")
        print(f"   Average convergence rate: {np.mean(validation_summary['convergence_rates']):.3f}")
        
        self.validation_results.append(validation_summary)
        return validation_summary
    
    def run_performance_regression_tests(self) -> Dict:
        """Run performance regression tests against baselines."""
        print("\n⚡ Running Performance Regression Tests...")
        
        # Define performance baselines (these would be established over time)
        if not self.performance_baselines:
            self.performance_baselines = {
                '64x64': {'cpu_time': 0.01, 'memory_mb': 5, 'iterations': 8},
                '128x128': {'cpu_time': 0.05, 'memory_mb': 20, 'iterations': 8},
                '256x256': {'cpu_time': 0.25, 'memory_mb': 80, 'iterations': 8},
            }
        
        regression_results = {
            'test_sizes': [],
            'performance_ratios': [],
            'memory_ratios': [],
            'iteration_ratios': [],
            'regressions_detected': 0
        }
        
        for size_key, baseline in self.performance_baselines.items():
            size = int(size_key.split('x')[0])
            print(f"\n🔍 Testing {size}×{size} performance...")
            
            # Create test problem
            grid = Grid(nx=size, ny=size)
            solver = MultigridSolver(grid)
            
            x = np.linspace(0, 1, size)
            y = np.linspace(0, 1, size)
            X, Y = np.meshgrid(x, y)
            u_exact = np.sin(np.pi * X) * np.sin(np.pi * Y)
            rhs = 2 * np.pi**2 * u_exact
            
            # Measure performance
            process = psutil.Process()
            memory_before = process.memory_info().rss / 1024 / 1024  # MB
            
            start_time = time.perf_counter()
            solution = solver.solve(rhs, np.zeros_like(rhs), tol=1e-8, max_iter=50)
            solve_time = time.perf_counter() - start_time
            
            memory_after = process.memory_info().rss / 1024 / 1024  # MB
            memory_used = memory_after - memory_before
            
            iterations = getattr(solver, 'iteration_count', -1)
            
            # Compare to baselines
            perf_ratio = solve_time / baseline['cpu_time']
            memory_ratio = memory_used / baseline['memory_mb']
            iter_ratio = iterations / baseline['iterations'] if iterations > 0 else 1.0
            
            # Check for regressions (>20% slower)
            regression_threshold = 1.2
            regression_detected = (
                perf_ratio > regression_threshold or
                memory_ratio > regression_threshold or
                iter_ratio > regression_threshold
            )
            
            if regression_detected:
                regression_results['regressions_detected'] += 1
                status = "⚠️ REGRESSION"
            else:
                status = "✅ PASSED"
            
            regression_results['test_sizes'].append(size_key)
            regression_results['performance_ratios'].append(perf_ratio)
            regression_results['memory_ratios'].append(memory_ratio)
            regression_results['iteration_ratios'].append(iter_ratio)
            
            print(f"   Time: {solve_time:.4f}s (baseline: {baseline['cpu_time']:.4f}s, ratio: {perf_ratio:.2f}) {status}")
            print(f"   Memory: {memory_used:.1f}MB (baseline: {baseline['memory_mb']:.1f}MB, ratio: {memory_ratio:.2f})")
            print(f"   Iterations: {iterations} (baseline: {baseline['iterations']}, ratio: {iter_ratio:.2f})")
        
        print(f"\n📊 Regression Test Summary:")
        print(f"   Tests run: {len(self.performance_baselines)}")
        print(f"   Regressions detected: {regression_results['regressions_detected']}")
        if regression_results['regressions_detected'] == 0:
            print(f"   Status: ✅ ALL PERFORMANCE TESTS PASSED")
        else:
            print(f"   Status: ⚠️ PERFORMANCE REGRESSIONS DETECTED")
        
        return regression_results
    
    def generate_validation_report(self) -> None:
        """Generate comprehensive validation and performance report."""
        print("\n" + "="*60)
        print("📋 COMPREHENSIVE VALIDATION REPORT")
        print("="*60)
        
        if self.validation_results:
            latest_validation = self.validation_results[-1]
            print(f"\n🧪 Mathematical Validation:")
            print(f"   Problem types tested: {len(latest_validation['problem_types'])}")
            print(f"   Convergence tests passed: {latest_validation['passed_count']}/{len(latest_validation['problem_types'])}")
            print(f"   Average convergence rate: {np.mean(latest_validation['convergence_rates']):.3f}")
            print(f"   Status: {'✅ PASSED' if latest_validation['passed_count'] >= len(latest_validation['problem_types']) * 0.8 else '❌ FAILED'}")
        
        # Overall system health
        print(f"\n🏥 System Health:")
        print(f"   Package version: {multigrid.__version__}")
        print(f"   GPU support: {'✅ Available' if GPU_AVAILABLE else '❌ Not Available'}")
        print(f"   Memory usage: {psutil.Process().memory_info().rss / 1024 / 1024:.1f}MB")
        print(f"   CPU usage: {psutil.cpu_percent():.1f}%")
        
        # Recommendations
        print(f"\n💡 Recommendations:")
        print(f"   • Regular validation: Run this workflow weekly")
        print(f"   • Performance monitoring: Track regression trends")
        print(f"   • GPU optimization: Consider GPU acceleration for large problems")
        print(f"   • Memory optimization: Monitor memory usage for large-scale problems")

# Create and run automated validation workflow
workflow = AutomatedValidationWorkflow()

print("🤖 Automated Validation Workflow Initialized")
print("   Ready to run comprehensive validation and benchmarking tests")

## 8. Comprehensive Example: Solving a Real-World Problem

Put everything together to solve a realistic engineering problem with monitoring.

In [None]:
def solve_heat_transfer_problem():
    """
    Solve a realistic heat transfer problem with advanced features.
    
    Problem: Heat conduction in a composite material with different thermal conductivities
    and time-dependent boundary conditions.
    """
    print("🔥 Solving Heat Transfer in Composite Material")
    print("=" * 55)
    
    # Problem setup
    nx, ny = 128, 128
    grid = Grid(nx=nx, ny=ny)
    
    # Create heat equation solver with adaptive time stepping
    heat_solver = HeatEquationSolver(
        grid=grid,
        diffusivity=1.0,  # Base diffusivity
        time_scheme='crank_nicolson'
    )
    
    # Set up composite material properties
    x = np.linspace(0, 1, nx)
    y = np.linspace(0, 1, ny)
    X, Y = np.meshgrid(x, y)
    
    # Define material regions with different thermal conductivities
    def thermal_conductivity(x, y):
        """Variable thermal conductivity for composite material."""
        # High conductivity in center region (metal insert)
        center_region = ((x - 0.5)**2 + (y - 0.5)**2) < 0.15**2
        
        # Medium conductivity in intermediate region
        intermediate_region = (((x - 0.5)**2 + (y - 0.5)**2) < 0.25**2) & ~center_region
        
        # Low conductivity elsewhere (insulator)
        conductivity = np.ones_like(x) * 0.1  # Base: insulator
        conductivity = np.where(intermediate_region, 1.0, conductivity)  # Medium: ceramic
        conductivity = np.where(center_region, 10.0, conductivity)  # High: metal
        
        return conductivity
    
    k_field = thermal_conductivity(X, Y)
    
    # Time-dependent boundary conditions
    def boundary_temperature(t):
        """Time-dependent boundary temperature."""
        return 100.0 * (1 + 0.5 * np.sin(2 * np.pi * t))  # Oscillating temperature
    
    # Initial condition: room temperature
    T_initial = np.ones((nx, ny)) * 20.0
    
    # Set up real-time monitoring
    dashboard = solver_dashboard()
    dashboard.start_monitoring()
    
    # Solution parameters
    final_time = 1.0
    dt_initial = 0.01
    tolerance = 1e-6
    
    print(f"Problem parameters:")
    print(f"   Grid: {nx}×{ny}")
    print(f"   Final time: {final_time}s")
    print(f"   Initial time step: {dt_initial}s")
    print(f"   Material regions: 3 (insulator, ceramic, metal)")
    print(f"   Boundary condition: Time-dependent temperature")
    
    # Visualization setup
    vis_tools = AdvancedVisualizationTools()
    
    # Time stepping loop with monitoring
    T = T_initial.copy()
    t = 0.0
    dt = dt_initial
    time_steps = []
    temperatures = []
    max_temps = []
    
    print(f"\n🚀 Starting time integration...")
    
    step = 0
    while t < final_time:
        step += 1
        
        # Apply time-dependent boundary conditions
        bc_temp = boundary_temperature(t)
        
        # Set boundary conditions (Dirichlet on all boundaries)
        T[0, :] = bc_temp    # Left
        T[-1, :] = bc_temp   # Right  
        T[:, 0] = bc_temp    # Bottom
        T[:, -1] = bc_temp   # Top
        
        # Take time step (simplified - would use heat_solver.step in real implementation)
        # For demonstration, use simple explicit scheme
        dt_stable = min(dt, 0.25 * min(1.0/nx**2, 1.0/ny**2) / np.max(k_field))
        
        # Apply heat equation (simplified)
        T_new = T.copy()
        for i in range(1, nx-1):
            for j in range(1, ny-1):
                # Finite difference approximation
                laplacian = (
                    T[i+1, j] - 2*T[i, j] + T[i-1, j]
                ) / (1.0/nx)**2 + (
                    T[i, j+1] - 2*T[i, j] + T[i, j-1]
                ) / (1.0/ny)**2
                
                T_new[i, j] = T[i, j] + dt_stable * k_field[i, j] * laplacian
        
        T = T_new
        t += dt_stable
        
        # Store results
        time_steps.append(t)
        temperatures.append(T.copy())
        max_temps.append(np.max(T))
        
        # Print progress
        if step % 10 == 0 or t >= final_time:
            print(f"   Step {step:3d}: t = {t:.4f}s, dt = {dt_stable:.6f}s, "
                  f"T_max = {np.max(T):.1f}°C, T_avg = {np.mean(T):.1f}°C")
    
    dashboard.stop_monitoring()
    
    # Create comprehensive visualizations
    print(f"\n🎨 Creating visualizations...")
    
    fig, axes = plt.subplots(2, 3, figsize=(18, 12))
    
    # 1. Material properties
    im1 = axes[0, 0].imshow(k_field, extent=[0, 1, 0, 1], origin='lower', cmap='hot')
    axes[0, 0].set_title('Thermal Conductivity\n(k = 0.1, 1.0, 10.0)')
    axes[0, 0].set_xlabel('x')
    axes[0, 0].set_ylabel('y')
    plt.colorbar(im1, ax=axes[0, 0])
    
    # 2. Initial temperature
    im2 = axes[0, 1].imshow(temperatures[0], extent=[0, 1, 0, 1], origin='lower', cmap='coolwarm')
    axes[0, 1].set_title(f'Initial Temperature\n(t = 0.0s)')
    axes[0, 1].set_xlabel('x')
    axes[0, 1].set_ylabel('y')
    plt.colorbar(im2, ax=axes[0, 1])
    
    # 3. Final temperature
    im3 = axes[0, 2].imshow(temperatures[-1], extent=[0, 1, 0, 1], origin='lower', cmap='coolwarm')
    axes[0, 2].set_title(f'Final Temperature\n(t = {final_time:.2f}s)')
    axes[0, 2].set_xlabel('x')
    axes[0, 2].set_ylabel('y')
    plt.colorbar(im3, ax=axes[0, 2])
    
    # 4. Temperature evolution at center
    center_temps = [T[nx//2, ny//2] for T in temperatures]
    axes[1, 0].plot(time_steps, center_temps, 'b-', linewidth=2, label='Center')
    axes[1, 0].plot(time_steps, max_temps, 'r-', linewidth=2, label='Maximum')
    boundary_temps = [boundary_temperature(t) for t in time_steps]
    axes[1, 0].plot(time_steps, boundary_temps, 'g--', linewidth=2, label='Boundary')
    axes[1, 0].set_xlabel('Time (s)')
    axes[1, 0].set_ylabel('Temperature (°C)')
    axes[1, 0].set_title('Temperature Evolution')
    axes[1, 0].legend()
    axes[1, 0].grid(True, alpha=0.3)
    
    # 5. Cross-section temperature profile
    y_center = ny // 2
    x_coords = np.linspace(0, 1, nx)
    
    # Plot profiles at different times
    time_indices = [0, len(temperatures)//4, len(temperatures)//2, -1]
    time_labels = ['t=0.0s', f't={final_time/4:.2f}s', f't={final_time/2:.2f}s', f't={final_time:.2f}s']
    
    for idx, label in zip(time_indices, time_labels):
        profile = temperatures[idx][:, y_center]
        axes[1, 1].plot(x_coords, profile, linewidth=2, label=label)
    
    axes[1, 1].set_xlabel('x-coordinate')
    axes[1, 1].set_ylabel('Temperature (°C)')
    axes[1, 1].set_title('Temperature Profile (y = 0.5)')
    axes[1, 1].legend()
    axes[1, 1].grid(True, alpha=0.3)
    
    # 6. Heat flux visualization
    # Calculate heat flux for final state
    T_final = temperatures[-1]
    flux_x = np.zeros_like(T_final)
    flux_y = np.zeros_like(T_final)
    
    # Compute flux using finite differences
    for i in range(1, nx-1):
        for j in range(1, ny-1):
            flux_x[i, j] = -k_field[i, j] * (T_final[i+1, j] - T_final[i-1, j]) / (2.0/nx)
            flux_y[i, j] = -k_field[i, j] * (T_final[i, j+1] - T_final[i, j-1]) / (2.0/ny)
    
    # Create quiver plot (subsample for clarity)
    skip = 8
    X_sub, Y_sub = X[::skip, ::skip], Y[::skip, ::skip]
    flux_x_sub = flux_x[::skip, ::skip]
    flux_y_sub = flux_y[::skip, ::skip]
    
    flux_mag = np.sqrt(flux_x**2 + flux_y**2)
    im6 = axes[1, 2].imshow(flux_mag, extent=[0, 1, 0, 1], origin='lower', cmap='plasma')
    axes[1, 2].quiver(X_sub, Y_sub, flux_x_sub, flux_y_sub, 
                     angles='xy', scale_units='xy', scale=50, color='white', alpha=0.7)
    axes[1, 2].set_title('Heat Flux Magnitude & Direction')
    axes[1, 2].set_xlabel('x')
    axes[1, 2].set_ylabel('y')
    plt.colorbar(im6, ax=axes[1, 2])
    
    plt.tight_layout()
    plt.show()
    
    # Performance summary
    print(f"\n📊 Solution Summary:")
    print(f"   Total time steps: {len(time_steps)}")
    print(f"   Final time: {time_steps[-1]:.4f}s")
    print(f"   Average time step: {np.mean(np.diff(time_steps)):.6f}s")
    print(f"   Final max temperature: {max_temps[-1]:.1f}°C")
    print(f"   Final center temperature: {center_temps[-1]:.1f}°C")
    print(f"   Temperature gradient (max): {np.max(flux_mag):.1f}°C/m")
    
    return {
        'time_steps': time_steps,
        'temperatures': temperatures,
        'conductivity': k_field,
        'max_temperatures': max_temps
    }

# Run the comprehensive example
print("🚀 Running Comprehensive Real-World Example")
print("This demonstrates all advanced features working together:")
print("  • Variable material properties")
print("  • Time-dependent boundary conditions")
print("  • Real-time monitoring")
print("  • Advanced visualization")
print("  • Performance analysis")

# Uncomment to run the full example (takes a few minutes)
# results = solve_heat_transfer_problem()

print("\n✅ Advanced Features Tutorial Completed!")
print("You have learned:")
print("  • Advanced solver configuration and optimization")
print("  • Real-time monitoring and visualization")
print("  • Performance profiling and analysis")
print("  • Complex boundary conditions and material properties")
print("  • GPU acceleration techniques")
print("  • Automated validation workflows")
print("  • Integration of all features in realistic problems")

## Summary and Next Steps

This tutorial has demonstrated the advanced capabilities of the Mixed-Precision Multigrid Solvers package:

### Key Features Covered:
1. **Advanced Solver Configuration** - Problem-specific parameter optimization
2. **Real-Time Monitoring** - Dashboard integration with live performance tracking
3. **Performance Profiling** - Comprehensive analysis across problem sizes
4. **Complex Problem Setup** - Variable coefficients and mixed boundary conditions
5. **GPU Acceleration** - Performance comparison and optimization
6. **Automated Workflows** - Validation and benchmarking automation
7. **Real-World Integration** - Complete heat transfer problem solution

### Production Deployment:
- Use Docker containers for consistent deployment
- Set up automated monitoring with Prometheus/Grafana
- Implement continuous validation workflows
- Optimize for your specific hardware configuration

### Performance Optimization Tips:
- Choose appropriate solver parameters for your problem type
- Use GPU acceleration for problems with >100k unknowns
- Monitor memory usage and system resources
- Set up automated performance regression testing

### Further Resources:
- API Documentation: Complete reference for all functions
- Benchmarking Guide: Detailed performance analysis methods  
- GPU Optimization: Advanced CUDA kernel configuration
- Production Deployment: Docker and Kubernetes setup guides