# SciTeX Parallel Tutorial

This notebook demonstrates how to use the `scitex.parallel` module for parallel processing and multiprocessing tasks.

## Features Covered

* Parallel function execution with ThreadPoolExecutor
* Automatic CPU core detection and utilization
* Progress tracking with tqdm integration
* Handling multiple argument functions
* Error handling and performance optimization
* Scientific computing applications

## 1. Basic Setup and Imports

In [None]:
import numpy as np
import time
import multiprocessing
from scitex import parallel as stx_parallel
import matplotlib.pyplot as plt
from concurrent.futures import ThreadPoolExecutor

print("SciTeX Parallel Tutorial")
print("Available functions:", dir(stx_parallel))
print(f"Available CPU cores: {multiprocessing.cpu_count()}")

## 2. Basic Parallel Processing

### Simple Mathematical Operations

In [None]:
# Define a simple function to run in parallel
def square_number(x):
    """Square a number with a small delay to simulate computation."""
    time.sleep(0.1)  # Simulate computational work
    return x ** 2

# Create arguments list - each tuple contains arguments for one function call
numbers = list(range(1, 11))
args_list = [(num,) for num in numbers]  # Convert to tuple format

print(f"Numbers to process: {numbers}")
print(f"Arguments list format: {args_list[:3]}...")  # Show first 3

# Time sequential execution
start_time = time.time()
sequential_results = [square_number(num) for num in numbers]
sequential_time = time.time() - start_time

print(f"\nSequential execution:")
print(f"  Time: {sequential_time:.3f} seconds")
print(f"  Results: {sequential_results}")

# Time parallel execution
start_time = time.time()
parallel_results = stx_parallel.run(square_number, args_list, n_jobs=4, desc="Squaring numbers")
parallel_time = time.time() - start_time

print(f"\nParallel execution (4 workers):")
print(f"  Time: {parallel_time:.3f} seconds")
print(f"  Results: {parallel_results}")
print(f"  Speedup: {sequential_time/parallel_time:.2f}x")

# Verify results are identical
print(f"\nResults match: {sequential_results == parallel_results}")

### Multiple Argument Functions

In [None]:
# Define a function with multiple arguments
def compute_distance(x1, y1, x2, y2):
    """Compute Euclidean distance between two points."""
    time.sleep(0.05)  # Simulate computation
    return np.sqrt((x2 - x1)**2 + (y2 - y1)**2)

# Generate random point pairs
np.random.seed(42)
n_pairs = 20
point_pairs = []
for i in range(n_pairs):
    x1, y1 = np.random.uniform(-10, 10, 2)
    x2, y2 = np.random.uniform(-10, 10, 2)
    point_pairs.append((x1, y1, x2, y2))

print(f"Computing distances for {n_pairs} point pairs")
print(f"Sample point pair: ({point_pairs[0][0]:.2f}, {point_pairs[0][1]:.2f}) to ({point_pairs[0][2]:.2f}, {point_pairs[0][3]:.2f})")

# Run in parallel
start_time = time.time()
distances = stx_parallel.run(compute_distance, point_pairs, n_jobs=-1, desc="Computing distances")
parallel_time = time.time() - start_time

print(f"\nParallel computation completed in {parallel_time:.3f} seconds")
print(f"Sample distances: {[f'{d:.2f}' for d in distances[:5]]}")
print(f"Average distance: {np.mean(distances):.2f}")
print(f"Min/Max distances: {np.min(distances):.2f} / {np.max(distances):.2f}")

## 3. Functions Returning Multiple Values

In [None]:
# Define a function that returns multiple values
def analyze_number(x):
    """Analyze a number and return multiple statistics."""
    time.sleep(0.02)  # Simulate computation
    square = x ** 2
    cube = x ** 3
    sqrt = np.sqrt(abs(x)) if x >= 0 else np.sqrt(-x) * 1j
    factorial = np.math.factorial(x) if x >= 0 and x <= 10 else None
    return square, cube, sqrt, factorial

# Test with a range of numbers
test_numbers = list(range(-5, 11))
args_list = [(num,) for num in test_numbers]

print(f"Analyzing numbers: {test_numbers}")

# Run parallel analysis
results = stx_parallel.run(analyze_number, args_list, n_jobs=4, desc="Analyzing numbers")

print(f"\nResults type: {type(results)}")
print(f"Number of result arrays: {len(results)}")

# Extract individual result arrays
squares, cubes, sqrts, factorials = results

print(f"\nSquares: {squares}")
print(f"Cubes: {cubes}")
print(f"Square roots: {[f'{abs(s):.2f}' + ('i' if isinstance(s, complex) else '') for s in sqrts]}")
print(f"Factorials: {factorials}")

# Visualize results
fig, axes = plt.subplots(2, 2, figsize=(12, 8))
fig.suptitle('Parallel Number Analysis Results')

# Plot squares
axes[0, 0].plot(test_numbers, squares, 'bo-')
axes[0, 0].set_title('Squares')
axes[0, 0].set_xlabel('Number')
axes[0, 0].set_ylabel('Square')
axes[0, 0].grid(True, alpha=0.3)

# Plot cubes
axes[0, 1].plot(test_numbers, cubes, 'ro-')
axes[0, 1].set_title('Cubes')
axes[0, 1].set_xlabel('Number')
axes[0, 1].set_ylabel('Cube')
axes[0, 1].grid(True, alpha=0.3)

# Plot square roots (magnitude)
sqrt_magnitudes = [abs(s) for s in sqrts]
axes[1, 0].plot(test_numbers, sqrt_magnitudes, 'go-')
axes[1, 0].set_title('Square Root Magnitudes')
axes[1, 0].set_xlabel('Number')
axes[1, 0].set_ylabel('|√x|')
axes[1, 0].grid(True, alpha=0.3)

# Plot factorials (for valid values)
valid_factorials = [(n, f) for n, f in zip(test_numbers, factorials) if f is not None]
if valid_factorials:
    numbers, facts = zip(*valid_factorials)
    axes[1, 1].semilogy(numbers, facts, 'mo-')
axes[1, 1].set_title('Factorials (log scale)')
axes[1, 1].set_xlabel('Number')
axes[1, 1].set_ylabel('n! (log scale)')
axes[1, 1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## 4. Scientific Computing Applications

### Monte Carlo Simulation

In [None]:
def monte_carlo_pi(n_samples, seed):
    """Estimate π using Monte Carlo method."""
    np.random.seed(seed)
    
    # Generate random points in unit square
    x = np.random.uniform(-1, 1, n_samples)
    y = np.random.uniform(-1, 1, n_samples)
    
    # Count points inside unit circle
    inside_circle = (x**2 + y**2) <= 1
    pi_estimate = 4 * np.sum(inside_circle) / n_samples
    
    return pi_estimate, np.sum(inside_circle), n_samples

# Run multiple Monte Carlo simulations in parallel
n_simulations = 20
samples_per_sim = 100000

# Create arguments: (n_samples, seed) for each simulation
mc_args = [(samples_per_sim, i) for i in range(n_simulations)]

print(f"Running {n_simulations} Monte Carlo simulations")
print(f"Each simulation uses {samples_per_sim:,} samples")

start_time = time.time()
mc_results = stx_parallel.run(monte_carlo_pi, mc_args, n_jobs=-1, desc="Monte Carlo π estimation")
parallel_time = time.time() - start_time

# Extract results
pi_estimates, points_inside, total_points = mc_results

print(f"\nParallel execution completed in {parallel_time:.3f} seconds")
print(f"π estimates: {[f'{est:.4f}' for est in pi_estimates[:5]]}...")
print(f"Average π estimate: {np.mean(pi_estimates):.6f}")
print(f"Standard deviation: {np.std(pi_estimates):.6f}")
print(f"True π value: {np.pi:.6f}")
print(f"Error: {abs(np.mean(pi_estimates) - np.pi):.6f}")

# Visualize convergence
plt.figure(figsize=(12, 5))

plt.subplot(1, 2, 1)
plt.plot(pi_estimates, 'bo-', alpha=0.7, label='π estimates')
plt.axhline(y=np.pi, color='r', linestyle='--', label='True π')
plt.axhline(y=np.mean(pi_estimates), color='g', linestyle='--', label='Average estimate')
plt.xlabel('Simulation number')
plt.ylabel('π estimate')
plt.title('Monte Carlo π Estimates')
plt.legend()
plt.grid(True, alpha=0.3)

plt.subplot(1, 2, 2)
plt.hist(pi_estimates, bins=10, alpha=0.7, edgecolor='black')
plt.axvline(x=np.pi, color='r', linestyle='--', label='True π')
plt.axvline(x=np.mean(pi_estimates), color='g', linestyle='--', label='Average estimate')
plt.xlabel('π estimate')
plt.ylabel('Frequency')
plt.title('Distribution of π Estimates')
plt.legend()
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

### Signal Processing in Parallel

In [None]:
def analyze_signal(signal_params):
    """Analyze a synthetic signal."""
    freq, amplitude, noise_level, duration = signal_params
    
    # Generate time vector
    t = np.linspace(0, duration, int(1000 * duration))
    
    # Generate signal with noise
    signal = amplitude * np.sin(2 * np.pi * freq * t)
    noise = np.random.normal(0, noise_level, len(t))
    noisy_signal = signal + noise
    
    # Compute statistics
    mean_power = np.mean(noisy_signal**2)
    peak_amplitude = np.max(np.abs(noisy_signal))
    snr = 20 * np.log10(amplitude / noise_level) if noise_level > 0 else np.inf
    
    # Simple frequency analysis (find dominant frequency)
    fft = np.fft.fft(noisy_signal)
    freqs = np.fft.fftfreq(len(t), t[1] - t[0])
    dominant_freq = freqs[np.argmax(np.abs(fft[:len(fft)//2]))]
    
    return {
        'frequency': freq,
        'amplitude': amplitude, 
        'noise_level': noise_level,
        'mean_power': mean_power,
        'peak_amplitude': peak_amplitude,
        'snr_db': snr,
        'detected_freq': dominant_freq,
        'freq_error': abs(dominant_freq - freq)
    }

# Generate signal parameters for analysis
signal_configs = []
for freq in [1, 2, 5, 10, 15]:  # Different frequencies
    for amp in [1.0, 2.0]:      # Different amplitudes
        for noise in [0.1, 0.3, 0.5]:  # Different noise levels
            signal_configs.append((freq, amp, noise, 2.0))  # 2 second duration

print(f"Analyzing {len(signal_configs)} signal configurations")
print(f"Sample configuration: freq={signal_configs[0][0]}Hz, amp={signal_configs[0][1]}, noise={signal_configs[0][2]}")

# Run parallel signal analysis
start_time = time.time()
signal_results = stx_parallel.run(analyze_signal, [(config,) for config in signal_configs], 
                                 n_jobs=-1, desc="Analyzing signals")
analysis_time = time.time() - start_time

print(f"\nSignal analysis completed in {analysis_time:.3f} seconds")
print(f"Average frequency detection error: {np.mean([r['freq_error'] for r in signal_results]):.3f} Hz")

# Analyze results
import pandas as pd

# Convert results to DataFrame for analysis
df = pd.DataFrame(signal_results)
print("\nSignal Analysis Summary:")
print(df.groupby(['frequency', 'noise_level']).agg({
    'snr_db': 'mean',
    'freq_error': 'mean',
    'peak_amplitude': 'mean'
}).round(3))

# Visualize results
fig, axes = plt.subplots(2, 2, figsize=(15, 10))
fig.suptitle('Parallel Signal Analysis Results')

# SNR vs Noise Level
for freq in df['frequency'].unique():
    freq_data = df[df['frequency'] == freq]
    axes[0, 0].plot(freq_data['noise_level'], freq_data['snr_db'], 'o-', label=f'{freq} Hz')
axes[0, 0].set_xlabel('Noise Level')
axes[0, 0].set_ylabel('SNR (dB)')
axes[0, 0].set_title('SNR vs Noise Level')
axes[0, 0].legend()
axes[0, 0].grid(True, alpha=0.3)

# Frequency Detection Error
axes[0, 1].scatter(df['frequency'], df['freq_error'], c=df['noise_level'], 
                  cmap='viridis', alpha=0.7)
axes[0, 1].set_xlabel('True Frequency (Hz)')
axes[0, 1].set_ylabel('Frequency Error (Hz)')
axes[0, 1].set_title('Frequency Detection Error')
cbar = plt.colorbar(axes[0, 1].collections[0], ax=axes[0, 1])
cbar.set_label('Noise Level')
axes[0, 1].grid(True, alpha=0.3)

# Mean Power vs Amplitude
axes[1, 0].scatter(df['amplitude'], df['mean_power'], c=df['frequency'], 
                  cmap='plasma', alpha=0.7)
axes[1, 0].set_xlabel('Signal Amplitude')
axes[1, 0].set_ylabel('Mean Power')
axes[1, 0].set_title('Mean Power vs Amplitude')
cbar = plt.colorbar(axes[1, 0].collections[0], ax=axes[1, 0])
cbar.set_label('Frequency (Hz)')
axes[1, 0].grid(True, alpha=0.3)

# Distribution of SNR values
axes[1, 1].hist(df['snr_db'], bins=15, alpha=0.7, edgecolor='black')
axes[1, 1].set_xlabel('SNR (dB)')
axes[1, 1].set_ylabel('Count')
axes[1, 1].set_title('Distribution of SNR Values')
axes[1, 1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## 5. Performance Analysis and Optimization

### Worker Count Optimization

In [None]:
def cpu_intensive_task(n):
    """A CPU-intensive task for benchmarking."""
    # Compute some intensive calculation
    result = 0
    for i in range(n * 1000):
        result += np.sin(i) * np.cos(i)
    return result

# Test different numbers of workers
max_workers = multiprocessing.cpu_count()
worker_counts = [1, 2, 4, max_workers//2, max_workers, max_workers*2]
worker_counts = [w for w in worker_counts if w >= 1]  # Remove duplicates and invalid values
worker_counts = sorted(list(set(worker_counts)))  # Remove duplicates

# Create workload
task_sizes = [500] * 20  # 20 tasks of moderate size
args_list = [(size,) for size in task_sizes]

print(f"Benchmarking with {len(task_sizes)} tasks")
print(f"Testing worker counts: {worker_counts}")

performance_results = []

for n_workers in worker_counts:
    print(f"\nTesting with {n_workers} workers...")
    
    start_time = time.time()
    results = stx_parallel.run(cpu_intensive_task, args_list, 
                              n_jobs=n_workers, desc=f"Workers: {n_workers}")
    execution_time = time.time() - start_time
    
    performance_results.append({
        'workers': n_workers,
        'time': execution_time,
        'speedup': performance_results[0]['time'] / execution_time if performance_results else 1.0,
        'efficiency': (performance_results[0]['time'] / execution_time) / n_workers if performance_results else 1.0
    })
    
    print(f"  Execution time: {execution_time:.3f} seconds")
    if performance_results:
        speedup = performance_results[0]['time'] / execution_time
        efficiency = speedup / n_workers
        print(f"  Speedup: {speedup:.2f}x")
        print(f"  Efficiency: {efficiency:.2f}")

# Visualize performance results
perf_df = pd.DataFrame(performance_results)

fig, axes = plt.subplots(1, 3, figsize=(18, 5))

# Execution time vs workers
axes[0].plot(perf_df['workers'], perf_df['time'], 'bo-', linewidth=2, markersize=8)
axes[0].set_xlabel('Number of Workers')
axes[0].set_ylabel('Execution Time (seconds)')
axes[0].set_title('Execution Time vs Workers')
axes[0].grid(True, alpha=0.3)

# Speedup vs workers
axes[1].plot(perf_df['workers'], perf_df['speedup'], 'ro-', linewidth=2, markersize=8, label='Actual')
axes[1].plot(perf_df['workers'], perf_df['workers'], 'k--', alpha=0.5, label='Ideal Linear')
axes[1].set_xlabel('Number of Workers')
axes[1].set_ylabel('Speedup')
axes[1].set_title('Speedup vs Workers')
axes[1].legend()
axes[1].grid(True, alpha=0.3)

# Efficiency vs workers
axes[2].plot(perf_df['workers'], perf_df['efficiency'], 'go-', linewidth=2, markersize=8)
axes[2].axhline(y=1.0, color='k', linestyle='--', alpha=0.5, label='Perfect Efficiency')
axes[2].set_xlabel('Number of Workers')
axes[2].set_ylabel('Efficiency')
axes[2].set_title('Efficiency vs Workers')
axes[2].legend()
axes[2].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Find optimal worker count
optimal_idx = perf_df['time'].idxmin()
optimal_workers = perf_df.loc[optimal_idx, 'workers']
optimal_time = perf_df.loc[optimal_idx, 'time']

print(f"\nOptimal configuration:")
print(f"  Workers: {optimal_workers}")
print(f"  Execution time: {optimal_time:.3f} seconds")
print(f"  Speedup: {perf_df.loc[optimal_idx, 'speedup']:.2f}x")
print(f"  Efficiency: {perf_df.loc[optimal_idx, 'efficiency']:.2f}")

## 6. Error Handling and Edge Cases

In [None]:
def function_with_errors(x):
    """Function that may raise errors for demonstration."""
    if x < 0:
        raise ValueError(f"Negative input not allowed: {x}")
    elif x == 0:
        return float('inf')  # Special case
    else:
        return 1.0 / x

def safe_function_wrapper(x):
    """Wrapper that handles errors gracefully."""
    try:
        result = function_with_errors(x)
        return result, None  # (result, error)
    except Exception as e:
        return None, str(e)  # (result, error)

# Test with problematic inputs
test_inputs = [-2, -1, 0, 0.5, 1, 2, 5]
args_list = [(x,) for x in test_inputs]

print(f"Testing error handling with inputs: {test_inputs}")

# Run with safe wrapper
results = stx_parallel.run(safe_function_wrapper, args_list, n_jobs=4, desc="Error handling test")

# Extract results and errors
values, errors = results

print("\nResults:")
for i, (inp, val, err) in enumerate(zip(test_inputs, values, errors)):
    if err is None:
        print(f"  Input {inp}: Result = {val}")
    else:
        print(f"  Input {inp}: Error = {err}")

# Test edge cases
print("\n" + "="*50)
print("Testing Edge Cases")
print("="*50)

# Test with empty input
try:
    empty_result = stx_parallel.run(square_number, [], n_jobs=2)
    print("Empty input test: Unexpected success")
except ValueError as e:
    print(f"Empty input test: Correctly caught error - {e}")

# Test with invalid function
try:
    invalid_result = stx_parallel.run("not_a_function", [(1,)], n_jobs=2)
    print("Invalid function test: Unexpected success")
except ValueError as e:
    print(f"Invalid function test: Correctly caught error - {e}")

# Test with too many workers
import warnings
with warnings.catch_warnings(record=True) as w:
    warnings.simplefilter("always")
    result = stx_parallel.run(square_number, [(1,), (2,)], n_jobs=1000)
    if w:
        print(f"Too many workers test: Warning correctly issued - {w[0].message}")
    else:
        print("Too many workers test: No warning (unexpected)")

print("\nEdge case testing completed successfully!")

## 7. Best Practices and Tips

In [None]:
print("Best Practices for SciTeX Parallel Processing")
print("=" * 45)

practices = [
    {
        "title": "1. Choose Optimal Worker Count",
        "description": "Use n_jobs=-1 for automatic CPU detection, or tune based on your workload",
        "example": "n_jobs=-1  # Auto-detect CPUs\nn_jobs=4   # Fixed worker count"
    },
    {
        "title": "2. Argument List Format",
        "description": "Always use tuple format for arguments, even for single arguments",
        "example": "args_list = [(arg1,), (arg2,)]  # Single argument\nargs_list = [(a1, b1), (a2, b2)]  # Multiple arguments"
    },
    {
        "title": "3. Handle Multiple Return Values",
        "description": "Functions returning tuples are automatically transposed",
        "example": "results = run(func_returning_tuple, args)\nval1_list, val2_list = results  # Automatic unpacking"
    },
    {
        "title": "4. Error Handling",
        "description": "Wrap functions in try-catch for robust error handling",
        "example": "def safe_func(x):\n    try:\n        return func(x), None\n    except Exception as e:\n        return None, str(e)"
    },
    {
        "title": "5. Progress Monitoring",
        "description": "Use descriptive names for progress bars",
        "example": "run(func, args, desc='Processing data batch 1/5')"
    },
    {
        "title": "6. Memory Considerations",
        "description": "Be aware of memory usage with large datasets",
        "example": "# Process in chunks for large datasets\nfor chunk in chunks(large_dataset):\n    results.extend(run(func, chunk))"
    }
]

for practice in practices:
    print(f"\n{practice['title']}")
    print("-" * len(practice['title']))
    print(practice['description'])
    print(f"Example: {practice['example']}")

# Demonstrate chunking for large datasets
print("\n" + "="*50)
print("Memory-Efficient Processing Example")
print("="*50)

def chunk_processor(chunk_data):
    """Process a chunk of data."""
    chunk_id, data_size = chunk_data
    # Simulate processing
    processed_data = np.random.randn(data_size).sum()
    return chunk_id, processed_data, data_size

# Simulate large dataset processing
total_data_size = 1000000  # 1M data points
chunk_size = 50000         # 50K per chunk
n_chunks = total_data_size // chunk_size

# Create chunk specifications
chunk_specs = [(i, chunk_size) for i in range(n_chunks)]

print(f"Processing {total_data_size:,} data points in {n_chunks} chunks")
print(f"Chunk size: {chunk_size:,} points")

start_time = time.time()
chunk_results = stx_parallel.run(chunk_processor, chunk_specs, 
                                n_jobs=-1, desc="Processing chunks")
chunk_time = time.time() - start_time

chunk_ids, processed_values, sizes = chunk_results

print(f"\nChunk processing completed in {chunk_time:.3f} seconds")
print(f"Processed {sum(sizes):,} total data points")
print(f"Processing rate: {sum(sizes)/chunk_time:,.0f} points/second")
print(f"Sample processed values: {[f'{v:.2f}' for v in processed_values[:5]]}")

## 8. Summary and Comparison

### Performance Summary

In [None]:
print("SciTeX Parallel Module Performance Summary")
print("=" * 45)

# Create a comprehensive benchmark
def benchmark_task(task_info):
    """Benchmark task for comparison."""
    task_type, size = task_info
    
    if task_type == 'cpu':
        # CPU-intensive task
        result = sum(np.sin(i) for i in range(size))
    elif task_type == 'memory':
        # Memory-intensive task
        arr = np.random.randn(size)
        result = np.sum(arr ** 2)
    elif task_type == 'io':
        # I/O simulation
        time.sleep(0.01)  # Simulate I/O wait
        result = size * 2
    
    return result

# Test different task types
task_types = ['cpu', 'memory', 'io']
task_sizes = [1000, 5000, 10000]

benchmark_results = {}

for task_type in task_types:
    print(f"\nBenchmarking {task_type.upper()} tasks...")
    
    # Create task list
    tasks = [(task_type, size) for size in task_sizes]
    
    # Sequential execution
    start_time = time.time()
    seq_results = [benchmark_task(task) for task in tasks]
    seq_time = time.time() - start_time
    
    # Parallel execution
    start_time = time.time()
    par_results = stx_parallel.run(benchmark_task, tasks, n_jobs=-1, desc=f"{task_type} tasks")
    par_time = time.time() - start_time
    
    speedup = seq_time / par_time
    benchmark_results[task_type] = {
        'sequential_time': seq_time,
        'parallel_time': par_time,
        'speedup': speedup,
        'results_match': seq_results == par_results
    }
    
    print(f"  Sequential: {seq_time:.3f}s")
    print(f"  Parallel:   {par_time:.3f}s")
    print(f"  Speedup:    {speedup:.2f}x")
    print(f"  Results match: {seq_results == par_results}")

# Create summary table
print("\n" + "="*60)
print("Benchmark Summary")
print("="*60)

print(f"{'Task Type':<12} {'Sequential':<12} {'Parallel':<12} {'Speedup':<10} {'Correct':<8}")
print("-" * 60)

for task_type, results in benchmark_results.items():
    print(f"{task_type.upper():<12} {results['sequential_time']:<12.3f} {results['parallel_time']:<12.3f} {results['speedup']:<10.2f} {results['results_match']:<8}")

# Visualize benchmark results
task_names = list(benchmark_results.keys())
speedups = [benchmark_results[task]['speedup'] for task in task_names]
seq_times = [benchmark_results[task]['sequential_time'] for task in task_names]
par_times = [benchmark_results[task]['parallel_time'] for task in task_names]

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))

# Speedup comparison
bars = ax1.bar(task_names, speedups, color=['skyblue', 'lightgreen', 'salmon'])
ax1.set_ylabel('Speedup Factor')
ax1.set_title('Parallel Speedup by Task Type')
ax1.axhline(y=1, color='k', linestyle='--', alpha=0.5, label='No speedup')
ax1.legend()

# Add value labels on bars
for bar, speedup in zip(bars, speedups):
    height = bar.get_height()
    ax1.text(bar.get_x() + bar.get_width()/2., height + 0.05,
             f'{speedup:.1f}x', ha='center', va='bottom')

# Execution time comparison
x = np.arange(len(task_names))
width = 0.35

ax2.bar(x - width/2, seq_times, width, label='Sequential', color='lightcoral')
ax2.bar(x + width/2, par_times, width, label='Parallel', color='lightblue')

ax2.set_ylabel('Execution Time (seconds)')
ax2.set_title('Execution Time Comparison')
ax2.set_xticks(x)
ax2.set_xticklabels([t.upper() for t in task_names])
ax2.legend()

plt.tight_layout()
plt.show()

print(f"\nOverall Performance:")
avg_speedup = np.mean(speedups)
print(f"  Average speedup: {avg_speedup:.2f}x")
print(f"  Best for: {task_names[np.argmax(speedups)].upper()} tasks ({max(speedups):.2f}x)")
print(f"  System specs: {multiprocessing.cpu_count()} CPU cores")

## 9. Summary

The `scitex.parallel` module provides efficient parallel processing capabilities with the following key features:

### Core Function: `run(func, args_list, n_jobs=-1, desc="Processing")`

**Parameters:**
- `func`: Function to execute in parallel
- `args_list`: List of tuples, each containing arguments for one function call
- `n_jobs`: Number of workers (-1 for auto-detection)
- `desc`: Description for progress bar

**Key Features:**
1. **Automatic CPU Detection**: Uses all available cores when `n_jobs=-1`
2. **Progress Tracking**: Built-in tqdm progress bars
3. **Multiple Return Values**: Automatically handles functions returning tuples
4. **Error Handling**: Graceful handling of exceptions and edge cases
5. **ThreadPoolExecutor**: Uses efficient thread-based parallelism

### Best Use Cases:
- **I/O-bound tasks**: File processing, network requests
- **CPU-bound tasks**: Mathematical computations, data analysis
- **Batch processing**: Large datasets, simulation runs
- **Scientific computing**: Monte Carlo simulations, signal analysis

### Performance Characteristics:
- Optimal speedup typically achieved with 2-4x CPU core count
- I/O-bound tasks show highest speedup ratios
- Memory-intensive tasks benefit from parallel processing
- CPU-bound tasks show good scaling up to core count

### Integration Benefits:
- Seamless integration with other SciTeX modules
- Consistent API design with other SciTeX tools
- Built-in progress feedback for scientific workflows
- Robust error handling for production use

The module is designed to make parallel processing accessible and efficient for scientific computing workflows, with minimal code changes required to parallelize existing sequential operations.