# HPXPy Heat Diffusion Simulation

This notebook solves the 1D heat equation using explicit finite differences, demonstrating parallel scalability.

## The Heat Equation

$$\frac{\partial u}{\partial t} = \alpha \frac{\partial^2 u}{\partial x^2}$$

This is a classic HPC benchmark because:
1. It's embarrassingly parallel for element-wise operations
2. It has natural domain decomposition for distributed computing
3. It demonstrates strong and weak scaling

**Note:** For proper scalability testing with multiple thread counts, run the `heat_diffusion_demo.py` script instead.

In [None]:
import time
import numpy as np
import hpxpy as hpx

hpx.init(num_threads=4)

## Physical Setup

We simulate a rod with:
- Initial condition: Gaussian hot spot in the middle
- Boundary conditions: Fixed at 0 (Dirichlet)

In [None]:
# Physical parameters
L = 1.0           # Domain length
alpha = 0.01      # Thermal diffusivity
grid_size = 100_000
n_steps = 100

dx = L / grid_size
dt = 0.4 * dx * dx / alpha  # CFL condition for stability

print(f"Grid size: {grid_size:,} points")
print(f"Time steps: {n_steps}")
print(f"dx = {dx:.6f}, dt = {dt:.6f}")

## Initial Condition: Gaussian Pulse

In [None]:
# Initial condition: hot spot in the middle
x = np.linspace(0, L, grid_size)
u_initial = np.exp(-100 * (x - 0.5)**2)  # Gaussian pulse

print(f"Initial total heat: {np.sum(u_initial):.4f}")

## Simulation with HPXPy

In [None]:
# Convert to HPXPy
u_hpx = hpx.from_numpy(u_initial.copy())

# Coefficient for stencil
r = alpha * dt / (dx * dx)

# Time stepping
start = time.perf_counter()

for step in range(n_steps):
    # Get numpy array for boundary handling
    u_np = u_hpx.to_numpy()
    
    # Create shifted arrays for stencil (u[i-1], u[i], u[i+1])
    u_left = np.roll(u_np, 1)
    u_right = np.roll(u_np, -1)
    
    # Fixed boundaries (Dirichlet: u=0 at edges)
    u_left[0] = 0
    u_right[-1] = 0
    
    # Convert to HPXPy for parallel computation
    u_l = hpx.from_numpy(u_left)
    u_r = hpx.from_numpy(u_right)
    
    # Apply stencil: u_new = u + r * (u_left - 2*u + u_right)
    u_hpx = u_hpx + r * (u_l - 2 * u_hpx + u_r)

elapsed = time.perf_counter() - start

# Get final state
u_final = u_hpx.to_numpy()
total_heat = float(hpx.sum(u_hpx))

print(f"Simulation complete!")
print(f"Time: {elapsed:.4f} seconds")
print(f"Final total heat: {total_heat:.4f}")
print(f"Heat conservation: {total_heat / np.sum(u_initial) * 100:.2f}%")

## Visualization

In [None]:
try:
    import matplotlib.pyplot as plt
    
    # Subsample for plotting
    step = max(1, grid_size // 1000)
    x_plot = x[::step]
    u_initial_plot = u_initial[::step]
    u_final_plot = u_final[::step]
    
    plt.figure(figsize=(10, 4))
    plt.plot(x_plot, u_initial_plot, 'b-', label='Initial')
    plt.plot(x_plot, u_final_plot, 'r-', label=f'After {n_steps} steps')
    plt.xlabel('x')
    plt.ylabel('Temperature')
    plt.title('1D Heat Diffusion')
    plt.legend()
    plt.grid(True, alpha=0.3)
    plt.show()
except ImportError:
    print("matplotlib not available for visualization")

## Distributed Computing Projection

In a distributed HPX deployment, this simulation would:

### 1. Domain Decomposition
- Grid partitioned across N localities (nodes)
- Each locality owns a contiguous chunk
- Block distribution: locality i owns elements `[i*N/n, (i+1)*N/n)`

### 2. Communication Pattern
- Each step requires boundary exchange (ghost cells)
- Only 2 values exchanged per locality pair
- Communication: O(N) where N = number of localities

### 3. Expected Scaling

| Localities | Grid Size | Communication | Speedup |
|------------|-----------|---------------|---------|
| 1 | 1,000,000 | 0 | 1x |
| 2 | 1,000,000 | 2 values | ~2x |
| 4 | 1,000,000 | 6 values | ~4x |
| 8 | 1,000,000 | 14 values | ~8x |
| 16 | 1,000,000 | 30 values | ~16x |

### 4. HPX Advantages
- Asynchronous communication overlaps with computation
- AGAS enables transparent data access across localities
- Futurization allows pipelining of time steps

In [None]:
hpx.finalize()
print("Demo complete!")