# Distributed JuMP Optimization Demo

This notebook demonstrates how to use Distributed.jl to solve multiple independent JuMP optimization problems in parallel, each on a different Julia process. Each problem is a simple linear program (LP) with a linear objective, variable bounds, and linear constraints. The notebook collects the optimal objective values from each process and displays them.

Key features:
- Each optimization uses a different starting point
- Uses JuMP and Ipopt as the solver
- Can be run on a single machine or across multiple nodes in an HPC environment
- Uses the Project.toml environment provided

## Setup Distributed Environment

First, we'll set up our distributed computing environment by adding worker processes and loading the necessary packages on all processes.

In [None]:
using Pkg;
println("Active project: ", Pkg.project().path)

In [None]:
using Distributed

println("Distributed JuMP Optimization Demo")
println("="^40)

# Add worker processes if not already started
if nprocs() == 1
    n_workers = min(Sys.CPU_THREADS - 1, 4)
    addprocs(n_workers)
end

println("Main process: $(myid())")
println("Worker processes: $(workers())")
println("Total processes: $(nprocs())")

## Load Required Packages on All Processes

It's critical to load the required packages on all worker processes before using them. The `@everywhere` macro ensures that the code is executed on all processes.

In [None]:
# Important: Add packages to all processes (main and workers)
# This must be done AFTER processes are added but BEFORE they're used
@everywhere begin
    using JuMP
    using Ipopt
end

## Define the Optimization Function

Next, we'll define the function that builds and solves our JuMP model. This function needs to be defined on all worker processes using the `@everywhere` macro.

### Mathematical Formulation of the Optimization Problem

The JuMP model we'll solve is a simple linear programming problem that can be expressed mathematically as:

$$
\begin{align}
\min_{x,y} \quad & x + 2y \\
\text{subject to} \quad & x + y \geq 5 \\
& x - y \leq 3 \\
& 0 \leq x \leq 10 \\
& 0 \leq y \leq 10
\end{align}
$$

This is a classic linear programming problem with:
- A linear objective function that we want to minimize
- Two linear inequality constraints 
- Box constraints (bounds) on both variables

The problem has a unique optimal solution regardless of the starting point. We'll solve this problem with different initial guesses for $x$ and $y$ on different worker processes and verify that they all reach the same optimal solution.

In [None]:
# Function to build and solve a simple LP
@everywhere function solve_lp(start_x::Float64)
    model = Model(Ipopt.Optimizer)
    set_silent(model)
    @variable(model, 0 <= x <= 10, start = start_x)
    @variable(model, 0 <= y <= 10, start = start_x)
    @objective(model, Min, x + 2y)
    @constraint(model, x + y >= 5)
    @constraint(model, x - y <= 3)
    optimize!(model)
    status = termination_status(model)
    obj = objective_value(model)
    return (status, obj, value(x), value(y))
end

## Set Up Different Starting Points

We'll create a range of different starting points for our optimization problems. Each starting point will be used to initialize a separate optimization run on a different worker process.

In [None]:
# Generate starting points based on number of processes
n_points = nprocs() * 2  # 2 points per process, adjust as needed
starting_points = vcat(
    # Regular spacing across the domain
    range(0.0, 10.0, length=nprocs()),
    # Random points for additional diversity
    rand(nprocs()) * 10.0
)

# Ensure we don't use more points than processes
n_instances = min(length(starting_points), nprocs())

println("Generated $(length(starting_points)) starting points")
println("Will solve $n_instances optimization problems in parallel using $(nprocs()) processes")
println("Starting points: $starting_points")

## Run Distributed Optimization

Now we'll solve the optimization problems in parallel across all worker processes using the `pmap` function.

In [None]:
# Assign each optimization to a worker process
println("Starting distributed optimization...")
results = pmap((sp, idx) -> begin
        println("[Process $(myid())] Solving instance $idx with start_x=$sp")
        solve_lp(sp)
    end, starting_points[1:n_instances], 1:n_instances)

## Display Results

After all optimizations have completed, we'll display the results that have been collected on the main process.

In [None]:
println("\nResults (collected on main process):")
for (i, (status, obj, xval, yval)) in enumerate(results)
    println("Instance $i: status=$status, optimal obj=$obj, x=$xval, y=$yval")
end

println("\nDemo complete.")

## Visualize the Results

Let's create a visualization of our optimization results. We'll plot the optimal solutions to visualize how they converge to the same point regardless of starting position.

In [None]:
using Plots

# Extract results
x_vals = [r[3] for r in results]
y_vals = [r[4] for r in results]
obj_vals = [r[2] for r in results]

# Create a scatter plot of solutions
p = scatter(x_vals, y_vals, 
    label="Optimal Solutions", 
    title="Optimization Results from Different Starting Points",
    xlabel="x", 
    ylabel="y",
    markersize=8,
    legend=:topright)

# Add starting points
scatter!(starting_points[1:n_instances], starting_points[1:n_instances], 
    label="Starting Points", 
    markersize=6, 
    markershape=:cross,
    markercolor=:red)

# Draw the constraints
x_range = 0:0.1:10
plot!(x_range, 5 .- x_range, label="x + y = 5", linestyle=:dash)
plot!(x_range, x_range .- 3, label="x - y = 3", linestyle=:dash)

# Show the plot
display(p)

## Running on HPC Clusters

For running this notebook on an HPC cluster, you need to launch Julia with a machine file that specifies the nodes to use. Here's how to do it:

1. Create a file `machines.txt` with the hostnames or IPs of the cluster nodes
2. Start Julia with `julia --machine-file machines.txt`
3. Then run this notebook

## Clean Up

Finally, clean up by removing worker processes when you're done. This is optional in notebooks but recommended in scripts.

In [None]:
# Optional: Remove worker processes when done
# Note: In a notebook environment, you may want to keep workers for further computations
rmprocs(workers())
println("✓ Worker processes removed")