# SSSP/APSP HPC: A Comparative Performance Analysis (Colab)

This notebook provides a complete workflow to set up the environment, build the project, run benchmarks, and analyze the performance of multiple shortest-path algorithms (Dijkstra, Bellman-Ford, Floyd-Warshall, Johnson's) and their HPC variants (Serial, OpenMP, CUDA, Hybrid).

## 1. Environment Setup

First, let's set up the environment. This involves checking for a GPU, cloning the project repository from GitHub, and installing Python dependencies.

### 1.1 Check GPU Availability

Ensure that a GPU is available for the CUDA/hybrid builds. Go to **Runtime -> Change runtime type** and select **GPU** as the hardware accelerator. The following cell should show your assigned GPU.

In [None]:
!nvidia-smi

### 1.2 Clone the Repository

In [None]:
!git clone https://www.github.com/UchihaIthachi/bellman-ford-hpc-openmp-cuda.git
%cd bellman-ford-hpc-openmp-cuda

### 1.3 Install Dependencies

In [None]:
%pip install pandas matplotlib seaborn

## 2. Build the Executables

Next, we compile all the C/C++ and CUDA source code. The new `Makefile` will automatically build all targets and place them in the `bin/` directory. If `nvcc` is not found, CUDA-based targets will be gracefully skipped.

In [None]:
!make clean && make all

## 3. Run the Benchmarks

Now, we'll run a series of benchmarks directly from the notebook. The code below will execute each compiled binary across a range of graph sizes and collect the timing results into a pandas DataFrame.

You can customize the vertex counts and other parameters in the `benchmark_params` dictionary.

In [None]:
import subprocess
import re
import pandas as pd
import os

def run_command(command):
    try:
        print(f"  Executing: {command}")
        return subprocess.run(command, shell=True, capture_output=True, text=True, check=True).stdout
    except subprocess.CalledProcessError as e:
        print(f"    Error running command. Stderr: {e.stderr.strip()}")
        return None

def parse_time(output):
    match = re.search(r"time: ([\d.]+) s", output)
    return float(match.group(1)) if match else None

benchmark_params = {
    'sssp_vertices': [500, 1000, 2000, 5000],
    'apsp_vertices': [50, 100, 200, 400],
    'min_w': -10,
    'max_w': 50,
    'density': 0.1,
    'threads': 4,
    'split_ratio': 0.5
}

executables = {
    'sssp': ['dijkstra_serial', 'dijkstra_openmp', 'dijkstra_cuda', 'dijkstra_hybrid', 'BF_serial', 'BF_openmp', 'BF_cuda', 'BF_hybrid'],
    'apsp': ['floyd_serial', 'floyd_openmp', 'floyd_cuda', 'johnson_serial', 'johnson_openmp', 'johnson_cuda', 'johnson_hybrid']
}

all_results = []
for group, vertices_list in [('sssp', benchmark_params['sssp_vertices']), ('apsp', benchmark_params['apsp_vertices'])]:
    for v in vertices_list:
        print(f"\nRunning {group.upper()} benchmarks for {v} vertices...")
        result_row = {'vertices': v, 'group': group}
        for exe in executables[group]:
            path = f"./bin/{exe}"
            if not os.path.exists(path):
                result_row[exe] = None
                continue
            
            cmd_parts = [path, v, benchmark_params['min_w'], benchmark_params['max_w']]
            # Dijkstra requires non-negative weights, so adjust min_w
            if 'dijkstra' in exe:
                cmd_parts[2] = 1 # Use 1 for min_w
            
            cmd_parts.append(benchmark_params['density'])
            
            if 'hybrid' in exe:
                 cmd_parts.insert(4, benchmark_params['split_ratio'])
            
            if 'openmp' in exe or 'hybrid' in exe:
                cmd_parts.append(benchmark_params['threads'])
            
            cmd = ' '.join(map(str, cmd_parts))
            output = run_command(cmd)
            
            if output:
                time = parse_time(output)
                result_row[exe] = time
                if time is not None:
                    print(f"    {exe}: {time:.6f}s")
            else:
                result_row[exe] = None
        all_results.append(result_row)

df = pd.DataFrame(all_results)
df.to_json("benchmark_results.json", orient='records', indent=4)

## 4. Analyze the Results

With the benchmarks complete, let's load the results into a pandas DataFrame and examine the raw data.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

try:
    df = pd.read_json('benchmark_results.json')
    print("Benchmark Results:")
    display(df.set_index('vertices'))
except FileNotFoundError:
    print("benchmark_results.json not found. Make sure the previous step ran successfully.")

### Performance Visualization

Now, let's plot the results to visualize the performance differences. We will create separate plots for SSSP and APSP algorithms, as their runtimes are on different scales.

In [None]:
if 'df' in locals():
    plt.style.use('seaborn-v0_8-whitegrid')
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(20, 8))
    fig.suptitle('Algorithm Performance Comparison', fontsize=18)

    # SSSP Algorithms
    df_sssp = df[df['group'] == 'sssp'].drop(columns='group').melt(id_vars=['vertices'], var_name='Algorithm', value_name='Time (s)').dropna()
    sns.lineplot(data=df_sssp, x='vertices', y='Time (s)', hue='Algorithm', marker='o', ax=ax1)
    ax1.set_title('SSSP Algorithm Performance', fontsize=16)
    ax1.set_xlabel('Number of Vertices', fontsize=12)
    ax1.set_ylabel('Execution Time (s) [Log Scale]', fontsize=12)
    ax1.set_yscale('log')
    ax1.legend(title='SSSP Variants')

    # APSP Algorithms
    df_apsp = df[df['group'] == 'apsp'].drop(columns='group').melt(id_vars=['vertices'], var_name='Algorithm', value_name='Time (s)').dropna()
    sns.lineplot(data=df_apsp, x='vertices', y='Time (s)', hue='Algorithm', marker='o', ax=ax2)
    ax2.set_title('APSP Algorithm Performance', fontsize=16)
    ax2.set_xlabel('Number of Vertices', fontsize=12)
    ax2.set_ylabel('Execution Time (s) [Log Scale]', fontsize=12)
    ax2.set_yscale('log')
    ax2.legend(title='APSP Variants')
    
    plt.tight_layout(rect=[0, 0, 1, 0.96])
    plt.show()

### Analysis

From the plots, we can draw several conclusions:

- **SSSP (Dijkstra vs. Bellman-Ford):** Dijkstra's algorithm consistently outperforms Bellman-Ford for graphs with non-negative weights. This is expected due to their complexity differences (O(E log V) or O(V^2) for Dijkstra vs. O(VE) for Bellman-Ford). Bellman-Ford's advantage is its ability to handle negative weights, which comes at a performance cost.

- **APSP (Floyd-Warshall vs. Johnson's):** For dense graphs (as generated here), Floyd-Warshall's O(V^3) complexity can be competitive. Johnson's algorithm, with a complexity of O(VE + V^2 log V), is typically better suited for sparse graphs. The benchmark results here should illustrate this trade-off.

- **Parallelism (OpenMP/CUDA):** The parallel implementations (OpenMP, CUDA) show significant speedups over their serial counterparts, especially for larger graphs. The massive parallelism of the GPU should make the CUDA variants the fastest for large problem sizes, though the overhead of data transfer can impact performance on smaller graphs.

## 5. Conclusion

This analysis demonstrates the performance characteristics of various SSSP and APSP algorithms and their HPC implementations. The choice of algorithm depends heavily on the graph's properties (e.g., presence of negative weights, density), while the choice of implementation depends on the available hardware and the desired level of performance. For maximum speed on large-scale problems, GPU-accelerated solutions using CUDA are highly effective.