<a href="https://colab.research.google.com/github/ja390/Parallel-Computing-Assignment3/blob/main/MPI.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
!sudo apt-get update -qq

W: Skipping acquire of configured file 'main/source/Sources' as repository 'https://r2u.stat.illinois.edu/ubuntu jammy InRelease' does not seem to provide it (sources.list entry misspelt?)


In [3]:
!sudo apt-get install -y openmpi-bin libopenmpi-dev > /dev/null

In [5]:
print("✔ OpenMPI Installed")
!mpirun --version

✔ OpenMPI Installed
mpirun (Open MPI) 4.1.2

Report bugs to http://www.open-mpi.org/community/help/


In [6]:
print("\nChecking nvcc (CUDA compiler):")
!nvcc --version


Checking nvcc (CUDA compiler):
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Thu_Jun__6_02:18:23_PDT_2024
Cuda compilation tools, release 12.5, V12.5.82
Build cuda_12.5.r12.5/compiler.34385749_0


In [1]:
print("\nChecking GPU:")
!nvidia-smi


Checking GPU:
Wed Dec  3 07:01:35 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   42C    P8              9W /   70W |       0MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                 

In [7]:
%%writefile kmeans_mpi.c
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <mpi.h>

#define MAX_ITER 100
#define K 3   // Number of clusters

typedef struct {
    double x, y;
    int cluster;
} Point;

typedef struct {
    double x, y;
} Centroid;

// Euclidean distance
double distance(Point p, Centroid c) {
    return sqrt((p.x - c.x)*(p.x - c.x) + (p.y - c.y)*(p.y - c.y));
}

int main(int argc, char *argv[]) {
    MPI_Init(&argc, &argv);

    int rank, size;
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    double start_time, end_time;

    // Dataset (can scale up for performance testing)
    int n = 9;
    Point *allPoints = NULL;

    if (rank == 0) {
        allPoints = (Point*)malloc(n * sizeof(Point));
        Point temp[9] = {
            {1,2}, {2,1}, {3,3},
            {8,8}, {9,8}, {8,9},
            {4,5}, {5,6}, {6,5}
        };
        for (int i = 0; i < n; i++) allPoints[i] = temp[i];
    }

    Centroid centroids[K] = {{2,2}, {8,8}, {5,5}};

    // Calculate number of points per process
    int base_n = n / size;
    int remainder = n % size;
    int local_n = (rank < remainder) ? base_n + 1 : base_n;

    Point *localPoints = (Point*)malloc(local_n * sizeof(Point));

    // Scatter points
    int *sendcounts = NULL;
    int *displs = NULL;

    if (rank == 0) {
        sendcounts = (int*)malloc(size * sizeof(int));
        displs = (int*)malloc(size * sizeof(int));
        int offset = 0;
        for (int i = 0; i < size; i++) {
            sendcounts[i] = (i < remainder) ? base_n + 1 : base_n;
            displs[i] = offset;
            offset += sendcounts[i];
        }
    }

    MPI_Scatterv(
        allPoints, sendcounts, displs, MPI_BYTE,
        localPoints, local_n * sizeof(Point), MPI_BYTE,
        0, MPI_COMM_WORLD
    );

    MPI_Barrier(MPI_COMM_WORLD);
    start_time = MPI_Wtime();

    // Main iteration
    for (int iter = 0; iter < MAX_ITER; iter++) {
        // Assign points to nearest cluster
        for (int i = 0; i < local_n; i++) {
            double minDist = INFINITY;
            int best = 0;
            for (int j = 0; j < K; j++) {
                double d = distance(localPoints[i], centroids[j]);
                if (d < minDist) {
                    minDist = d;
                    best = j;
                }
            }
            localPoints[i].cluster = best;
        }

        // local sums and counts
        double local_sumX[K] = {0}, local_sumY[K] = {0};
        int local_count[K] = {0};
        for (int i = 0; i < local_n; i++) {
            int c = localPoints[i].cluster;
            local_sumX[c] += localPoints[i].x;
            local_sumY[c] += localPoints[i].y;
            local_count[c]++;
        }

        // Reduce to global sums
        double global_sumX[K], global_sumY[K];
        int global_count[K];
        MPI_Allreduce(local_sumX,  global_sumX,  K, MPI_DOUBLE, MPI_SUM, MPI_COMM_WORLD);
        MPI_Allreduce(local_sumY,  global_sumY,  K, MPI_DOUBLE, MPI_SUM, MPI_COMM_WORLD);
        MPI_Allreduce(local_count, global_count, K, MPI_INT,    MPI_SUM, MPI_COMM_WORLD);

        // Update centroids on root
        if (rank == 0) {
            for (int j = 0; j < K; j++) {
                if (global_count[j] > 0) {
                    centroids[j].x = global_sumX[j] / global_count[j];
                    centroids[j].y = global_sumY[j] / global_count[j];
                }
            }
        }

        // Broadcast updated centroids
        MPI_Bcast(centroids, K * sizeof(Centroid), MPI_BYTE, 0, MPI_COMM_WORLD);
    }

    MPI_Barrier(MPI_COMM_WORLD);
    end_time = MPI_Wtime();

    if (rank == 0) {
        printf("\n MPI K-Means Results\n");
        printf("Processes Used: %d\n", size);
        printf("Execution Time: %f seconds\n", end_time - start_time);
        printf("\nFinal Centroids:\n");
        for (int j = 0; j < K; j++) {
            printf("Centroid %d: (%.2f, %.2f)\n", j, centroids[j].x, centroids[j].y);
        }
    }

    free(localPoints);
    if (rank == 0) {
        free(allPoints);
        free(sendcounts);
        free(displs);
    }

    MPI_Finalize();
    return 0;
}


Writing kmeans_mpi.c


In [8]:
!mpicc kmeans_mpi.c -o kmeans_mpi -lm


In [10]:
!OMPI_ALLOW_RUN_AS_ROOT=1 OMPI_ALLOW_RUN_AS_ROOT_CONFIRM=1 mpirun --allow-run-as-root --oversubscribe -np 1 ./kmeans_mpi


 MPI K-Means Results
Processes Used: 1
Execution Time: 0.000044 seconds

Final Centroids:
Centroid 0: (0.11, 0.00)
Centroid 1: (8.00, 8.00)
Centroid 2: (5.00, 5.00)


In [9]:
!OMPI_ALLOW_RUN_AS_ROOT=1 OMPI_ALLOW_RUN_AS_ROOT_CONFIRM=1 mpirun --allow-run-as-root --oversubscribe -np 2 ./kmeans_mpi


 MPI K-Means Results
Processes Used: 2
Execution Time: 0.001236 seconds

Final Centroids:
Centroid 0: (1144280264902606642906518901936106413685994075226799276395750030922870048613948610338828670840897309381417757991890339060005125589496811823373049042558183634386958059683963143898247586149042679608255770206244586335497317218300341457509599412224.00, 0.00)
Centroid 1: (8.00, 8.00)
Centroid 2: (0.00, 0.00)


In [11]:
!OMPI_ALLOW_RUN_AS_ROOT=1 OMPI_ALLOW_RUN_AS_ROOT_CONFIRM=1 mpirun --allow-run-as-root --oversubscribe -np 4 ./kmeans_mpi


 MPI K-Means Results
Processes Used: 4
Execution Time: 0.002475 seconds

Final Centroids:
Centroid 0: (0.00, 0.00)
Centroid 1: (8.00, 8.00)
Centroid 2: (5.00, 5.00)


In [12]:
!OMPI_ALLOW_RUN_AS_ROOT=1 OMPI_ALLOW_RUN_AS_ROOT_CONFIRM=1 mpirun --allow-run-as-root --oversubscribe -np 8 ./kmeans_mpi


 MPI K-Means Results
Processes Used: 8
Execution Time: 0.015567 seconds

Final Centroids:
Centroid 0: (0.00, 0.00)
Centroid 1: (8.00, 8.00)
Centroid 2: (5.00, 5.00)


In [13]:
!OMPI_ALLOW_RUN_AS_ROOT=1 OMPI_ALLOW_RUN_AS_ROOT_CONFIRM=1 mpirun --allow-run-as-root --oversubscribe -np 16 ./kmeans_mpi


 MPI K-Means Results
Processes Used: 16
Execution Time: 0.055851 seconds

Final Centroids:
Centroid 0: (0.00, 0.00)
Centroid 1: (8.00, 8.00)
Centroid 2: (5.00, 5.00)
