### Parallel Average with p Workers (C) [6 points]

The task is to compute the average of `n` numbers `a(0)`, ..., `a(n – 1)`. For example, for `n = 5`, the average can be computed in different ways:

      (a(0) + a(1) + a(2) + a(3) + a(4)) / 5
    = a(0) / 5 + a(1) / 5 + a(2) / 5 + a(3) / 5 + a(4) / 5
    = (a(0) + a(1) + a(2)) / 5 + (a(3) + a(4)) / 5

The last variant suggests a computation in parallel: one thread computes `(a(0) + a(1) + a(2)) / 5`, and a second thread computes `(a(3) + a(4)) / 5`; the main program collects the results of the two threads and adds them.

The program below computes the average of `n` random integers sequentially; you are asked to complete the parallel computation with `p` workers. The average is computed in both ways, and the times the sequential and parallel computation take are printed. The program reads `n` and `p` from the command line to make testing easier. [4 points]

In [1]:
%%writefile Average.c
#include <pthread.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>

#define SHARED 1

struct Args {int *a; int l; int u; int n; double avg;};

void *worker(struct Args *arg) {
    double s = 0;
    for (int i = arg->l; i < arg->u; i++) s+= arg->a[i];
    arg->avg = s / arg->n;
}

double sequentialaverage(int a[], int n) {
    double s = 0;
    for (int i = 0; i < n; i++) s += a[i];
    return s / n;
}

static double parallelaverage(int a[], int n, int p) {
    pthread_t threads[p];
    pthread_attr_t attr;
    pthread_attr_init(&attr);
    pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM);

    int perThread = n/p;
    struct Args workerList[p];
    for (int i = 0; i < p; i++){
        workerList[i].a = a;
        workerList[i].l = i*perThread;
        workerList[i].u = i<(p-1) ? (i+1)*perThread : n;
        workerList[i].n = n;
    }

    for(int i = 0; i<p; i++) pthread_create(&threads[i], &attr, worker, &workerList[i]);
    for(int i = 0; i<p; i++) pthread_join(threads[i], NULL);

    double avg = 0;
    for(int i=0; i<p; i++) avg += workerList[i].avg;

    return avg;
}

/* main program: read command line and create threads */
int main(int argc, char *argv[]) {
    
    int n = atoi(argv[1]);
    int p = atoi(argv[2]);
    int a[n];
    srand(time(NULL));
    for (int i = 0; i < n; i++) a[i] = rand() % 10000;
    
    struct timeval start, end;
    gettimeofday(&start, 0);
    double avg = sequentialaverage(a, n);
    gettimeofday(&end, 0);
    long seconds = end.tv_sec - start.tv_sec;
    long microseconds = end.tv_usec - start.tv_usec;
    long elapsed = seconds * 1e6 + microseconds;
    printf("Sequential: %f Time: %i microseconds\n", avg, elapsed);
    
    gettimeofday(&start, 0);
    avg = parallelaverage(a, n, p);
    gettimeofday(&end, 0);
    seconds = end.tv_sec - start.tv_sec;
    microseconds = end.tv_usec - start.tv_usec;
    elapsed = seconds * 1e6 + microseconds;
    printf("Parallel:   %f Time: %i microseconds\n", avg, elapsed);
}

Overwriting Average.c


In [2]:
!gcc Average.c -lpthread -Wno-incompatible-pointer-types -o Average

Run your implementation with the following values of `n`; you may also include more values. As each run can produce different timing results, run your implementation with the same value of `n` several times. The above program measures the elapsed time, not the CPU time. If there are other processes (users) on the same CPU, the elapsed time will be larger than the CPU time. If you are using a server, choose a time of the day with few other users. In multiple runs with the same parameter, smaller times approximate the CPU time better.

In [9]:
!./Average 1000000 1

Sequential: 4999.081910 Time: 2876 microseconds
Parallel:   4999.081910 Time: 3611 microseconds


In [10]:
!./Average 1000000 2

Sequential: 5000.099400 Time: 2844 microseconds
Parallel:   5000.099400 Time: 2155 microseconds


In [11]:
!./Average 1000000 4

Sequential: 5000.099400 Time: 2878 microseconds
Parallel:   5000.099400 Time: 1473 microseconds


In [12]:
!./Average 1000000 8

Sequential: 4997.073334 Time: 2846 microseconds
Parallel:   4997.073334 Time: 1168 microseconds


In [13]:
!./Average 1000000 16

Sequential: 4997.073334 Time: 2841 microseconds
Parallel:   4997.073334 Time: 1654 microseconds


In [15]:
!./Average 1000000 32

Sequential: 4996.406079 Time: 2837 microseconds
Parallel:   4996.406079 Time: 2371 microseconds


In [19]:
!./Average 500000 5

Sequential: 4996.197162 Time: 1436 microseconds
Parallel:   4996.197162 Time: 1051 microseconds


Run your implementation with different values of `n` and `p`; summarize and explain your observations. For each pair of values, run multiple times and take the smallest execution time, as those executions had less interference. Is the result of the sequential and parallel computation always the same? [2 points]

I noticed while running tests that C parallelism is more powerful than in java. the parallel computation is significantly faster than the sequential computation for lists of size greater than 500000, however sequential computation becomes faster for lists of size smaller than 500000 just like in java.