Sorting Algorithm Optimization using OpenMP and MPI

This project aims to optimize classical sorting algorithms using OpenMP and MPI. The project contains two folders, one for OpenMP and the other for MPI.

Aims

This project aims to design and develop serial and parallel solutions for three sorting algorithms: fast sort, enumeration sort, and merge sort using OpenMP and MPI to accelerate codes.

Tasks

Random generation of an array with double-precision floating-point numbers.
A serial solution for each sorting algorithm that sorts the generated array.
A parallel solution for each sorting algorithm that sorts the array on the CPU using OpenMP or MPI.

OpenMP

Introduction

QuickSort

Principle：The main principle of quick sorting is left and right exchange sorting, first, we determine a middle value, the array is divided into left and right two parts, from right to left to find smaller than the middle value, from left to right to find larger than the middle value, exchange each other's position, and so on, and so on, finally get a sequence that the middle value to the left is all smaller than the middle value, the middle value to the right is larger than the middle value. At this point, the left and right sides can be seen as two new arrays, continue to use the above principle until the end of the loop.
EnumerationSort

Principle：Enumeration sorting is to compare the size of all elements with all other elements and find out how many elements are smaller than themselves, so as to obtain the position (rank) of the elements. The position of each element is unique.
MergeSort

Principle: The array is split into individual elements, so that relative to itself is ordered, two adjacent elements compare size, so that the loop, and finally merged into a whole orderly sequence.

Requirement

Random generation of an array with double-precision floating-point numbers:

std::random_device rd;
std::mt19937 gen(rd());
std::uniform_real_distribution<double> random(1.0, 100000.0);

for (int i = 0; i < n; i++) {
    a[i] = random(gen);
    // std::cout << a[i] << ", ";
}

All of my code uses the same random function as shown in the figure to display double random numbers.

liuyu@Ames-MacBookPro 5507 % ./a.out
3.14555 1.57566 2.67936 4.57623 3.99636 2.41742 8.83001 727.742 3.27268 7.91091

A serial solution for each sorting algorithm to sort the generated array:

[1] QuickSort

Liuyu @ Ames-macbookpro 5507 g++-11 -openmp Quicksortopenmp2.cpp
Liuyu @ Ames-macbookpro 5507 % ./a.out
1.54195 3.08108 3.4174 4.60761 5.46417 7.97852 8.76398 8.84981 9.66999 9.84632

[2] EnumerationSort

[3] MergeSort

A parallel solution for each sorting algorithm to sort the array with OpenMP on CPU;

The parallel solution for each arrangement algorithm will be described in detail in the Report section.

OpenMP Codes Algorithm

The C++ codes of your serial and parallel solutions.

[1] QuickSort

serial codes:

// QuickSort function to sort an array in ascending order
void QuickSort(double *arr, int low, int high) {
    // Get the base value from the array
    double temp = arr[low];

    // Initialize two pointers, i and j, to traverse the array from both ends
    int i = low;
    int j = high;

    // While i and j haven't crossed and the current element is greater than the base value, move j left
    while (i < j && arr[j] > temp) {
        j--;
    }
    // When the current element is less than or equal to the base value, fill it in at i and move i right
    if (i < j) {
        arr[i] = arr[j];
        i++;
    }

    // While i and j haven't crossed and the current element is less than or equal to the base value, move i right
    while (i < j && arr[i] <= temp) {
        i++;
    }
    // When the current element is greater than the base value, fill it in at j and move j left
    if (i < j) {
        arr[j] = arr[i];
        j--;
    }

    // After i and j have crossed, fill in the base value at the correct position in the array
    arr[i] = temp;

    // Recursively sort the subarrays on the left and right of the base value
    if (low < i - 1) {
        QuickSort(arr, low, i - 1);
    }
    if (i + 1 < high) {
        QuickSort(arr, i + 1, high);
    }
}

parallel codes:

My parallel idea is to divide the original array into two and sort them quickly respectively. After the quick sorting, the merging sorting of the two arrays can make the algorithm with the lowest time complexity and the most eﬃcient algorithm.

#pragma omp parallel sections
{
    #pragma omp section
    MergeSort(d, 0, m-1); // perform MergeSort on the left array
    
    #pragma omp section
    MergeSort(e, 0, m-1); // perform MergeSort on the right array
}

int left = 0, right = 0, k = 0;
while (left < m && right < m) // continue while neither array is finished
{
    // compare the leftmost elements of the two arrays
    if (d[left] <= e[right])
    {
        b[k++] = d[left++]; // add the smaller element to the output array
    }
    else
    {
        b[k++] = e[right++]; // add the smaller element to the output array
    }
}

while (left < m) // if the left array is unfinished
{
    b[k++] = d[left++]; // add the remaining elements to the output array
}

while (right < m) // if the right array is unfinished
{
    b[k++] = e[right++]; // add the remaining elements to the output array
}

[2] EnumerationSort

serial codes:

void enumSort(double *a, double *b, int n)
{
    // Iterate over each element in the array
    for (int i = 0; i < n; i++)
    {
        int k = 0;
        
        // Compare element i with every other element in the array
        for (int j = 0; j < n; j++)
        {
            if (a[i] > a[j])
                k++;
        }
        
        // Find the next available slot in array b
        while (b[k] != 0)
            k++;
        
        // Insert the element from array a into array b at position k
        b[k] = a[i];
    }
}

Create two arrays. Array A is the sequence to be sorted, and array B is used to store the sorted sequence.

Two for loops are used to perform enumeration sorting. The external for loop (i = 0... i++) selects a value to compare with each number in the sequence. The purpose of this loop is to iterate through all values in array A.

The purpose of the second for loop (j = 0... j++) is to select a value to compare with each number in the sequence, which is "each number". After executing the for loop twice, we can compare each number in the sequence with all the other numbers in the sequence.

If a[i] > a[j], then the k++ operation is performed, which means to move the position of a[i] to the right. For example, if the leftmost number a[0] in the sequence is larger than the five numbers in the sequence, k++ is executed five times.

Starting with the first number a[i], we compare it with all the numbers in the sequence. If a[i] is smaller than the numbers in the sequence, we put a[i] in the position of the corresponding array B.

The purpose of the while loop is to determine whether the position of array B has a value at this time. If it does, the old original value will be replaced by the new value. For example, 33325 may be arranged as 23 empty 5.

Finally, b[k] = a[i] puts the compared number in the correct position in array B.

void paEnumSort(double *a, double *b, int n) 
{
#pragma omp parallel
    {
#pragma omp for
        for (int i = 0; i < n; i++) 
        {
            int k = 0;
            for (int j = 0; j < n; j++) 
            {
                if (a[i] > a[j])
                    k++;
            }
            while (b[k] != 0)
                k++;
            b[k] = a[i];
        }
    }
}

[3] MergeSort

serial codes:

void MergeSort(double arr[], int low, int high)
{
    if (low >= high) { return; } // base case: stop if sequence length is 1
    
    int mid = low + (high - low) / 2; // get middle element
    
    MergeSort(arr, low, mid); // recursive call on left half
    MergeSort(arr, mid + 1, high); // recursive call on right half
    Merge(arr, low, mid, high); // merge the two halves
}

void Merge(double arr[], int low, int mid, int high)
{
    int i = low, j = mid + 1, k = 0; // initialize indices for left and right subarrays and temp array
    double *temp = new double[high - low + 1]; // create temporary array to store merged ordered sequence
    
    while (i <= mid && j <= high) { // compare elements in both subarrays and place them in temp in order
        if (arr[i] <= arr[j]) {
            temp[k++] = arr[i++];
        } else {
            temp[k++] = arr[j++];
        }
    }
    
    while (i <= mid) { // copy remaining elements in left subarray to temp
        temp[k++] = arr[i++];
    }
    
    while (j <= high) { // copy remaining elements in right subarray to temp
        temp[k++] = arr[j++];
    }
    
    for (i = low, k = 0; i <= high; i++, k++) { // copy elements from temp back to original array
        arr[i] = temp[k];
    }
    
    delete[] temp; // free up memory
}

parallel codes:

The parallel idea of merge sort is similar to that of quicksort. I choose to divide the array into two, and then run merge sort twice in parallel, and then sort and merge the two arrays after sorting.

#pragma omp parallel sections
{
#pragma omp section
    MergeSort(d, 0, m-1);
#pragma omp section
    MergeSort(e, 0, m-1);
}
while (left < m && right < m) // When the left and right arrays are not finished.
{
    if (d[left] <= e[right])
        b[k++] = d[left++]; // If the smaller value is in the left array, move the left pointer one unit to the right.
    else
        b[k++] = e[right++]; // If the smaller value is in the right array, move the right pointer one unit to the right.
}
while (left < m)
    b[k++] = d[left++]; // Copy the remaining values from the left array to the merged array.
while (right < m)
    b[k++] = e[right++]; // Copy the remaining values from the right array to the merged array.

Experimental Data for OpenMP Codes

Time	Sort	Types	Sequences	Rating
0.0155	QuickSort	Serial	1	1.8452381
0.0159	QuickSort	Serial	2	1.82758621
0.0162	QuickSort	Serial	3	1.90588235
0.0157	QuickSort	Serial	4	1.8045977
0.0158	QuickSort	Serial	5	1.8372093
0.0084	QuickSort	Parallel	1
0.0087	QuickSort	Parallel	2
0.0085	QuickSort	Parallel	3
0.0087	QuickSort	Parallel	4
0.0086	QuickSort	Parallel	5	8.23889189
45.8	EnumerationSort	Serial	1	8.69720624
42.338	EnumerationSort	Serial	2	8.77505155
42.559	EnumerationSort	Serial	3	8.6644898
42.456	EnumerationSort	Serial	4	8.65991
42.3383	EnumerationSort	Serial	5
5.559	EnumerationSort	Parallel	1
4.868	EnumerationSort	Parallel	2
4.85	EnumerationSort	Parallel	3
4.9	EnumerationSort	Parallel	4
4.889	EnumerationSort	Parallel	5
0.0291	MergeSort	Serial	1
0.0344	MergeSort	Serial	2
0.034	MergeSort	Serial	3
0.0341	MergeSort	Serial	4	1.02352941
0.0344	MergeSort	Serial	5	1.97109827
0.015	MergeSort	Parallel	1
0.017	MergeSort	Parallel	2
0.0176	MergeSort	Parallel	3
0.0173	MergeSort	Parallel	4
0.0172	MergeSort	Parallel	5

Conclusion

The improvement ratio of parallel quicksort and parallel mergesort is about 100%, doubling the speed; The parallel enumeration sort is improved by about 750%. We find that concurrent programs greatly improve the enumeration sort of serial programs; Finally, on the whole, quick sort is the best sort;

MPI

Introduction to MPI

OpenMP and MPI are two means of parallel programming. The comparison is as follows: OpenMP: thread level; Shared storage. MPI: process level; Distributed storage. The last project optimized OpenMP for three serial sorting, and this project optimized MPI.

MPI Function

MPI_Init(int* argc, char*** argv)

This function initializes the MPI environment, creates MPI global or internal variables, and assigns a unique rank to each process.
```
#include <mpi.h>

int main(int argc, char** argv) {
    MPI_Init(&argc, &argv);
    // ...
    MPI_Finalize();
    return 0;
}
```

MPI_Wtime()

This function returns the wall-clock time in seconds.

#include <mpi.h>
#include <iostream>

int main(int argc, char** argv) {
    MPI_Init(&argc, &argv);
    double start_time = MPI_Wtime();
    // ...
    double end_time = MPI_Wtime();
    std::cout << "Elapsed time: " << end_time - start_time << " seconds\n";
    MPI_Finalize();
    return 0;
}

MPI_Comm_size(MPI_Comm communicator, int* size)

This function returns the size of the communicator, i.e., the number of processes available in the communicator.

#include <mpi.h>
#include <iostream>

int main(int argc, char** argv) {
    MPI_Init(&argc, &argv);
    MPI_Comm comm = MPI_COMM_WORLD;
    int size;
    MPI_Comm_size(comm, &size);
    std::cout << "Number of processes: " << size << std::endl;
    MPI_Finalize();
    return 0;
}

MPI_Comm_rank(MPI_Comm communicator, int* rank)

This function returns the rank of the current process in the communicator, i.e., a unique number starting from 0 that specifies the corresponding process when sending or receiving information.

#include <mpi.h>
#include <iostream>

int main(int argc, char** argv) {
    MPI_Init(&argc, &argv);
    MPI_Comm comm = MPI_COMM_WORLD;
    int rank;
    MPI_Comm_rank(comm, &rank);
    std::cout << "Rank of this process: " << rank << std::endl;
    MPI_Finalize();
    return 0;
}

MPI_Send(void* data, int count, MPI_Datatype datatype, int destination, int tag, MPI_Comm communicator)

: This function sends count data of type datatype to the destination process with rank destination. The data tag is tag and the communicator is communicator.

#include <mpi.h>
#include <iostream>

int main(int argc, char** argv) {
    MPI_Init(&argc, &argv);
    MPI_Comm comm = MPI_COMM_WORLD;
    int data = 42;
    int dest = 1;
    int tag = 0;
    MPI_Send(&data, 1, MPI_INT, dest, tag, comm);
    MPI_Finalize();
    return 0;
}

MPI_Recv(void* data, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm communicator, MPI_Status* status)

This function receives data of type datatype with tag tag from the source process with rank source. The maximum length of the data is count, and the actually received data length and tag are saved in status. The rank of the source process and the communicator are also specified.
```
#include <mpi.h>
#include <iostream>

int main(int argc, char** argv) {
    MPI_Init(&argc, &argv);
    MPI_Comm comm = MPI_COMM_WORLD;
    int data
```
MPI_Comm_rank

This code gets the rank of the current process in the communicator MPI_COMM_WORLD and saves it in the variable rank.
```
int rank;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
```
MPI_Send

This code sends count number of data of type datatype from the buffer pointed to by &data to the process with rank destination in the communicator MPI_COMM_WORLD. The message is tagged with tag.
```
MPI_Send(&data, count, datatype, destination, tag, MPI_COMM_WORLD);
```
MPI_Recv

This code receives a message of up to count elements of type datatype from the process with rank source in the communicator MPI_COMM_WORLD. The message must have tag tag. The received data is stored in the buffer pointed to by &data. Information about the received message is stored in the MPI_Status object pointed to by &status.
```
MPI_Recv(&data, count, datatype, source, tag, MPI_COMM_WORLD, &status);
```
MPI_Bcast

This code broadcasts a message from the process with rank source_proc to all other processes in the communicator comm. The message consists of count elements of type datatype stored in the buffer pointed to by &data.
```
MPI_Bcast(&data, count, datatype, source_proc, comm);
```

MPI Codes Algorithm

Design idea

Sort Algorithms The design principle for all three sorting algorithms is to first divide the unordered array into two equal parts, as it is optimized for dual-core CPU with dual-process design. Then sort the two processes with rank == 1 and rank == 0 respectively. Finally, send the ordered array with rank == 1 to rank == 0 and perform the final merge at rank 0.
Random Array Generator and Verification Program The design principle for generating the random array program is to first save the generated random array in binary format to unsorted_array.txt. Then read and verify the random array, with rank 1 being responsible for this task. The data is then sent to rank 0, which is responsible for verifying that the randomly generated array is the same as the data in unsorted_array.txt.
Serial and Parallel Sorting Verification The principle for verifying serial and parallel sorting is similar to that of the random array generator and verification program. First, save the two sorted data in binary format into a text file. Then call the verification program, which is similar to the random array verification program. If the output is "All things same", it proves that the two sorting results are the same.
Mixed Programming of MPI and OpenMP First, the multi-processes of MPI are used for common execution, and OpenMP optimizes the for loop from the thread to realize mixed programming. All three sorts use this idea.

MPI Codes

Quick sort:

if(rank==1){
    QuickSort(b1,0,m-1);
    MPI_Send(b1,m,MPI_DOUBLE,0,0,MPI_COMM_WORLD);
}
else if(rank==0){
    MPI_Recv(b1,m,MPI_DOUBLE,1,0,MPI_COMM_WORLD,MPI_STATUS_IGNORE);
    QuickSort(b2,0,m-1);
}
// Initialize a counter k, left pointer, and right pointer.
int k = 0, left = 0, right = 0;
while(left<m && right<m)
{
    // Compare the first element of the two arrays and put the smaller one into the result array.
    if(b1[left]<=b2[right])
        b[k++]=b1[left++];
    else
        b[k++]=b2[right++];
}
// If the left array has any remaining elements, append them to the result array.
while(left<m)
    b[k++]=b1[left++];
// If the right array has any remaining elements, append them to the result array.
while(right<m)
    b[k++]=b2[right++];
// The final array b[N] is the final QuickSort_MPI sorted array.

If the rank is 1, it performs QuickSort on the left half of the array and sends the sorted array to rank 0 using MPI_Send().

If the rank is 0, it receives the sorted array from rank 1 using MPI_Recv() and performs QuickSort on the right half of the array.

The merged array is constructed by comparing the first element of the left and right subarrays and selecting the smaller one.

If there are any remaining elements in the left or right subarray, they are appended to the result array. The final array b[N] is the sorted array after QuickSort_MPI.

Merge sort:
```
if (rank == 1) {
    MergeSort(b1, 0, m - 1);
    MPI_Send(b1, m, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD);
}
else if (rank == 0) {
    MPI_Recv(b1, m, MPI_DOUBLE, 1, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
    MergeSort(b2, 0, m - 1);
}

// Merge the two sorted subarrays
while (left < m && right < m) {
    if (b1[left] <= b2[right])
        b[k++] = b1[left++];
    else
        b[k++] = b2[right++];
}

// Copy any remaining elements from left or right subarray
while (left < m)
    b[k++] = b1[left++];
while (right < m)
    b[k++] = b2[right++];

// The final array b[N] is the sorted array after MergeSort_MPI.
```
The MergeSort_MPI algorithm is a parallel sorting algorithm that utilizes both MPI and MergeSort to efficiently sort a large dataset. The algorithm is divided into two main parts:
1. Parallel Sorting: If the rank is 1, it sorts the left half of the array using MergeSort and sends the sorted array to rank 0 using MPI_Send(). If the rank is 0, it receives the sorted array from rank 1 using MPI_Recv() and sorts the right half of the array using MergeSort.
2. Merging: The merged array is constructed by comparing the first element of the left and right subarrays and selecting the smaller one. If there are any remaining elements in the left or right subarray, they are appended to the result array. The final array b[N] is the sorted array after MergeSort_MPI.
This algorithm is highly efficient for large datasets and takes advantage of both parallel processing and the efficient MergeSort algorithm.

Enumeration sort

if (rank == 1) {
    // Sort the left half of the array using Enumeration Sort
    enumSort(b1, c1, m);
    // Send the sorted array to rank 0
    MPI_Send(c1, m + 1, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD);
}
else if (rank == 0) {
    // Receive the sorted array from rank 1
    MPI_Recv(c1, m + 1, MPI_DOUBLE, 1, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
    // Sort the right half of the array using Enumeration Sort
    enumSort(b2, c2, m);
}

// Merge the two sorted arrays
while (left < m && right < m) {
    if (c1[left] <= c2[right])
        b[k++] = c1[left++];
    else
        b[k++] = c2[right++];
}
while (left < m)
    b[k++] = c1[left++];
while (right < m)
    b[k++] = c2[right++];

// The final array b[N] is the sorted array after EnumerationSort_MPI

The code performs Enumeration Sort on the left half of the array if the rank is 1, and sends the sorted array to rank 0 using MPI_Send(). If the rank is 0, it receives the sorted array from rank 1 using MPI_Recv() and performs Enumeration Sort on the right half of the array.

The merged array is constructed by comparing the first element of the left and right subarrays and selecting the smaller one. If there are any remaining elements in the left or right subarray, they are appended to the result array.

The final array b[N] is the sorted array after EnumerationSort_MPI.

MPI Experiment Data

10000	Type	Average Time	Acceleration ratio	Promotion rate
QuickSort	Serial	0.0011985	1.0626	6.26%
QuickSort	Parallel	0.0012735	-	-
EnumerationSort	Serial	0.21271175	1.524	52.40%
EnumerationSort	Parallel	0.13957278	-	-
MergeSort	Serial	0.001887	1.0728	7.28%
MergeSort	Parallel	0.001759	-	-
-------	---------	------------	------------------	--------------
100000	Type	Average Time	Acceleration ratio	Promotion rate
QuickSort	Serial	0.01341475	1.12528	12.53%
QuickSort	Parallel	0.01192125	-	-
EnumerationSort	Serial	10.626163	1.995	99.50%
EnumerationSort	Parallel	5.3267382	-	-
MergeSort	Serial	0.0194187	1.1023	10.23%
MergeSort	Parallel	0.0176165	-	-

Conclusion

From the final tables and charts we can find: Quick_Sort the dual-core CPU improvement rate in MPI parallel programs that process big data is more than 10%, and so on, the multicore CPU improvement rate should be higher. Enumeration_Sort the dual-core CPU improvement rate in MPI parallel programs that process big data is close to 100%, and so on the multicore CPU improvement rate should be higher. Merge_Sort the two-core CPU boost rate in MPI parallel programs that handle big data should be higher at 10% for life.

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
MPI		MPI
OpenMP		OpenMP
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sorting Algorithm Optimization using OpenMP and MPI

Aims

Tasks

OpenMP

Introduction

Requirement

OpenMP Codes Algorithm

Experimental Data for OpenMP Codes

Conclusion

MPI

Introduction to MPI

MPI Function

MPI Codes Algorithm

Design idea

MPI Codes

MPI Experiment Data

Conclusion

Experimental Environment

About

Releases

Packages

Languages

Ameame1/Optimizing_Classical_Sorting_by_OpenMP-MPI

Folders and files

Latest commit

History

Repository files navigation

Sorting Algorithm Optimization using OpenMP and MPI

Aims

Tasks

OpenMP

Introduction

Requirement

OpenMP Codes Algorithm

Experimental Data for OpenMP Codes

Conclusion

MPI

Introduction to MPI

MPI Function

MPI Codes Algorithm

Design idea

MPI Codes

MPI Experiment Data

Conclusion

Experimental Environment

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages