<a href="https://colab.research.google.com/github/anama-1104/cis677/blob/main/openmp_reduction_or_criticalsection.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Comparing reduction and critical section in Openmp

In OpenMP there is support for:


1.   Reduction operations in for statements
2.   Critical sections

There are cases when it is possible to achieve the same goals by using either one of two alternatives:

1. Using a reduction and then looking for the element of interest
2. Using a critical section

This example illustratest both alternatives.



In [1]:
%%writefile reduc_vs_critical.cc
//
// Comparing reduce vs. critical section in OpenMP
// Create a vector of integers with 10,000,000 entries
// Fill with random integer positive values
// Find the largest element and its position in the vector
// Use two different approaches:
// 1. Perform a reduction and then look for the entry
// 2. Use a critical section to update the max value and its entry
// Time the different parts of the code
//
#include <omp.h>
#include <cstdlib>
#include <float.h>
#include <algorithm>
#include <iostream>
#include <vector>

const int SIZE = 10000000;
int main(int argc,char *argv[])
{
  std::vector < int > array(SIZE);

  double startGenerating,endGenerating;
  startGenerating = omp_get_wtime();
#pragma omp parallel for
  for(int i = 0;i < SIZE;i++) {
    array[i] = rand();
  }
  endGenerating = omp_get_wtime();

  // Now compare two approaches to finding
  // the largest element in the array
  // 1. Perform a reduction and then find the corresponding element
  // 2. Use a critical section
  double startReduction,endReduction;
  startReduction = omp_get_wtime();
  int maxValue = 0;
#pragma omp parallel for reduction(max:maxValue)
  for(int i = 0;i < SIZE;i++) {
    maxValue = std::max(maxValue,array[i]);
  }
  endReduction = omp_get_wtime();
  double startFindMaxEntry,endFindMaxEntry;
  startFindMaxEntry = omp_get_wtime();
  int maxIndex = -1;
  for(int i = 0;i < SIZE;i++) {
    if (array[i] == maxValue) {
      maxIndex = i;
      break;
    }
  }
  endFindMaxEntry = omp_get_wtime();
  std::cout << "Max value: " << maxValue
       << " at position: " << maxIndex << std::endl;
  double startCriticalSection,endCriticalSection;
  startCriticalSection = omp_get_wtime();
  maxIndex = -1;
  maxValue = 0;
#pragma omp parallel for shared(maxValue,maxIndex)
  for(int i = 0;i < SIZE;i++) {
    #pragma omp critical
    if (array[i] > maxValue)
      {
      maxValue = array[i];
      maxIndex = i;
      }
  }
  endCriticalSection = omp_get_wtime();
  std::cout << "Max value: " << maxValue
	    << " at position: " << maxIndex << std::endl;
  std::cout << "Times: " << std::endl;
  std::cout << "Generation: " << (endGenerating - startGenerating)
	    << std::endl;
  std::cout << "Reduction: " << (endReduction - startReduction)
	    << std::endl;
  std::cout << "FindMaxEntry: " << (endFindMaxEntry - startFindMaxEntry)
	    << std::endl;
  std::cout << "Reduction + FindMaxEntry: " <<
    (endReduction - startReduction)+(endFindMaxEntry - startFindMaxEntry)
	    <<std::endl;
  std::cout << "Critical Section: " << (endCriticalSection - startCriticalSection)
	    << std::endl;
}


Writing reduc_vs_critical.cc


In [2]:
!g++ reduc_vs_critical.cc -o reduc_vs_critical -fopenmp -O3

In [3]:
!./reduc_vs_critical

Max value: 2147483025 at position: 4488871
Max value: 2147483025 at position: 4488871
Times: 
Generation: 0.285828
Reduction: 0.00546451
FindMaxEntry: 0.0133595
Reduction + FindMaxEntry: 0.018824
Critical Section: 0.342475


Which approach is faster, using a critical section or using reduction?


When working with OpenMP, reduction tends to be the faster approach because each thread gets its own private copy of the variable, does its work independently, and then all the results are combined at the end in one step. This avoids the constant stopping and waiting that happens in a critical section, where every thread must line up and take turns to update the shared variable. The downside of reduction is that it uses slightly more memory, since each thread needs its own copy of the variable, while critical sections only rely on a single shared one. In practice though, the extra memory needed for reduction is usually very small compared to the performance benefits, so programmers typically prefer reduction unless the task doesn’t fit into a reduction pattern. Critical sections are still important when you need more complex updates that reductions can’t handle, but they can quickly become a performance bottleneck as the number of threads grows.





Which approach uses more memory?


In OpenMP, the reduction approach uses more memory because each thread is given its own private copy of the variable. After the loop, these copies are combined into a final result. By contrast, a critical section only relies on one shared variable, so it uses less memory.
