Skip to content

PAPI Parallel Programs

Treece-Burgess edited this page Jan 30, 2024 · 16 revisions

Using PAPI with Parallel Programs

Threads

A thread is an independent flow of instructions that can be scheduled to run by the operating system. Multi-threaded programming is a form of parallel programming where several controlled threads are executing concurrently in the program. All threads execute in the same memory space, and can therefore work concurrently on shared data. Threads can run in parallel on several processors, allowing a single program to divide its work between several processors, thus running faster than a single-threaded program, which runs on only one processor at a time.

In PAPI, each thread is responsible for the creation, start, stop, and read of its own counters. When a thread is created, it inherits no PAPI information or state from the calling thread unless explicitly specified.

For those on highly modified systems, the user should take care to set the scope of each thread to PTHREAD_SCOPE_SYSTEM attribute, unless the system is known to have a non-hybrid thread library implementation. PAPI does support unbound or user threads explicitly, but it should work and the counts will reflect totals for the underlying bound thread. PAPI supports threading agnostically by allowing the user to specify the function that returns the current thread ID. For nearly all platforms, this will be the pthread_self function. If your system has some other way of identifying a unique kernel thread with a PMU context, it should be specified here.

Initialization of Thread Support

Thread support in the PAPI library can be initialized by calling the following low-level function:

C:

int retval = PAPI_thread_init(pthread_self);

Arguments for PAPI_thread_init:

  • pthread_self -- returns current thread ID. unless otherwise known, you should always use pthread_self as an argument to PAPI_thread_init. Never use omp_get_thread_num.

Fortran:

use iso_c_binding
use omp_lib
integer(c_int) check
call PAPIF_thread_init(omp_get_thread_num, check)

Fortran arguments for PAPIF_thread_init:

  • omp_get_thread_num -- returns current thread ID.
  • check -- an error return value for Fortran.

This function should be called only once, after PAPI_library_init and before any other PAPI calls. Applications that make no use of threads do not need to call this function.

Thread ID

The identifier of the current thread can be obtained by calling the following low-level function:

C:

unsigned long retval = PAPI_thread_id();

No arguments for PAPI_thread_id.

Fortran:

use iso_c_binding
integer(c_int) thread_id
call PAPIF_thread_id(thread_id)

Fortran arguments for PAPIF_thread_id:

  • thread_id -- thread identifier of the current thread.

Note: If a segmentation fault occurs with the above example, try replacing integer(c_int) with either integer(c_long) or integer(c_long_long).

This function calls the thread id function registered by PAPI_thread_init and returns an unsigned long integer containing the thread identifier.

Example

In the following code example, PAPI_thread_init and PAPI_thread_id are used to initialize thread support in the PAPI library and to acquire the identifier of the current thread, respectively. Unless you know otherwise, you should always use pthread_self as an argument to PAPI_thread_init. Never use omp_get_thread_num.

#include <papi.h>
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

void handle_error (int retval)
{
    printf("PAPI error %d: %s\n", retval, PAPI_strerror(retval));
    exit(1);
}

int main()
{
    int retval;
    unsigned long tid_retval;
 
    /* Initialize the PAPI library */
    retval = PAPI_library_init(PAPI_VER_CURRENT);
    if (retval != PAPI_VER_CURRENT)
        handle_error(retval);

    /* Initialize thread support in the PAPI library */
    retval = PAPI_thread_init(pthread_self);
    if (retval != PAPI_OK)
        handle_error(retval);

    /* Obtaining thread identifier for current thread */
    tid_retval = PAPI_thread_id(); 
    if (tid_retval == (unsigned long int)-1)
        handle_error(tid_retval);
 
    printf("Initial thread id is: %lu\n",tid_retval);

    /* Executes if all low-level PAPI
    function calls returned PAPI_OK */
    printf("\033[0;32m\n\nPASSED\n\033[0m");
    exit(0); 
}

Possible Output

Initial thread id is: 0


PASSED

On success, this PAPI function returns a valid thread identifier and the possible above output is returned. On error, a non-zero error code is returned.

Thread Utilities

Four more utility functions related to threads are available in PAPI. These functions allow you to register a newly created thread to make it available for reference by PAPI, to remove a registered thread in cases where thread ids may be reused by the system, and to create and access thread-specific storage in a platform independent fashion for use with PAPI. These functions are shown below in both C and Fortran. As a note the first two functions are the only two available in Fortran:

C:

int retval = PAPI_register_thread();

No arguments for PAPI_register_thread.

int retval = PAPI_unregister_thread();

No arguments for PAPI_unregister_thread.

int tag;
void *ptr;
int retval = PAPI_set_thr_specific(tag, ptr);

Arguments for PAPI_set_thr_specific:

  • tag -- an identifier, the value of which is either PAPI_USR1_TLS or PAPI_USR2_TLS. This identifier indicates which of several data structures associated with this thread is to be accessed.
  • ptr -- pointer to the memory containing the data structure.
int tag;
void **ptr;
int retval = PAPI_get_thr_specific(tag, ptr);

Arguments for PAPI_get_thr_specific:

  • tag -- an identifier, the value of which is either PAPI_USR1_TLS or PAPI_USR2_TLS. This identifier indicates which of several data structures associated with this thread is to be accessed.
  • ptr -- pointer to the memory containing the data structure.

Fortran:

use iso_c_binding
integer(c_int) check
call PAPIF_register_thread(check)

Fortran arguments for PAPIF_register_thread:

  • check -- an error return value for Fortran.
use iso_c_binding
integer(c_int) check
call PAPIF_unregister_thread(check)

Fortran arguments for PAPIF_unregister_thread:

  • check -- an error return value for Fortran.

For more code examples of using Pthreads and OpenMP with PAPI, see ctests/zero_pthreads.c and ctests/zero_omp.c, respectively. Also, for a code example of using SMP with PAPI, see ctests/zero_smp.c.

MPI

MPI is an acronym for Message Passing Interface. MPI is a library specification for message-passing, proposed as a standard by a broadly based committee of vendors, implementers, and users. MPI was designed for high performance on both massively parallel machines and on workstation clusters.

PAPI supports MPI. When using timers in applications that contain multiplexing, profiling, and overflow, MPI uses a default virtual timer and must be converted to a real timer in order for the application to work properly. Otherwise, the application will exit.

Optionally, several supported tools including TAU and Score-P can be used to implement PAPI with MPI. The following is a code example of using MPI’s PI program with PAPI:

#include <papi.h>
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#include <math.h>

void handle_error (int retval)
{
    printf("PAPI error %d: %s\n", retval, PAPI_strerror(retval));
    exit(1);
}

int main(int argc, char **argv)
{
    int done = 0, n, myid, numprocs, i, rc, retval, EventSet = PAPI_NULL;
    double PI25DT = 3.141592653589793238462643;
    double mypi, pi, h, sum, x, a;
    long_long values[1] = {(long_long) 0}; 
 
    MPI_Init(&argc,&argv);
    MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
    MPI_Comm_rank(MPI_COMM_WORLD,&myid);
 
    /*Initialize the PAPI library */
    retval = PAPI_library_init(PAPI_VER_CURRENT);
    if (retval != PAPI_VER_CURRENT)
        handle_error(retval);
 
    /* Create an EventSet */
    retval = PAPI_create_eventset(&EventSet);
    if (retval != PAPI_OK)
        handle_error(retval);
 
    /* Add Total Instructions Executed to our EventSet */
    retval = PAPI_add_event(EventSet, PAPI_TOT_INS);
    if (retval != PAPI_OK)
        handle_error(retval);
 
    /* Start counting */
    retval = PAPI_start(EventSet);
    if (retval != PAPI_OK)
        handle_error(retval);
 
    while (!done)
    {
        if (myid == 0) {
            printf("Enter the number of intervals: (0 quits) ");
            scanf("%d",&n);
        }   
        MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);
        if (n == 0)
            break;
 
        h = 1.0 / (double) n;
        sum = 0.0;
        for (i = myid + 1; i <= n; i += numprocs) {
            x = h * ((double)i - 0.5);
            sum += 4.0 / (1.0 + x*x);
        }   

        mypi = h * sum;
        MPI_Reduce(&mypi, &pi, 1, MPI_DOUBLE, MPI_SUM, 0,MPI_COMM_WORLD);
 
        if (myid == 0)
            printf("pi is approximately %.16f, Error is %.16f\n",
                   pi, fabs(pi - PI25DT));
    }   
 
    /* Read the counters */
    retval = PAPI_read(EventSet, values);
    if (retval != PAPI_OK)
        handle_error(retval);
 
    printf("After reading counters: %lld\n",values[0]);
 
    /* Start the counters */
    retval = PAPI_stop(EventSet, values);
    if (retval != PAPI_OK)
        handle_error(retval);

    printf("After stopping counters: %lld\n",values[0]);

    MPI_Finalize();

    /* Executes if all low-level PAPI
    function calls returned PAPI_OK */
    printf("\033[0;32m\n\nPASSED\n\033[0m");
    exit(0); 
}

Possible Output

(after entering 50, 75, and 100 as input)

Enter the number of intervals: (0 quits) 50
pi is approximately 3.1416259869230028, Error is 0.0000333333332097
Enter the number of intervals: (0 quits) 75
pi is approximately 3.1416074684045965, Error is 0.0000148148148034
Enter the number of intervals: (0 quits) 100
pi is approximately 3.1416009869231254, Error is 0.0000083333333323
Enter the number of intervals: (0 quits) 0
After reading counters: 117393
After stopping counters: 122921


PASSED

On success, all PAPI functions return PAPI_OK and the possible above output is returned. On error, a non-zero error code is returned.