In order to successfully complete this assignment you need to participate both individually and in groups during class.   Have one of the instructors check your notebook and sign you out before leaving class. Turn in your assignment using D2L no later than **11:59pm**. 

---


# In-Class Assignment: MPI Errors

<img src="https://cdn.pixabay.com/photo/2016/10/04/13/52/fail-1714367_960_720.jpg" width=30%>
<p style="text-align: right;">Animation from: [Pixabay](https://pixabay.com/)</p>

### Agenda for today's class (70 minutes)

</p>

1. (10 minutes) Pre-class Review 
2. (30 minutes) MPI Error Example
3. (20 minutes) Rumor Mill
4. (10 minutes) Quiz Prep

---
# 1. Pre-class Review 

&#9989; **<font color=red>DO THIS:</font>** Discuss the references you found for error handling in MPI with your group. Below, summarize your findings for handling errors in MPI.  

By default, MPI uses MPI_ERRORS_ARE_FATAL, which causes the whole program to terminate when an error occurs. To prevent this, you can switch to MPI_ERRORS_RETURN, which lets functions return error codes instead of aborting.

You can check these codes using MPI_Error_string to get readable messages. For more control, you can create custom error handlers with MPI_Comm_create_errhandler and attach them using MPI_Comm_set_errhandler.

Error handling can also be customized for files, windows, and info objects. Overall, it’s best to check return values and handle errors gracefully to avoid crashes and make debugging easier.

---

# 2. MPI Error Example

As a class, lets look at our code from Friday and add Error Checking. 


How do we check if error handling is working?

```c
#include <mpi.h>
#include <stdio.h>

static long num_steps = 100000;
double step;

int main(int argc, char** argv)
{ 
    int i, nthreads;
    double pi, sum;
    step = 1.0 / (double) num_steps;
    int rank, size, err;
    MPI_Status status;

    /? Initialize MPI
    MPI_Init(&argc, &argv);
    
    // Set custom error handler to return errors instead of aborting
    MPI_Errhandler_set(MPI_COMM_WORLD, MPI_ERRORS_RETURN);  // Simple, portable

    // Get rank and size
    err = MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    if (err != MPI_SUCCESS) {
        char errString[MPI_MAX_ERROR_STRING];
        int len;
        MPI_Error_string(err, errString, &len);
        printf("MPI_Comm_rank error: %s\n", errString);
    }

    err = MPI_Comm_size(MPI_COMM_WORLD, &size);
    if (err != MPI_SUCCESS) {
        char errString[MPI_MAX_ERROR_STRING];
        int len;
        MPI_Error_string(err, errString, &len);
        printf("MPI_Comm_size error: %s\n", errString);
    }

    // Pi calculation
    {
        int id, nthrds;
        double x;
        id = rank;
        nthrds = size;
        nthreads = nthrds;

        for (i = id, sum = 0.0; i < num_steps; i += nthrds) {
            x = (i + 0.5) * step;
            sum += 4.0 / (1.0 + x * x);
        }
    }

    if (rank == 0) {
        double procsum;
        pi = sum * step;

        for (int proc = 1; proc < nthreads; proc++) {
            err = MPI_Recv(&procsum, 1, MPI_DOUBLE, proc, 1, MPI_COMM_WORLD, &status);
            if (err != MPI_SUCCESS) {
                char errString[MPI_MAX_ERROR_STRING];
                int len;
                MPI_Error_string(err, errString, &len);
                printf("MPI_Recv error from process %d: %s\n", proc, errString);
            } else {
                pi += procsum * step;
            }
        }

        printf("Pi = %f\n", pi);

        // 🔴 Test forced error: send to invalid rank
        double dummy = 0.0;
        int badRank = size + 10;  // invalid rank
        err = MPI_Send(&dummy, 1, MPI_DOUBLE, badRank, 1, MPI_COMM_WORLD);
        if (err != MPI_SUCCESS) {
            char errString[MPI_MAX_ERROR_STRING];
            int len;
            MPI_Error_string(err, errString, &len);
            printf("Forced MPI_Send error: %s\n", errString);
        }

    } else {
        // Send partial sum to process 0
        err = MPI_Send(&sum, 1, MPI_DOUBLE, 0, 1, MPI_COMM_WORLD);
        if (err != MPI_SUCCESS) {
            char errString[MPI_MAX_ERROR_STRING];
            int len;
            MPI_Error_string(err, errString, &len);
            printf("Process %d: MPI_Send error: %s\n", rank, errString);
        }
    }

    MPI_Finalize();
    return 0;
}
```

---
<a name=Rumor-Example-Continued></a>
# 3. Rumor Example Continued

Use the rest of the class to continue working on the rumor example. If you get a solution working, discuss your solution with your group. 

---
<a name=Quiz></a>
# 4. Use the remainder of class to review for the MPI quiz.

Next class we will have an MPI quiz to check understanding of the MPI content we have covered so far. Be sure to review the general MPI workflow and ask questions if you are feeling confused about any of the MPI content so far. 


-----
### Congratulations, we're done!

Have one of the instructors check your notebook and sign you out before leaving class. Turn in your assignment using D2L.

Written by Dr. Dirk Colbry, Michigan State University (Updated by Dr. Nathan Haut in Spring 2025)
<a rel="license" href="http://creativecommons.org/licenses/by-nc/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc/4.0/">Creative Commons Attribution-NonCommercial 4.0 International License</a>.

----