Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is passing NULL as choice argument legal? #533

Open
devreal opened this issue Dec 14, 2021 · 3 comments
Open

Is passing NULL as choice argument legal? #533

devreal opened this issue Dec 14, 2021 · 3 comments
Labels
chap-environment MPI Environmental Management Chapter Committee mpi-5 For inclusion in the MPI 5.0 standard

Comments

@devreal
Copy link

devreal commented Dec 14, 2021

tl/dr: MPI should specify whether passing an invalid pointer (NULL in C) as choice argument is valid if zero elements are to be received/sent.

Problem

Passing null-pointer to memcpy is undefined behavior in C [1]. The C standard states (N1570, 7.1.4, 1): [emphasize mine]

If an argument to a function has an invalid value (such as a value outside the domain of the function, or a pointer outside the address space of the program, or a null pointer, or a pointer to non-modifiable storage when the corresponding parameter is not const-qualified) or a type (after promotion) not expected by a function with variable number of arguments, the behavior is undefined.

We probably all know that UB is among the worst areas of the C language as it allows compilers to do all kinds of optimizations that users do not expect.

In contrast tomemcpy, it is not as easy to eliminate a receive or send call because, well, there are two dependent calls on two different processes and leaving out one will cause a deadlock. The MPI standard does not mention whether passing invalid pointers as choice arguments is valid if the message is empty. Section 2.5.5. simply states: [emphasize mine]

MPI functions sometimes use arguments with a choice (or union) data type. Distinct calls to the same routine may pass by reference actual arguments of different types. [Fortran stuff]; for C, we use void*.

Consider this example:

/* Receive cnt integer elemenets into buf.
 * Passing buf == NULL and cnt == 0 is valid to receive empty messages.
 */
void recv(int buf[], int cnt){
  // need to call MPI_Recv to get the message out of the way
  MPI_Recv(buf, cnt, MPI_INT, ...);
  if (NULL != buf) {
    assert(cnt != 0);
  }
}

Now consider a simplified implemenation of MPI_Recv that eventually performs a memcpy into buf:

int MPI_Recv(void *buf, int cnt, MPI_Datatype type, ...) {
  //....
  memcpy(buf, tmp_recv_buf, size_of_type(type)*cnt); // UB if buf == NULL
  return MPI_SUCCESS;
}

Last, consider that the compiler is actually able to see that implementation, e.g., because everything was compiled with link-time-optimization enabled. In that case, the assert in recv will trigger if buf == NULL && cnt == 0 because the compiler may assume that buf != NULL after the inlined call to memcpy (because UB). For a demonstration, see this example (notice how the conditional before the assert after the memcpy is optimized out and thus will abort the program if buf == NULL and cnt == 0).

Looking at the relevant sentence in the standard, such an implementation of MPI_Recv might be permissible under a strict interpretation:

Distinct calls to the same routine may pass by reference actual arguments of different types.

The NULL pointer is not a reference to an actual object (or argument, for that matter) so there is no reason to ever expect it.

Note, that the C standard requires pointers to point to valid objects even if zero elements are copied:

(N1570, 7.24.1, 2)

Where an argument declared as size_t n specifies the length of the array for a function, n can have the value zero on a call to that function. Unless explicitly stated otherwise in the description of a particular function in this subclause, pointer arguments on such a call shall still have valid values, as described in 7.1.4. On such a call, a function that locates a character finds no occurrence, a function that compares two character sequences returns zero, and a function that copies characters copies zero characters.

Given that, the call to memcpy in the above implementation of MPI_Recv is legal under the strict interpretation outlined above.

Proposal

Add verbiage to the standard requiring implementations to ignore choice arguments if the pointer can be expected to point to zero elements:

MPI functions sometimes use arguments with a choice (or union) data type. Distinct calls to the same routine may pass by reference actual arguments of different types. Choice arguments are ignored if zero objects are referenced.

This would cover all relevant function (P2P, collective, RMA) where zero can be passed as the number of elements.

Changes to the Text

See above

Impact on Implementations

Implementations may not use application-provided invalid pointers for anything other than comparison. In other words, implementations have to make sure that a pointer is non-NULL before passing it into memcpy (or any similar C standard function to which (N1570, 7.1.4, 1) cited above applies) or otherwise de-referencing. The simplified implementation could then be:

int MPI_Recv(void *buf, int cnt, MPI_Datatype type, ...) {
  //....
  // make sure message is not truncated
  //...
  if (cnt > 0) {
    memcpy(buf, tmp_recv_buf, size_of_type(type)*cnt); // UB if buf == NULL
  }
  return MPI_SUCCESS;
}

The conditional before the call to memcpy prevents UB and thus the assert from triggering if buf == NULL.

This seems more prudent than requiring applications to pass non-NULL pointers to MPI even if zero elements are to be transferred. I assume that most implementations are already performing that check (or do not use function that trigger such UB), so the impact on implementations should be minimal.

Impact on Users

Clarification that passing invalid pointers to MPI is legal if no elements are actually accessed and the guarantee that this does not create undefined behavior, under no circumstance.

References and Pull Requests

[1] https://en.cppreference.com/w/c/string/byte/memcpy

@VictorEijkhout
Copy link

VictorEijkhout commented Dec 14, 2021 via email

@jeffhammond
Copy link
Member

I've been passing null and count=0 for years. If count is zero, buffer should not be referenced at all. I do not want to force programmers to pass valid buffers for zero count. That's just going to make code ugly.

@devreal
Copy link
Author

devreal commented Jan 8, 2022

From the discussion in the Languages WG: @martinruefenacht seems to be working on contracts as part of the semantic terms chapter. The issue of passing NULL seems to fit there. I don't see any harm in stating this explicitly in one place in the standard, even if it's obvious for people here.

@wesbland wesbland added chap-environment MPI Environmental Management Chapter Committee mpi-5 For inclusion in the MPI 5.0 standard labels Jun 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chap-environment MPI Environmental Management Chapter Committee mpi-5 For inclusion in the MPI 5.0 standard
Projects
Status: To Do
Development

No branches or pull requests

4 participants