Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPI support in WAMR #731

Merged
merged 13 commits into from
Mar 28, 2023
Merged

MPI support in WAMR #731

merged 13 commits into from
Mar 28, 2023

Conversation

csegarragonz
Copy link
Collaborator

@csegarragonz csegarragonz commented Mar 22, 2023

In this PR I implement MPI support for WAMR.

Recently, execution of large applications in WAVM (e.g. LAMMPS) has been a bit flaky. I have also spotted areas of WAVM's WASI support that are out of sync with the spec. This is increasingly worrying after we have rebased to the latest wasi-libc.

This PR does not mean we are dumping WAVM, but it will help cross-check the origin of certain runtime errors. Also, and this speaks very well to the current WAVM MPI implementation and faasm/faabric split, implementing this was around a day's worth of work.

Most of the code is boilerplate code or WAMR-specific pointer/offset checking/translation. As an implementation note for the future, some of the MPI calls use double-pointers. When registering native symbols in WAMR, one may use the * character to indicate that a parameter (int32_t) is in fact an offset to WASM memory, and the WAMR runtime converts it into a pointer type, and does the corresponding bound checks so that it is ready to be read-from/written-to. However, if we are dealing with a double pointer, the second pointer is still an offset, not a readily usable native pointer.

NB: To run LAMMPS in WAMR we still need a couple more pieces that will come in a subsequent PR.

REG_NATIVE_FUNC(__faasm_await_call, "(i)i"),
REG_NATIVE_FUNC(__faasm_chain_ptr, "(i$i)i"),
REG_NATIVE_FUNC(__faasm_pull_state, "(*i)"),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have had to add a couple state functions as well for the MPI tests to pass.

/**
* Chain a function by function pointer
*/
static int32_t __faasm_chain_ptr_wrapper(wasm_exec_env_t exec_env,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two methods were just moved somewhere else in the file to preserve alphabetical order.

{
SPDLOG_DEBUG("S - poll_oneoff");
throw std::runtime_error("poll_oneoff not implemented");

WAMRWasmModule* module = getExecutingWAMRModule();
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also needed this function for the MPI tests.

@@ -168,9 +168,10 @@ TEST_CASE_METHOD(OpenMPTestFixture,
doOmpTestLocal("default_shared");
}

// 23/03/2023 - This test has become very flaky.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that this test has started to fail more and more recently also makes me suspicious about the state of WAVM.

@csegarragonz csegarragonz self-assigned this Mar 23, 2023
Copy link
Collaborator

@Shillaker Shillaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, surprisingly clean, we clearly did a good job of encapsulating stuff elsewhere. It is a shame about WAVM, but I do agree that it seems to be slowly dying off, so perhaps we do need to think about going full WAMR.

Anywho, nice work 👍

@csegarragonz csegarragonz merged commit 00c729e into main Mar 28, 2023
@csegarragonz csegarragonz deleted the wamr-mpi branch March 28, 2023 10:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants