Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature request] very basic support for MPI debugging multiple processes at the same time #1723

Open
davydden opened this issue Mar 22, 2018 · 27 comments

Comments

@davydden
Copy link

davydden commented Mar 22, 2018

It would be good to have a very basic support of debugging MPI programs.

Outside of VSCode I can attach a debugger to each terminal via

mpirun -np 2 xterm -e lldb -f my_exec

and then do the standard serial debugging in each terminal window.

Alternative is to hack-in into the source a piece of code which prints process ID and waits so that a user can attach a debugger to some/all processes, see https://www.open-mpi.org/faq/?category=debugging#serial-debuggers .

With vscode-cpptools one can already adopt the latter via launch.json and its "processId": "${command:pickProcess}" option.

It would be great, if vscode-cpptools goes one step further by extending the launch config with

    {
      "num_mpi_processes" : "4",
      "mpi_exec": "<path-to-mpirun>"
    },

and in the background simply attach gdb/lldb debugger for each MPI process and perhaps append a number to the "name": "(lldb) Attach", so that one can use the available GUI.

One would, of course, need to manually switch between debuggers for each process in the GUI but this should be fine for up to 4-8 processes, which is often enough to find bugs in MPI programs.

p.s. AFAIK the only other non-commerical project with MPI debugger is Eclipse PTP, but last time I checked it was limited to Linux.

@pieandcakes
Copy link
Contributor

One would, of course, need to manually switch between debuggers for each process in the GUI but this should be fine for up to 4-8 processes, which is often enough to find bugs in MPI programs.

@davydden I don't think the UI would support what you are proposing. Unfortunately, VS Code will only allow one debug instance, so even if we attach to 4 separate processes (which we can't because VS Code won't spawn the debug adapter 4 times) there isn't a UI to allow this change. With the way the debugAdapter is authored, it is a one to one relationship with the debuggee. Your feature request would need to start with VS Code for them to allow simultaneous multi-process debugging.

@davydden
Copy link
Author

davydden commented Mar 23, 2018

@pieandcakes
that's unfortunate, but thanks for the prompt and detailed comment, though 👍

EDIT: i guess the only way to hack this is separate attach configs:

    {
      "name": "(lldb) Attach MPI 0",
      "type": "cppdbg",
      "request": "attach",
      "program": "${workspaceFolder}/some.debug",
      "processId": "${command:pickProcess}",
      "MIMode": "lldb"
    },
    {
      "name": "(lldb) Attach MPI 1",
      "type": "cppdbg",
      "request": "attach",
      "program": "${workspaceFolder}/some.debug",
      "processId": "${command:pickProcess}",
      "MIMode": "lldb"
    },
    {
      "name": "(lldb) Attach MPI 2",
      "type": "cppdbg",
      "request": "attach",
      "program": "${workspaceFolder}/some.debug",
      "processId": "${command:pickProcess}",
      "MIMode": "lldb"
    },

@pieandcakes
Copy link
Contributor

@davydden Even then, you can only attach one process at a time so it won't get you too far. Once you have started debugging, you can't really debug again.

@davydden
Copy link
Author

Once you have started debugging, you can't really debug again.

I see, thanks for clarifying.

@davydden
Copy link
Author

@pieandcakes I was said upstream that the support for multiple processes is already there:

The VS Code debugger supports multiple processes. See https://code.visualstudio.com/updates/v1_20#_node-debugging.
The animated gif shows VS Code debugging a master process and 5 child processes.

do you think it's possible for cpptools to use this functionality and enable debugging with multiple processes?

@davydden davydden reopened this Mar 26, 2018
@davydden
Copy link
Author

davydden commented Mar 26, 2018

based on the discussion in vscode issue, I would say one needs to do two things:

  1. from cpptools debugger configuration, run a given executable with mpirun -np X my_executable and somehow pause until debuggers are attached (X is number of processes).
  2. look for processes with the name my_executable (there should be as many as X) and attach lldb debugger to each one

Personally, I have no idea how one would pause[1] processes to wait until debugger is attached, but apart from this technical issue, this should be doable and everything is there in VSCode to support it.

[1] without injecting any code like

while (0 == i) 
  sleep(5);

@pieandcakes
Copy link
Contributor

@davydden we can take that as a suggested feature request. Can you provide a project with a repro? I don't have experience with mpirun.

@pieandcakes pieandcakes changed the title [feature request] very basic support for MPI debugging [feature request] very basic support for MPI debugging multiple processes at the same time Mar 26, 2018
@davydden
Copy link
Author

davydden commented Mar 26, 2018

@pieandcakes sure, here's a simple example

// compile with:
//   mpic++ -std=c++11 mpi_example.cc -o mpi_example
// run with:
//   mpirun -np 2 mpi_example
// debug manually with different terminals for different processes:
//   mpirun -np 2 xterm -e lldb mpi_example

#include <mpi.h>
#include <stdio.h>
#include <iostream>
#include <unistd.h>

int main(int argc, char** argv) {
    // Initialize the MPI environment
    MPI_Init(NULL, NULL);

    // Get the number of processes
    int world_size;
    MPI_Comm_size(MPI_COMM_WORLD, &world_size);

    // Get the rank of the process
    int world_rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);

    for (int p = 0; p < world_size; ++p)
      {
        if (p == world_rank)
          std::cout << "process rank " << p << " out of " << world_size << std::endl;

        // uncomment this if you want all processes to wait and check PID and alike
        /*
        while (p==0)
          sleep(5);
        */

        MPI_Barrier(MPI_COMM_WORLD);
      }

    // Finalize the MPI environment.
    MPI_Finalize();
}

on macOS you can use homebrew or spack to build any MPI providers i.e. OpenMPI (do not confuse with OpenMP, those are different things). On Linux you can get MPI from standard repositories (i.e. sudo apt-get install openmpi-bin openmpi-common libopenmpi-dev should do). I have no experience with building MPI on Windows, though.

Let me know if I could further help somehow, though I have no experience whatsoever with TypeScript / JavaScript used in VSCode.

p.s. this is of course a personal opinion, but I think VS Code could attract a fair amount of computational scientists with this feature as not all universities have licenses for dedicated commercial debuggers, that cost upwards from around 700$. It's probably also not something an individual would buy himself. A good summary of debugger options for MPI is in this stackoverflow post and also in deal.II FAQ wiki.

@WardenGnaw
Copy link
Member

Thanks @davydden. We played around with this a bit and I think I could see how we someone could make this work.

Below is a basic outline in case anyone wants to try and take this on. In the interest of full disclosure, this is enough work that we would need to see significant interest in using this from the MPI community before we would take on this work.

At any rate, here is how we think it could work:

  1. A VS Code extension that will do something
    1. It opens a unix domain socket (or whatever IPC you want) to listen to messages from a ‘launch helper’ program (see item 2 below).
    2. It kicks off an MPI session by executing something like mpirun -n <num_nodes> /example/path/to/launch/helper <path-to-app>
    3. When the launch helper program reports process ids back, build up a json object with the content that would normally be in a launch.json configuration that tells VS Code to debug using GDB/LLDB, and attach to the specified process ID
    4. Send this launch command to VS Code using vscode.debug.startDebugging(undefined, DebugConfiguration) // see https://code.visualstudio.com/docs/extensionAPI/vscode-api#_debug
    5. In startDebugging.then, tell the launch helper program to run the target app
  2. We need a little ‘launch helper’ program that is written in C/C++. I will call this 'MPIDebugLauncher'. It does the following:
    1. When it starts up, it sends a message to the extension with its process id
    2. Wait for the extension to tell the helper that launch is ready (step 1.e)
    3. execv the target app

A few other notes:

  1. It might be possible to have GDB launch the process instead of having it to attach if GDB is unhappy with attaching past the execv. But I am not sure if it is just environment variables that need to be propagated from MPI to the target app, or if things will only work if the parent/child process relationship is preserved.
  2. When we implemented MPI debugging in full VS many years ago, there were many scalability and races that we needed to deal with. So while this seems like it should work, its hard to say for sure.
  3. As an alternative, if the target app processes are all spawned as child processes of mpirun/exec, it might be possible to implement this as just general child process debugging

@davydden
Copy link
Author

@WardenGnaw thanks entertaining the idea, appreciate it.

this is enough work that we would need to see significant interest in using this from the MPI community before we would take on this work.

That's perfectly understandable. I will ping a few folks in case the want to add something on the prospect of debugging MPI with GUI in Visual Studio Code @alalazo @bangerth @tjhei @BarrySmith @tgamblin @goxberry @jthies .

@goxberry
Copy link

Thanks for the ping @davydden!

I agree with both of the statements @davydden mentioned re: market:

VS Code could attract a fair amount of computational scientists with this feature as not all universities have licenses for dedicated commercial debuggers, that cost upwards from around 700$. It's probably also not something an individual would buy himself.

and scoping a "parallel debugging" feature:

up to 4-8 processes, which is often enough to find bugs in MPI programs

The two most important things that make dedicated commercial debuggers (e.g., TotalView) easier to use are:

  • a single window can toggle over each MPI process
  • the current active process number is prominently labeled

It's possible to do many of the same sorts of things that these debuggers do by spawning multiple debugger instances in separate windows, but it's a kludgy setup that tends to reduce productivity because a user has to spend time figuring out which context (i.e., which window) they want to operate on.

If VS Code were able to toggle among MPI processes in a single window and display the appropriate debugging info, it would be a killer feature for developers who don't have access to dedicated commercial debuggers, and it would make it easier to demo and explain to beginners how to debug parallel programs that use MPI.

@pramodk
Copy link

pramodk commented Mar 27, 2018

VS Code could attract a fair amount of computational scientists with this feature as not all universities have licenses for dedicated commercial debuggers, that cost upwards from around 700$. It's probably also not something an individual would buy himself.

I agree with @davydden. We pay lot of money for site licenses of parallel debuggers. Often it's sufficient for many users to debug with few mpi processes. If parallel debugging feature will be available, me (and my colleagues) will definitely give a try!

@fouriaux
Copy link

That will be a very practical feature for daily use before jumping to more sophisticated debuggers.

@jsquyres
Copy link

Have you talked to the Microsoft HPC/MPI developers? I thought they developed exactly this kind of functionality (on the MPI side) several years ago. I could be wrong -- I don't follow the Microsoft side of HPC very much. Best to check with them to see exactly what they did.

Additionally, be aware that the MPI community defined a mechanism for the MPI implementation and a tool (e.g., a debugger) to attach to each other -- it's called the MPIR process acquisition interface. Both Open MPI and MPICH have supported MPIR for... er... I don't know offhand, but it's measured in decades. It's the same mechanism that is used by DDT, TotalView, ...etc. If you're going to make new tools that attach to MPI processes, you might as well use the pre-existing infrastructure for it.

That being said, Open MPI is deprecating its MPIR interface and is (literally) in the middle of replacing it with a more modern/flexible PMIx interface for tool attachment. MPIR is fine, but it's showing its age; PMIx is the New Hotness. That might be worth investigating, too.

Just my $0.02.

@ggouaillardet
Copy link

Also, debugging a parallel application is quite different than debugging several independant processes

actions (continue, step, next, ...) and breakpoints/watchpoints/tracepoints should be set on all the tasks, a subset of tasks (e.g. communicator/group) or an individual task.

automatically attaching to several independant processes is obviously better than nothing, especially if it is free (as in beer), but I am afraid it is a bit too rusty to be effective.

Obviously, a parallel debugger is not a trivial development.

@omidmeh
Copy link

omidmeh commented Mar 29, 2018

This would be extremely useful too. For people who can't purchase something that's about $1000 to use at research projects or at home, and there is quite a few that I know of personally, this would be an amazing feature.
In fact, I am interested enough that depending on what's needed I would be willing to contribute some too.

@WardenGnaw
Copy link
Member

@TheMeh I had listed a description on what needs to be done in this comment.

Creating a VSCode extension can be found here.

@Lupum1001
Copy link

Lupum1001 commented Sep 2, 2019

@WardenGnaw
I seem to have gotten very close to getting this to work (a 'launch' configuration, not 'attach') by just changing a bunch of gdb settings in the launch.json file. At the moment, there appears to be one or two bugs getting in the way, but it seems to work in principle:

My Application:
There's a single parent process responsible for launching all of the child processes using fork() and execv().
System Control Task (parent)
-- InitTask (child)
-- ConfigTask (child)
-- StorageManager (child)
-- SBServer (child)
-- etc...

Normal behavior without gdb (for reference):

  1. Launch the Control Task
  2. The C.T. launches the InitTask to completion (waits for it to exit)
  3. The C.T. launches the rest of the child processes in turn, and they stay running
  4. All of the log messages from the child processes are fed to the terminal where the C.T. was run.

What works when launched with gdb in vscode:

  1. The C.T. starts using a simple 'launch' style debugging configuration
  2. The C.T. forks itself, and you get 2 call stacks and 2 identical "inferiors" listed the debug console.
    image
    image
  1. The execv() is run, and the second inferior becomes the InitTask.
    image
    image
  2. Note that the child task appears to have automatically been selected when it hit its own break points, despite "follow-fork-mode" being set to "parent" (see below config). This only works if you have the "schedule-multiple" and "non-stop" settings set to "on". Without them, the focus stays on Inferior 1 (the parent), and the application hangs.
    Default Wrong Behavior ('non-stop' and 'schedule-multiple' aren't turned on):
    image
    image
    .
    .
    Observed Bugs
  1. The entire debugging instance is torn down once the InitTask closes itself (despite the 3rd task starting to launch successfully). The only way I've avoided this is to detatch the process manually, before it exits.
    1.a) insert a breakpoint at the very end of the InitTask main()
    1.b) use the debug console to switch to a different active inferior.
    1.c) detach the InitTask inferior (which is trying to close) and remove it. That way, it'll continue running, and close itself on its own.
    When I enabled engine logging, it appeared as if the extension was sending an "exit" command when it detected the child process close (while still attached).
    image
    (here, I skipped a bunch of "paragraph-sized" messages related to unloading/re-loading libraries, breakpoints, etc)
    image
    Below is a stripped down version of the log output:

[2019-09-02 07:09:39.881] I SCT >>> SCT starting on PID 30603 <<<
[2019-09-02 07:09:40.306] I SCT starting 'Init' (AppControl/debug/bin/PrimeInit) on PID 30611
[2019-09-02 07:10:05.903] D INIT no previous network/interfaces file found - skipping network configuration
[2019-09-02 07:10:19.567] I SCT 'Init' (AppControl/debug/bin/PrimeInit) exited - status 0
[2019-09-02 07:10:19.969] I SCT starting 'Config' (Configuration/debug/bin/ConfigTask) on PID 30615

  1. Another observed bug was that break points are misplaced in gdb if processes have conflicting names for source files (e.g. if both have a main.cpp). I just posted about this on this ticket: About breakpoint #3268

.
.
.
.
This is my current entry for "setupCommands" in my launch.json file. The rest was just a typical launch config (define an executable, working directory, etc).
NOTE: The farther down this list you go, the less sure I am about the behavior (and/or "correctness") of each setting.

"setupCommands": [
{
"description": "Enable pretty-printing for gdb",
"text": "-enable-pretty-printing",
"ignoreFailures": false
},
{
"text": "-gdb-set follow-fork-mode parent", //parent or child both seem to work if 'schedule-multiple is enabled.
"ignoreFailures": false
},
{
"description": "On a fork, keep gdb attached to both processes.",
"text": "-gdb-set detach-on-fork off", //set to off
"ignoreFailures": false
},
{
"description": "PID stays the same after renaming forked process",
"text": "-gdb-set follow-exec-mode same", //'same' or 'new' works
"ignoreFailures": false
},
{
"text": "-gdb-set schedule-multiple on", //must be set to 'on' if follow-fork-mode is 'parent', or child will never run.
"ignoreFailures": false
},
{
"description": "Let other processes continue if one hits a break point",
"text": "-gdb-set non-stop on", //this is probably optional, depending on desired behavior
"ignoreFailures": false
},
{
"description": "needs to be off for 'non-stop' mode?",
"text": "-gdb-set pagination off", //unsure if this is needed
"ignoreFailures": false
},
{
"description": "Need this 'on' to ensure 'non-stop on' works",
"text": "-gdb-set target-async on",
"ignoreFailures": false
}
]

@rasmus98
Copy link

rasmus98 commented Aug 9, 2020

Any update on this? This is still the top result on google for "debug mpi in vscode", so it seems this has significant interest. As a stop-gap solution, has anyone been able to successfully create a launch.json file to launch and automatically attatch to eg the first mpi rank? Naively, it seems to me that this should be significantly more doable, and still extremely usefull...

@ghost
Copy link

ghost commented Nov 16, 2020

Debugging would be fantastic, but I can't figure out how to include the "mpiexec -n " prefix in launch.json. Surely there is some way to do this, yes? I tried adding it as a prelaunch task but the system requires that I include it as a prefix to running the a python script. Any help would be greatly appreciated.

@phil-blain
Copy link

phil-blain commented Nov 16, 2020

Tangentially related: have a look at tmpi. It's not as cool as debugging MPI executables in VSCode could be, but it's a major improvement over mpirun -np $n xterm -e gdb --args ./my_executable since keyboard input is multiplexed to each process (by default). Note that it only works for OpenMPI (not MPICH) for the moment though.

@robertu94
Copy link

Hi, I've written a program I call mpigdb that allows debugging MPI programs with a simple wrapper around GDB, gdbserver, and the standard MPI process startup mechanisms. I tried to do some basic enable-ment to get this working in vscode (on the branch paper), but it looks like the extension tries to restart and overwrite the commands I use to configure the underlying GDB instance. It registers the connections to the processes

image

but then seems to hang almost immediately. Without vs-code, this finishes nearly instantaneously.

I had to do some hacky things in the launch.json to get this to launch.

{
    // Use IntelliSense to learn about possible attributes.
    // Hover to view descriptions of existing attributes.
    // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
    "version": "0.2.0",
    "configurations": [
        {
            "name": "mpigdb",
            "logging": { "engineLogging": true, "trace": true, "traceResponse": true },
            "type": "cppdbg",
            "request": "launch",
            "program": "${workspaceFolder}/play/build/crash",
            "args": ["${workspaceFolder}/play/build/crash"],
            "stopAtEntry": false,
            "cwd": "${fileDirname}",
            "environment": [],
            "externalConsole": false,
            "MIMode": "gdb",
            "miDebuggerPath": "/home/runderwood/.cargo/bin/mpigdb",
            "miDebuggerArgs": "--mpigdb_verbose -np 4 -- ${workspaceFolder}/play/build/crash",
            "setupCommands": [
                {
                    "description": "Enable pretty-printing for gdb",
                    "text": "-enable-pretty-printing",
                    "ignoreFailures": true
                },
                {
                    "description": "Set Disassembly Flavor to Intel",
                    "text": "-gdb-set disassembly-flavor intel",
                    "ignoreFailures": true
                }
            ]
        }

    ]


}

Here is the debug console output: debug.txt

@mystery-e204
Copy link

It's not an integrated solution, but the way I debug MPI programs with vscode is using mpidb. It automates the spawning of gdbserver instances with some customization and prints the information required to attach with gdb. You can pipe this information into a helper tool that generates a launch.json file for vscode. It includes a compound configuration that lets you debug the specified MPI ranks as a "multi-process" debugging session.

@mredenti
Copy link

mredenti commented Feb 3, 2024

Are there any updates regarding this? Do we finally have MPI debugging support in VScode that allows us to switch MPI process from the UI?

@robertu94
Copy link

robertu94 commented Feb 19, 2024

@mredenti, I've used the technique from mpidb suggested by @mystery-e204 , and I added --mpigdb_frontend vscode to my mpigdb tool. It's not fully compatible with the UI (e.g. the processes still need to be started with the cmdline), and there are some bugs in multi-process debugging in vscode that make it suboptimal, but it may be better than nothing for you.

@MaximusXIV2020
Copy link

MaximusXIV2020 commented Mar 7, 2024

I figured it out (no special VS Code extensions or other tools required):
You have to use remote debugging, combined with the Multiple Instruction Multiple Data (MIMD) mode of mpiexec (Form A). Then you can start the debugging servers in parallel, handing over a different TCP port on each rank as the argument.

Example with two MPI ranks:

  1. Run this in a terminal (or wrap it into a VS Code task, using tasks.json). It will start the debugging servers (one for each rank):
mpiexec -n 1 gdbserver :20000 ./my_prog : -n 1 gdbserver :20001 ./my_prog
  1. In VS Code, launch "connect C++ (NProcs=2)", using a launch.json configuration as follows. It will connect the debugging clients for all ranks at once.
{
  "version": "0.2.0",
  "compounds": [
    {
      "name": "connect C++ (NProcs=2)",
      "configurations": [
        "_rank0",
        "_rank1"
      ],
      "stopAll": true
    }
  ],
  "configurations": [
    {
      "name": "_rank0",
      "type": "cppdbg",
      "request": "launch",
      "program": "/abs/path/to/my_prog",
      "miDebuggerServerAddress": "localhost:20000",
      "MIMode": "gdb",
      "miDebuggerPath": "/abs/path/to/gdb",
      "cwd": "${workspaceRoot}"
    },
    {
      "name": "_rank1",
      "type": "cppdbg",
      "request": "launch",
      "program": "/abs/path/to/my_prog",
      "miDebuggerServerAddress": "localhost:20001",
      "MIMode": "gdb",
      "miDebuggerPath": "/abs/path/to/gdb",
      "cwd": "${workspaceRoot}"
    }
  ]
}

Works like a charm. You can also use other debuggers, if remote debugging is supported by both the debugger in general and the VS Code implementation. I tried debugpy, which works perfectly fine for me. The mpiexec call for Python debugging could look like this:

mpiexec -n 1 python -m debugpy --listen :10000 --wait-for-client my_script.py : -n 1 python -m debugpy --listen :10001 --wait-for-client my_script.py

You can even combine Python and C++ debugging, if you start the servers like this:

mpiexec -n 1 gdbserver :20000 python -m debugpy --listen :10000 my_script.py : -n 1 gdbserver :20001 python -m debugpy --listen :10001 my_script.py

@robertu94
Copy link

This is actually what mpigdb does internally when in vscode mode. Last time I checked (a few months ago) I've found some edge cases with some of the more exotic features.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests