subprocess: Support for fork in subprocess debugging #943

karthiknadig · 2018-10-19T17:37:31Z

Two possible options here:
Option 1: refactor main, daemon, and session code to allow teardown and restart.

Pro: Cleaner solution. We are going to refactor the above mentioned items irrespective of option 1 or 2
Con: This is a much bigger work item.

Option 2: Delete all loaded ptvsd and pydevd modules, and attempt ptvsd attach

Pro: Smaller work item.
Con: Potentially long bug tail due to unpredictable state when fork occurs.

karthiknadig · 2018-11-15T05:04:46Z

To do this right, we need clean shutdown of ptvsd. #799 tracks the work needed for clean shutdown.

sjdv1982 · 2018-11-20T08:24:28Z

As long as this is unsupported, could you make it fail loudly? For now, it just hangs

fabioz · 2019-04-08T10:49:17Z

As a note for anyone using multiprocessing arriving here, on Python 3 it's usually possible to add:

import multiprocessing
multiprocessing.set_start_method('spawn', True)

to the start of the program to make multiprocessing work (because that way multiprocessing will not use fork).

supertaz · 2019-04-09T08:11:22Z

(because that way multiprocessing will not use fork).

True, but incompatible with many basic use cases for multiprocessing.Pool and can lead to running out of memory pretty quickly due to recursive process spawning and unexpected memory usage patterns. spawn() starts a new process with a full memory copy of the parent, which runs from the main entry point, whereas fork() splits off a process from the parent with only the context from the parent that the new process needs to run and continues execution with the next instruction. This difference is why spawn() is slow and fork() is fast. It also means that changing the setting usually results in processes that execute very different code paths from the time they are invoked.

The following common pattern is used in scripts that expect to run a single thread, except to occasionally fork() n scope-limited processes that allow data to be processed in parallel, and is based off a basic use case in the multiprocessing docs:

def parallel_df_func(df, func, cores = 4, partitions = 10):
    df_chunks = np.array_split(df, partitions)
    with Pool(cores) as pool:
        result = pool.map(func, df_chunks)
    return pd.concat(result)

The above code is written for fork(), and trying to debug it with spawn() creates a spawn bomb. Luckily spawn() is slow, so even recursively spawning ~20 more processes per iteration that each load a new copy of a multi-million row dataset from storage only leads to leaking an average of 2-3GB per second until around 10-15 seconds in, where the exponential nature starts to really accelerate the growth. A smaller dataset would lead to faster growth, though it would eventually bottleneck due to I/O if you had enough RAM.

This workaround is a Bad Idea(tm) to try if you're not running (and looking at) top and ready with killall python in a terminal, on a machine with copious amounts of RAM, unless your script is actually designed to use spawn() or to run its entirety in multiple processes. The debugger doesn't catch and stop the runaway processes, so if you're not watching for this behavior, you'll get a nasty surprise pretty quickly.

int19h · 2019-04-09T18:34:47Z

We're still actively working on code refactoring that is necessary for us to support fork properly. It's a tricky thing to get right, because of issues it has with multi-threading (orphaned locks etc), if it's not immediately followed by exec - and we use threads heavily in the debugger itself, so this applies even when debugging single-threaded programs.

I'm not sure I quite follow your code example, though. Wouldn't it create the same number of processes regardless of how they're spawned? Or are you saying that it's not such a big deal with fork, because they're all going to share most of their memory pages?

fabioz · 2019-04-10T13:51:01Z

@supertaz I do agree with you that there may be caveats when changing fork for spawn, especially if you've a use-case optimized for it (so, thanks for that note and sorry if I didn't give the proper warnings before).

Although I'd also like to point that it's very easy to shoot yourself on the foot with fork unless you really do understand its pitfalls, so, I'd say my recommendation would be to use spawn as the default unless you really have a case which absolutely needs fork.

That's especially true on CPython because although fork gives you a copy on write memory, the fact that CPython is reference-counted ends up making it copy on read -- because the reference count of the object is changed that memory has to be effectively copied at that point -- see: https://pycon-2012-notes.readthedocs.io/en/latest/python_linkers_and_virtual_memory.html for a PyCon presentation on it...

And that's besides the regular caveats with locks/uis/file descriptors/buffers/threads/etc ;)

p.s.: that's not to say that there aren't cases where it's a better option nor that the debugger won't support it in the future, as @int19h pointed, we're working on that and I just wanted to present the current workaround to use the debugger until that work is finished.

even though vs code doesn't work with this project microsoft/ptvsd#943

supertaz · 2019-05-14T22:28:19Z

Or are you saying that it's not such a big deal with fork, because they're all going to share most of their memory pages?

It's got a lot to do with the problem at hand, where there's processing of very large data structures that have been split into chunks. Because fork() is CoW (even if reference counting makes it CoR, this point will hold), when you're trying to accelerate processing of large datasets on NUMA systems with many cores (especially with HT pipelines), and you aren't overwriting data, but instead aggregating results and then handling them in the parent at completion, you have a near-linear improvement of processing times with fork() because you're not copying much (if any) memory into the child. The same can't be said of spawn(), as it is copying memory you don't need to access, and thus is slow. Also, spawn() and fork() work differently as to where the entry point is for the child. Because python's multiprocessing package abstracts some of this away (though it's accessible), and because the code is designed to be short-lived and operate as parallel inline execution of a single function, instead of being part of a long-lived pool that operates in a dispatcher-worker pattern, there are some pretty big differences in how the code behaves. fork() followed by exec() is similar in behavior to spawn() plus CoW, but other fork() usage patterns are not, and I think this is where any confusion is coming in. The above code isn't resetting state and entry point via exec(), because the children have one line of code to execute (which calls others, but there is no branching) before they return data to the parent and die. This is efficient with fork() only, and is a workaround for Python's GIL hamstringing threading.

The pattern I supplied is used in data science and data engineering to handle processing of very large datasets via pandas DataFrames and other similar structures. I fully acknowledge how someone who didn't know how to properly use fork() or spawn() could wreak havoc on a system (typically their own system, as most multi-occupancy systems limit a user's resources), but I've been using both for decades and understand the case for one over the other. spawn() is useful for cases where it is appropriate, it's just not useful in these types of cases, and I was trying to illustrate that switching blindly to code not written for spawn() could be just as bad as not knowing how to use fork(), just slower. It's a good illustration, however, of how either fork() or spawn() can be abused if the wrong one is used in code that is designed for a use case where only the other is appropriate.

Since someone who doesn't truly understand how fork() and spawn() work, but was trying to debug something based on a common pattern that is often recommended for speeding up processing under Python and wasn't watching system resources (and/or didn't know to) might not realize that they were creating a recursive memory monster, I thought it important to illustrate that the workaround isn't a workaround for anything that uses multiprocessing for short-lifetime, inline purposes where fork() without exec() is the appropriate solution. Also, I was trying to illustrate the importance of a solution to the debugging issue (as difficult as it may be to create one), over offering a workaround without explaining why said workaround is limited in scope and applicability.

Sorry it took me so long to respond, I hadn't noticed the notification icon, and I missed the notification emails. I hope I addressed the questions, and feel free to ask for more clarifications, as mine my have further obfuscated the point (my concentration is flagging at present).

memeplex · 2019-09-29T20:03:57Z

For me setting start method to spawn or forkserver just avoids the exception thrown for fork, but breakpoints in subprocesses are still not working at all. Do they work for you?

memeplex · 2019-09-29T20:09:39Z

Ah, it seems that I have to create a launch configuration and \enable subProcess. Is that true or can I avoid this step?

karthiknadig · 2019-09-30T02:32:42Z

@memeplex that is correct. "subProcess": true, has to be set in the debug config for the workaround..

Breich90 · 2019-11-07T23:27:13Z

Any update here? Spawning isn't a feasible alternative.

int19h · 2019-11-08T03:50:21Z

@Breich90 My apologies - we haven't done a good job of tracking the work on the new implementation in a way that's easy to follow. At this point, it's mostly centralized in this issue, which references a bunch of others: #1706

TL;DR is that the new implementation is already committed and does fix the fork issue, but we need to do a few more bug fixes and polish before we can ship it as stable. There's a pre-release build for it, ptvsd 5.0.0a7, but it has to be mated with a supporting build of VSCode - and there isn't a ready-made one yet, so there's no easy way to test it without building things locally. We'll have something available for testing soon, though.

AirVetra · 2019-11-26T14:18:55Z

Hello, I'm very sorry, just don't understand what should we do - just wait for the solution and look for #1706?

The error is the following:

Traceback (most recent call last):
File "/home/airvetra/.vscode-server/extensions/ms-python.python-2019.11.50794/pythonFiles/ptvsd_launcher.py", line 43, in
main(ptvsdArgs)
File "/home/airvetra/.vscode-server/extensions/ms-python.python-2019.11.50794/pythonFiles/lib/python/old_ptvsd/ptvsd/main.py", line 432, in main
run()
File "/home/airvetra/.vscode-server/extensions/ms-python.python-2019.11.50794/pythonFiles/lib/python/old_ptvsd/ptvsd/main.py", line 316, in run_file
runpy.run_path(target, run_name='main')
File "/usr/lib/python3.7/runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "/usr/lib/python3.7/runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/airvetra/ml/ML/autots/run_local_test.py", line 118, in
main()
File "/home/airvetra/ml/ML/autots/run_local_test.py", line 114, in main
run(dataset_dir, code_dir)
File "/home/airvetra/ml/ML/autots/run_local_test.py", line 80, in run
ingestion_process.start()
File "/usr/lib/python3.7/multiprocessing/process.py", line 112, in start
self._popen = self._Popen(self)
File "/usr/lib/python3.7/multiprocessing/context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/usr/lib/python3.7/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/usr/lib/python3.7/multiprocessing/popen_spawn_posix.py", line 32, in init
super().init(process_obj)
File "/usr/lib/python3.7/multiprocessing/popen_fork.py", line 20, in init
self._launch(process_obj)
File "/usr/lib/python3.7/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/usr/lib/python3.7/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'run..run_ingestion'

int19h · 2019-11-26T19:17:50Z

In general, you'll need a recent pre-release of ptvsd, and the corresponding version of VSCode, to have fork supported. While the issue is still open because we're still working on some bits and pieces, the bulk of it is already there in ptvsd 5 alphas.

However, this particular call stack doesn't look to me like a typical ptvsd failure do to fork. It mentions pickling and popen_spawn_posix, so this sounds more like what happens if you do set_start_method("spawn")? In this case the error is due to some data that you're trying to share between your processes not being pickle-able.

AirVetra · 2019-11-26T20:24:03Z

Pavel, you are definitely right - i added this set_start_method ("spawn") and after adding it the above error started to occur. Before it there was the multiprocessing error RuntimeError: already started.

So, could you advise how to solve the issue with debugging such code - I could share if it could help...

int19h · 2019-12-10T03:22:23Z

This is complete. The remaining work to ship the new adapter is being tracked by #1706

gauravmunjal13 · 2020-06-20T08:06:49Z

Hi Team,

I came across a strange issue using VS code for debugging PyTorch code on enumerator(data_loader) line.

The error is:
RuntimeError: already started
E00019.065: Exception escaped from start_client
...
AssertionError: can only join a child process

This is happening because of doing multi-processing in data-loader. This has happened specifically when I updated the VS code to 1.46.1 yesterday.

On searching on the net, I figured out the solution as to set num_workers=0 or using the following code before enumerating data loader:
import multiprocessing
multiprocessing.set_start_method('spawn', True)

Is there another way to resolve this issue?

Thanks and regards,
Gaurav Kumar

galfaroth · 2020-06-22T09:43:55Z

I up @guaravmunjal13 After update of vscode I have the same exception. Adding his multiprocessing lines doesn't work for me. Before the debugging was working correctly. Setting num_workers to 0 works but is slow for my training.

karthiknadig added Enhancement Investigate labels Oct 19, 2018

karthiknadig added this to the Nov 2018.1 milestone Oct 24, 2018

karthiknadig assigned int19h Oct 29, 2018

int19h mentioned this issue Nov 1, 2018

ptvsd_subprocess is not received on linux with py2.7 #935

Closed

karthiknadig modified the milestones: Nov 2018.1, Nov 2018.2 Nov 5, 2018

int19h mentioned this issue Nov 19, 2018

ptvsd hangs on subprocess.Process.join #1036

Closed

karthiknadig removed this from the Nov 2018.2 milestone Nov 21, 2018

karthiknadig mentioned this issue Nov 30, 2018

Unable to debug an OpenAI baselines pipeline task #1058

Closed

davidhopper2003 mentioned this issue Nov 30, 2018

Unable to debug the PPO2 algorithm with VSCode openai/baselines#739

Closed

This was referenced Nov 30, 2018

ptvsd and Celery worker raises RuntimeError: already started #1046

Closed

Debugger doesn't start process if breakpoints are set #1040

Closed

Multiprocess debugging support #57

Closed

karthiknadig mentioned this issue Jan 22, 2019

New multiprocessing.Process sometimes hangs without starting #1099

Closed

karthiknadig mentioned this issue Feb 21, 2019

Break point fails with concurrent.futures.ProcessPoolExecutor in vscode python #1162

Closed

karthiknadig mentioned this issue Mar 20, 2019

Simple ProcessPoolExecutor example code fails in VSCode-Python #1228

Closed

fabioz mentioned this issue Mar 22, 2019

Python debugger freezes on fast.ai functions #1254

Closed

int19h mentioned this issue Apr 1, 2019

Unable to debug using ptvsd - Mixed mode debugging #1283

Closed

fabioz mentioned this issue Apr 5, 2019

torch.utils.data failing probably due to atexit issues #1056

Closed

fabioz mentioned this issue Apr 15, 2019

Exception when debugging code with multiprocessing module #1357

Closed

qubitron mentioned this issue May 9, 2019

not support for mutithread Debug microsoft/vscode-remote-release#226

Closed

samhippie added a commit to samhippie/shallow-red that referenced this issue May 12, 2019

Added .vscode to .gitignore

c0fd151

even though vs code doesn't work with this project microsoft/ptvsd#943

karthiknadig mentioned this issue Sep 21, 2019

Breakpoints aren't being hit in local Django project #1794

Closed

karthiknadig modified the milestones: Sep 2019.2, Oct 2019.1 Sep 23, 2019

karthiknadig mentioned this issue Sep 24, 2019

Definitive settings for Django when using reload? #1778

Closed

karthiknadig modified the milestones: Oct 2019.1, Oct 2019.2 Oct 8, 2019

karthiknadig mentioned this issue Oct 9, 2019

Remote enable_attach RuntimeError('already started') #1835

Closed

karthiknadig modified the milestones: Oct 2019.2, Oct 2019.3 Oct 28, 2019

karthiknadig modified the milestones: Oct 2019.3, Nov 2019.1 Nov 8, 2019

karthiknadig modified the milestones: Nov 2019.1, Nov 2019.2 Nov 26, 2019

sshl mentioned this issue Nov 26, 2019

[Exception escaped from start_client, RuntimeError: already started] Running tools/train_net.py in debug mode facebookresearch/detectron2#398

Closed

int19h mentioned this issue Dec 4, 2019

Popen('python') in VS Code crashes the debugger, with Python 2 #1966

Closed

int19h closed this as completed Dec 10, 2019

int19h mentioned this issue Dec 11, 2019

multiprocessing not working while debugging (it works on terminal though) #1969

Closed

karthiknadig mentioned this issue Dec 20, 2019

Can't debug python with multiprocessing microsoft/vscode-python#9254

Closed

karthiknadig mentioned this issue Jan 15, 2020

Fail to launch debug with multiprocessing microsoft/vscode-python#9607

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

subprocess: Support for fork in subprocess debugging #943

subprocess: Support for fork in subprocess debugging #943

karthiknadig commented Oct 19, 2018 •

edited

Loading

karthiknadig commented Nov 15, 2018

sjdv1982 commented Nov 20, 2018

fabioz commented Apr 8, 2019

supertaz commented Apr 9, 2019

int19h commented Apr 9, 2019

fabioz commented Apr 10, 2019

supertaz commented May 14, 2019

memeplex commented Sep 29, 2019

memeplex commented Sep 29, 2019 •

edited

Loading

karthiknadig commented Sep 30, 2019

Breich90 commented Nov 7, 2019

int19h commented Nov 8, 2019 •

edited

Loading

AirVetra commented Nov 26, 2019

int19h commented Nov 26, 2019

AirVetra commented Nov 26, 2019

int19h commented Dec 10, 2019

gauravmunjal13 commented Jun 20, 2020

galfaroth commented Jun 22, 2020 •

edited

Loading

subprocess: Support for fork in subprocess debugging #943

subprocess: Support for fork in subprocess debugging #943

Comments

karthiknadig commented Oct 19, 2018 • edited Loading

karthiknadig commented Nov 15, 2018

sjdv1982 commented Nov 20, 2018

fabioz commented Apr 8, 2019

supertaz commented Apr 9, 2019

int19h commented Apr 9, 2019

fabioz commented Apr 10, 2019

supertaz commented May 14, 2019

memeplex commented Sep 29, 2019

memeplex commented Sep 29, 2019 • edited Loading

karthiknadig commented Sep 30, 2019

Breich90 commented Nov 7, 2019

int19h commented Nov 8, 2019 • edited Loading

AirVetra commented Nov 26, 2019

int19h commented Nov 26, 2019

AirVetra commented Nov 26, 2019

int19h commented Dec 10, 2019

gauravmunjal13 commented Jun 20, 2020

galfaroth commented Jun 22, 2020 • edited Loading

karthiknadig commented Oct 19, 2018 •

edited

Loading

memeplex commented Sep 29, 2019 •

edited

Loading

int19h commented Nov 8, 2019 •

edited

Loading

galfaroth commented Jun 22, 2020 •

edited

Loading