cx_Freeze with torch.multiprocessing using wrong source in child processes #2376

dmagee · 2024-05-01T15:50:27Z

Prerequisite

Make sure no duplicated issue has already been reported. You should look for closed issues, too.
Make sure you are not asking us to help to solve your specific issue. GitHub issues are opened mainly for development purposes. If you want to ask someone to help to solve your problem, go to some community site like StackOverflow, etc.
Make sure your problem is not derived from packaging (e.g. Homebrew).

Describe the bug
On linux when I use cx_Freeze with a python script that uses torch.multiprocessing to create multiple threads (which essentially calls multiprocessing) the child processes seem to try to use the original python files (for the program) and original python environment (for python modules), not the ones in the build directory. The initial result of this is errors about the program source .py files not being found. Other errors can occur if the source is copied into the build folder.

To Reproduce
Environment is linux, python 3.11, pytorch v2.2.2+cu121 [Note: This problem does not occur on windows]

Minimal source (Minimal.py):

import os
os.environ['KERAS_BACKEND']="torch"


import torch

def per_device_launch_fn(current_gpu_index, num_gpu):

    for i in range(1,1000):
        print("Train...")

num_gpu =4

if __name__ == "__main__":

    print("Starting multiprocessing:"+str(num_gpu))
    torch.multiprocessing.start_processes(
                    per_device_launch_fn,
                    args=(num_gpu,),
                    nprocs=num_gpu,
                    join=True,
                    start_method="spawn",
            )

build script is

import sys
from cx_Freeze import setup, Executable

import sys
sys.setrecursionlimit(5000)

import os
os.environ['KERAS_BACKEND']="torch"

build_exe_options = {"packages": ["onnx","numpy","torch","PIL", "torchvision","keras","sympy","integr
als","multiprocessing"]}

setup(name="Mimimal",version="1.0",description="Minimal",options={"build_exe": build_exe_options},exe
cutables=[Executable("Minimal.py")])

Expected behavior
I would expect the pyc versions of code in the build folder to be used under all circumstances (even by child processes), not the original ones.

Desktop (please complete the following information):

Platform information (e.g. Ubuntu Linux 22.04): Linux (not sure version, it's an HPC)
OS architecture (e.g. amd64): intel64
cx_Freeze version [e.g. 6.11]:6.15.16
Python version [e.g. 3.10]: 3.11

The text was updated successfully, but these errors were encountered:

dmagee · 2024-05-05T16:13:00Z

Re-opening as my fix I previously posted doesn't actually work (unnless source is present in launch folder).

marcelotduarte · 2024-05-05T17:17:00Z

On linux when I use cx_Freeze with a python script that uses torch.multiprocessing to create multiple threads (which essentially calls multiprocessing) the child processes seem to try to use the original python files (for the program) and original python environment (for python modules), not the ones in the build directory. The initial result of this is errors about the program source .py files not being found. Other errors can occur if the source is copied into the build folder.

This information is for debug. This can be changed with replace_paths.

The real bug however must be the use of multiprocessing. Using stdlib's multiprocessing, we need to use freeze_support, but torch.multiprocessing should not have this function and so a way around this must be analyzed.

dmagee · 2024-05-05T18:07:26Z

I tried freeze_support(), which works/is needed on windows, but not linux. I'm not sure what paths I would replace. It appears to look for the python files of the app in the folder that the executable is run from, and throws an error that itcan't find them.

e.g. running from the build folder....

FileNotFoundError: [Errno 2] No such file or directory: '/some/folder/build/exe.linux-x86_64-3.11/Minimal.py'
(repeated once per child process)

If you run it from another location it complains they are not in that location (always with the full path of that location).

D.

marcelotduarte · 2024-05-05T18:53:15Z

Can you test with cx_Freeze 7.0 and with dev release?

You can test with the latest development build:
pip install --force --no-cache --pre --extra-index-url https://marcelotduarte.github.io/packages/ cx_Freeze
For conda-forge the command is:
conda install -y --no-channel-priority -S -c https://marcelotduarte.github.io/packages/conda cx_Freeze

dmagee · 2024-05-05T20:09:56Z

There's still an issue:

Still issue with finding the source:

$ ./Minimal
--filename: /my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/lib/library.dat
--MAXPATHLEN: 4096
--filename: /my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/lib/library.zip
Starting multiprocessing:4
--filename: /my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/lib/library.dat
--MAXPATHLEN: 4096
--filename: /my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/lib/library.zip
--filename: /my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/lib/library.dat
--MAXPATHLEN: 4096
--filename: /my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/lib/library.zip
--filename: /my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/lib/library.dat
--MAXPATHLEN: 4096
--filename: /my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/lib/library.zip
--filename: /my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/lib/library.dat
--MAXPATHLEN: 4096
--filename: /my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/lib/library.zip
--filename: /my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/lib/library.dat
--MAXPATHLEN: 4096
--filename: /my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/lib/library.zip
Traceback (most recent call last):
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/site-packages/cx_Freeze/initscripts/__startup__.py", line 141, in run
Traceback (most recent call last):
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/site-packages/cx_Freeze/initscripts/__startup__.py", line 141, in run
    module_init.run(name + "__main__")
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/site-packages/cx_Freeze/initscripts/console.py", line 19, in run
    module_init.run(name + "__main__")
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/site-packages/cx_Freeze/initscripts/console.py", line 19, in run
    exec(code, module_main.__dict__)
  File "Minimal.py", line 4, in <module>
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/multiprocessing/__init__.py", line 49, in <module>
    exec(code, module_main.__dict__)
  File "Minimal.py", line 4, in <module>
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/multiprocessing/__init__.py", line 49, in <module>
Traceback (most recent call last):
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/site-packages/cx_Freeze/initscripts/__startup__.py", line 141, in run
    module_init.run(name + "__main__")
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/site-packages/cx_Freeze/initscripts/console.py", line 19, in run
    exec(code, module_main.__dict__)
  File "Minimal.py", line 4, in <module>
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/multiprocessing/__init__.py", line 49, in <module>
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/multiprocessing/spawn.py", line 79, in freeze_support
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/multiprocessing/spawn.py", line 79, in freeze_support
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/multiprocessing/spawn.py", line 79, in freeze_support
Traceback (most recent call last):
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/site-packages/cx_Freeze/initscripts/__startup__.py", line 141, in run
    module_init.run(name + "__main__")
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/site-packages/cx_Freeze/initscripts/console.py", line 19, in run
    exec(code, module_main.__dict__)
  File "Minimal.py", line 4, in <module>
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/multiprocessing/__init__.py", line 49, in <module>
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/multiprocessing/spawn.py", line 79, in freeze_support
    spawn_main(**kwds)
    spawn_main(**kwds)
    spawn_main(**kwds)
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
    spawn_main(**kwds)
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
    exitcode = _main(fd, parent_sentinel)
    exitcode = _main(fd, parent_sentinel)
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
    exitcode = _main(fd, parent_sentinel)
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
    prepare(preparation_data)
    prepare(preparation_data)
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
    prepare(preparation_data)
    prepare(preparation_data)
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
    _fixup_main_from_path(data['init_main_from_path'])
    _fixup_main_from_path(data['init_main_from_path'])
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
    _fixup_main_from_path(data['init_main_from_path'])
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
    main_content = runpy.run_path(main_path,
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 290, in run_path
  File "<frozen runpy>", line 290, in run_path
  File "<frozen runpy>", line 254, in _get_code_from_file
  File "<frozen runpy>", line 254, in _get_code_from_file
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/Minimal.py'
FileNotFoundError: [Errno 2] No such file or directory: '/my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/Minimal.py'
  File "<frozen runpy>", line 290, in run_path
  File "<frozen runpy>", line 254, in _get_code_from_file
FileNotFoundError: [Errno 2] No such file or directory: '/my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/Minimal.py'
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 290, in run_path
  File "<frozen runpy>", line 254, in _get_code_from_file
FileNotFoundError: [Errno 2] No such file or directory: '/my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/Minimal.py'
Traceback (most recent call last):
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/site-packages/cx_Freeze/initscripts/__startup__.py", line 141, in run
    module_init.run(name + "__main__")
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/site-packages/cx_Freeze/initscripts/console.py", line 19, in run
    exec(code, module_main.__dict__)
  File "Minimal.py", line 28, in <module>
  File "Minimal.py", line 18, in main
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/site-packages/torch/multiprocessing/spawn.py", line 197, in start_processes
    while not context.join():
              ^^^^^^^^^^^^^^
  File "/my/home/folder/.conda/envs/pt_p311/lib/python3.11/site-packages/torch/multiprocessing/spawn.py", line 148, in join
    raise ProcessExitedException(
torch.multiprocessing.spawn.ProcessExitedException: process 0 terminated with exit code 255

marcelotduarte · 2024-05-06T18:42:23Z

From what I understand you are using conda for Linux. What command did you use to install this specific version of Torch?

dmagee · 2024-05-06T19:28:01Z

Actually I set up the environment with conda, but used pip to install the modules as I couldn't get the versions I needed with conda. I think the command was:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
(from https://pytorch.org/get-started/locally/)

marcelotduarte · 2024-05-06T23:31:45Z

Using your Minimal.py and command line: cxfreeze --script Minimal.py build_exe --replace-paths '*='
The patch works with Linux pip:
pip install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 --index-url https://download.pytorch.org/whl/cpu

Will be available in cx_Freeze 7.1.0.dev16

marcelotduarte · 2024-05-07T07:17:12Z

https://cx-freeze--2382.org.readthedocs.build/en/2382/faq.html#multiprocessing-support

You can test the patch in the latest development build:
pip install --force --no-cache --pre --extra-index-url https://marcelotduarte.github.io/packages/ cx_Freeze
For conda-forge the command is:
conda install -y --no-channel-priority -S -c https://marcelotduarte.github.io/packages/conda cx_Freeze

dmagee · 2024-05-07T09:46:48Z

So I did:

cxfreeze --script Minimal.py build_exe --replace-paths '*='

And, I now get the error:

FileNotFoundError: [Errno 2] No such file or directory: '/my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/=/Minimal.py'

marcelotduarte · 2024-05-07T11:03:08Z

Please check if you have cx_Freeze 7.1.0.dev16 with:
cxfreeze --version

dmagee · 2024-05-07T12:17:33Z

Actually it was cxfreeze 7.1.0-dev15. Not sure how that happened, as I followed your instructions. I just tried it again and now I have 7.1.0-dev16. However, same output:

FileNotFoundError: [Errno 2] No such file or directory: '/my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/=/Minimal.py'

marcelotduarte · 2024-05-07T14:09:38Z

Uninstall cx_Freeze and reinstall. Are you using the pip or conda version? Probably some conflict.

dmagee · 2024-05-07T14:38:21Z

I was using PIP. I uninstalled and re-installed via pip, and same error. I then tried uninstalling via pip, and installing via conda, and I get:

$ conda install -y --no-channel-priority -S -c https://marcelotduarte.github.io/packages/conda cx_Freeze
Retrieving notices: ...working... done
Collecting package metadata (current_repodata.json): failed

UnavailableInvalidChannel: HTTP 404 NOT FOUND for channel packages/conda https://marcelotduarte.github.io/packages/conda

The channel is not accessible or is invalid.

You will need to adjust your conda configuration to proceed.
Use conda config --show channels to view your configuration's current state,
and use conda config --show-sources to view config file locations.

marcelotduarte · 2024-05-07T16:47:12Z

Initially, I did two tests. If you can do the same, to eliminate any bugs. I created a new environment using the system python and another using Conda. But if you test this second option the way I tested it, it's already good.
Then, I installed cx_Freeze and PyTorch:

pip install --force --no-cache --pre --extra-index-url https://marcelotduarte.github.io/packages/ cx_Freeze
pip install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 --index-url https://download.pytorch.org/whl/cpu

Note that I used the cpu version, use that too. Then I will test using Cuda.

dmagee · 2024-05-07T17:44:07Z

In the meantime I re-installed using conda by doing:

wget https://marcelotduarte.github.io/packages/conda/linux-64/cx_freeze-7.1.0.dev16-py311h459d7ec_0.conda
conda install cx_freeze-7.1.0.dev16-py311h459d7ec_0.conda
(after uninstalling using pip)

I also got the same error.

Note: The whole point of torch.multiprocessing is to use multiple GPUs, so it working just on CPU isn't that useful.

I'll try to create an entirely new environment from scratch with conda and see if it works...

dmagee · 2024-05-07T17:59:47Z

I created an entirely new environment with just cx_Freeze and torch (GPU version) with the same issue, this is my history;

1050 conda create --name cxtest python=3.11
1051 conda activate cxtest
1052 pip install --force --no-cache --pre --extra-index-url https://marcelotduarte.github.io/packages/ cx_Freeze
1053 pip3 install torch torchvision torchaudio
1054 cd ../..
1055 python Minimal.py ---- Note: This works fine
1056 rm -r build
1057 cxfreeze --script Minimal.py build_exe --replace-paths '*='
1058 cd build/exe.linux-x86_64-3.11/
1059 ./Minimal

Output (Note: Ever so slightly different from before as getting SIGTERM that I didn't before, same file missing error though):

$ ./Minimal
Starting multiprocessing:4
Starting multiprocessing:4
Traceback (most recent call last):
File "=/startup.py", line 141, in run
File "=/console.py", line 19, in run
File "=/Minimal.py", line 28, in
File "=/Minimal.py", line 18, in main
File "=/torch/multiprocessing/spawn.py", line 208, in start_processes
File "=/multiprocessing/context.py", line 243, in get_context
File "=/multiprocessing/init.py", line 56, in
File "=/multiprocessing/init.py", line 53, in _get_freeze_context
File "=/multiprocessing/spawn.py", line 79, in freeze_support
File "=/multiprocessing/spawn.py", line 122, in spawn_main
File "=/multiprocessing/spawn.py", line 131, in _main
File "=/multiprocessing/spawn.py", line 246, in prepare
File "=/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
File "", line 290, in run_path
File "", line 254, in _get_code_from_file
FileNotFoundError: [Errno 2] No such file or directory: '/my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/=/Minimal.py'
Starting multiprocessing:4
Traceback (most recent call last):
File "=/startup.py", line 141, in run
File "=/console.py", line 19, in run
File "=/Minimal.py", line 28, in
File "=/Minimal.py", line 18, in main
File "=/torch/multiprocessing/spawn.py", line 208, in start_processes
File "=/multiprocessing/context.py", line 243, in get_context
File "=/multiprocessing/init.py", line 56, in
File "=/multiprocessing/init.py", line 53, in _get_freeze_context
File "=/multiprocessing/spawn.py", line 79, in freeze_support
File "=/multiprocessing/spawn.py", line 122, in spawn_main
File "=/multiprocessing/spawn.py", line 131, in _main
File "=/multiprocessing/spawn.py", line 246, in prepare
File "=/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
File "", line 290, in run_path
File "", line 254, in _get_code_from_file
FileNotFoundError: [Errno 2] No such file or directory: '/my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/=/Minimal.py'
Starting multiprocessing:4
Traceback (most recent call last):
File "=/startup.py", line 141, in run
File "=/console.py", line 19, in run
File "=/Minimal.py", line 28, in
File "=/Minimal.py", line 18, in main
File "=/torch/multiprocessing/spawn.py", line 208, in start_processes
File "=/multiprocessing/context.py", line 243, in get_context
File "=/multiprocessing/init.py", line 56, in
File "=/multiprocessing/init.py", line 53, in _get_freeze_context
File "=/multiprocessing/spawn.py", line 79, in freeze_support
File "=/multiprocessing/spawn.py", line 122, in spawn_main
File "=/multiprocessing/spawn.py", line 131, in _main
File "=/multiprocessing/spawn.py", line 246, in prepare
File "=/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
File "", line 290, in run_path
File "", line 254, in _get_code_from_file
FileNotFoundError: [Errno 2] No such file or directory: '/my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/=/Minimal.py'
Starting multiprocessing:4
Traceback (most recent call last):
File "=/startup.py", line 141, in run
File "=/console.py", line 19, in run
File "=/Minimal.py", line 28, in
File "=/Minimal.py", line 18, in main
File "=/torch/multiprocessing/spawn.py", line 208, in start_processes
File "=/multiprocessing/context.py", line 243, in get_context
File "=/multiprocessing/init.py", line 56, in
File "=/multiprocessing/init.py", line 53, in _get_freeze_context
File "=/multiprocessing/spawn.py", line 79, in freeze_support
File "=/multiprocessing/spawn.py", line 122, in spawn_main
File "=/multiprocessing/spawn.py", line 131, in _main
File "=/multiprocessing/spawn.py", line 246, in prepare
File "=/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
File "", line 290, in run_path
File "", line 254, in _get_code_from_file
FileNotFoundError: [Errno 2] No such file or directory: '/my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/=/Minimal.py'
W0507 18:55:52.264000 140603076392768 ../=/torch/multiprocessing/spawn.py:145] Terminating process 2386939 via signal SIGTERM
W0507 18:55:52.264000 140603076392768 ../=/torch/multiprocessing/spawn.py:145] Terminating process 2386941 via signal SIGTERM
Traceback (most recent call last):
File "=/startup.py", line 141, in run
File "=/console.py", line 19, in run
File "=/Minimal.py", line 28, in
File "=/Minimal.py", line 18, in main
File "=/torch/multiprocessing/spawn.py", line 237, in start_processes
File "=/torch/multiprocessing/spawn.py", line 177, in join
torch.multiprocessing.spawn.ProcessExitedException: process 3 terminated with exit code 255

marcelotduarte · 2024-05-07T20:52:23Z

In the meantime I re-installed using conda by doing:

The conda version has a bug, I'll try to solve it.

cxfreeze --script Minimal.py build_exe --replace-paths '*='
...
Output (Note: Ever so slightly different from before as getting SIGTERM that I didn't before, same file missing error though):

I had told you to use replace_paths exactly to remove the complete path information in the traceback, but I see that it now causes the (previous) error or the SIGTERM. I'll investigate it.

But, using only:
cxfreeze --script Minimal.py build_exe
or with other parameters, like:
cxfreeze --script Minimal.py build_exe --silent
It worked...
It should work with your original or modified setup without 'packages' too.

dmagee · 2024-05-08T08:30:08Z

Using cxfreeze --script Minimal.py build_exe I get a slightly different error with my new environment (cx_Freeze installed with pip as history above):

(cxtest) $ ./Minimal
Starting multiprocessing:4
Starting multiprocessing:4
Traceback (most recent call last):
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/site-packages/cx_Freeze/initscripts/startup.py", line 141, in run
module_init.run(name + "main")
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/site-packages/cx_Freeze/initscripts/console.py", line 19, in run
exec(code, module_main.dict)
File "Minimal.py", line 28, in
File "Minimal.py", line 18, in main
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/site-packages/torch/multiprocessing/spawn.py", line 208, in start_processes
Starting multiprocessing:4
Traceback (most recent call last):
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/site-packages/cx_Freeze/initscripts/startup.py", line 141, in run
module_init.run(name + "main")
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/site-packages/cx_Freeze/initscripts/console.py", line 19, in run
exec(code, module_main.dict)
File "Minimal.py", line 28, in
Starting multiprocessing:4
File "Minimal.py", line 18, in main
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/site-packages/torch/multiprocessing/spawn.py", line 208, in start_processes
Traceback (most recent call last):
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/site-packages/cx_Freeze/initscripts/startup.py", line 141, in run
module_init.run(name + "main")
mp = multiprocessing.get_context(start_method)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/context.py", line 243, in get_context
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/site-packages/cx_Freeze/initscripts/console.py", line 19, in run
mp = multiprocessing.get_context(start_method)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/context.py", line 243, in get_context
exec(code, module_main.dict)
File "Minimal.py", line 28, in
File "Minimal.py", line 18, in main
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/site-packages/torch/multiprocessing/spawn.py", line 208, in start_processes
mp = multiprocessing.get_context(start_method)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/context.py", line 243, in get_context
Starting multiprocessing:4
Traceback (most recent call last):
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/site-packages/cx_Freeze/initscripts/startup.py", line 141, in run
module_init.run(name + "main")
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/site-packages/cx_Freeze/initscripts/console.py", line 19, in run
exec(code, module_main.dict)
File "Minimal.py", line 28, in
File "Minimal.py", line 18, in main
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/site-packages/torch/multiprocessing/spawn.py", line 208, in start_processes
return super().get_context(method)
return super().get_context(method)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/init.py", line 56, in
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/init.py", line 56, in
return super().get_context(method)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/init.py", line 56, in
mp = multiprocessing.get_context(start_method)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/context.py", line 243, in get_context
return super().get_context(method)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/init.py", line 56, in
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/init.py", line 53, in _get_freeze_context
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/init.py", line 53, in _get_freeze_context
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/init.py", line 53, in _get_freeze_context
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/init.py", line 53, in _get_freeze_context
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/spawn.py", line 79, in freeze_support
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/spawn.py", line 79, in freeze_support
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/spawn.py", line 79, in freeze_support
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/spawn.py", line 79, in freeze_support
spawn_main(**kwds)
spawn_main(**kwds)
spawn_main(**kwds)
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
spawn_main(**kwds)
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/spawn.py", line 122, in spawn_main
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
exitcode = _main(fd, parent_sentinel)
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
exitcode = _main(fd, parent_sentinel)
prepare(preparation_data)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/spawn.py", line 131, in _main
prepare(preparation_data)
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
prepare(preparation_data)
prepare(preparation_data)
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/spawn.py", line 246, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
_fixup_main_from_path(data['init_main_from_path'])
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
_fixup_main_from_path(data['init_main_from_path'])
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
_fixup_main_from_path(data['init_main_from_path'])
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/multiprocessing/spawn.py", line 297, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
main_content = runpy.run_path(main_path,
main_content = runpy.run_path(main_path,
^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 290, in run_path
^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 254, in _get_code_from_file
File "", line 290, in run_path
File "", line 290, in run_path
File "", line 254, in _get_code_from_file
FileNotFoundError: [Errno 2] No such file or directory: '/my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/Minimal.py'
File "", line 254, in _get_code_from_file
FileNotFoundError: [Errno 2] No such file or directory: '/my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/Minimal.py'
FileNotFoundError: [Errno 2] No such file or directory: '/my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/Minimal.py'
main_content = runpy.run_path(main_path,
^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 290, in run_path
File "", line 254, in _get_code_from_file
FileNotFoundError: [Errno 2] No such file or directory: '/my/home/folder/minimal_bug/build/exe.linux-x86_64-3.11/Minimal.py'
W0508 09:27:04.490000 140290065565504 ../../../../.conda/envs/cxtest/lib/python3.11/site-packages/torch/multiprocessing/spawn.py:145] Terminating process 1901879 via signal SIGTERM
W0508 09:27:04.491000 140290065565504 ../../../../.conda/envs/cxtest/lib/python3.11/site-packages/torch/multiprocessing/spawn.py:145] Terminating process 1901880 via signal SIGTERM
W0508 09:27:04.491000 140290065565504 ../../../../.conda/envs/cxtest/lib/python3.11/site-packages/torch/multiprocessing/spawn.py:145] Terminating process 1901881 via signal SIGTERM
Traceback (most recent call last):
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/site-packages/cx_Freeze/initscripts/startup.py", line 141, in run
module_init.run(name + "main")
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/site-packages/cx_Freeze/initscripts/console.py", line 19, in run
exec(code, module_main.dict)
File "Minimal.py", line 28, in
File "Minimal.py", line 18, in main
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/site-packages/torch/multiprocessing/spawn.py", line 237, in start_processes
while not context.join():
^^^^^^^^^^^^^^
File "/my/home/folder/.conda/envs/cxtest/lib/python3.11/site-packages/torch/multiprocessing/spawn.py", line 177, in join
raise ProcessExitedException(
torch.multiprocessing.spawn.ProcessExitedException: process 3 terminated with exit code 255

marcelotduarte · 2024-05-08T09:36:07Z

(cxtest) marcelo@teste7:/mnt/81da54df-d490-4cc4-a259-ffbee7f55c92/testes/2376$ python -VV
Python 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:36:13) [GCC 12.3.0]
(cxtest) marcelo@teste7:/mnt/81da54df-d490-4cc4-a259-ffbee7f55c92/testes/2376$ pip list
Package                  Version
------------------------ -----------
cx_Freeze                7.1.0.dev16
filelock                 3.14.0
fsspec                   2024.3.1
Jinja2                   3.1.4
MarkupSafe               2.1.5
mpmath                   1.3.0
networkx                 3.3
numpy                    1.26.4
nvidia-cublas-cu12       12.1.3.1
nvidia-cuda-cupti-cu12   12.1.105
nvidia-cuda-nvrtc-cu12   12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12        8.9.2.26
nvidia-cufft-cu12        11.0.2.54
nvidia-curand-cu12       10.3.2.106
nvidia-cusolver-cu12     11.4.5.107
nvidia-cusparse-cu12     12.1.0.106
nvidia-nccl-cu12         2.20.5
nvidia-nvjitlink-cu12    12.4.127
nvidia-nvtx-cu12         12.1.105
patchelf                 0.17.2.1
pillow                   10.3.0
pip                      24.0
setuptools               69.5.1
sympy                    1.12
torch                    2.3.0
torchaudio               2.3.0
torchvision              0.18.0
triton                   2.3.0
typing_extensions        4.11.0
wheel                    0.43.0
(cxtest) marcelo@teste7:/mnt/81da54df-d490-4cc4-a259-ffbee7f55c92/testes/2376$

dmagee · 2024-05-08T09:42:03Z

I can't see a difference!

(cxtest) ]$ python -VV
Python 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:36:13) [GCC 12.3.0]
(cxtest) $ pip list
Package Version

cx_Freeze 7.1.0.dev16
filelock 3.14.0
fsspec 2024.3.1
Jinja2 3.1.4
MarkupSafe 2.1.5
mpmath 1.3.0
networkx 3.3
numpy 1.26.4
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu12 2.20.5
nvidia-nvjitlink-cu12 12.4.127
nvidia-nvtx-cu12 12.1.105
patchelf 0.17.2.1
pillow 10.3.0
pip 24.0
setuptools 69.5.1
sympy 1.12
torch 2.3.0
torchaudio 2.3.0
torchvision 0.18.0
triton 2.3.0
typing_extensions 4.11.0
wheel 0.43.0

dmagee · 2024-05-08T09:48:26Z

Are you sure you don't have the source files in the folder you are running the executable from? It's the only thing I can think of.

marcelotduarte · 2024-05-09T21:24:41Z

Now, I understand the situation.
You can test the fix in the latest development build (cx_Freeze 7.1.0.dev18):
pip install --force --no-cache --pre --extra-index-url https://marcelotduarte.github.io/packages/ cx_Freeze

dmagee · 2024-05-10T10:30:52Z

This version just hangs. You run the program and it outputs absolutely nothing to the screen, and doesn't return.

EDIT: If you leave it long enough, it does actually run ok. I'm just timing it now to see how log, but it was more than a few miniutes.

History:
1002 conda activate cxtest
1003 pip uninstall cx_Freeze
1004 pip install --force --no-cache --pre --extra-index-url https://marcelotduarte.github.io/packages/ cx_Freeze
1005 cxfreeze --script Minimal.py build_exe
1006 cd build/exe.linux-x86_64-3.11/
1007 ./Minimal

marcelotduarte · 2024-05-10T16:40:40Z

I changed the code a bit to check __file__:

import torch

def per_device_launch_fn(current_gpu_index, num_gpu):

    for i in range(1, 1000):
        print("Train...")

num_gpu = 4

if __name__ == "__main__":
    print("Starting multiprocessing:", num_gpu, __file__)
    torch.multiprocessing.start_processes(
                    per_device_launch_fn,
                    args=(num_gpu,),
                    nprocs=num_gpu,
                    join=True,
                    start_method="spawn",
            )

$ time python Minimal.py

real	0m2,951s
user	0m6,175s
sys	0m2,158s

$ cxfreeze --script Minimal.py build_exe
$ (cd build/exe.linux-x86_64-3.11/ && time ./Minimal)

real	0m8,011s
user	0m6,434s
sys	0m2,768s

In the next run, the time is similar to the time used by the python command:

real	0m2,645s
user	0m5,945s
sys	0m2,318s

And next time too:

real	0m2,875s
user	0m5,661s
sys	0m2,496s

But, using to build:
$ cxfreeze --script Minimal.py build_exe --silent --no-compress --zip-filename=
Running time is a little shorter on the first run:

real	0m5,146s
user	0m6,148s
sys	0m2,584s

dmagee · 2024-05-13T16:08:22Z

I think the timing thing may have been system related (it's a shared computer) one run took 1.5hours last week, but today it's not taking that long. One other issue I did notice though is that every sub-process in the frozen version the following is true:

if name == "main":

resulting in this code being called N times, whereas in the python version it's only called once. This doesn't matter in the minimal example (the training loop is called 4 times with different values of current_gpu_index), but for my real program the logic is a bit more complex in main() as it checks sys.argv in the main process, which results in different behaviour in python and frozen version. I maybe able to re-write the code to get round this, but it does strike me as a bug, as presumably torch.multiprocessing must be doing something to ensure per_device_launch_fn() is called in the python version, whereas in the frozen version it is being called via main(). I'm doing some testing to see if this is significant.

Edit: Child processed seem to be called with the following (additional*?) arguments:

--multiprocessing-fork tracker_fd=XX pipe_handle=YY

Where XX is the same for all children, and YY is different for each child. I'm assuming in the python version the torch.multiprocessing code reads these and puts sys.argv back how you might expect.

[* My program has no arguments, so it's not clear if they are additional, or replacements]

marcelotduarte · 2024-05-13T17:21:44Z

The hook that I used to patch multiprocessing is based on #264 and later I discovered a patch similar (##501 (comment)), even open python/cpython#104607.
So I don't see much to do. I thought of a possibility, I did some tests and I didn't see a difference, of course, using that test you gave me, not something big.

... but for my real program the logic is a bit more complex in main() as it checks sys.argv in the main process, which results in different behaviour in python and frozen version ...

I don't think it's very different, see how the spawn is described.

dmagee · 2024-05-13T18:02:00Z

Update: Simply doing this works round this:

if name == "main":
    no_args = len(sys.argv)
    if no_args>1 and sys.argv[1]=="--multiprocessing-fork":
            print("Is fork-child")
            torch.multiprocessing.start_processes(
                                per_device_launch_fn,
                                args=(num_gpu,),
                                nprocs=num_gpu,
                                join=True,
                                start_method="spawn",
                        )
    else:
    # Normal initialisation for parent process

To be clear, this is not necessary in the python version.

marcelotduarte · 2024-05-26T06:19:01Z

Release 7.1.0 is out!
Documentation

I'll continue to work on pytorch hook to optimize it.

marcelotduarte · 2024-06-07T04:31:31Z

Based on information from you and others, I improved the hook for multiprocessing.
You can test the patch in the latest development build:
pip install --force --no-cache --pre --extra-index-url https://marcelotduarte.github.io/packages/ cx_Freeze
The provisional documentation: https://cx-freeze--2443.org.readthedocs.build/en/2443/faq.html#multiprocessing-support
The updated script is:

import torch
from multiprocessing import freeze_support

def per_device_launch_fn(current_gpu_index, num_gpu):

    for i in range(1, 1000):
        print("Train...")

num_gpu = 4

if __name__ == "__main__":
    freeze_support()
    print("Starting multiprocessing:", num_gpu, __file__)
    torch.multiprocessing.start_processes(
                    per_device_launch_fn,
                    args=(num_gpu,),
                    nprocs=num_gpu,
                    join=True,
                    start_method="spawn",
            )

marcelotduarte · 2024-06-14T04:41:28Z

Release 7.1.1 is out!
Documentation

dmagee closed this as completed May 1, 2024

dmagee reopened this May 5, 2024

marcelotduarte mentioned this issue May 6, 2024

hooks: improve multiprocessing hook to work with pytorch #2382

Merged

marcelotduarte closed this as completed in #2382 May 7, 2024

marcelotduarte mentioned this issue May 9, 2024

fix: global of main module to work better with multiprocessing #2385

Merged

marcelotduarte mentioned this issue Jun 7, 2024

hooks: fix #2382 regression / improve tests and docs #2443

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cx_Freeze with torch.multiprocessing using wrong source in child processes #2376

cx_Freeze with torch.multiprocessing using wrong source in child processes #2376

dmagee commented May 1, 2024 •

edited

dmagee commented May 5, 2024 •

edited

marcelotduarte commented May 5, 2024 •

edited

dmagee commented May 5, 2024

marcelotduarte commented May 5, 2024

dmagee commented May 5, 2024

marcelotduarte commented May 6, 2024

dmagee commented May 6, 2024

marcelotduarte commented May 6, 2024

marcelotduarte commented May 7, 2024 •

edited

dmagee commented May 7, 2024 •

edited

marcelotduarte commented May 7, 2024 •

edited

dmagee commented May 7, 2024

marcelotduarte commented May 7, 2024

dmagee commented May 7, 2024

marcelotduarte commented May 7, 2024

dmagee commented May 7, 2024

dmagee commented May 7, 2024 •

edited

marcelotduarte commented May 7, 2024

dmagee commented May 8, 2024 •

edited

marcelotduarte commented May 8, 2024

dmagee commented May 8, 2024 •

edited

dmagee commented May 8, 2024 •

edited

marcelotduarte commented May 9, 2024

dmagee commented May 10, 2024 •

edited

marcelotduarte commented May 10, 2024

dmagee commented May 13, 2024 •

edited

marcelotduarte commented May 13, 2024

dmagee commented May 13, 2024 •

edited

marcelotduarte commented May 26, 2024

marcelotduarte commented Jun 7, 2024 •

edited

marcelotduarte commented Jun 14, 2024

cx_Freeze with torch.multiprocessing using wrong source in child processes #2376

cx_Freeze with torch.multiprocessing using wrong source in child processes #2376

Comments

dmagee commented May 1, 2024 • edited

dmagee commented May 5, 2024 • edited

marcelotduarte commented May 5, 2024 • edited

dmagee commented May 5, 2024

marcelotduarte commented May 5, 2024

dmagee commented May 5, 2024

marcelotduarte commented May 6, 2024

dmagee commented May 6, 2024

marcelotduarte commented May 6, 2024

marcelotduarte commented May 7, 2024 • edited

dmagee commented May 7, 2024 • edited

marcelotduarte commented May 7, 2024 • edited

dmagee commented May 7, 2024

marcelotduarte commented May 7, 2024

dmagee commented May 7, 2024

marcelotduarte commented May 7, 2024

dmagee commented May 7, 2024

dmagee commented May 7, 2024 • edited

marcelotduarte commented May 7, 2024

dmagee commented May 8, 2024 • edited

marcelotduarte commented May 8, 2024

dmagee commented May 8, 2024 • edited

dmagee commented May 8, 2024 • edited

marcelotduarte commented May 9, 2024

dmagee commented May 10, 2024 • edited

marcelotduarte commented May 10, 2024

dmagee commented May 13, 2024 • edited

marcelotduarte commented May 13, 2024

dmagee commented May 13, 2024 • edited

marcelotduarte commented May 26, 2024

marcelotduarte commented Jun 7, 2024 • edited

marcelotduarte commented Jun 14, 2024

dmagee commented May 1, 2024 •

edited

dmagee commented May 5, 2024 •

edited

marcelotduarte commented May 5, 2024 •

edited

marcelotduarte commented May 7, 2024 •

edited

dmagee commented May 7, 2024 •

edited

marcelotduarte commented May 7, 2024 •

edited

dmagee commented May 7, 2024 •

edited

dmagee commented May 8, 2024 •

edited

dmagee commented May 8, 2024 •

edited

dmagee commented May 8, 2024 •

edited

dmagee commented May 10, 2024 •

edited

dmagee commented May 13, 2024 •

edited

dmagee commented May 13, 2024 •

edited

marcelotduarte commented Jun 7, 2024 •

edited