Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Horovod fails to install via Poetry #3697

Open
joeyearsley opened this issue Sep 12, 2022 · 17 comments · May be fixed by #3991
Open

Horovod fails to install via Poetry #3697

joeyearsley opened this issue Sep 12, 2022 · 17 comments · May be fixed by #3991

Comments

@joeyearsley
Copy link

Environment:

  1. Framework: Tensorflow
  2. Framework version: 1.8.1
  3. Horovod version: 0.24.2
  4. MPI version: -
  5. CUDA version: -
  6. NCCL version: -
  7. Python version: 3.8
  8. Spark / PySpark version: -
  9. Ray version: -
  10. OS and version: Ubuntu 22.04
  11. GCC version: -
  12. CMake version: -

Checklist:

  1. Did you search issues to find if somebody asked this question before? Yes
  2. If your question is about hang, did you read this doc?
  3. If your question is about docker, did you read this doc?
  4. Did you check if you question is answered in the troubleshooting guide?

Bug report:
Trying to install horovod via poetry and keep receiving:

        ModuleNotFoundError: No module named 'tensorflow'

If I install via pip into the virtual env directly this works, however it adds extra load onto the team as they have to remember to do this on top of poetry install.

Cmake Output:

cmake /tmp/pip-req-build-69ukzo_1 -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_LIBRARY_OUTPUT_DIRECTORY_RELWITHDEBINFO=/tmp/pip-req-build-69ukzo_1/build/lib.linux-x86_64-3.8 -DPYTHON_EXECUTABLE:FILEPATH=/root/.cache/pypoetry/virtuale
nvs/kheironml-VsnhxLU2-py3.8/bin/python

And I can import tensorflow via that direct path and the virtual env.

@joeyearsley
Copy link
Author

If I clone and build from source it works correctly, I'll open a ticket in poetry as well in case it is an issue with their installer

@joeyearsley
Copy link
Author

Having spoken to the poetry team they believe the issue lies in Horovod not being compliant with PEP-517/518, is this correct?

If so, are there any plans to resolve this?

@maxhgerlach
Copy link
Collaborator

maxhgerlach commented Sep 12, 2022

Having spoken to the poetry team they believe the issue lies in Horovod not being compliant with PEP-517/518, is this correct?

Would that be related to the observations in issue and comments #3483? I think it would be great to clear that up and assistance would be appreciated!

@joeyearsley
Copy link
Author

The best I can say is it might be, however on my system I've got TensorFlow already installed and poetry seems to be resolving to install it before horovod.
So I believe it's due to PEP-517 and building from source in isolation needing the requirements to be pre-set in the setup-requires or pyproject, as you suggested.

I wonder if there needs to be another PEP to make a standard where isolation can also mean exposing the extras required rather than just those in the setup.py/pyproject.

@geroldcsendes
Copy link

I am facing the same issue. Raised a ticket at poetry and they said the horovod's build is not compliant with PEP-517 as mentioned above by @joeyearsley

@itsdani
Copy link

itsdani commented Oct 27, 2022

The install with the latest poetry now fails with

[...]
      ModuleNotFoundError: No module named 'packaging'
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for horovod
Failed to build horovod
ERROR: Could not build wheels for horovod, which is required to install pyproject.toml-based projects

It also fails with the same error using pip install --use-pep517 horovod (which is what poetry uses in the background)

@itsdani
Copy link

itsdani commented Oct 27, 2022

This current issue is introduced in 0.26.0, as poetry add horovod==0.25.0 works fine with poetry 1.2.2

It might have been introduced by #3700, which started to use packaging.version, and the error is No module named 'packaging'

Here is the full stacktrace, it might help:

just the stacktrace
running build_ext
        Traceback (most recent call last):
          File "/home/dsegesdi/work/horovodtest/.venv/lib/python3.10/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 351, in <module>
            main()
          File "/home/dsegesdi/work/horovodtest/.venv/lib/python3.10/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 333, in main
            json_out['return_val'] = hook(**hook_input['kwargs'])
          File "/home/dsegesdi/work/horovodtest/.venv/lib/python3.10/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 249, in build_wheel
            return _build_backend().build_wheel(wheel_directory, config_settings,
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 412, in build_wheel
            return self._build_with_temp_dir(['bdist_wheel'], '.whl',
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 397, in _build_with_temp_dir
            self.run_setup()
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 483, in run_setup
            super(_BuildMetaLegacyBackend,
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 335, in run_setup
            exec(code, locals())
          File "<string>", line 213, in <module>
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/__init__.py", line 87, in setup
            return distutils.core.setup(**attrs)
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 185, in setup
            return run_commands(dist)
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
            dist.run_commands()
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 968, in run_commands
            self.run_command(cmd)
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 1217, in run_command
            super().run_command(command)
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
            cmd_obj.run()
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/wheel/bdist_wheel.py", line 299, in run
            self.run_command('build')
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 319, in run_command
            self.distribution.run_command(command)
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 1217, in run_command
            super().run_command(command)
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
            cmd_obj.run()
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 132, in run
            self.run_command(cmd_name)
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 319, in run_command
            self.distribution.run_command(command)
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 1217, in run_command
            super().run_command(command)
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
            cmd_obj.run()
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 84, in run
            _build_ext.run(self)
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 346, in run
            self.build_extensions()
          File "<string>", line 106, in build_extensions
          File "<string>", line 67, in get_cmake_bin
        ModuleNotFoundError: No module named 'packaging'
        [end of output]
    
    note: This error originates from a subprocess, and is likely not a problem with pip.
    ERROR: Failed building wheel for horovod
  Failed to build horovod
  ERROR: Could not build wheels for horovod, which is required to install pyproject.toml-based projects
Full poetry output with stacktrace
Loading configuration file /home/dsegesdi/.config/pypoetry/config.toml
Loading configuration file /home/dsegesdi/.config/pypoetry/auth.toml
Disabling source caches
Creating virtualenv horovodtest in /home/dsegesdi/work/horovodtest/.venv
Using virtualenv: /home/dsegesdi/work/horovodtest/.venv
Using version ^0.26.1 for horovod

Updating dependencies
Resolving dependencies...
   1: fact: horovodtest is 0.1.0
   1: derived: horovodtest
   1: fact: horovodtest depends on horovod (^0.26.1)
   1: selecting horovodtest (0.1.0)
   1: derived: horovod (>=0.26.1,<0.27.0)
   1: fact: horovod (0.26.1) depends on cloudpickle (*)
   1: fact: horovod (0.26.1) depends on psutil (*)
   1: fact: horovod (0.26.1) depends on pyyaml (*)
   1: fact: horovod (0.26.1) depends on packaging (*)
   1: fact: horovod (0.26.1) depends on cffi (>=1.4.0)
   1: selecting horovod (0.26.1)
   1: derived: cffi (>=1.4.0)
   1: derived: packaging
   1: derived: pyyaml
   1: derived: psutil
   1: derived: cloudpickle
   1: selecting pyyaml (6.0)
   1: fact: cffi (1.15.1) depends on pycparser (*)
   1: selecting cffi (1.15.1)
   1: derived: pycparser
   1: selecting pycparser (2.21)
   1: selecting cloudpickle (2.2.0)
   1: fact: packaging (21.3) depends on pyparsing (>=2.0.2,<3.0.5 || >3.0.5)
   1: selecting packaging (21.3)
   1: derived: pyparsing (>=2.0.2,!=3.0.5)
   1: selecting pyparsing (3.0.9)
   1: selecting psutil (5.9.3)
   1: Version solving took 12.439 seconds.
   1: Tried 1 solutions.

Writing lock file

Finding the necessary packages for the current system

Package operations: 8 installs, 0 updates, 0 removals

  • Installing pycparser (2.21): Pending...
  • Installing pycparser (2.21): Installing...
  • Installing pycparser (2.21)
  • Installing pyparsing (3.0.9): Pending...
  • Installing pyparsing (3.0.9): Installing...
  • Installing pyparsing (3.0.9)
  • Installing cffi (1.15.1): Pending...
  • Installing cffi (1.15.1): Installing...
  • Installing cffi (1.15.1)
  • Installing cloudpickle (2.2.0): Pending...
  • Installing cloudpickle (2.2.0): Installing...
  • Installing cloudpickle (2.2.0)
  • Installing packaging (21.3): Pending...
  • Installing packaging (21.3): Installing...
  • Installing packaging (21.3)
  • Installing psutil (5.9.3): Pending...
  • Installing psutil (5.9.3): Installing...
  • Installing psutil (5.9.3)
  • Installing pyyaml (6.0): Pending...
  • Installing pyyaml (6.0): Installing...
  • Installing pyyaml (6.0)
  • Installing horovod (0.26.1): Pending...
  • Installing horovod (0.26.1): Installing...
  • Installing horovod (0.26.1): Failed

  Stack trace:

  2  ~/.pyenv/versions/3.10.7/lib/python3.10/site-packages/poetry/utils/env.py:1472 in _run
      1470│                 )
      1471│             else:
    → 1472│                 output = subprocess.check_output(
      1473│                     command, stderr=subprocess.STDOUT, env=env, **kwargs
      1474│                 )

  1  ~/.pyenv/versions/3.10.7/lib/python3.10/subprocess.py:420 in check_output
       418│         kwargs['input'] = empty
       419│ 
    →  420│     return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
       421│                **kwargs).stdout
       422│ 

  CalledProcessError

  Command '['/home/dsegesdi/work/horovodtest/.venv/bin/python', '-m', 'pip', 'install', '--use-pep517', '--disable-pip-version-check', '--prefix', '/home/dsegesdi/work/horovodtest/.venv', '--no-deps', '/home/dsegesdi/.cache/pypoetry/artifacts/22/3f/52/5ac958d5d9895637e320ab9c30f2728b984c4e6220527a35a0078abf96/horovod-0.26.1.tar.gz']' returned non-zero exit status 1.

  at ~/.pyenv/versions/3.10.7/lib/python3.10/subprocess.py:524 in run
       520│             # We don't call process.wait() as .__exit__ does that for us.
       521│             raise
       522│         retcode = process.poll()
       523│         if check and retcode:
    →  524│             raise CalledProcessError(retcode, process.args,
       525│                                      output=stdout, stderr=stderr)
       526│     return CompletedProcess(process.args, retcode, stdout, stderr)
       527│ 
       528│ 

The following error occurred when trying to handle this error:


  Stack trace:

  3  ~/.pyenv/versions/3.10.7/lib/python3.10/site-packages/poetry/utils/pip.py:49 in pip_install
       47│ 
       48│     try:
    →  49│         return environment.run_pip(*args)
       50│     except EnvCommandError as e:
       51│         raise PoetryException(f"Failed to install {path.as_posix()}") from e

  2  ~/.pyenv/versions/3.10.7/lib/python3.10/site-packages/poetry/utils/env.py:1435 in run_pip
      1433│         pip = self.get_pip_command()
      1434│         cmd = pip + list(args)
    → 1435│         return self._run(cmd, **kwargs)
      1436│ 
      1437│     def run_python_script(self, content: str, **kwargs: Any) -> int | str:

  1  ~/.pyenv/versions/3.10.7/lib/python3.10/site-packages/poetry/utils/env.py:1712 in _run
      1710│     def _run(self, cmd: list[str], **kwargs: Any) -> int | str:
      1711│         kwargs["env"] = self.get_temp_environ(environ=kwargs.get("env"))
    → 1712│         return super()._run(cmd, **kwargs)
      1713│ 
      1714│     def get_temp_environ(

  EnvCommandError

  Command ['/home/dsegesdi/work/horovodtest/.venv/bin/python', '-m', 'pip', 'install', '--use-pep517', '--disable-pip-version-check', '--prefix', '/home/dsegesdi/work/horovodtest/.venv', '--no-deps', '/home/dsegesdi/.cache/pypoetry/artifacts/22/3f/52/5ac958d5d9895637e320ab9c30f2728b984c4e6220527a35a0078abf96/horovod-0.26.1.tar.gz'] errored with the following return code 1, and output: 
  Processing /home/dsegesdi/.cache/pypoetry/artifacts/22/3f/52/5ac958d5d9895637e320ab9c30f2728b984c4e6220527a35a0078abf96/horovod-0.26.1.tar.gz
    Installing build dependencies: started
    Installing build dependencies: finished with status 'done'
    Getting requirements to build wheel: started
    Getting requirements to build wheel: finished with status 'done'
    Preparing metadata (pyproject.toml): started
    Preparing metadata (pyproject.toml): finished with status 'done'
  Building wheels for collected packages: horovod
    Building wheel for horovod (pyproject.toml): started
    Building wheel for horovod (pyproject.toml): finished with status 'error'
    error: subprocess-exited-with-error
    
    × Building wheel for horovod (pyproject.toml) did not run successfully.
    │ exit code: 1
    ╰─> [236 lines of output]
        running bdist_wheel
        running build
        running build_py
        creating build
        creating build/lib.linux-x86_64-cpython-310
        creating build/lib.linux-x86_64-cpython-310/horovod
        copying horovod/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod
        creating build/lib.linux-x86_64-cpython-310/horovod/torch
        copying horovod/torch/sync_batch_norm.py -> build/lib.linux-x86_64-cpython-310/horovod/torch
        copying horovod/torch/optimizer.py -> build/lib.linux-x86_64-cpython-310/horovod/torch
        copying horovod/torch/mpi_ops.py -> build/lib.linux-x86_64-cpython-310/horovod/torch
        copying horovod/torch/functions.py -> build/lib.linux-x86_64-cpython-310/horovod/torch
        copying horovod/torch/compression.py -> build/lib.linux-x86_64-cpython-310/horovod/torch
        copying horovod/torch/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/torch
        creating build/lib.linux-x86_64-cpython-310/horovod/tensorflow
        copying horovod/tensorflow/util.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow
        copying horovod/tensorflow/sync_batch_norm.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow
        copying horovod/tensorflow/mpi_ops.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow
        copying horovod/tensorflow/gradient_aggregation_eager.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow
        copying horovod/tensorflow/gradient_aggregation.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow
        copying horovod/tensorflow/functions.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow
        copying horovod/tensorflow/elastic.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow
        copying horovod/tensorflow/compression.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow
        copying horovod/tensorflow/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow
        creating build/lib.linux-x86_64-cpython-310/horovod/spark
        copying horovod/spark/runner.py -> build/lib.linux-x86_64-cpython-310/horovod/spark
        copying horovod/spark/mpi_run.py -> build/lib.linux-x86_64-cpython-310/horovod/spark
        copying horovod/spark/gloo_run.py -> build/lib.linux-x86_64-cpython-310/horovod/spark
        copying horovod/spark/conf.py -> build/lib.linux-x86_64-cpython-310/horovod/spark
        copying horovod/spark/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/spark
        creating build/lib.linux-x86_64-cpython-310/horovod/runner
        copying horovod/runner/task_fn.py -> build/lib.linux-x86_64-cpython-310/horovod/runner
        copying horovod/runner/run_task.py -> build/lib.linux-x86_64-cpython-310/horovod/runner
        copying horovod/runner/mpi_run.py -> build/lib.linux-x86_64-cpython-310/horovod/runner
        copying horovod/runner/launch.py -> build/lib.linux-x86_64-cpython-310/horovod/runner
        copying horovod/runner/js_run.py -> build/lib.linux-x86_64-cpython-310/horovod/runner
        copying horovod/runner/gloo_run.py -> build/lib.linux-x86_64-cpython-310/horovod/runner
        copying horovod/runner/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/runner
        creating build/lib.linux-x86_64-cpython-310/horovod/ray
        copying horovod/ray/worker.py -> build/lib.linux-x86_64-cpython-310/horovod/ray
        copying horovod/ray/utils.py -> build/lib.linux-x86_64-cpython-310/horovod/ray
        copying horovod/ray/strategy.py -> build/lib.linux-x86_64-cpython-310/horovod/ray
        copying horovod/ray/runner.py -> build/lib.linux-x86_64-cpython-310/horovod/ray
        copying horovod/ray/ray_logger.py -> build/lib.linux-x86_64-cpython-310/horovod/ray
        copying horovod/ray/elastic_v2.py -> build/lib.linux-x86_64-cpython-310/horovod/ray
        copying horovod/ray/elastic.py -> build/lib.linux-x86_64-cpython-310/horovod/ray
        copying horovod/ray/driver_service.py -> build/lib.linux-x86_64-cpython-310/horovod/ray
        copying horovod/ray/adapter.py -> build/lib.linux-x86_64-cpython-310/horovod/ray
        copying horovod/ray/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/ray
        creating build/lib.linux-x86_64-cpython-310/horovod/mxnet
        copying horovod/mxnet/mpi_ops.py -> build/lib.linux-x86_64-cpython-310/horovod/mxnet
        copying horovod/mxnet/functions.py -> build/lib.linux-x86_64-cpython-310/horovod/mxnet
        copying horovod/mxnet/compression.py -> build/lib.linux-x86_64-cpython-310/horovod/mxnet
        copying horovod/mxnet/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/mxnet
        creating build/lib.linux-x86_64-cpython-310/horovod/keras
        copying horovod/keras/elastic.py -> build/lib.linux-x86_64-cpython-310/horovod/keras
        copying horovod/keras/callbacks.py -> build/lib.linux-x86_64-cpython-310/horovod/keras
        copying horovod/keras/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/keras
        creating build/lib.linux-x86_64-cpython-310/horovod/data
        copying horovod/data/data_loader_base.py -> build/lib.linux-x86_64-cpython-310/horovod/data
        copying horovod/data/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/data
        creating build/lib.linux-x86_64-cpython-310/horovod/common
        copying horovod/common/util.py -> build/lib.linux-x86_64-cpython-310/horovod/common
        copying horovod/common/process_sets.py -> build/lib.linux-x86_64-cpython-310/horovod/common
        copying horovod/common/exceptions.py -> build/lib.linux-x86_64-cpython-310/horovod/common
        copying horovod/common/elastic.py -> build/lib.linux-x86_64-cpython-310/horovod/common
        copying horovod/common/basics.py -> build/lib.linux-x86_64-cpython-310/horovod/common
        copying horovod/common/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/common
        creating build/lib.linux-x86_64-cpython-310/horovod/_keras
        copying horovod/_keras/elastic.py -> build/lib.linux-x86_64-cpython-310/horovod/_keras
        copying horovod/_keras/callbacks.py -> build/lib.linux-x86_64-cpython-310/horovod/_keras
        copying horovod/_keras/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/_keras
        creating build/lib.linux-x86_64-cpython-310/horovod/torch/elastic
        copying horovod/torch/elastic/state.py -> build/lib.linux-x86_64-cpython-310/horovod/torch/elastic
        copying horovod/torch/elastic/sampler.py -> build/lib.linux-x86_64-cpython-310/horovod/torch/elastic
        copying horovod/torch/elastic/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/torch/elastic
        creating build/lib.linux-x86_64-cpython-310/horovod/tensorflow/keras
        copying horovod/tensorflow/keras/elastic.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow/keras
        copying horovod/tensorflow/keras/callbacks.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow/keras
        copying horovod/tensorflow/keras/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow/keras
        creating build/lib.linux-x86_64-cpython-310/horovod/tensorflow/data
        copying horovod/tensorflow/data/compute_worker.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow/data
        copying horovod/tensorflow/data/compute_service.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow/data
        copying horovod/tensorflow/data/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/tensorflow/data
        creating build/lib.linux-x86_64-cpython-310/horovod/spark/torch
        copying horovod/spark/torch/util.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/torch
        copying horovod/spark/torch/remote.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/torch
        copying horovod/spark/torch/estimator.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/torch
        copying horovod/spark/torch/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/torch
        creating build/lib.linux-x86_64-cpython-310/horovod/spark/tensorflow
        copying horovod/spark/tensorflow/compute_worker.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/tensorflow
        copying horovod/spark/tensorflow/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/tensorflow
        creating build/lib.linux-x86_64-cpython-310/horovod/spark/task
        copying horovod/spark/task/task_service.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/task
        copying horovod/spark/task/task_info.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/task
        copying horovod/spark/task/mpirun_exec_fn.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/task
        copying horovod/spark/task/gloo_exec_fn.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/task
        copying horovod/spark/task/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/task
        creating build/lib.linux-x86_64-cpython-310/horovod/spark/lightning
        copying horovod/spark/lightning/util.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/lightning
        copying horovod/spark/lightning/remote.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/lightning
        copying horovod/spark/lightning/legacy.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/lightning
        copying horovod/spark/lightning/estimator.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/lightning
        copying horovod/spark/lightning/datamodule.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/lightning
        copying horovod/spark/lightning/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/lightning
        creating build/lib.linux-x86_64-cpython-310/horovod/spark/keras
        copying horovod/spark/keras/util.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/keras
        copying horovod/spark/keras/tensorflow.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/keras
        copying horovod/spark/keras/remote.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/keras
        copying horovod/spark/keras/optimizer.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/keras
        copying horovod/spark/keras/estimator.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/keras
        copying horovod/spark/keras/datamodule.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/keras
        copying horovod/spark/keras/bare.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/keras
        copying horovod/spark/keras/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/keras
        creating build/lib.linux-x86_64-cpython-310/horovod/spark/driver
        copying horovod/spark/driver/rsh.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/driver
        copying horovod/spark/driver/rendezvous.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/driver
        copying horovod/spark/driver/mpirun_rsh.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/driver
        copying horovod/spark/driver/job_id.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/driver
        copying horovod/spark/driver/host_discovery.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/driver
        copying horovod/spark/driver/driver_service.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/driver
        copying horovod/spark/driver/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/driver
        creating build/lib.linux-x86_64-cpython-310/horovod/spark/data_loaders
        copying horovod/spark/data_loaders/pytorch_data_loaders.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/data_loaders
        copying horovod/spark/data_loaders/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/data_loaders
        creating build/lib.linux-x86_64-cpython-310/horovod/spark/common
        copying horovod/spark/common/util.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/common
        copying horovod/spark/common/store.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/common
        copying horovod/spark/common/serialization.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/common
        copying horovod/spark/common/params.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/common
        copying horovod/spark/common/estimator.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/common
        copying horovod/spark/common/datamodule.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/common
        copying horovod/spark/common/constants.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/common
        copying horovod/spark/common/cache.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/common
        copying horovod/spark/common/backend.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/common
        copying horovod/spark/common/_namedtuple_fix.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/common
        copying horovod/spark/common/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/spark/common
        creating build/lib.linux-x86_64-cpython-310/horovod/runner/util
        copying horovod/runner/util/threads.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/util
        copying horovod/runner/util/streams.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/util
        copying horovod/runner/util/remote.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/util
        copying horovod/runner/util/network.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/util
        copying horovod/runner/util/lsf.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/util
        copying horovod/runner/util/cache.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/util
        copying horovod/runner/util/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/util
        creating build/lib.linux-x86_64-cpython-310/horovod/runner/task
        copying horovod/runner/task/task_service.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/task
        copying horovod/runner/task/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/task
        creating build/lib.linux-x86_64-cpython-310/horovod/runner/http
        copying horovod/runner/http/http_server.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/http
        copying horovod/runner/http/http_client.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/http
        copying horovod/runner/http/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/http
        creating build/lib.linux-x86_64-cpython-310/horovod/runner/elastic
        copying horovod/runner/elastic/worker.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/elastic
        copying horovod/runner/elastic/settings.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/elastic
        copying horovod/runner/elastic/rendezvous.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/elastic
        copying horovod/runner/elastic/registration.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/elastic
        copying horovod/runner/elastic/driver.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/elastic
        copying horovod/runner/elastic/discovery.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/elastic
        copying horovod/runner/elastic/constants.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/elastic
        copying horovod/runner/elastic/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/elastic
        creating build/lib.linux-x86_64-cpython-310/horovod/runner/driver
        copying horovod/runner/driver/driver_service.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/driver
        copying horovod/runner/driver/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/driver
        creating build/lib.linux-x86_64-cpython-310/horovod/runner/common
        copying horovod/runner/common/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common
        creating build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
        copying horovod/runner/common/util/tiny_shell_exec.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
        copying horovod/runner/common/util/timeout.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
        copying horovod/runner/common/util/settings.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
        copying horovod/runner/common/util/secret.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
        copying horovod/runner/common/util/safe_shell_exec.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
        copying horovod/runner/common/util/network.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
        copying horovod/runner/common/util/hosts.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
        copying horovod/runner/common/util/host_hash.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
        copying horovod/runner/common/util/env.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
        copying horovod/runner/common/util/config_parser.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
        copying horovod/runner/common/util/codec.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
        copying horovod/runner/common/util/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/util
        creating build/lib.linux-x86_64-cpython-310/horovod/runner/common/service
        copying horovod/runner/common/service/task_service.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/service
        copying horovod/runner/common/service/driver_service.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/service
        copying horovod/runner/common/service/compute_service.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/service
        copying horovod/runner/common/service/__init__.py -> build/lib.linux-x86_64-cpython-310/horovod/runner/common/service
        running build_ext
        Traceback (most recent call last):
          File "/home/dsegesdi/work/horovodtest/.venv/lib/python3.10/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 351, in <module>
            main()
          File "/home/dsegesdi/work/horovodtest/.venv/lib/python3.10/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 333, in main
            json_out['return_val'] = hook(**hook_input['kwargs'])
          File "/home/dsegesdi/work/horovodtest/.venv/lib/python3.10/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 249, in build_wheel
            return _build_backend().build_wheel(wheel_directory, config_settings,
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 412, in build_wheel
            return self._build_with_temp_dir(['bdist_wheel'], '.whl',
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 397, in _build_with_temp_dir
            self.run_setup()
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 483, in run_setup
            super(_BuildMetaLegacyBackend,
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/build_meta.py", line 335, in run_setup
            exec(code, locals())
          File "<string>", line 213, in <module>
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/__init__.py", line 87, in setup
            return distutils.core.setup(**attrs)
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 185, in setup
            return run_commands(dist)
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
            dist.run_commands()
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 968, in run_commands
            self.run_command(cmd)
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 1217, in run_command
            super().run_command(command)
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
            cmd_obj.run()
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/wheel/bdist_wheel.py", line 299, in run
            self.run_command('build')
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 319, in run_command
            self.distribution.run_command(command)
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 1217, in run_command
            super().run_command(command)
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
            cmd_obj.run()
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 132, in run
            self.run_command(cmd_name)
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 319, in run_command
            self.distribution.run_command(command)
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/dist.py", line 1217, in run_command
            super().run_command(command)
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 987, in run_command
            cmd_obj.run()
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 84, in run
            _build_ext.run(self)
          File "/tmp/pip-build-env-rbzyecma/overlay/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 346, in run
            self.build_extensions()
          File "<string>", line 106, in build_extensions
          File "<string>", line 67, in get_cmake_bin
        ModuleNotFoundError: No module named 'packaging'
        [end of output]
    
    note: This error originates from a subprocess, and is likely not a problem with pip.
    ERROR: Failed building wheel for horovod
  Failed to build horovod
  ERROR: Could not build wheels for horovod, which is required to install pyproject.toml-based projects
  

  at ~/.pyenv/versions/3.10.7/lib/python3.10/site-packages/poetry/utils/env.py:1476 in _run
      1472│                 output = subprocess.check_output(
      1473│                     command, stderr=subprocess.STDOUT, env=env, **kwargs
      1474│                 )
      1475│         except CalledProcessError as e:
    → 1476│             raise EnvCommandError(e, input=input_)
      1477│ 
      1478│         return decode(output)
      1479│ 
      1480│     def execute(self, bin: str, *args: str, **kwargs: Any) -> int:

The following error occurred when trying to handle this error:


  Stack trace:

  5  ~/.pyenv/versions/3.10.7/lib/python3.10/site-packages/poetry/installation/executor.py:261 in _execute_operation
      259│ 
      260│             try:
    → 261│                 result = self._do_execute_operation(operation)
      262│             except EnvCommandError as e:
      263│                 if e.e.returncode == -2:

  4  ~/.pyenv/versions/3.10.7/lib/python3.10/site-packages/poetry/installation/executor.py:334 in _do_execute_operation
      332│             return 0
      333│ 
    → 334│         result: int = getattr(self, f"_execute_{method}")(operation)
      335│ 
      336│         if result != 0:

  3  ~/.pyenv/versions/3.10.7/lib/python3.10/site-packages/poetry/installation/executor.py:454 in _execute_install
      452│ 
      453│     def _execute_install(self, operation: Install | Update) -> int:
    → 454│         status_code = self._install(operation)
      455│ 
      456│         self._save_url_reference(operation)

  2  ~/.pyenv/versions/3.10.7/lib/python3.10/site-packages/poetry/installation/executor.py:496 in _install
      494│         )
      495│         self._write(operation, message)
    → 496│         return self.pip_install(archive, upgrade=operation.job_type == "update")
      497│ 
      498│     def _update(self, operation: Install | Update) -> int:

  1  ~/.pyenv/versions/3.10.7/lib/python3.10/site-packages/poetry/installation/executor.py:123 in pip_install
      121│     ) -> int:
      122│         try:
    → 123│             pip_install(req, self._env, upgrade=upgrade, editable=editable)
      124│         except EnvCommandError as e:
      125│             output = decode(e.e.output)

  PoetryException

  Failed to install /home/dsegesdi/.cache/pypoetry/artifacts/22/3f/52/5ac958d5d9895637e320ab9c30f2728b984c4e6220527a35a0078abf96/horovod-0.26.1.tar.gz

  at ~/.pyenv/versions/3.10.7/lib/python3.10/site-packages/poetry/utils/pip.py:51 in pip_install
       47│ 
       48│     try:
       49│         return environment.run_pip(*args)
       50│     except EnvCommandError as e:
    →  51│         raise PoetryException(f"Failed to install {path.as_posix()}") from e
       52│ 

@itsdani
Copy link

itsdani commented Oct 27, 2022

I'm wondering if this should be a new issue. The original error doesn't happen anymore with poetry 1.2.2 and this new issue doesn't happen with the Horovod version: 0.24.2 in the issue description

Environment:

  1. Framework: -
  2. Framework version: -
  3. Horovod version: 0.26.0, 0.26.1
  4. MPI version: -
  5. CUDA version: -
  6. NCCL version: -
  7. Python version: 3.10.7
  8. Spark / PySpark version: -
  9. Ray version: -
  10. OS and version: Ubuntu 20.04 and Endeavour OS
  11. GCC version: 12.2.0
  12. CMake version: 3.24.2

@EnricoMi
Copy link
Collaborator

Error No module named 'packaging' with Horovod 0.26.0 is a different issue: #3744
That has been fixed with release 0.26.1: https://github.com/horovod/horovod/releases/tag/v0.26.1

@itsdani
Copy link

itsdani commented Oct 27, 2022

This still happens with poetry for 0.26.1, I have explicitly tried poetry add horovod==0.26.1

It also fails with pip install --use-pep517 --no-cache-dir horovod==0.26.1

@EnricoMi
Copy link
Collaborator

@itsdani you are saying the issue described in this issue's description is fixed with petry 1.2.2?

@itsdani
Copy link

itsdani commented Oct 27, 2022

Yes, the No module named 'tensorflow' doesn't happen with poetry 1.2.2

@maxhgerlach
Copy link
Collaborator

If the packaging import is a problem now, I suspect that the install process just doesn't get far enough to surface the No module named 'tensorflow' error.

As a user of Horovod, I'd suggest to separately build a Horovod wheel adapted for your environment (with all flags configured appropriately and all external dependencies installed with the right versions; HOROVOD_LOCAL_VERSION set to some internal value) and then use that in your Python dependency management systems.

@itsdani
Copy link

itsdani commented Oct 27, 2022

The No module named 'tensorflow' problem doesn't happen on poetry 1.2.2 for horovod 0.25.0, the install succeeds without any errors.

I have commented on the other closed issue (#3744) regarding No module named 'packaging', because my issue is the same and the current fix doesn't work with pip install --use-pep517

@maxhgerlach
Copy link
Collaborator

maxhgerlach commented Oct 27, 2022

OK. I don't think it's helpful to spread this over multiple semi-related issues, though.

Would you know how to configure the dependence on packaging in setup.py in a way that it works with pip install --use-pep517?

@itsdani
Copy link

itsdani commented Oct 27, 2022

Would you know how to configure the dependence on packaging in setup.py in a way that it works with pip install --use-pep517?

Sadly I have no idea how any of this works :(

@itsdani
Copy link

itsdani commented Oct 27, 2022

I have found a solution. Adding a pyproject.toml with build-system dependencies to horovod fixes the issue, since PEP-517 and 518 specifies the use of pyproject.toml for things like this.

I'm not entirely sure which issue I should reference in the PR (#3758) at this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
5 participants