Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unable to install tf 1.15.2 with dgx-a100 #4

Closed
jackyko1991 opened this issue Sep 14, 2020 · 2 comments
Closed

unable to install tf 1.15.2 with dgx-a100 #4

jackyko1991 opened this issue Sep 14, 2020 · 2 comments

Comments

@jackyko1991
Copy link
Contributor

jackyko1991 commented Sep 14, 2020

I have just installed the DGX-A100 server. To setup the tensorflow bare-metal environment, I worked with anaconda for virtual environment control.

To replicate the process, I run through the following commands:

Installation of Anaconda

wget https://repo.anaconda.com/archive/Anaconda3-2020.07-Linux-x86_64.sh
chmod 777 ./Anaconda3-2020.07-Linux-x86_64.sh
bash ./Anaconda3-2020.07-Linux-x86_64.sh
source ~/.bashrc

Prepare environment

conda create -n tf_1.15.2 python=3.7
conda activate tf_1.15.2

Nvidia Tensorflow 1.15.2 install

pip install --user nvidia-pyindex
pip install --user nvidia-tensorflow[horovod]

error appears as following when installing nvidia-tensorflow[horovod]

Collecting nvidia-tensorflow[horovod]
  Downloading nvidia-tensorflow-0.0.1.dev0.tar.gz (3.4 kB)
    ERROR: Command errored out with exit status 1:
     command: /home/jacky/anaconda3/envs/tf_1.15.2/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-i7t5p2bw/nvidia-tensorflow/setup.py'"'"'; __file__='"'"'/tmp/pip-install-i7t5p2bw/nvidia-tensorflow/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-8fko_xuv
         cwd: /tmp/pip-install-i7t5p2bw/nvidia-tensorflow/
    Complete output (7 lines):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-i7t5p2bw/nvidia-tensorflow/setup.py", line 130, in <module>
        raise RuntimeError("This package should not be installed.\nPlease refer "
    RuntimeError: This package should not be installed.
    Please refer to NVIDIA instructions: https://github.com/nvidia/tensorflow.
    Your PIP command defaults to the official PyPI as a package repository.
    ----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

NGC docker is not an available solution for my machine. Any help will be appreciated

@jackyko1991
Copy link
Contributor Author

jackyko1991 commented Sep 14, 2020

just passed the install with python 3.6. Will the coming release update with a newer version of python?

@DEKHTIARJonathan
Copy link
Contributor

DEKHTIARJonathan commented Oct 2, 2020

@jackyko1991 the problems comes that you use anaconda. If you use a standard python:3.6 distribution it will work perfectly.

I suggest you using a docker container as follows, if you can:

docker pull python:3.6

docker run --gpus all -it --rm \
  -v $(pwd):/workspace --workdir /workspace \
  -e NVIDIA_DRIVER_CAPABILITIES=compute,utility \
  python:3.6 bash

If you really really want to make it with anaconda, here a workaround:

Use this command to install tensorflow:

pip install \
    --extra-index-url=https://pypi.ngc.nvidia.com \
    --trusted-host pypi.ngc.nvidia.com \
    nvidia-tensorflow[horovod]

If it works for you, please close the issue :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants