Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ALGO-935] Create Tensorflow 2.4 Python 3.8 environment with updated base image [WIP] #188

Closed
wants to merge 1 commit into from

Conversation

aslisabanci
Copy link
Contributor

Couldn't test this on deep purple yet, a bit frustrated with the errors I've been getting so far. I'm opening this PR for your reviews, in case you notice something missing. Any help to test these is also appreciated to make things faster.

@aslisabanci
Copy link
Contributor Author

aslisabanci commented Feb 13, 2021

Trying to validate the environment with this command:
./tools/environment_validator.py -b nvidia/cuda:11.2.0-cudnn8-devel-ubuntu20.04 -g python3 -s python38 -d tensorflow-gpu-2.4 -t dependency -n tensorflow-gpu-2.4 --nvidia-support 1

fails with the following error on deep purple (where we have CUDA 10.2 installed)
docker.errors.APIError: 500 Server Error for http+docker://localhost/v1.40/containers/8914501b9ed4ebd13c13dd2d79c053ae88a6abcf2f17975382d3ee720cae1fea/start: Internal Server Error ("OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: requirement error: unsatisfied condition: cuda>=11.2, please update your driver to a newer version, or use an earlier cuda container\\\\n\\\"\"": unknown")

Not progressing with publishing this on test until we address this.

@aslisabanci
Copy link
Contributor Author

With the updated CUDA drivers, we can now validate this package as:
./tools/environment_validator.py -b nvidia/cuda:11.0.3-cudnn8-runtime-ubuntu20.04 -g python3 -s python38 -d tensorflow-gpu-2.4 -t dependency -n tensorflow-gpu-2.4 --nvidia-support 1

Things to note above:

@aslisabanci aslisabanci changed the title [ALGO-935] Create Tensorflow 2.4 Python 3.8 environment with updated base image [ALGO-935] Create Tensorflow 2.4 Python 3.8 environment with updated base image [WIP] Feb 16, 2021
@aslisabanci
Copy link
Contributor Author

Deleting the branch and the PR as they're already merged into develop from another branch by Daniel.

@aslisabanci aslisabanci deleted the ALGO-935 branch April 19, 2021 17:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant