Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use an NVIDIA base image. #3177

Merged
merged 5 commits into from Sep 6, 2019
Merged

Use an NVIDIA base image. #3177

merged 5 commits into from Sep 6, 2019

Conversation

@schmmd
Copy link
Member

schmmd commented Aug 20, 2019

This yields a smaller base image with reduced complexity around NVIDIA/CUDA (we simply use the base image) but increased complexity around what version of Python we're installing (we need to manage it more manually rather than just using the base image).

This change would make it easier to run the allennlp Docker image on a GPU because it wouldn't require that the CUDA 10.0 drivers are installed--the image packages the drivers itself. For example, allennlp-server-3 has the 10.1 drivers installed. With the current Docker image we have a CUDNN_STATUS_EXECUTION_FAILED but training works successfully on a GPU with this image.

allennlp                                                  nvidia-base-18.04                          b8b191a7a08d        2 hours ago         3.66GB
allennlp/allennlp                                         v0.8.4                                     ed8b963a800a        2 months ago        3.93GB
@schmmd schmmd requested a review from joelgrus Aug 20, 2019
@schmmd

This comment has been minimized.

Copy link
Member Author

schmmd commented Aug 20, 2019

Another way I could have set this up is by relying more on conda--both managing pytorch and cuda versions with it.

I could easily apply the same change here on our Dockerfile we use for CI, if desired.

build-essential \
cmake \
git \
python3-dev \

This comment has been minimized.

Copy link
@joelgrus

joelgrus Aug 20, 2019

Collaborator

can we be more specific about what version of python to install? off the top of my head I have no idea what version this would result in

This comment has been minimized.

Copy link
@schmmd

schmmd Aug 20, 2019

Author Member

I'd have to use Conda (which is an option). I previously looked into that.

# python --version
Python 3.6.8

This comment has been minimized.

Copy link
@schmmd

schmmd Aug 21, 2019

Author Member

I pushed a commit that forces 3.6 at least.

@schmmd

This comment has been minimized.

Copy link
Member Author

schmmd commented Aug 22, 2019

@joelgrus does this look good or should we try Conda? We're now pinning to 3.6.x.

Copy link
Collaborator

joelgrus left a comment

it looks good, at some point we should consider standardizing on 3.7, but that's a bigger discussion

schmmd and others added 2 commits Aug 22, 2019
@schmmd schmmd merged commit b1caa9e into master Sep 6, 2019
3 checks passed
3 checks passed
Pull Requests (AllenNLP Library) TeamCity build finished
Details
codecov/patch Coverage not affected when comparing 155a94e...5a3fde3
Details
codecov/project 92% remains the same compared to 155a94e
Details
schmmd added a commit that referenced this pull request Sep 6, 2019
This reverts commit b1caa9e.
schmmd added a commit that referenced this pull request Sep 6, 2019
This reverts commit b1caa9e.
@schmmd schmmd mentioned this pull request Sep 10, 2019
reiyw added a commit to reiyw/allennlp that referenced this pull request Nov 12, 2019
* Remove some environment variables that are set in the NVIDIA base image.

* Use the nvidia base image in Dockerfile.pip.

* Use Python 3.6.
reiyw added a commit to reiyw/allennlp that referenced this pull request Nov 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.