Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cirun.io GPU runners. #9

Merged
merged 32 commits into from
Jun 30, 2021
Merged

Conversation

hodgestar
Copy link
Contributor

This is a trial run of using cirun.io self-hosted runners to run tests on a real GPU.

@coveralls
Copy link

coveralls commented Jun 24, 2021

Pull Request Test Coverage Report for Build 983112386

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 100.0%

Totals Coverage Status
Change from base Build 944397557: 0.0%
Covered Lines: 4
Relevant Lines: 4

💛 - Coveralls

.cirun.yml Outdated Show resolved Hide resolved
.cirun.yml Outdated Show resolved Hide resolved
.cirun.yml Outdated Show resolved Hide resolved
@hodgestar
Copy link
Contributor Author

@MrRobot2211 @jakelishman The self-host GPU runner appears to be working! Ready for review.

There is one minor issue I'm aware of -- coveralls is failing to find the git commit information to associate with the coverage upload. This happens because the NVidia AMI is based on Ubuntu 18.04 which comes with git 2.17. GitHub requires git >= 2.18 to do its fancy "single commit checkout" and falls back to creating just a tarball snapshot of the repository which doesn't contain the commit information.

We could try fix this in another PR by adding a step that does some version of:

sudo add-apt-repository ppa:git-core/ppa -y
sudo apt-get update
sudo apt-get install git -y
git --version

but I think that should either happen in another PR, or we should solve the issue by seeing if we can get an NVidia AMI based on Ubuntu 20.04.

@hodgestar
Copy link
Contributor Author

@aktech Thanks for all your help. Looks like it's working nicely!

Do you perhaps know if there is an AMI we can use that gives us the NVidia drivers and Ubuntu 20.04?

@MrRobot2211
Copy link
Collaborator

@aktech Thanks for all your help. Looks like it's working nicely!

Do you perhaps know if there is an AMI we can use that gives us the NVidia drivers and Ubuntu 20.04?

This one seems to have what we need ami-0887cec38e5c0ab84

@hodgestar
Copy link
Contributor Author

This one seems to have what we need ami-0887cec38e5c0ab84

@MrRobot2211 The last test run seems to have the same git version issue as before. What is the AMI you've switched to and do you have a link to a description of what is in it?

@MrRobot2211
Copy link
Collaborator

MrRobot2211 commented Jun 30, 2021

This one seems to have what we need ami-0887cec38e5c0ab84

@MrRobot2211 The last test run seems to have the same git version issue as before. What is the AMI you've switched to and do you have a link to a description of what is in it?

Sorry for that.

The descriptio of this one is
Deep Learning AMI GPU CUDA 11.2.1 (Ubuntu 20.04) 20210625 - ami-0887cec38e5c0ab84
Built with NVIDIA CUDA, cuDNN, NCCL, GPU Driver, Docker, NVIDIA-Docker and EFA support. For a fully managed experience, check: https://aws.amazon.com/sagemaker
Root device type: ebs Virtualization type: hvm ENA Enabled: Yes

the links seem to be state-less so you will have to enter here

and then search ubuntu 20.04 nvidia. There are some more options but most of them come with some DL libraries pre-installed.

@hodgestar
Copy link
Contributor Author

@MrRobot2211 Ah, that AMI is for us-east-2. I am trying the same AMI for eu-west-1.

@hodgestar
Copy link
Contributor Author

@MrRobot2211 Nice! It looks like the eu-west-1 version of the AMI you suggested sorted the problem out. If you're happy with this PR now, I will merge it and we can start using it.

@aktech
Copy link

aktech commented Jun 30, 2021

Nice one guys in finding out the right AMI for Ubuntu 20! I'll add it to the docs in here.

.github/workflows/docs-gpu.yml Show resolved Hide resolved
.github/workflows/docs-gpu.yml Outdated Show resolved Hide resolved
tests/test_base.py Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants