Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add unittest job to CircleCI #2328

Merged
merged 1 commit into from Jun 30, 2020

Conversation

guyang3532
Copy link
Contributor

@guyang3532 guyang3532 commented Jun 18, 2020

Remove unittest from conda build, and add them as independent unittest jobs.
So unittests will not triggered at every conda build and test in an independent test env.
test at #2332

@guyang3532 guyang3532 force-pushed the add_win_ci_unittest branch 7 times, most recently from fc8b338 to 374d0c6 Compare June 18, 2020 07:49
@guyang3532 guyang3532 changed the title Add unittest to Windows CI Add unittest job to CircleCI Jun 18, 2020
Copy link
Contributor

@peterjc123 peterjc123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM, please remember to remove the windows legacy build jobs and its job spec.

@codecov
Copy link

codecov bot commented Jun 18, 2020

Codecov Report

Merging #2328 into master will increase coverage by 2.24%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2328      +/-   ##
==========================================
+ Coverage   68.49%   70.73%   +2.24%     
==========================================
  Files          93       93              
  Lines        7655     7706      +51     
  Branches     1177     1177              
==========================================
+ Hits         5243     5451     +208     
+ Misses       2075     1903     -172     
- Partials      337      352      +15     
Impacted Files Coverage Δ
torchvision/models/detection/transform.py 95.55% <0.00%> (+0.02%) ⬆️
torchvision/ops/poolers.py 97.02% <0.00%> (+0.02%) ⬆️
torchvision/models/inception.py 84.64% <0.00%> (+0.05%) ⬆️
torchvision/models/googlenet.py 82.32% <0.00%> (+0.09%) ⬆️
torchvision/models/detection/roi_heads.py 82.34% <0.00%> (+0.11%) ⬆️
torchvision/models/detection/generalized_rcnn.py 88.33% <0.00%> (+0.19%) ⬆️
torchvision/models/detection/_utils.py 67.34% <0.00%> (+0.22%) ⬆️
torchvision/datasets/imagenet.py 82.45% <0.00%> (+0.31%) ⬆️
torchvision/ops/boxes.py 84.44% <0.00%> (+0.35%) ⬆️
torchvision/models/densenet.py 82.81% <0.00%> (+0.41%) ⬆️
... and 16 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 446eac6...80c34fd. Read the comment docs.

@guyang3532
Copy link
Contributor Author

guyang3532 commented Jun 18, 2020

Generally LGTM, please remember to remove the windows legacy build jobs and its job spec.

thank you, they have been removed.

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

I think that we might now be missing our single CUDA CI that was effectively running CUDA tests, which was torchvision_linux_py3.8_cu102_cuda . So we would need to add back a CUDA CI test before we can merge this, otherwise we won't be able to check for CUDA correctness (now it only compiles torchvision, but doesn't run any tests).

@seemethere could you also have a look?

conda activate ./env

python -m torch.utils.collect_env
pytest --cov=torchaudio --junitxml=test-results/junit.xml -v --durations 20 test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: torchaudio

# This script is for setting up environment in which unit test is ran.
# To speed up the CI time, the resulting environment is cached.
#
# Do not install PyTorch and torchaudio here, otherwise they also get cached.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: torchaudio

# This script is for setting up environment in which unit test is ran.
# To speed up the CI time, the resulting environment is cached.
#
# Do not install PyTorch and torchaudio here, otherwise they also get cached.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: torchaudio

unittest_linux_cpu:
<<: *binary_common
docker:
- image: "pytorch/manylinux-cuda102"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why cuda image is used to run cpu tests?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I added the original version of this to torchaudio, I could not find CPU version. If you know what's an appropriate image for CPU, let me know, I will also update on torchaudio side.

image: ubuntu-1604-cuda-10.1:201909-23
resource_class: gpu.small
environment:
image_name: "pytorch/manylinux-cuda101"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why you prefer CUDA 10.1 vs CUDA 10.2

executor:
name: windows-gpu
environment:
CUDA_VERSION: "10.1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above: why CUDA 10.1 rather than CUDA 10.2?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use CUDA 10.1 before this change just because it is already installed in the image. If you want, we may do it in a follow-up PR.

<<: *binary_common
machine:
image: ubuntu-1604-cuda-10.1:201909-23
resource_class: gpu.small
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't this job and the unittest_linux_cpu basically the same job?

Should we just consolidate them and then add an extra parameter to designate resource_class?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seemethere When I worked on this, I could not find a Docker runner with GPU. My understanding is that for GPU, CircleCI gives Virtual Machine, so we need to run the same script with docker run argument. While non-GPU environment CIrcleCI gives is Docker container.
Let me know if you know a way to request GPU Docker environment (not VM).

path: test-results

unittest_windows_gpu:
<<: *binary_common
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here about consolidating build jobs

@guyang3532 guyang3532 force-pushed the add_win_ci_unittest branch 3 times, most recently from 61ef663 to 87eff97 Compare June 22, 2020 07:21
@peterjc123
Copy link
Contributor

Thanks for the PR!

I think that we might now be missing our single CUDA CI that was effectively running CUDA tests, which was torchvision_linux_py3.8_cu102_cuda . So we would need to add back a CUDA CI test before we can merge this, otherwise we won't be able to check for CUDA correctness (now it only compiles torchvision, but doesn't run any tests).

@seemethere could you also have a look?

Do you mean to run unit tests with CUDA? There is already one named unittest_linux_gpu that is testing with CUDA 10.1. Or you just want to add one for CUDA 10.2? I don't think we should run tests through conda-build anymore. It's just a waste of time.

@fmassa
Copy link
Member

fmassa commented Jun 22, 2020

@peterjc123

There is already one named unittest_linux_gpu that is testing with CUDA 10.1.

I might be missing something, but the CI is not running any test named unittest_*_gpu I believe?

@peterjc123
Copy link
Contributor

peterjc123 commented Jun 22, 2020

@fmassa They run only on master. Maybe we should make at least one of them run in PRs, right?

@fmassa
Copy link
Member

fmassa commented Jun 22, 2020

They run only on master. Maybe we should make at least one of them run in PRs, right?

Yes, they should run in PRs as well. I see that it has been updated now, thanks!

@fmassa
Copy link
Member

fmassa commented Jun 22, 2020

Maybe we don't need all CUDA tests running on every PR (would be expensive?), but only a small subset -- before we only had one CI running CUDA tests on Linux and Windows, maybe we could keep the same here?

cc @seemethere for thoughts

@guyang3532
Copy link
Contributor Author

Maybe we don't need all CUDA tests running on every PR (would be expensive?), but only a small subset -- before we only had one CI running CUDA tests on Linux and Windows, maybe we could keep the same here?

cc @seemethere for thoughts

if we only put unittest_gpu_py3.8 on pr and others on master?

@fmassa
Copy link
Member

fmassa commented Jun 23, 2020

@guyang3532

if we only put unittest_gpu_py3.8 on pr and others on master?

That sounds good to me, although I would also put unittest_windwos_gpu_py3.8 on every PR as well

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me, thanks a lot!

I think we can cleanup one more build which is not needed anymore, but otherwise this seems ok to me.

I'll let @seemethere merge this one, as he is maintaining CI and releases for torchvision.

Comment on lines 518 to 521
- binary_linux_conda_cuda:
name: torchvision_linux_py3.8_cu102_cuda
python_version: "3.8"
cu_version: "cu102"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This build is I believe not needed anymore, and the corresponding binary_linux_conda_cuda is not needed anymore I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants