Mark tests that should only be run nightly. #8689
Mark tests that should only be run nightly. #8689
Conversation
|
||
|
||
# This test takes several minutes to run, marking as nightly. | ||
@attr('nightly') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't move the sparse tests to nightly. Try reduce the array dimensions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do...
@@ -1266,6 +1267,7 @@ def test_bneq(a, b): | |||
test_bneq(a, b) | |||
|
|||
|
|||
@attr('nightly') | |||
def test_broadcast_binary_op(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to stay in unittests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do...
I understand the motivation for moving tests to nightly. The concern I still have is if we disable certain tests on PR-level and only run them nightly, the problems in code may not surface in PR as they should. One might submit a patch for optimization or fix on existing op which actually contains bug, whose tests only run nightly. Reviewers might assume that the tests are run by the CI while in fact they are not. In general, removing tests from CI in PR is equivalent of removing the guarantee that the particular functionality still works after the PR. Admittedly some tests, including some of the tests I wrote, may be simple fuzzy tests with randomized inputs, that should have been designed as small fixtures to catch specific edge cases. That way, the each of the tests won't run for longer than seconds which is ideal. I agree we should do better at writing tests. Given the current situation in the test cases, it will be major endeavor moving from where we are to where the ideal world is. Given the constraint that we shouldn't be breaking mxnet in doing so, I think we should only replace existing long-running tests with carefully designed test cases instead of disabling them. |
@szha For what it's worth I agree with all of your points. I hope we can work together on updating tests soon (I plan to try and work on this full time). |
@KellenSunderland thanks. The information you collected in this PR while analyzing the tests are valuable. It would be great if we could have a summary and a checklist of the test problems in an issue, so that everyone can pitch in on helping and see the progress. |
fcbfca8
to
807e965
Compare
@piiswrong updated. |
@@ -3365,6 +3368,7 @@ def test_log_softmax(): | |||
check_numeric_gradient(sym, [data], rtol=0.05, atol=1e-3) | |||
|
|||
|
|||
@attr('nightly') | |||
def test_pick(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please leave this test as is. it's an operator that's not in contrib, so we shouldn't leave any chance of breaking it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Roger.
@@ -18,6 +18,7 @@ | |||
import os | |||
import mxnet as mx | |||
import numpy as np | |||
from nose.plugins.attrib import attr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused import
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ty
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually attr is used on line 192 right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, wrong file. I meant for the sparse tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. Updated.
807e965
to
6cb6e16
Compare
@szha updated. |
6cb6e16
to
4d065b5
Compare
FYI gpu test runtimes are: 14m20.971s on a p3.2xlarge. |
@KellenSunderland please rebase and resolve conflict. |
4d065b5
to
31c066c
Compare
Setup Jenkins to ignore slow tests during PR builds. Also marking one crashing test. Github issue has been raised.
31c066c
to
7657c37
Compare
closing due to outdated |
Sorry for some reason I can't re-open #8554. Here's a new PR.
As discussed here in a few threads, segmenting tests will go a long way in stabilizing our CI. This PR is a WIP that aims to remove some of the most problematic tests. All tests ignored here either crashed on a P3 instance with the release version (r 0.12) of mxnet from the deep learning AMI, or they took longer to run than a minute. With these tests removed all tests were running in less than 2 minutes on a P3.
Tests are removed with test annotations (aka decorators). Initially I've only used @nightly to indicate a test that would best be run on a nightly basis and @crashing to indicate a crashing test. Once we've decorated tests with annotations we can selectively choose to run them by passing the -a arg to nose (as shown in the new Jenkinsfile).
Also made a few changes to speed up GPU builds on CI.
Checklist
Essentials
make lint
)