Deciding for a testing policy #67

dnouri · 2014-12-13T18:27:30Z

Disucssion started in #66:

@dnouri said: "At some point we should make a decision about whether or not we're committing to tests. If we want tests, then pull requests that add functionality should add tests, too. get_output is (was) thoroughly tested, thus adding a test for this extra feature is trivial."

@benanne responded: "On the one hand tests are obviously a great thing to have - on the other hand they may create a bit of a barrier for potential contributors. I think it might actually be a good idea to require tests for modifications to core functionality like this one, but if someone decides to add a new layer class or something, maybe we shouldn't be so strict. We should probably flesh out an official policy, maybe we can create a separate issue for it."

Let's continue the discussion here.

dnouri · 2014-12-13T18:28:08Z

Why should our policy be different to Theano's? They seem to have plenty of contributors.

eickenberg · 2014-12-13T19:24:58Z

Um, if I may chime in: in my experience with experienced/productive developers, the absence of testing can be much scarier (and thus be a barrier) than strictness: If you have full unit tests, full regression tests for bugfixes and rudimentary smoke tests that cover common usecases (by verifying that they run through), then you can be pretty much sure not to break anything existing by simply getting all the tests to pass in your contribution branch. If this is guaranteed, then it gives a feeling of safety that is worth quite a lot IMO.

By its very nature, this package should always have very easy to unit test building blocks, so it shouldn't really be that much effort to do it properly. Often one can also compare to a numpy/scipy version of the same functionality.

Also, as a more and more regular user (and contributor if I find something that I can be useful for) of this toolbox, I like the feeling to be able to rely on it working, and testably so ...

In any case, I'd be very very very wary of potential code rot and the broken window effect: If you decide now to be relaxed on testing, it will be very difficult to change policies later on.

It may also be a point of distinction to other python dl packages that are cropping up here and there, to be able to say yes, we do all that, but we also have tests.

For new contributors this may also be a great opportunity to learn good practice ...

Following @dnouri s comment: There doesn't seem to be a problem with contributor numbers in sklearn either.

benanne · 2014-12-13T19:26:51Z

All good points. Then I guess we need some kind of code sprint to get the current code base up to speed with tests. We can't expect contributors to provide tests if we don't do it ourselves :) And I guess the same thing goes for documentation, actually.

dnouri · 2014-12-13T19:57:03Z

Cool. Yes, let's try to improve that test and docs coverage. I've been falling behind a little bit lately. But I think in the meantime we can ask contributors to add tests for new stuff where tests for that class already exist, and update docs where they're already written.

f0k · 2014-12-13T21:29:16Z

we can ask contributors to [...] update docs where they're already written.

Yes, I totally agree with that. I cannot stand outdated docstrings, and I feel bad when modifying undocumented code. I will try to develop the same strong feelings for tests.

About the "this might put up a barrier for new contributors" argument: We don't have to force contributors to add all the tests and docstrings that come to our mind when reading a PR, we can ask them nicely, and do them ourselves if they're not able to comply and the contribution is worth it. There's a nice article about that: http://blog.spreedly.com/2014/06/24/merge-pull-request-considered-harmful (the first section). I guess that could be a good compromise between being overly strict with contributors and inviting the broken window effect.

benanne · 2014-12-13T23:11:51Z

Very good read! I agree that that's a good compromise.

erikbern · 2014-12-13T23:21:12Z

To chime in with my experience from maintaining Luigi: I think in fact it actually lowers the barrier for contribution since the tests provide a safety guarantee. A ratio of 1:1 in terms of line count is generally a good goal.

dnouri · 2014-12-15T13:36:25Z

Regarding the 1:1 ratio: There's also Python's nice coverage module, which can be easily used with py.test. It shows you which lines of your production code weren't executed by any of the tests.

benanne · 2014-12-15T14:13:19Z

Sounds like this could actually be useful to track our progress writing tests for the base library so far. Can this be integrated into our current setup with travis?

f0k · 2014-12-15T14:15:29Z

There's coveralls which can be integrated into Travis CI. scikit-learn uses that, for example.

dnouri · 2014-12-15T14:28:19Z

coveralls looks cool, never tried it. I just checked in something that prints out a coverage report at the end of every test run. Right now it looks pretty messy. ;-)

--------------- coverage: platform linux2, python 2.7.6-final-0 ----------------
Name                                 Stmts   Miss  Cover   Missing
------------------------------------------------------------------
nntools/__init__                         7      0   100%   
nntools/init                            49      2    96%   17, 44
nntools/layers/__init__                  3      0   100%   
nntools/layers/base                    432    212    51%   59, 337, 366, 371, 391, 394, 407-408, 429, 500-520, 523-524, 527, 530, 533-542, 547-570, 577-597, 600-601, 604, 607, 610-622, 627-651, 660-662, 665-674, 677, 697-704, 708-710, 713-724, 741-747, 751-773, 787-806, 809, 812, 815, 818-832, 840-841, 844, 847, 854, 857, 864-867, 870-877, 880, 893-896, 933-936, 940-942
nntools/layers/corrmm                   68     51    25%   26-65, 68-69, 72, 75, 78-82, 85-100
nntools/layers/cuda_convnet            174    140    20%   25-80, 83-88, 91, 94, 97-110, 113-138, 143-164, 167-182, 185-194, 209, 212, 223, 226, 241-260, 263, 266, 269, 272-284
nntools/nonlinearities                  10      2    80%   22, 26
nntools/objectives                      15      0   100%   
nntools/regularization                   8      4    50%   8-13
nntools/tests/conftest                   6      0   100%   
nntools/tests/test_examples             24      9    63%   22-23, 29-36
nntools/tests/test_init                 39      0   100%   
nntools/tests/test_layers              227      0   100%   
nntools/tests/test_objectives           36      0   100%   
nntools/theano_extensions/__init__       0      0   100%   
nntools/theano_extensions/conv         113    104     8%   17-35, 42-56, 63-77, 86-116, 123-170, 177-211
nntools/theano_extensions/padding       20     16    20%   16-38
nntools/updates                         60     51    15%   12-18, 22-31, 37-48, 56-65, 76-85, 97-115
nntools/utils                           45      9    80%   20-21, 35-36, 48-51, 98, 108
------------------------------------------------------------------
TOTAL                                 1336    600    55%

benanne · 2015-01-24T00:24:24Z

I think our policy has been pretty much decided, we want pull requests to come with tests as much as possible (and if they don't we should encourage the submitter to add them, or add them ourselves before merging). So I'll close this issue.

The first step to implementing this policy is getting our test coverage up to an acceptable level, so contributors will be encouraged to write tests themselves as well. I made a new issue for this: #112.

dnouri mentioned this issue Dec 15, 2014

Use coverage to print out report at end of test runs. #71

Merged

benanne added this to the First release milestone Dec 18, 2014

benanne mentioned this issue Jan 24, 2015

Improving test coverage #112

Closed

benanne closed this as completed Jan 24, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deciding for a testing policy #67

Deciding for a testing policy #67

dnouri commented Dec 13, 2014

dnouri commented Dec 13, 2014

eickenberg commented Dec 13, 2014

benanne commented Dec 13, 2014

dnouri commented Dec 13, 2014

f0k commented Dec 13, 2014

benanne commented Dec 13, 2014

erikbern commented Dec 13, 2014

dnouri commented Dec 15, 2014

benanne commented Dec 15, 2014

f0k commented Dec 15, 2014

dnouri commented Dec 15, 2014

benanne commented Jan 24, 2015

Deciding for a testing policy #67

Deciding for a testing policy #67

Comments

dnouri commented Dec 13, 2014

dnouri commented Dec 13, 2014

eickenberg commented Dec 13, 2014

benanne commented Dec 13, 2014

dnouri commented Dec 13, 2014

f0k commented Dec 13, 2014

benanne commented Dec 13, 2014

erikbern commented Dec 13, 2014

dnouri commented Dec 15, 2014

benanne commented Dec 15, 2014

f0k commented Dec 15, 2014

dnouri commented Dec 15, 2014

benanne commented Jan 24, 2015