Test improvements #17

Merged
merged 7 commits into from Jan 12, 2012

2 participants

@kmike
Natural Language Toolkit member

In order to run tests across multiple python interpreters it is very convenient to use the http://tox.testrun.org/ tool. I've added a tox config and a TESTING.rst file with the draft info about how to run tests.

Some doctests are also fixed (there are still a lot of them failing though).

@kmike
Natural Language Toolkit member

just discovered a better way to fix the doctests, please delay merging

@stevenbird
Natural Language Toolkit member

OK. Thanks for the tox config. Please note that some doctests are user documentation and not really regression tests, and will eventually need to live in a docstring.

chunkscore.recall()
0.33333333333333331

In the docstring we permit ellipses, and can write the above result instead as 0.33333...

However, its a higher priority to get all our tests working where they are, so using float_equals() is fine.

@kmike
Natural Language Toolkit member

Yeah, but this may be 0.49999999999 vs 0.5 where ellipses won't help

kmike added some commits Jan 12, 2012
@kmike kmike Doctests in docstrings are executed by tox via sphinx 1ecad43
@kmike kmike A lot of doctests are fixed (or skipped), a couple of imports are fixed.
Some nltk.align tests are marked as +SKIP because they consume huge amount of memory;
AnnotationTask doctest is marked as +SKIP because __file__ doesn't work in doctests;
some other tests are marked as +SKIP because they require external software;

RegexpTokenizer is fixed (finditer should be re.finditer);
nltk.metrics.confusionmatrix is fixed (FreqDist import was missing).
0542ec4
@kmike kmike numpy is also seems to be a requirement for tests 372059b
@kmike kmike no, numpy is not a requirement 9402f34
@stevenbird
Natural Language Toolkit member

Ready to merge?

NB I intend to tweak docstring imports of the form "from nltk.x.y import z" -> "from nltk.x import z" wherever possible. In general, x/init.py explicitly imports all names from x/y.py that are to be made "public".

@kmike
Natural Language Toolkit member

Yes, I think this is ready and can be merged. There are still a lot of tests failing but the number is reduced for a good amount.

The missing numpy causes 8 test failures but it is not available for pypy (that's why python 2.7 has 13 failed doctests and pypy has 21). We can either live with this or somehow tweak the tox.ini.

@stevenbird stevenbird merged commit e2aea09 into nltk:master Jan 12, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment