Skip to content

Improve test coverage by moving self-tests #820

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
peterjc opened this issue May 11, 2016 · 65 comments
Open

Improve test coverage by moving self-tests #820

peterjc opened this issue May 11, 2016 · 65 comments
Labels
good first issue This should be an easy fix, suitable for beginners help wanted

Comments

@peterjc
Copy link
Member

peterjc commented May 11, 2016

Many of the Python files in Biopython still have self-tests at the end which are executed if the file is run directly via the idiom:

if __name__ == "__main__":
    print("Running a quick self-test")
    ...

This is bad for several reasons - these tests are not being run routinely via our buildbot [Update: We stopped using buildbot, but now have AppVeyor instead] or TravisCI (and therefore if the test breaks, we don't know about it), they are not counted on the coverage report https://codecov.io/github/biopython/biopython/ (and in fact actively reduce the coverage score [Update: We configured codecov to ignore the self tests]), and moreover with Python 3 making subtle changes to imports, some of these self tests have accidentally become Python 2 only.

This is a tracking issue to encourage more commits like these which move these self-tests under Tests/test_XXX.py using the unittest framework instead:

Or, in some cases using a doctest instead makes sense:

In those cases, we're left with a stub like this which is useful when working directly on a file's doctests:

if __name__ == "__main__":
    from Bio._utils import run_doctest
    run_doctest(verbose=0)
@peterjc
Copy link
Member Author

peterjc commented May 11, 2016

Here's an indication of some obvious targets to tackle next:

$ python count_main.py Bio BioSQL
Walking Bio
26  726 Bio/bgzf.py
48  258 Bio/MaxEntropy.py
32  194 Bio/NaiveBayes.py
44  622 Bio/Align/AlignInfo.py
246 574 Bio/AlignIO/FastaIO.py
65  179 Bio/AlignIO/NexusIO.py
25  535 Bio/Blast/NCBIXML.py
358 1730    Bio/GenBank/Scanner.py
11  180 Bio/Graphics/GenomeDiagram/_Colors.py
16  184 Bio/Graphics/GenomeDiagram/_FeatureSet.py
8   172 Bio/Graphics/GenomeDiagram/_GraphSet.py
26  302 Bio/Graphics/GenomeDiagram/_Track.py
29  217 Bio/KDTree/KDTree.py
29  161 Bio/KEGG/KGML/KGML_parser.py
57  1760    Bio/Nexus/Nexus.py
6   61  Bio/PDB/Dice.py
15  344 Bio/PDB/DSSP.py
10  276 Bio/PDB/FragmentMapper.py
26  301 Bio/PDB/HSExposure.py
25  91  Bio/PDB/MMCIF2Dict.py
13  196 Bio/PDB/MMCIFParser.py
9   163 Bio/PDB/NACCESS.py
12  118 Bio/PDB/NeighborSearch.py
12  245 Bio/PDB/parse_pdb_header.py
16  202 Bio/PDB/PDBIO.py
44  303 Bio/PDB/PDBList.py
19  280 Bio/PDB/PDBParser.py
24  408 Bio/PDB/Polypeptide.py
8   87  Bio/PDB/PSEA.py
9   138 Bio/PDB/ResidueDepth.py
27  117 Bio/PDB/StructureAlignment.py
17  66  Bio/PDB/Superimposer.py
55  319 Bio/PDB/Vector.py
42  160 Bio/PDB/QCPSuperimposer/__init__.py
15  329 Bio/SeqIO/SeqXmlIO.py
186 1366    Bio/SeqIO/SffIO.py
18  146 Bio/SeqIO/SwissIO.py
13  113 Bio/SeqUtils/CheckSum.py
30  136 Bio/SVDSuperimposer/__init__.py
16  536 Bio/SwissProt/__init__.py
11  351 Bio/UniProt/GOA.py
Walking BioSQL
Done

This was a quick script to count the number of (non-blank) lines after the __main__ trick (and the total number of non-blank lines) per file, reporting any over 5:

import sys
import os

def count(filename):
    total = 0
    in_main = False
    main = 0
    with open(filename) as in_handle:
        for line in in_handle:
            if not line.strip():
                continue
            total += 1
            if line.startswith("if ") and "__main__" in line:
                in_main = True
            if in_main:
                main += 1
    if main > 5:
        print("%i\t%i\t%s" % (main, total, filename))
    return main, total

for path in sys.argv[1:]:
    print("Walking %s" % path)
    for (dirpath, dirnames, filenames) in os.walk(path):
        for f in filenames:
            if f.endswith(".py"):
                count(os.path.join(dirpath, f))
print("Done")

peterjc added a commit that referenced this issue May 24, 2016
This was covered by test_SwissProt.py anyway, see GitHub issue #820.
peterjc added a commit that referenced this issue May 24, 2016
Parsing this file was covered in test_SeqXmlIO.py anyway.
See GitHub issue #820.
peterjc added a commit that referenced this issue May 24, 2016
Already covered by test_SeqUtils.py - see GitHub issue #820
@vincentdavis
Copy link
Contributor

@peterjc you might want to label this easy.

@peterjc peterjc added the good first issue This should be an easy fix, suitable for beginners label Sep 28, 2016
@peterjc
Copy link
Member Author

peterjc commented Sep 28, 2016

Good point Vincent - it will vary case by case, but a lot of these should be fairly easy jobs - and the task is well defined.

@vincentdavis
Copy link
Contributor

For review: Considering coverage of SffIO docstests (56%) VS test_SffIO (57%) VS selftest(73%).
Attached are coverage reports for each.
Coverage_SffIO_docs.pdf
Coverage_SffIO___main__.pdf
Coverage_SffIO nostest.pdf

@peterjc
Copy link
Member Author

peterjc commented Sep 29, 2016

Those figures are certainly strong motivation to move the SFF self-tests into our unit-test framework. Looking back it was a mistake to write the tests that way in the first place, but I've grown to appreciated automated testing more and more over the years.

@souravsingh
Copy link
Contributor

@peterjc I would like to work on the issue

@peterjc
Copy link
Member Author

peterjc commented Oct 5, 2016

Hi @souravsingh - how about working on Bio/AlignIO/NexusIO.py moving the self-test into Tests/test_Nexus.py? Are you clear about the task, and how to submit a pull request? Thanks!

peterjc pushed a commit that referenced this issue Oct 6, 2016
Squashed commit from pull request #948.

Part of work to address #820.
@peterjc
Copy link
Member Author

peterjc commented Oct 6, 2016

#948 from @vincentdavis did this for Bio/SeqIO/SffIO.py

@vincentdavis
Copy link
Contributor

vincentdavis commented Oct 7, 2016

Updated list.
(excludes not yet KGML_parser, Maxentropy, SVDSuperimposer/init.py fix)

__main__    Total   File
26          732     Bio/bgzf.py
246         574     Bio/AlignIO/FastaIO.py
65          179     Bio/AlignIO/NexusIO.py
25          535     Bio/Blast/NCBIXML.py
358         1746    Bio/GenBank/Scanner.py
11          180     Bio/Graphics/GenomeDiagram/_Colors.py
16          184     Bio/Graphics/GenomeDiagram/_FeatureSet.py
8           172     Bio/Graphics/GenomeDiagram/_GraphSet.py
26          302     Bio/Graphics/GenomeDiagram/_Track.py
29          217     Bio/KDTree/KDTree.py
57          1761    Bio/Nexus/Nexus.py
6           61      Bio/PDB/Dice.py
15          384     Bio/PDB/DSSP.py
10          276     Bio/PDB/FragmentMapper.py
26          301     Bio/PDB/HSExposure.py
25          91      Bio/PDB/MMCIF2Dict.py
13          377     Bio/PDB/MMCIFParser.py
9           163     Bio/PDB/NACCESS.py
12          118     Bio/PDB/NeighborSearch.py
12          244     Bio/PDB/parse_pdb_header.py
16          202     Bio/PDB/PDBIO.py
44          303     Bio/PDB/PDBList.py
19          280     Bio/PDB/PDBParser.py
24          408     Bio/PDB/Polypeptide.py
8           87      Bio/PDB/PSEA.py
9           138     Bio/PDB/ResidueDepth.py
27          117     Bio/PDB/StructureAlignment.py
17          66      Bio/PDB/Superimposer.py
55          319     Bio/PDB/Vector.py
42          156     Bio/PDB/QCPSuperimposer/__init__.py
11          372     Bio/UniProt/GOA.py

@peterjc
Copy link
Member Author

peterjc commented Oct 10, 2016

KGML_parser, Maxentropy, SVDSuperimposer and UniProt/GOA done via #950, with a start on Nexus in #957.

@amblina
Copy link

amblina commented Oct 30, 2016

Heya, I'm giving Bio/AlignIO/FastaIO.py a go 😄

@peterjc
Copy link
Member Author

peterjc commented Oct 31, 2016

@amblina - great - when you're ready, open a pull request and mention #820 please :)

Note we're asking all new contributors to agree to using both the historic Biopython User Agreement AND the 3-clause BSD license, see #898

@Gasta88
Copy link
Contributor

Gasta88 commented Feb 13, 2017

Hi, I'm having a look at Bio/Blast/NCBIXML.py.

Ideally the class should be tested with every kind of blast XML output found in Test/Blast? Ex: blastn, blastp, etc?

@peterjc
Copy link
Member Author

peterjc commented Feb 13, 2017

@Gasta88 that's an interesting one, right now https://github.com/biopython/biopython/blob/master/Bio/Blast/NCBIXML.py#L646 (link points at current latest code) when run directly takes a filename via the command line, and tries parsing it. I think we/you could just delete that if __name__ == '__main__': block.

In terms of test coverage, other than this NCBIXML.py looks good - eg https://codecov.io/gh/biopython/biopython/src/06b6cd64d42662ef208e86be713e5c2b5865c5ca/Bio/Blast/NCBIXML.py shows just a couple of error conditions and special cases are not being tested.

If you wanted to look at that, adding tests to Tests/test_NCBIXML.py would be idea.

peterjc added a commit that referenced this issue Feb 22, 2017
See previous commit from pull request 1063,
working towards issue #820 on improving test
coverage.
Gasta88 added a commit to Gasta88/biopython that referenced this issue Mar 9, 2017
in reference to issue biopython#820, moved self_test into doctest format
This was referenced Mar 9, 2017
@Gasta88
Copy link
Contributor

Gasta88 commented Feb 7, 2018

Hi @peterjc , I see that AlignIO/FastaIO need to be sorted out, but the test_AlignIO_FastaIO.py script has been already developed (by you).

Is it ok if I edit this script to use unittest, or shall i just remove the self-test from FastaIO?

@peterjc
Copy link
Member Author

peterjc commented Feb 7, 2018

@Gasta88 from a quick check by eye, the current Tests/test_AlignIO_FastaIO.py covers everything in the self-test in Bio/AlignIO/FastaIO.py, so a pull request to remove the self-test seems fine.

If you also wanted to convert Tests/test_AlignIO_FastaIO.py from its current print-and-compare style to a unittest based style, that would also be very welcome. Thanks!

@peterjc
Copy link
Member Author

peterjc commented Jul 4, 2018

New numbers:

$ python count_main.py Bio BioSQL
Walking Bio
34	769	Bio/bgzf.py
358	1872	Bio/GenBank/Scanner.py
16	186	Bio/Graphics/GenomeDiagram/_FeatureSet.py
8	139	Bio/Graphics/GenomeDiagram/_GraphSet.py
26	269	Bio/Graphics/GenomeDiagram/_Track.py
6	62	Bio/PDB/Dice.py
15	443	Bio/PDB/DSSP.py
10	166	Bio/PDB/NACCESS.py
13	118	Bio/PDB/NeighborSearch.py
61	429	Bio/PDB/PDBList.py
8	86	Bio/PDB/PSEA.py
Walking BioSQL
Done

These are executable code so a false positive:

  • Bio/bgzf.py
  • Bio/PDB/PDBList.py

These are all self tests to review:

  • Bio/GenBank/Scanner.py
  • Bio/Graphics/GenomeDiagram/_FeatureSet.py
  • Bio/Graphics/GenomeDiagram/_GraphSet.py
  • Bio/Graphics/GenomeDiagram/_Track.py
  • Bio/PDB/Dice.py
  • Bio/PDB/DSSP.py
  • Bio/PDB/NACCESS.py
  • Bio/PDB/NeighborSearch.py
  • Bio/PDB/PSEA.py

@svalqui
Copy link
Contributor

svalqui commented Jan 15, 2019

I'll have a look at Bio/PDB/Dice.py

@svalqui
Copy link
Contributor

svalqui commented Jan 31, 2019

I'll have a look at Bio/PDB/NeighborSearch.

@svalqui
Copy link
Contributor

svalqui commented Feb 11, 2019

* Bio/PDB/NeighborSearch.py, is commented above and here #1366  , changes out of the scope of this PR.
* Bio/PDB/PSEA.py, Had some work done here #1336 , unclear current state, leaving it as is.
* Bio/PDB/NACCESS.py, requires a key from the author to work, leaving it as is.

I'll work on Bio/PDB/DSSP.py, also relates to #1288.

@peterjc
Copy link
Member Author

peterjc commented Feb 11, 2019

@svalqui you'll need to install the command line tool DSSP to try this.

It may turn out that the self-test functionality is already covered in https://github.com/biopython/biopython/blob/master/Tests/test_DSSP_tool.py (in which case we can remove the if __name__ == "__main__": bit from https://github.com/biopython/biopython/blob/master/Bio/PDB/DSSP.py

@svalqui
Copy link
Contributor

svalqui commented Feb 11, 2019

@peterjc Thank you for that I had missed test_DSSP_tool, which takes care of dssp; I have removed if name is main. I will have a look at Bio/Graphics/GenomeDiagram/_FeatureSet.py.
Cheers,

peterjc pushed a commit that referenced this issue Feb 12, 2019
Already covered with test_DSSP_tool.py - see issue #820.
@svalqui
Copy link
Contributor

svalqui commented Feb 12, 2019

Self test on:

  • Bio/Graphics/GenomeDiagram/_FeatureSet.py
  • Bio/Graphics/GenomeDiagram/_GraphSet.py
  • Bio/Graphics/GenomeDiagram/_Track.py
    are covered on Tests/test_GenomeDiagram.py
    I'll remove self test.

@svalqui
Copy link
Contributor

svalqui commented Feb 12, 2019

I'll have a look at Bio/GenBank/Scanner.py

peterjc pushed a commit that referenced this issue Feb 13, 2019
Some bits of these self tests no longer worked, and it looks like
everything else is covered in Tests/test_GenomeDiagram.py

Partly addresses issue #820.
peterjc pushed a commit that referenced this issue Feb 27, 2019
Testing the same method in Tests/test_GenBank_unitest.py instead,
using existing example files. Addresses part of GitHub issue #820.
@peterjc
Copy link
Member Author

peterjc commented Feb 27, 2019

$ python count_main.py Bio BioSQL
Walking Bio
34	764	Bio/bgzf.py
8	89	Bio/PDB/PSEA.py
61	439	Bio/PDB/PDBList.py
10	170	Bio/PDB/NACCESS.py
15	120	Bio/PDB/NeighborSearch.py
Walking BioSQL
Done

These are executable code so a false positive:

  • Bio/bgzf.py
  • Bio/PDB/PDBList.py

These are all self tests to review:

@valentin994
Copy link
Contributor

Are any files left to be done? I tried to grep self-test but only test_SffiO.py file came out. I noticed a dangling #print which I guess could be removed.
image
But if there is more to do I would gladly help.

@peterjc
Copy link
Member Author

peterjc commented May 23, 2022

From my comments earlier, it looks like just Bio/PDB/PSEA.py (see #1336) and Bio/PDB/NeighborSearch.py (see #1366) need addressing.

@valentin994 valentin994 mentioned this issue Jul 11, 2022
3 tasks
BabaYaga1221 added a commit to BabaYaga1221/biopython that referenced this issue May 5, 2023
@peterjc
Copy link
Member Author

peterjc commented May 2, 2025

Running that script again now against the current default branch:

python count_main.py Bio BioSQL
Walking Bio
30      825     Bio/bgzf.py
103     687     Bio/PDB/PDBList.py
10      179     Bio/PDB/NACCESS.py
Walking BioSQL
Done

Those three are not self-tests, but simple command line functionality.

Closing this issue as completed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue This should be an easy fix, suitable for beginners help wanted
Projects
None yet
Development

No branches or pull requests

10 participants