Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test suite failure with biopython 1.79 #23

Closed
emollier opened this issue Oct 20, 2021 · 8 comments
Closed

test suite failure with biopython 1.79 #23

emollier opened this issue Oct 20, 2021 · 8 comments
Assignees
Labels

Comments

@emollier
Copy link

Greetings,

While working on updating Biopython to version 1.79 in Debian Experimental, we noticed that the new version of Biopython broke the test suite of ncbi-acc-download. Excerpt from the full log from Debian continuous integration, there are variations around the following symptom, affecting the download_wgs_parts function of wgs.py:

__________________ test_download_wgs_parts_supercontig_retry ___________________

req = <requests_mock.mocker.Mocker object at 0x7fc14613fb20>

    def test_download_wgs_parts_supercontig_retry(req):
        cfg = Config(format="genbank")
        supercontig = open(full_path('supercontig.gbk'), 'rt')
        req.get(ENTREZ_URL, response_list=[
            {"text": u'Whoa, slow down', "status_code": 429, "headers": {"Retry-After": "0"}},
            {"body": open(full_path('supercontig_full.gbk'), 'rt')}
        ])
    
        outhandle = wgs.download_wgs_parts(supercontig, cfg)
        supercontig_full = open(full_path('supercontig_full.gbk'), 'rt')
>       assert outhandle.getvalue() == supercontig_full.read()
E       AssertionError: assert 'LOCUS       ...671808)\n//\n' == 'LOCUS       ...ccctaac\n//\n'
E         - LOCUS       NC_007194                 60 bp    DNA     linear   CON 03-APR-2018
E         ?                                  ^^^^^^^                            ^^ ^^^   ^^
E         + LOCUS       NC_007194            4918979 bp    DNA     linear   CON 11-NOV-2009
E         ?                                  ^^^^^^^                            ^^ ^^^   ^^
E           DEFINITION  Aspergillus fumigatus Af293 chromosome 1, whole genome shotgun
E                       sequence.
E           ACCESSION   NC_007194...
E         
E         ...Full output truncated (22 lines hidden), use '-vv' to show

test_wgs.py:145: AssertionError

After some digging, the issue turned out to be related to the deprecation of Bio.Seq.UnknownSeq in Biopython 1.79. Apparently, unknown sequences are now created as regular sequences of class Seq with data being None and non-zero length, and thus the test on record.seq being an instance of UnknownSeq always fails.

The problem was initially discussed on Debian bug tracker issue #996794, where I devised on a patch with which I tried to remain functionally equivalent to the existing code while extending support to both Biopython 1.78 and 1.79. I didn't prepare a merge request though, as I'm still feeling a bit uncomfortable about my approach which I deem heavy, and believe you probably would have a more elegant approach to fix the issue. Of course feel free to use the patch as is if you feel it is appropriate.

In hope this helps,
Have a nice day :)

@kblin
Copy link
Owner

kblin commented Oct 20, 2021

I'm not sure there's a nice way to support both 1.78 and 1.79 from the same code base.

@kblin
Copy link
Owner

kblin commented Oct 20, 2021

I've had a discussion with the Biopython authors about the pain points of their 1.79 changes here, but then got sidetracked and just pinned to 1.78 for local use. I lost track of how many of my python packages got broken by that change.

I guess it's time for a new release that just bumps the minimum required version to 1.79 and builds on the new code, though even for that I really hate the code necessary.

@kblin
Copy link
Owner

kblin commented Oct 20, 2021

Ok, this is also breaking PR #22

@kblin kblin self-assigned this Oct 20, 2021
@kblin kblin added the bug label Oct 20, 2021
@emollier
Copy link
Author

emollier commented Oct 20, 2021

I just recalled to mention in case you were to use it: I wrote the patch in Debian context, and the writing to /dev/null to make sure the error is raised wouldn't be portable on non-Unix systems.

Otherwise thanks for the context, this was informative.

@kblin
Copy link
Owner

kblin commented Oct 20, 2021

It looks like I have started to fix the issue on this machine three months ago and forgot about it.

@kblin
Copy link
Owner

kblin commented Oct 20, 2021

Should be fixed by dcba66f, I'll release a new version of ncbi-acc-download once PR #22 is merged.

@emollier
Copy link
Author

emollier commented Oct 21, 2021 via email

@kblin
Copy link
Owner

kblin commented Oct 22, 2021

Fixed in release 0.2.8 that's now on pypi (and here).

@kblin kblin closed this as completed Oct 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants