Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for NCBI BLAST XML v2 #1997

Merged
merged 7 commits into from Apr 23, 2019

Conversation

rtf-const
Copy link
Contributor

This pull request addresses issue #1839

  • I hereby agree to dual licence this and any previous contributions under both
    the Biopython License Agreement AND the BSD 3-Clause License.

  • I have read the CONTRIBUTING.rst file and understand that AppVeyor and
    TravisCI will be used to confirm the Biopython unit tests and flake8 style
    checks pass with these changes.

  • I have added my name to the alphabetical contributors listings in the files
    NEWS.rst and CONTRIB.rst as part of this pull request, am listed
    already, or do not wish to be listed. (This acknowledgement is optional.)

@codecov
Copy link

codecov bot commented Apr 3, 2019

Codecov Report

Merging #1997 into master will increase coverage by 0.03%.
The diff coverage is 98%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1997      +/-   ##
==========================================
+ Coverage   85.07%   85.11%   +0.03%     
==========================================
  Files         314      314              
  Lines       50311    50438     +127     
==========================================
+ Hits        42802    42929     +127     
  Misses       7509     7509
Impacted Files Coverage Δ
Bio/Blast/Record.py 92.02% <100%> (+0.92%) ⬆️
Bio/Blast/NCBIXML.py 89.82% <97.74%> (+0.33%) ⬆️
Bio/Entrez/Parser.py 81.75% <0%> (+0.17%) ⬆️
Bio/File.py 87.27% <0%> (+0.25%) ⬆️
Bio/Application/__init__.py 87.74% <0%> (+1.18%) ⬆️
Bio/_py3k/__init__.py 76.19% <0%> (+3.57%) ⬆️
Bio/AlignIO/EmbossIO.py 80% <0%> (+4.81%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 58861fe...d678d2f. Read the comment docs.

@peterjc peterjc requested a review from bow April 4, 2019 10:46
@peterjc
Copy link
Member

peterjc commented Apr 4, 2019

This looks impressive - I see you've split the existing BLAST XML parser into two using subclasses, did that seem easiest?

Does this mean adding support for BLAST XML v2 to Bio.SearchIO should be really simple to include?

@rtf-const
Copy link
Contributor Author

I do not use subclasses for parsing different XML versions, because parser doesn't know XML version beforehand. Instead actuall version is determined in the process of parsing based on root tag name. Blast XML v1 and v2 are very similar in structure and mainly differ just by tags names. So the process is very simple: start reading XML, look at the name of root node, determine version and then code setups the dictionary that matches XML tag name with the method that process it.

As I can see Bio.SearchIO already uses some constant dictionaries that match tag name with a resulting objects properties. So for adding xml v2 support in SerachIO we can use different dictionaries depending on XML version.

@peterjc
Copy link
Member

peterjc commented Apr 17, 2019

TravisCI failed on a ReportLab issue, since fixed on the master - see #1995. If you add yourself to the NEWS.rst file or made any other change that would trigger a rebuild.

@bow are you available to review this work?

@bow
Copy link
Member

bow commented Apr 21, 2019

@peterjc I'm less familiar with the internals of Bio.Blast, but from what I understand, the changes look ok.

And yes, SearchIO does indeed match XML tag names with object properties, since XML parsing is handled by xml.etree.ElementTree. This can be extended to v2 as well.

@peterjc
Copy link
Member

peterjc commented Apr 22, 2019

Rebased to deal with a trivial merge conflict in the NEWS file with the contributor names.

I think this deserves a short NEWS.rst entry too - although if it is now trivial to add BLAST v2 XML to SearchIO, we could merge this as is, and do that next?

@bow
Copy link
Member

bow commented Apr 22, 2019

Sounds like a good idea. It's also nice that @rtf-const already added the test files for v2 ~ which we can use for SearchIO as well :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants