Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW]: BAMnostic: an OS-agnostic toolkit for genomic sequence analysis #826

Closed
whedon opened this Issue Jul 18, 2018 · 36 comments

Comments

Projects
None yet
7 participants
@whedon
Copy link
Collaborator

whedon commented Jul 18, 2018

Submitting author: @betteridiot (Marcus)
Repository: https://github.com/betteridiot/bamnostic
Version: 0.8.4
Editor: @pjotrp
Reviewer: @luizirber, @peterjc
Archive: 10.5281/zenodo.1341915

Status

status

Status badge code:

HTML: <a href="http://joss.theoj.org/papers/9952b35bbb30ca6c01e6a27b80006bd8"><img src="http://joss.theoj.org/papers/9952b35bbb30ca6c01e6a27b80006bd8/status.svg"></a>
Markdown: [![status](http://joss.theoj.org/papers/9952b35bbb30ca6c01e6a27b80006bd8/status.svg)](http://joss.theoj.org/papers/9952b35bbb30ca6c01e6a27b80006bd8)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@luizirber & @peterjc, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.theoj.org/about#reviewer_guidelines. Any questions/concerns please let @pjotrp know.

Please try and complete your review in the next two weeks

Review checklist for @luizirber

Conflict of interest

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the repository url?
  • License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • Version: Does the release version given match the GitHub release (0.8.4)?
  • Authorship: Has the submitting author (@betteridiot) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Authors: Does the paper.md file include a list of authors with their affiliations?
  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?

Review checklist for @peterjc

Conflict of interest

Code of Conduct

General checks

  • Repository: Is the source code for this software available at the repository url?
  • License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
  • Version: Does the release version given match the GitHub release (0.8.4)?
  • Authorship: Has the submitting author (@betteridiot) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?

Functionality

  • Installation: Does installation proceed as outlined in the documentation?
  • Functionality: Have the functional claims of the software been confirmed?
  • Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
  • Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
  • Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
  • Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
  • Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

  • Authors: Does the paper.md file include a list of authors with their affiliations?
  • A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
  • References: Do all archival references that should have a DOI list one (e.g., papers, datasets, software)?
@whedon

This comment has been minimized.

Copy link
Collaborator Author

whedon commented Jul 18, 2018

Hello human, I'm @whedon. I'm here to help you with some common editorial tasks. @luizirber, it looks like you're currently assigned as the reviewer for this paper 🎉.

⭐️ Important ⭐️

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands
@whedon

This comment has been minimized.

Copy link
Collaborator Author

whedon commented Jul 18, 2018

Attempting PDF compilation. Reticulating splines etc...
@whedon

This comment has been minimized.

Copy link
Collaborator Author

whedon commented Jul 18, 2018

@pjotrp

This comment has been minimized.

Copy link
Collaborator

pjotrp commented Jul 18, 2018

@betteridiot we are starting review here. To expedite the review do you mind going through above list of check boxes and make sure they can be ticked (you can't tick them). Also check the PDF output carefully. Ping us here when you are done.

@pjotrp

This comment has been minimized.

Copy link
Collaborator

pjotrp commented Jul 18, 2018

Also address @peterjc concerns raised #779 (comment) here (I mean in this issue tracker):

One thing on the text: "This is a significant limitation as no other Python implementation (besides pysam) can perform random access operation on BAM files." maybe insert the rider "published" or "released" as there are several, including my old Python code can do this but has not been formally published or released, https://github.com/JohnLonginotto/pybam ("A simple, 100% python, BAM file reader.") and some work at https://github.com/nijibabulu/pypysam/

As a potential user I would also want to know about plans for CRAM support (logged as betteridiot/bamnostic#4), CSI indexing (betteridiot/bamnostic#3), and BAM output (betteridiot/bamnostic#5).

I know this is a lot to ask, and perhaps out of scope for JoSS, but as a user I would worry about the long term support as the SAM/BAM file format continues to evolve, (e.g. the overdue fix for long CIGAR strings last year, important with long reads - logged as betteridiot/bamnostic#6). In contrast, by building on top of the well supported HTSlib, pysam gets a lot of this maintenance work "for free", but a re-implementation would not (as I wrote earlier, this was a factor in me not formally releasing my own Python SAM/BAM code a years back).

@betteridiot

This comment has been minimized.

Copy link
Collaborator

betteridiot commented Jul 19, 2018

The PR: betteridiot/bamnostic#7 addresses the first issue raised by @peterjc.

Also, a couple of these points were also addressed in my last comment in #779. I will reiterate it here though:

Thank you greatly for your points. Regardless of @pjotrp decision to keep you on as the reviewer, you made great points that I would like to take the opportunity to address.

Your points regarding BAM output and long CIGAR strings are definitely revisions that can (and should) be made to bamnostic. The CIGAR string revision is still a relatively recent change, and was not written into bamnostic yet. Thank you for pointing this out, and that pull request should come very soon. The BAM output point is a bigger issue, but still fair. For the purposes of the original project that birthed bamnostic, BAM output was not desired, and therefore fell victim to oversight. However, due to incorporating @peterjc BGZF module, the framework for connecting the pieces is there. They just have not been formally implemented. Making this change should be relatively straight-forward (much like you stated in the respective issue thread).

I also agree with you that CRAM support would be a great boon for bamnostic. However, like you stated in betteridiot/bamnostic#4, it is a much more complex format that would require a non-trivial expenditure of time and resources to write up. Not saying that this feature is not in the future of bamnostic, but we believe that the present form of bamnostic could still be a very impactful tool for current pipelines and analysis. The same applies to CSI indexing. While it is a successor to BAI indexing, it has not permeated the domain to the degree that BAI has yet.

With regard to the phrasing: "This is a significant limitation as no other Python implementation (besides pysam) can perform random access operation on BAM files.", your point is fair that published should be added.

Additionally, in the vein of JoSS' purpose, this is an open-source project as well, and contributions are not only welcomed, they are encouraged. If the community would like to add features into bamnostic, pull requests are fully appreciated.

@betteridiot

This comment has been minimized.

Copy link
Collaborator

betteridiot commented Jul 19, 2018

The points about CSI indexing and CRAM support were addressed similarly in betteridiot/bamnostic#3 and betteridiot/bamnostic#4 (respectively).

@pjotrp

This comment has been minimized.

Copy link
Collaborator

pjotrp commented Jul 19, 2018

Thanks, can you confirm you agree that all check boxes can be ticked by the reviewer(s)?

@betteridiot

This comment has been minimized.

Copy link
Collaborator

betteridiot commented Jul 19, 2018

I will have to explicitly add a section on Community guidelines for all items to be checked off.

@betteridiot

This comment has been minimized.

Copy link
Collaborator

betteridiot commented Jul 19, 2018

Current version is 0.8.7b2
Community guidelines and versioning were added in betteridiot/bamnostic@05b00e1

Addressed @peterjc's comments on CRAM and CSI support is betteridiot/bamnostic#4 and betteridiot/bamnostic#3 (respectively) and added documentation about limitations in betteridiot/bamnostic@14747e7

CIGARs > 65535 support was added in betteridiot/bamnostic@6e55d64

@betteridiot

This comment has been minimized.

Copy link
Collaborator

betteridiot commented Jul 19, 2018

At this point, I can confirm all items on the checklist can be checked off. The only point made by @peterjc that has yet to be addressed with increased functionality has been on BAM output. This is feature that will be added in the very near future and this issue can be found at betteridiot/bamnostic#5

@pjotrp

This comment has been minimized.

Copy link
Collaborator

pjotrp commented Jul 19, 2018

Thanks. @luizirber over to you!

@peterjc

This comment has been minimized.

Copy link
Collaborator

peterjc commented Jul 22, 2018

BAM output betteridiot/bamnostic#5 was a feature request, quite a logical extension and one which will increase the utility of BAMnostic, but I did not mean to suggest it should delay or block this JOSS publication.

@betteridiot

This comment has been minimized.

Copy link
Collaborator

betteridiot commented Jul 22, 2018

Thank you @peterjc . I fully understood it as such. I was just being explicit with my intentions is all.

@pjotrp

This comment has been minimized.

Copy link
Collaborator

pjotrp commented Jul 22, 2018

Thanks @betteridiot. I think we are good to go. @luizirber would you be so kind to start reviewing this submission? @peterjc can help out if there are any issues. Thanks!

@luizirber

This comment has been minimized.

Copy link
Collaborator

luizirber commented Jul 23, 2018

on it

@pjotrp

This comment has been minimized.

Copy link
Collaborator

pjotrp commented Aug 7, 2018

@luizirber what is the latest update here?

@luizirber

This comment has been minimized.

Copy link
Collaborator

luizirber commented Aug 7, 2018

Hello @betteridiot! Great work on the software =]

Some comments and two small things:

  • Version
  • References

Checklist comments

General checks

Documentation

  • Automated tests: while there two tests, they don't cover large chunks of the codebase and API (especially the utils module).

    $ pytest --cov
    Name                    Stmts   Miss  Cover
    -------------------------------------------
    bamnostic/__init__.py       4      0   100%
    bamnostic/bai.py          159     63    60%
    bamnostic/bgzf.py         506    280    45%
    bamnostic/core.py         291    146    50%
    bamnostic/utils.py        246    178    28%
    tests/__init__.py           0      0   100%
    tests/test_build.py        11      0   100%
    -------------------------------------------
    TOTAL                    1217    667    45%
    

    Bringing in the doctests is already a nice improvement (but two tests need
    to be fixed):

    $ pytest --doctest-modules --cov
    Name                    Stmts   Miss  Cover
    -------------------------------------------
    bamnostic/__init__.py       4      0   100%
    bamnostic/bai.py          159     25    84%
    bamnostic/bgzf.py         506    169    67%
    bamnostic/core.py         291    114    61%
    bamnostic/utils.py        246     57    77%
    docs/source/conf.py        27      0   100%
    tests/__init__.py           0      0   100%
    tests/test_build.py        11      0   100%
    -------------------------------------------
    TOTAL                    1244    365    71%
    
  • Community guidelines: There are issue templates for bugs and features, and a comment
    at the end of the README. A plus would be to collect this info in a CONTRIBUTING file.

Software paper

  • References: there are two links to PySAM (repo and an issue) that don't have DOIs. But I'm not sure how to properly cite this... @pjotrp

Fixes

@betteridiot

This comment has been minimized.

Copy link
Collaborator

betteridiot commented Aug 8, 2018

betteridiot/bamnostic@fbd5e89 addresses the issues raises by @luizirber regarding the versions releases, the documentation error of bs.example_bam, and fixes to the two failing doctests.

The CONTRIBUTING.md file will take more time than these quick corrections.

Regarding the comment about the automated tests: @luizirber , are you requesting more testing coverage is required before publication? Or, do the doctests fulfill the needs of the review at this point?

@luizirber

This comment has been minimized.

Copy link
Collaborator

luizirber commented Aug 8, 2018

Thanks for the quick answer!

It is not required, I think it's already pretty good. Having 100% coverage is desirable but not that useful if you're writing tests just to hit the target, but I was a bit uncomfortable with only two tests. After I found the doctests this looked better, and I would suggest (but not require 😸) that you add a few more tests, but up to you.

I checked the version box above, last question is about the references, but I'll wait @pjotrp for that.

@pjotrp

This comment has been minimized.

Copy link
Collaborator

pjotrp commented Aug 8, 2018

@peterjc how would you like the paper to cite PySAM?

@betteridiot

This comment has been minimized.

Copy link
Collaborator

betteridiot commented Aug 8, 2018

I know that according to APA citations style guides, when the DOI is not available for electronic resources, the reference should cite the URL (as done in paper.bib).

Furthermore, the maintainers of PySAM requests that the original SAM/BAM format paper be cited in its stead (as done in paper.bib). This is also seen in pysam's Bioconda page, where the DOI points back to the SAM/BAM format paper again.

@pjotrp

This comment has been minimized.

Copy link
Collaborator

pjotrp commented Aug 8, 2018

OK, sounds fine to me. @luizirber when you think it is done we can start the publication process.

@betteridiot

This comment has been minimized.

Copy link
Collaborator

betteridiot commented Aug 8, 2018

betteridiot/bamnostic#13 adds a CONTRIBUTING.md and Contributor Covenant Code of Conduct to the project. The initial commit had some typos in the CONTRIBUTING.md file. These were addressed in betteridiot/bamnostic@6bbe4cd

@luizirber

This comment has been minimized.

Copy link
Collaborator

luizirber commented Aug 8, 2018

@pjotrp All box checked, ready to go!

@betteridiot

This comment has been minimized.

Copy link
Collaborator

betteridiot commented Aug 8, 2018

For expediency sake, the updated release (0.8.10c1) was updated to Zenodo under the DOI: 10.5281/zenodo.1341915

EDIT:
To make sure my GitHub release matched the PyPI and conda-forge versioning, a new release was created (0.8.11) which was updated on Zenodo under the DOI:10.5281/zenodo.1341959

@pjotrp

This comment has been minimized.

Copy link
Collaborator

pjotrp commented Aug 9, 2018

Cool. @arfon another publication to send into the ether!

@pjotrp

This comment has been minimized.

Copy link
Collaborator

pjotrp commented Aug 9, 2018

@whedon generate pdf

@whedon

This comment has been minimized.

Copy link
Collaborator Author

whedon commented Aug 9, 2018

Attempting PDF compilation. Reticulating splines etc...
@whedon

This comment has been minimized.

Copy link
Collaborator Author

whedon commented Aug 9, 2018

@pjotrp

This comment has been minimized.

Copy link
Collaborator

pjotrp commented Aug 9, 2018

@whedon set 10.5281/zenodo.1341915 as archive

@whedon

This comment has been minimized.

Copy link
Collaborator Author

whedon commented Aug 9, 2018

OK. 10.5281/zenodo.1341915 is the archive.

@arfon

This comment has been minimized.

Copy link
Member

arfon commented Aug 9, 2018

@luizirber, @peterjc - many thanks for your reviews here and to @pjotrp for editing this submission

@betteridiot - your paper is now accepted into JOSS and your DOI is https://doi.org/10.21105/joss.00826 ⚡️ 🚀 💥

@arfon arfon added the accepted label Aug 9, 2018

@arfon arfon closed this Aug 9, 2018

@whedon

This comment has been minimized.

Copy link
Collaborator Author

whedon commented Aug 9, 2018

🎉🎉🎉 Congratulations on your paper acceptance! 🎉🎉🎉

If you would like to include a link to your paper from your README use the following code snippets:

Markdown:
[![DOI](http://joss.theoj.org/papers/10.21105/joss.00826/status.svg)](https://doi.org/10.21105/joss.00826)

HTML:
<a style="border-width:0" href="https://doi.org/10.21105/joss.00826">
  <img src="http://joss.theoj.org/papers/10.21105/joss.00826/status.svg" alt="DOI badge" >
</a>

This is how it will look in your documentation:

DOI

We need your help!

Journal of Open Source Software is a community-run journal and relies upon volunteer effort. If you'd like to support us please consider doing either one (or both) of the the following:

@betteridiot

This comment has been minimized.

Copy link
Collaborator

betteridiot commented Aug 9, 2018

Thank you @pjotrp for keeping the review process moving. Thank you @peterjc and @luizirber for your insights. Lastly, thank you JoSS for this outlet.

@remills

This comment has been minimized.

Copy link

remills commented Aug 9, 2018

Agreed, this process was great and we will definitely consider publishing here again in the future!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.