[REVIEW]: shmlast: An improved implementation of Conditional Reciprocal Best Hits with LAST and Python #142

whedon · 2016-12-18T22:36:51Z

Submitting author: @camillescott (Camille Scott)
Repository: https://github.com/camillescott/shmlast
Version: v1.1
Editor: @biorelated
Reviewer: @genomematt
Archive: 10.5281/zenodo.260437

Status

Status badge code:

HTML: <a href="http://joss.theoj.org/papers/3cde54de7dfbcada7c0fc04f569b36c7"><img src="http://joss.theoj.org/papers/3cde54de7dfbcada7c0fc04f569b36c7/status.svg"></a>
Markdown: [![status](http://joss.theoj.org/papers/3cde54de7dfbcada7c0fc04f569b36c7/status.svg)](http://joss.theoj.org/papers/3cde54de7dfbcada7c0fc04f569b36c7)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer questions

Conflict of interest

As the reviewer I confirm that there are no conflicts of interest for me to review this work (such as being a major contributor to the software).

General checks

Repository: Is the source code for this software available at the repository url?
License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
Version: Does the release version given match the GitHub release (v1.0.2)?
Authorship: Has the submitting author (@camillescott) made major contributions to the software?

Functionality

Installation: Does installation proceed as outlined in the documentation?
Functionality: Have the functional claims of the software been confirmed?
Performance: Have any performance claims of the software been confirmed?

Documentation

A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g. API method documentation)?
Automated tests: Are there automated tests or manual steps described so that the function of the software can be verified?
Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

Authors: Does the paper.md file include a list of authors with their affiliations?
A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
References: Do all archival references that should have a DOI list one (e.g. papers, datasets, software)?

The text was updated successfully, but these errors were encountered:

whedon · 2016-12-18T22:36:54Z

Hello human, I'm @whedon. I'm here to help you with some common editorial tasks for JOSS. @genomematt it looks like you're currently assigned as the reviewer for this paper 🎉.

⭐ Important ⭐

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As as reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all JOSS reviews 😿

To fix this do the following two things:

Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

For a list of things I can do to help you, just type:

@whedon commands

george-githinji · 2016-12-31T11:07:45Z

@genomematt Where are we at with this review? Please let us know.

genomematt · 2017-01-04T04:35:25Z

I am still working my way through this, but some initial comments @camillescott.

I think that you need a more from first principles statement of need that outlines the sequence matching problem that CRBH is addressing (remembering that this is a generalist scientific computing journal) so that readers don't need to chase a reference to get a basic idea of what the software does.

You need contribution, bug report and support guidelines - check out some other JOSS articles for what this should look like.

A real world (ish) worked example, ideally as a jupyter notebook or doctest file would greatly enhance, and would help with explaining the statement of need.

Very pleased to see coverage as part of your testing. Your exclusions all look very sensible on first pass. You may wish to add #pragma no cover directives marking the code you don't test. This makes it easier to track in your code and makes full coverage 100%. (suggestion not a requirement)

As you make a claim of improved performance (speed) it would be good to quantify that improvement.

I had an install issue but I am pretty sure it was related to a prior install not yours. I will redo on a clean VM to confirm.

camillescott · 2017-01-09T21:08:30Z

Thanks @genomematt. I'll add a section to the paper describing the approach in a bit more detail. I can easily add a figure with performance comps as well.

camillescott · 2017-01-11T00:34:06Z

Update: I have added a performance comparison figure, a section describing the method in a bit more detail, and contribution guidelines.

genomematt · 2017-01-14T00:02:41Z

Ok @camillescott the added bits are good. My install issue was a conda problem and all was fine on a clean vm.

Given that there are multiple dependencies I think there should be a test step at the end of your install procedure. This would not need to be comprehensive testing of code, just a 'if you have installed it correctly this input will give this output.'

I am giving you a scraping pass on the real world application documentation, but an additional notebook that does not hide the actual running in a script like the performance notebook, showing real transcripts against a real db and then interpreting the csv to answer the biological question would enhance.
As it stands the documentation is good for anyone who has done this sort of procedure before, and not as easy for someone who has a set of transcripts and is asking 'what is the best way to match these to the most closely related model organism'. e.g. if I have 10 transcripts will this program work, will the results be valid? I really care about gene family X, how do I interpret the output to find my family X transcripts?

Neither of those would stop me recommending accepting, but the complete lack of community guidelines does.
Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

camillescott · 2017-01-17T18:38:09Z

What is expected for guidelines? I did add a CONTRIBUTING.md: https://github.com/camillescott/shmlast/blob/master/CONTRIBUTING.md

I followed the example of several other accepted projects on that -- should I add more?

genomematt · 2017-01-17T22:56:06Z

Sorry @camillescott I missed the CONTRIBUTING.md. I suspect a lot of other people especially those with bugs or support questions would as well. At least link it from the readme, and personally I would be inclined to put such a short section just in the readme.

genomematt · 2017-01-17T22:57:03Z

Ok @biorelated I consider this ready to accept.

camillescott · 2017-01-18T00:27:15Z

Copy that @genomematt -- added it to the readme :)

george-githinji · 2017-01-18T13:19:22Z

Many thanks @genomematt.

george-githinji · 2017-01-18T13:20:08Z

@arfon can we prepare this work for publication please?

arfon · 2017-01-20T21:02:44Z

@camillescott - At this point could you make an archive of the reviewed software in Zenodo/figshare/other service and update this thread with the DOI of the archive? I can then move forward with accepting the submission.

camillescott · 2017-01-27T19:52:04Z

@arfon: I've archived on Zenodo; here is the DOI: http://doi.org/10.5281/zenodo.260437

note that the final published version after the requested changes is v1.1, not v1.0.2 as in Whedon's first post in this issue

arfon · 2017-01-28T08:53:05Z

@whedon set 10.5281/zenodo.260437 as archive

whedon · 2017-01-28T08:53:08Z

OK. 10.5281/zenodo.260437 is the archive.

arfon · 2017-01-28T09:03:47Z

Many thanks for your review @genomematt and thanks for editing this one @biorelated ✨

@camillescott - your paper is now accepted into JOSS and your DOI is http://dx.doi.org/10.21105/joss.00142 ⚡️ 🚀 💥

whedon added the review label Dec 18, 2016

whedon mentioned this issue Dec 18, 2016

[PRE REVIEW]: shmlast: An improved implementation of Conditional Reciprocal Best Hits with LAST and Python #122

Closed

arfon added the accepted label Jan 28, 2017

arfon closed this as completed Jan 28, 2017

whedon added published Papers published in JOSS recommend-accept Papers recommended for acceptance in JOSS. labels Mar 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REVIEW]: shmlast: An improved implementation of Conditional Reciprocal Best Hits with LAST and Python #142

[REVIEW]: shmlast: An improved implementation of Conditional Reciprocal Best Hits with LAST and Python #142

whedon commented Dec 18, 2016 •

edited

whedon commented Dec 18, 2016

george-githinji commented Dec 31, 2016

genomematt commented Jan 4, 2017

camillescott commented Jan 9, 2017

camillescott commented Jan 11, 2017

genomematt commented Jan 14, 2017 •

edited

camillescott commented Jan 17, 2017

genomematt commented Jan 17, 2017

genomematt commented Jan 17, 2017

camillescott commented Jan 18, 2017

george-githinji commented Jan 18, 2017

george-githinji commented Jan 18, 2017

arfon commented Jan 20, 2017

camillescott commented Jan 27, 2017

arfon commented Jan 28, 2017

whedon commented Jan 28, 2017

arfon commented Jan 28, 2017

[REVIEW]: shmlast: An improved implementation of Conditional Reciprocal Best Hits with LAST and Python #142

[REVIEW]: shmlast: An improved implementation of Conditional Reciprocal Best Hits with LAST and Python #142

Comments

whedon commented Dec 18, 2016 • edited

Status

Reviewer questions

Conflict of interest

General checks

Functionality

Documentation

Software paper

whedon commented Dec 18, 2016

george-githinji commented Dec 31, 2016

genomematt commented Jan 4, 2017

camillescott commented Jan 9, 2017

camillescott commented Jan 11, 2017

genomematt commented Jan 14, 2017 • edited

camillescott commented Jan 17, 2017

genomematt commented Jan 17, 2017

genomematt commented Jan 17, 2017

camillescott commented Jan 18, 2017

george-githinji commented Jan 18, 2017

george-githinji commented Jan 18, 2017

arfon commented Jan 20, 2017

camillescott commented Jan 27, 2017

arfon commented Jan 28, 2017

whedon commented Jan 28, 2017

arfon commented Jan 28, 2017

whedon commented Dec 18, 2016 •

edited

genomematt commented Jan 14, 2017 •

edited