Skip to content

JOSS review #4

@will-rowe

Description

@will-rowe

Related to: openjournals/joss-reviews#1540

General feedback

Firstly, I really enjoyed this so thank you for asking me to review. As this I'm late to the party, I see that you have already substantially improved the documentation and paper thanks to a great review by Gavin.

I have no major objections to this paper and I think the software is a valuable contribution to the field. I've enjoyed playing with it so far. I would have liked a comparison against other tools, partly as I think this would attract users and increase the impact of sourcepredict, but I accept this is outside the scope of the paper.

Minor comments

  • File format descriptor for the taxonomic classification tables
    You use several terms to refer to the same thing throughout the documentation, such as OTU count table, TAXID abundance count table and abundance table. Please unify these.

  • Input filename in example
    Rather than calling the example input data dog_example.csv, could you make it a more informative filename? Even something like dog_sink_sample.csv, to tie it in with your source-sink narrative.

  • Reference formatting issue in the paper
    The formatting for the Kraken reference in the second paragraph of the paper needs fixing, at the moment the authors are outside the reference parenthesise.

Suggestions

  • Continuity in syntax/documentation
    To be clear, I don't need to see this changed but I just wanted to raise it. I found the usage documentation a little confusing when you used the term abundance_table, for instance as the positional argument tag. Throughout the paper and documentation you have nicely set up the source-to-sink logic and I think it is a shame not have this continuity here.

  • Standardised format for input taxonomic classification tables
    Again, just a suggestion for a future release maybe. It may be worth considering a standardised format, e.g. biom format. I only suggest this as the parsing required to get the TAXID and abundance values from different taxonomic classifiers may make this tool less appealing or unaccessible to some users. If you could get the standard output from one or more taxonomic classifiers (kraken/kaiju/metaphlan etc.) and run them straight into sourcepredict, that would lower the barrier to entry. Your kraken pipeline is a great step in this direction. You could maybe think about adding the kraken_parse script to this repo too - particularly as people often receive kraken reports as part of sequencing results so may not want/need to re-run kraken (or use mini kraken) before using sourcepredict.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions