SRST2 & groot parsers #9

fmaguire · 2020-05-19T22:53:06Z

See a5e6d39 for caveats of groot parser and 674f14d for srst2 caveats

- CIGAR string could be used to calculate some standardised metrics e.g. %ID, not implemented - Certain databases contain information related to other fields in their gene name (e.g. groot run with ARG-ANNOT has an accession and genomic location for the AMR gene) but not all databases - Databases pre-prepared by groot also contain the `reference_database_id` in the `gene_name` but this isn't guaranteed depending on how the database has been prepared for indexing.

- Parser assumes the "fullgenes" gene typing output file from srst2 - Occasionally, the annotation field contains more information depending on the database but not in a consistent format across dbs used.

dfornika · 2020-05-21T00:28:59Z

Looks great @fmaguire are you planning on doing any more work on this before we merge? The CI setup still needs some work so we don't expect to get green checkmarks there.

fmaguire · 2020-05-21T18:32:11Z

I was planning on adding a couple more but won't get around to it until next week. So merge away!

fmaguire added 3 commits May 18, 2020 01:22

Add SRST2 parser

674f14d

- Parser assumes the "fullgenes" gene typing output file from srst2 - Occasionally, the annotation field contains more information depending on the database but not in a consistent format across dbs used.

Tidy up and rename global variable

07b03c1

dfornika merged commit b9fa1b9 into pha4ge:master May 22, 2020

dfornika mentioned this pull request May 22, 2020

Add Parser: srst2 #2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SRST2 & groot parsers #9

SRST2 & groot parsers #9

fmaguire commented May 19, 2020

dfornika commented May 21, 2020

fmaguire commented May 21, 2020

SRST2 & groot parsers #9

SRST2 & groot parsers #9

Conversation

fmaguire commented May 19, 2020

dfornika commented May 21, 2020

fmaguire commented May 21, 2020