-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using new ARG-ANNOT v3 database #94
Comments
What command are you attempting to run? |
Hi,
Resistance gene detection so:
srst2 --input_pe strainA_1.fastq.gz strainA_2.fastq.gz --output strainA_test --log --gene_db resistance.fasta
Thanks
Hayley
…________________________________
From: Kat Holt [notifications@github.com]
Sent: 29 September 2017 18:16
To: katholt/srst2
Cc: Hayley Wilson; Author
Subject: Re: [katholt/srst2] Using new ARG-ANNOT v3 database (#94)
What command are you attempting to run?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#94 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AQgyZtqhAMy7rp0cgz_XUbj7jYa4Jkdlks5snSXagaJpZM4Pm8LW>.
|
@hjb60 May I ask where did you get the resistance.fasta from? I feel the sequence header "AGly-Aac3-IIa:X51534:91-951:861" does not look like the one in the ARG-ANNOT v3 database "(AGly)Aac3-IIa:X51534:91-951:861" or in our curated version of the same database "203__Aac3-IIa_AGly__Aac3-IIa__882 no;no;Aac3-IIa;AGly;X51534;91-951;861". SRST2 extracts information from sequence headers following a specific format as aforementioned (also please refer to Generating SRST2-compatible clustered database from raw sequences for more details). An error of unknown keys arises when this requirement is not fullfilled. You may want to compare your resistance database with the formal release of the ARG-ANNOT v3 database, or try our curated version, which has already been tested on SRST2. |
Thanks for the advice - I had downloaded the fasta file of the resistance database from here:
http://en.mediterranee-infection.com/arkotheque/client/ihumed/_depot_arko/articles/1425/argannot-aa-v3-march2017_doc.fasta
When it initially didn't work I tried removing the (). I will try using the link you have provided and hopefully that will work.
Many thanks for the response
Hayley
…________________________________
From: Yu Wan [notifications@github.com]
Sent: 02 October 2017 12:07
To: katholt/srst2
Cc: Hayley Wilson; Author
Subject: Re: [katholt/srst2] Using new ARG-ANNOT v3 database (#94)
May I ask where did you get the resistance.fasta from? I feel the sequence header "AGly-Aac3-IIa:X51534:91-951:861" does not look like the one in the ARG-ANNOT v3 database "(AGly)Aac3-IIa:X51534:91-951:861" or in our curated version of the same database "203__Aac3-IIa_AGly__Aac3-IIa__882 no;no;Aac3-IIa;AGly;X51534;91-951;861".
SRST2 extracts information from sequence headers following a specific format as aforementioned (also please refer to Generating SRST2-compatible clustered database from raw sequences<https://github.com/katholt/srst2#generating-srst2-compatible-clustered-database-from-raw-sequences> for more details). An error of unknown keys arises when this requirement is not fullfilled.
You may want to compare your resistance database with the formal release of the ARG-ANNOT v3 database<http://en.mediterranee-infection.com/arkotheque/client/ihumed/_depot_arko/articles/1424/arg-annot-nt-v3-march2017_doc.fasta>, or try our curated version<https://github.com/katholt/srst2/blob/master/data/ARGannot_r2.fasta>, which has already been tested on SRST2.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#94 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AQgyZj5OtLXLHUgzNS-W2waFvS0_Kbutks5soMQOgaJpZM4Pm8LW>.
|
Unless you have a specific reason to do otherwise, I would suggest using our pre-formatted version of this resistance database (ARGannot_r2.fasta) which is in the /data directory |
I would like to use the newest version of the ARG-ANNOT database but on running the script I am not getting the full genes output file. The error file contains the following:
09/20/2017 20:47:42 Processing SAMtools pileup...
Traceback (most recent call last):
File "/software/pathogen/external/apps/usr/local/Python-2.7.13/bin/srst2", line 11, in
load_entry_point('srst2==0.2.0', 'console_scripts', 'srst2')()
File "/software/pathogen/external/apps/usr/local/Python-2.7.13/lib/python2.7/site-packages/srst2/srst2.py", line 1729, in main
db_reports, db_results = run_srst2(args,fileSets,args.gene_db,"genes")
File "/software/pathogen/external/apps/usr/local/Python-2.7.13/lib/python2.7/site-packages/srst2/srst2.py", line 1264, in run_srst2
db_results_list, fasta)
File "/software/pathogen/external/apps/usr/local/Python-2.7.13/lib/python2.7/site-packages/srst2/srst2.py", line 1327, in process_fasta_db
results,gene_list, db_report, cluster_symbols, max_mismatch)
File "/software/pathogen/external/apps/usr/local/Python-2.7.13/lib/python2.7/site-packages/srst2/srst2.py", line 1422, in map_fileSet_to_db
read_pileup_data(pileup_file, size, args.prob_err)
File "/software/pathogen/external/apps/usr/local/Python-2.7.13/lib/python2.7/site-packages/srst2/srst2.py", line 337, in read_pileup_data
allele_size = size[allele]
KeyError: 'AGly-Aac3-IIa:X51534:91-951:861'
Is there some additional formatting I should be doing before trying to use it? Sorry if this is obvious - this area is not my forte!
The text was updated successfully, but these errors were encountered: