-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash when trying to create novel allele consensus fasta from ARG database #40
Comments
Hi Scott, I'm not sure what's going wrong here. |
It may be a problem with the relative path and the way the cluster that I'm running it on handles it with respect to my job submission script... I'm going to take a look in the morning and see if that's the issue and if I can fix it. Thanks for taking a look for me, too. |
It is a problem in the python code where it tries to prefix the supplied output on line 765. It works if the supplied output is a filename, but fails if it has directory path. I have edited the code and created a pull request to fix this issue. Thanks! |
Thanks to Yunyun for sorting this out! |
Thanks this is now tested and is being merged into the new release |
Howdy all,
Quick question, I am trying to recover the novel allele sequences from some samples being run against the ARG database included with SRST2. I'm getting the following error when trying to create the allele pileup:
06/17/2015 12:44:06 Generate pileup...
06/17/2015 12:44:06 Running: samtools mpileup -L 1000 -f /home/mresswl/archive/ARGFinder_DB/ARGannot.fasta -Q 20 -q 1 -B data_out/KPN166ec__KPN166ec.ARGannot.sorted.bam
[mpileup] 1 samples in 1 input files
Set max per-file depth to 8000
06/17/2015 12:44:11 Processing SAMtools pileup...
06/17/2015 12:44:34 Scoring alleles...
Traceback (most recent call last):
File "/share/apps/srst2/bin/srst2.py", line 1592, in
main()
File "/share/apps/srst2/bin/srst2.py", line 1548, in main
db_reports, db_results = run_srst2(args,fileSets,args.gene_db,"genes")
File "/share/apps/srst2/bin/srst2.py", line 1102, in run_srst2
db_reports, db_results_list = process_fasta_db(args, fileSets, run_type, db_reports, db_results_list, fasta)
File "/share/apps/srst2/bin/srst2.py", line 1164, in process_fasta_db
unique_gene_symbols, unique_allele_symbols,run_type,ST_db,results,gene_list,db_report,cluster_symbols,max_mismatch)
File "/share/apps/srst2/bin/srst2.py", line 1275, in map_fileSet_to_db
unique_gene_symbols, unique_allele_symbols, pileup_file)
File "/share/apps/srst2/bin/srst2.py", line 859, in parse_scores
allele_pileup_file = create_allele_pileup(top_allele, pileup_file) # XXX Creates a new pileup file for that allele. Not currently cleaned up
File "/share/apps/srst2/bin/srst2.py", line 765, in create_allele_pileup
with open(outpileup, 'w') as allele_pileup:
IOError: [Errno 2] No such file or directory: '77__OqxA_Flq__OqxA__1044.data_out/KPN166ec__KPN166ec.ARGannot.pileup'
So it seems like I'm having a problem creating a file / directory but I'm not certain why that is... the script is able to generate all the other output files it needs to... this is only an error when I try to run with --report_new_consensus. Any help is appreciated. This is happening for every paired end readset I attempt to generate the --report_new_consensus reports for.
Full program output is appended below, in case it is helpful.
06/17/2015 12:39:54 program started
06/17/2015 12:39:54 command line: /share/apps/srst2/bin/srst2.py --input_pe data_in/KPN166ec_1.fastq.gz data_in/KPN166ec_2.fastq.gz --report_new_consensus --gene_db /home/mresswl/archive/ARGFinder_DB/ARGannot.fasta --output data_out/KPN166ec
06/17/2015 12:39:54 Total paired readsets found:1
06/17/2015 12:39:54 Index for /home/mresswl/archive/ARGFinder_DB/ARGannot.fasta is already built...
06/17/2015 12:39:54 Processing database /home/mresswl/archive/ARGFinder_DB/ARGannot.fasta
06/17/2015 12:39:54 Non-unique:55, CmlA_PheCmlA5
06/17/2015 12:39:54 Non-unique:114, CfrA_Phe
06/17/2015 12:39:54 Processing sample KPN166ec
06/17/2015 12:39:54 Output prefix set to: data_out/KPN166ec__KPN166ec.ARGannot
06/17/2015 12:39:54 Aligning reads to index /home/mresswl/archive/ARGFinder_DB/ARGannot.fasta using bowtie2...
06/17/2015 12:39:54 Running: bowtie2 -1 data_in/KPN166ec_1.fastq.gz -2 data_in/KPN166ec_2.fastq.gz -S data_out/KPN166ec__KPN166ec.ARGannot.sam -q --very-sensitive-local --no-unal -a -x /home/mresswl/archive/ARGFinder_DB/ARGannot.fasta
1764353 reads; of these:
1764353 (100.00%) were paired; of these:
1760384 (99.78%) aligned concordantly 0 times
2397 (0.14%) aligned concordantly exactly 1 time
1572 (0.09%) aligned concordantly >1 times
----
1760384 pairs aligned concordantly 0 times; of these:
156 (0.01%) aligned discordantly 1 time
----
1760228 pairs aligned 0 times concordantly or discordantly; of these:
3520456 mates make up the pairs; of these:
3517876 (99.93%) aligned 0 times
1544 (0.04%) aligned exactly 1 time
1036 (0.03%) aligned >1 times
0.31% overall alignment rate
06/17/2015 12:43:47 Processing Bowtie2 output with SAMtools...
06/17/2015 12:43:47 Generate and sort BAM file...
06/17/2015 12:43:47 Running: samtools view -b -o data_out/KPN166ec__KPN166ec.ARGannot.unsorted.bam -q 1 -S data_out/KPN166ec__KPN166ec.ARGannot.sam.mod
[samopen] SAM header is present: 1683 sequences.
06/17/2015 12:43:50 Running: samtools sort data_out/KPN166ec__KPN166ec.ARGannot.unsorted.bam data_out/KPN166ec__KPN166ec.ARGannot.sorted
06/17/2015 12:44:06 Deleting sam and bam files that are not longer needed...
06/17/2015 12:44:06 Deleting data_out/KPN166ec__KPN166ec.ARGannot.sam
06/17/2015 12:44:06 Deleting data_out/KPN166ec__KPN166ec.ARGannot.sam.mod
06/17/2015 12:44:06 Deleting data_out/KPN166ec__KPN166ec.ARGannot.unsorted.bam
06/17/2015 12:44:06 Generate pileup...
06/17/2015 12:44:06 Running: samtools mpileup -L 1000 -f /home/mresswl/archive/ARGFinder_DB/ARGannot.fasta -Q 20 -q 1 -B data_out/KPN166ec__KPN166ec.ARGannot.sorted.bam
[mpileup] 1 samples in 1 input files
Set max per-file depth to 8000
06/17/2015 12:44:11 Processing SAMtools pileup...
06/17/2015 12:44:34 Scoring alleles...
Traceback (most recent call last):
File "/share/apps/srst2/bin/srst2.py", line 1592, in
main()
File "/share/apps/srst2/bin/srst2.py", line 1548, in main
db_reports, db_results = run_srst2(args,fileSets,args.gene_db,"genes")
File "/share/apps/srst2/bin/srst2.py", line 1102, in run_srst2
db_reports, db_results_list = process_fasta_db(args, fileSets, run_type, db_reports, db_results_list, fasta)
File "/share/apps/srst2/bin/srst2.py", line 1164, in process_fasta_db
unique_gene_symbols, unique_allele_symbols,run_type,ST_db,results,gene_list,db_report,cluster_symbols,max_mismatch)
File "/share/apps/srst2/bin/srst2.py", line 1275, in map_fileSet_to_db
unique_gene_symbols, unique_allele_symbols, pileup_file)
File "/share/apps/srst2/bin/srst2.py", line 859, in parse_scores
allele_pileup_file = create_allele_pileup(top_allele, pileup_file) # XXX Creates a new pileup file for that allele. Not currently cleaned up
File "/share/apps/srst2/bin/srst2.py", line 765, in create_allele_pileup
with open(outpileup, 'w') as allele_pileup:
IOError: [Errno 2] No such file or directory: '77__OqxA_Flq__OqxA__1044.data_out/KPN166ec__KPN166ec.ARGannot.pileup'
The text was updated successfully, but these errors were encountered: