Assembling failed #3

SolayMane · 2018-07-23T08:26:52Z

I try to assemble chloroplast from raw reads using
get_organelle_reads.py -1 /sanhome2/trimmed/out2045_1.clean.fastq -2 /sanhome2/trimmed/out2045_2.clean.fastq -s pc_ref.fa -w 103 -J 3 -M 5 -o chloro_out -R 5 -k 75,85,95,105 -P 1000000
I have about 70 000 000 150 PE reads.

I got error assembling failed and I thinkj because filtred paired reads files were empty and I don't know why.
Below somes logfiles to undersatnd this issue:

2018-07-20 17:50:45,343 - INFO: Separating filtered fastq file finished!
2018-07-20 17:50:47,679 - INFO: Assembling using SPAdes ...
2018-07-20 17:50:48,293 - ERROR: Error in SPAdes:
== Error == system call for: "['/home1/software/SPAdes-3.11.1-linux/bin/hammer', '/sanhome2/Organnelle/chloro_out/filtered_spades/corrected/configs/config.info']" finished abnormally, err code: 255

2018-07-20 17:50:48,298 - ERROR: Assembling failed.

Total Calc-cost 20067.3784549
Thanks you!
#############
config.info
; = HAMMER =
; input options: working dir, input files, offset, and possibly kmers
dataset /sanhome2/Organnelle/chloro_out/filtered_spades/input_dataset.yaml
input_working_dir /sanhome2/Organnelle/chloro_out/filtered_spades/tmp/hammer_BH7RTS
input_trim_quality 4
input_qvoffset
output_dir /sanhome2/Organnelle/chloro_out/filtered_spades/corrected

; == HAMMER GENERAL ==
; general options
general_do_everything_after_first_iteration 1
general_hard_memory_limit 250
general_max_nthreads 4
general_tau 1
general_max_iterations 1
general_debug 0

; count k-mers
count_do 1
count_numfiles 16
count_merge_nthreads 4
count_split_buffer 0
count_filter_singletons 0

; hamming graph clustering
hamming_do 1
hamming_blocksize_quadratic_threshold 50

; bayesian subclustering
bayes_do 1
bayes_nthreads 4
bayes_singleton_threshold 0.995
bayes_nonsingleton_threshold 0.9
bayes_use_hamming_dist 0
bayes_discard_only_singletons 0
bayes_debug_output 0
bayes_hammer_mode 0
bayes_write_solid_kmers 0
bayes_write_bad_kmers 0
bayes_initial_refine 1

; iterative expansion step
expand_do 1
expand_max_iterations 25
expand_nthreads 4
expand_write_each_iteration 0
expand_write_kmers_result 0

; read correction
correct_do 1
correct_discard_bad 0
correct_use_threshold 1
correct_threshold 0.98
correct_nthreads 4
correct_readbuffer 100000
correct_stats 1

Thank you for youre help!

Kinggerm · 2018-07-23T16:50:19Z

Can you show me the intact log file?
As you said, filtred paired reads files were empty. So it could be the problem with reads extending. I need your intact log to help you.
By the way, you had 10G raw data, that's really too much and unnecessary. But this is another thing and should not be the reason for the failure.

SolayMane · 2018-07-24T07:33:53Z

below the content of log file:

GetOrganelle v1.0.3a

This pipeline get organelle reads and genomes from genome skimming data by extending.
Find updates in https://github.com/Kinggerm/GetOrganelle and see README.md for more information.

/home1/software/GetOrganelle/get_organelle_reads.py -1 /sanhome2/trimmed/out2045_1.clean.fastq -2 /sanhome2/trimmed/out2045_2.clean.fastq -s pc_ref.fa -w 103 -J 3 -M 5 -o chloro_out -R 5 -k 75,85,95,105 -P 1000000

2018-07-20 12:16:20,997 - INFO: Unzipping reads ...
2018-07-20 12:16:20,997 - INFO: Unzipping reads finished.

2018-07-20 12:16:20,998 - INFO: Reading seeds ...
2018-07-20 12:16:20,998 - INFO: Making seed - bowtie2 index ...
2018-07-20 12:16:21,195 - INFO: Making seed - bowtie2 index finished.
2018-07-20 12:16:21,195 - INFO: Mapping reads to seed - bowtie2 index ...
2018-07-20 12:42:44,225 - INFO: Mapping finished.
2018-07-20 12:42:44,225 - INFO: Reading seeds finished.

2018-07-20 12:42:44,226 - INFO: Pre-reading fastq ...
2018-07-20 13:01:24,167 - INFO: 133898628 candidates in all 154713318 reads
2018-07-20 13:01:24,629 - INFO: Pre-reading fastq finished.

2018-07-20 13:01:24,630 - INFO: Pre-grouping reads...
2018-07-20 13:01:38,120 - INFO: 1000000/9475444 used/duplicated
2018-07-20 13:05:27,871 - INFO: 53791 groups made.

2018-07-20 13:05:56,287 - INFO: Adding initial words ...
2018-07-20 13:16:32,772 - INFO: Adding initial words finished.

2018-07-20 13:16:32,773 - INFO: Extending ...
2018-07-20 13:36:12,155 - INFO: Round 1: 133898628/133898628 AI 20604128 AW 331308344
2018-07-20 14:11:32,280 - INFO: Round 2: 133898628/133898628 AI 30262507 AW 431633632
2018-07-20 14:32:01,445 - INFO: Round 3: 133898628/133898628 AI 37325034 AW 507733592
2018-07-20 14:53:01,108 - INFO: Round 4: 133898628/133898628 AI 42667564 AW 566007848
2018-07-20 15:14:54,873 - INFO: Round 5: 133898628/133898628 AI 46775908 AW 611042474
2018-07-20 15:14:54,875 - INFO: Hit the round limit 5 and terminated ...
2018-07-20 17:46:45,566 - INFO: Extending finished.

2018-07-20 17:46:45,567 - INFO: Separating filtered fastq file ...
2018-07-20 17:50:45,343 - INFO: Separating filtered fastq file finished!
2018-07-20 17:50:47,679 - INFO: Assembling using SPAdes ...
2018-07-20 17:50:48,293 - ERROR: Error in SPAdes:
== Error == system call for: "['/home1/software/SPAdes-3.11.1-linux/bin/hammer', '/sanhome2/Organnelle/chloro_out/filtered_spades/corrected/configs/config.info']" finished abnormally, err code: 255

2018-07-20 17:50:48,298 - ERROR: Assembling failed.

Total Calc-cost 20067.3784549
Thanks you!

Kinggerm · 2018-07-24T10:33:34Z

Sorry for the trouble and thanks a lot for the information!
As I can tell from the log file, the extending is normal. But you mentioned that "filtred paired reads files were empty", can you show me a few reads you have for both out2045_1.clean.fastq and out2045_2.clean.fastq by

head -n 8 /sanhome2/trimmed/out2045_1.clean.fastq
head -n 8 /sanhome2/trimmed/out2045_2.clean.fastq

I want to make sure whether it was the problem with the format detecting.

SolayMane · 2018-07-24T10:46:14Z

head -n 8 /sanhome2/trimmed/out2045_1.clean.fastq:

@SRR6062045.201.1 201 length=151
TGGAGAACAAAGGATTTTATGTGCCAGTGGTGATCCTTTTTCAAATCTTGCTTTCTTCTAACTCTGGTTATTGCTTTTGTAGTGGTGGTGAGGTGCTCTGTGATTTTGTACTTCTAACTCTTCTTTCTCGTCTGTATGTGCACGTACAA
+
AFFJJJJJJJJJJAJJ7-FFJFJJJAFJAJ-<--<J<FAFAJFJJJFJJJFAF-FJJJF<FAAFFJJA-JA777<<AF<77A7<JF7JFJFAAFJFJ<F-<7A-FJFJFFJAJFAAFFAFJ7AJJ7FAFJ7AAJJ7A-<7<<FF<FJJF
@SRR6062045.202.1 202 length=151
TGCTTAAAGTTCATTCAAATTACAAAAATTAATTTAAGAAATTATGTAAAAATATCTACACAAAAATTAATTTCTTTCCCCTTTTTTTGTTTTTTAAACTAAACTAACCCTAAATTAACTTGTGCATACTGTCATCTGGAGCAAAAAAG
+
AFFJJJJFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJJJJJJJJJJJJJJJJJFJJ7FFJ<FJJJJJ<JJFJJJJFFFAJJ-AAFJJ<FJJJJFFFJJJ-<AJJJJJJA

head -n 8 /sanhome2/trimmed/out2045_2.clean.fastq:
@SRR6062045.201.2 201 length=151
AGGAAGCAATAGATCTCTCTCTCAACGACAAGAGAGTGCTCCCTTCCCCTTCTACTTTACAAAAACCGTGAAACGTAAGCATCTGCAAACCACAAATCTACCCCCTGAATTGAAATCAAAATTAAAAGACTAGTTGTACGTGCACATACAG
+
AAFFFJJJJJJ7JFJ<FJFAF-FJJJFJA<AAJFFF7FFJFJJJJAJJJJFJFFJJFJFJJJJJJJFF-AFJJJJJJJFAJAFFFJAJF<FJJFA-<FAJFF7FFAJJAAAFFJF-7<FJAF-<<-<777<<7<F7<<---<-A7F<F7F-
@SRR6062045.202.2 202 length=151
GGTGGGTTCCTTGTGGCAACCGGTCAATGCCAACCCCCTTGGTGGGGCCGCTGGCTTCCTGATTTACCTCTCACACGATGCTCGTGGGGCCGGGTGGTGTGGGCCGGTTAGGGTGTCCGGACAATACACCTTTTTTGCTCCAGATGACAGT
+
A-FFFJJJJJJJJJJFJJJJJJJJFJJJJJJJJJJJJJF<FJFJJJJJJJJJJJJ<JJJJJJJJJJJJJJFJJFJJJJJJJ-FJ<<JFJJJJJJ-7AAJFJJFFJJJAJAJJJ7J<AJJJFJF7-<-<<AA<<FJJ-A-7<FJ<FF-A-A-

Kinggerm · 2018-07-24T11:40:58Z

Thanks!
Now I see the problem. The head is not compatible with this GetOrganelle version.
I am going to fix this latter for your type of data. I would let you know once done. It would be quick.

SolayMane · 2018-07-24T11:43:20Z

Thank you very much!

Kinggerm · 2018-07-24T17:52:05Z

Hi, I made a few changes so that it could works for a small testing data. You could easily using git pull to update GetOrganelle.
Let me know if you could go through your data with new version. Also, I strongly suggest you reduce your dataset to 2G per end for plastome assembly. 2G is really enough and much more faster.

SolayMane · 2018-07-26T08:47:35Z

Hi thank you for your help, the issue is solved!

Kinggerm closed this as completed Jul 31, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Assembling failed #3

Assembling failed #3

SolayMane commented Jul 23, 2018

Kinggerm commented Jul 23, 2018

SolayMane commented Jul 24, 2018

Kinggerm commented Jul 24, 2018 •

edited

SolayMane commented Jul 24, 2018

Kinggerm commented Jul 24, 2018

SolayMane commented Jul 24, 2018

Kinggerm commented Jul 24, 2018

SolayMane commented Jul 26, 2018

Assembling failed #3

Assembling failed #3

Comments

SolayMane commented Jul 23, 2018

Kinggerm commented Jul 23, 2018

SolayMane commented Jul 24, 2018

Kinggerm commented Jul 24, 2018 • edited

SolayMane commented Jul 24, 2018

Kinggerm commented Jul 24, 2018

SolayMane commented Jul 24, 2018

Kinggerm commented Jul 24, 2018

SolayMane commented Jul 26, 2018

Kinggerm commented Jul 24, 2018 •

edited