Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic expansion triggered when load factor was below minimum threshold #39

Closed
hermeseduardo opened this issue Feb 22, 2018 · 8 comments

Comments

@hermeseduardo
Copy link

hi there, I got this error message. Any ideas about what could have cause it?

[2018-02-22 08:36:13] INFO: Running Flye 2.3.2-gd46edb7
[2018-02-22 08:36:13] INFO: Assembling reads
[2018-02-22 08:36:13] INFO: Reading sequences
[2018-02-22 08:44:22] INFO: Generating solid k-mer index
[2018-02-22 08:46:32] INFO: Counting kmers (1/2):
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2018-02-22 08:54:26] INFO: Counting kmers (2/2):
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2018-02-22 09:21:11] INFO: Filling index table
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
[2018-02-22 10:00:35] INFO: Extending reads
0% [2018-02-22 10:14:55] ERROR: Caught unhandled exception: Automatic expansion triggered when load factor was below minimum threshold
[2018-02-22 10:14:55] ERROR: flye-assemble(_Z16exceptionHandlerv+0x2d) [0x43c73d]
[2018-02-22 10:14:55] ERROR: /usr/lib64/libstdc++.so.6(+0x96706) [0x2aaaab277706]
[2018-02-22 10:14:55] ERROR: /usr/lib64/libstdc++.so.6(+0x96751) [0x2aaaab277751]
[2018-02-22 10:14:55] ERROR: /usr/lib64/libstdc++.so.6(+0xc1708) [0x2aaaab2a2708]
[2018-02-22 10:14:55] ERROR: /lib64/libpthread.so.0(+0x8744) [0x2aaaab789744]
[2018-02-22 10:14:55] ERROR: /lib64/libc.so.6(clone+0x6d) [0x2aaaaba87aad]
[2018-02-22 10:15:15] ERROR: Command '['flye-assemble', '-l', '/flush1/esc003/Flye_cynegetis_assembly/flye/flye.log', '-t', '20', '-v', '5000', '/flush2/esc003/Pacbio_subreads_smartbellremoved.fasta', '/flush1/esc003/Flye_cynegetis_assembly/flye/0-assembly/draft_assembly.fasta', '576716800', '/data/esc003/apps/Flye/flye/resource/asm_raw_reads.cfg']' returned non-zero exit status 1

mikolmogorov added a commit that referenced this issue Feb 23, 2018
@mikolmogorov
Copy link
Owner

Hi,

Thanks for the report. It looks like there is a rare issue with the hash function for sequence ids - it does not provide a good uniform distribution of keys in this case. I incorporated a batter hash function which should solve the problem - please try the updated version from the flye-devel branch. You can either checkout it through the git interface, or download a zip archive: https://github.com/fenderglass/Flye/archive/flye-devel.zip

Let me know if this helps,

Mikhail

@hermeseduardo
Copy link
Author

thanks Mikhail, I tried but the problem is still there. Please let me know if you need additional information from my side.
best,

Hermes

@mikolmogorov
Copy link
Owner

Could you send me the log file and give some information about the dataset: techonology, genome size, coverage etc?

@hermeseduardo
Copy link
Author

sure, pacbio subreads (RSII) fasta files, coverage about 40X, genome size around 500MB. Bellow the some info about the sbatch and the whole log file.

HE

#SBATCH --mem=1000GB
#SBATCH --cpus-per-task=20

/data/esc003/apps/Flye-flye-devel/bin/flye --pacbio-raw pacbio.fasta --genome-size 550m --threads 20 -m 5000 -o flye

[2018-02-24 01:57:36] root: DEBUG: Genome size: 576716800
[2018-02-24 01:57:36] root: INFO: Running Flye 2.3.2-release
[2018-02-24 01:57:36] root: DEBUG: Cmd: /data/esc003/apps/Flye-flye-devel/bin/flye --pacbio-raw /flush2/esc003/Pacbio_subreads_smartbellremoved_headers_removed.fasta --genome-size 550m --threads 20 -m 5000 -o flye
[2018-02-24 01:57:36] root: INFO: Assembling reads
[2018-02-24 01:57:36] root: DEBUG: -----Begin assembly log------
[2018-02-24 01:57:36] root: DEBUG: Running: flye-assemble -l /flush1/esc003/Flye_cynegetis_assembly/flye/flye.log -t 20 -v 5000 /flush2/esc003/Pacbio_subreads_smartbellremoved_headers_removed.fasta /flush1/esc003/Flye_cynegetis_assembly/flye/0-assembly/draft_assembly.fasta 576716800 /data/esc003/apps/Flye-flye-devel/flye/resource/asm_raw_reads.cfg
[2018-02-24 01:57:36] DEBUG: Build date: Feb 23 2018 21:50:23
[2018-02-24 01:57:36] DEBUG: Parameters:
[2018-02-24 01:57:36] DEBUG: kmer_size=15
[2018-02-24 01:57:36] DEBUG: kmer_size_big=17
[2018-02-24 01:57:36] DEBUG: big_genome_threshold=50000000
[2018-02-24 01:57:36] DEBUG: maximum_jump=1500
[2018-02-24 01:57:36] DEBUG: maximum_overhang=1500
[2018-02-24 01:57:36] DEBUG: hard_min_coverage_rate=10
[2018-02-24 01:57:36] DEBUG: repeat_coverage_rate=10
[2018-02-24 01:57:36] DEBUG: jump_divergence_rate=2
[2018-02-24 01:57:36] DEBUG: overlap_divergence_rate=5
[2018-02-24 01:57:36] DEBUG: penalty_window=100
[2018-02-24 01:57:36] DEBUG: max_coverage_drop_rate=5
[2018-02-24 01:57:36] DEBUG: chimera_window=100
[2018-02-24 01:57:36] DEBUG: min_reads_in_contig=4
[2018-02-24 01:57:36] DEBUG: max_inner_reads=10
[2018-02-24 01:57:36] DEBUG: max_inner_fraction=0.25
[2018-02-24 01:57:36] DEBUG: max_separation=500
[2018-02-24 01:57:36] DEBUG: tip_length_threshold=20000
[2018-02-24 01:57:36] DEBUG: unique_edge_length=50000
[2018-02-24 01:57:36] DEBUG: min_repeat_res_support=0.5
[2018-02-24 01:57:36] DEBUG: out_paths_ratio=5
[2018-02-24 01:57:36] DEBUG: graph_cov_drop_rate=10
[2018-02-24 01:57:36] DEBUG: coverage_estimate_window=100
[2018-02-24 01:57:36] DEBUG: low_cutoff_warning=1
[2018-02-24 01:57:36] DEBUG: assemble_kmer_sample=1
[2018-02-24 01:57:36] DEBUG: assemble_gap=500
[2018-02-24 01:57:36] DEBUG: repeat_graph_kmer_sample=5
[2018-02-24 01:57:36] DEBUG: repeat_graph_gap=100
[2018-02-24 01:57:36] DEBUG: repeat_graph_max_kmer=500
[2018-02-24 01:57:36] DEBUG: read_align_kmer_sample=1
[2018-02-24 01:57:36] DEBUG: read_align_gap=500
[2018-02-24 01:57:36] DEBUG: read_align_max_kmer=500
[2018-02-24 01:57:36] DEBUG: Running with k-mer size: 17
[2018-02-24 01:57:36] INFO: Reading sequences
[2018-02-24 02:05:51] DEBUG: Mean read length: 8958
[2018-02-24 02:05:51] DEBUG: Estimated coverage: 46
[2018-02-24 02:05:51] INFO: Generating solid k-mer index
[2018-02-24 02:05:51] DEBUG: Hard threshold set to 4
[2018-02-24 02:05:51] DEBUG: Started kmer counting
[2018-02-24 02:08:01] INFO: Counting kmers (1/2):
[2018-02-24 02:15:22] INFO: Counting kmers (2/2):
[2018-02-24 02:33:14] DEBUG: Filtered 1247354 repetitive kmers
[2018-02-24 02:33:14] DEBUG: Estimated minimum kmer coverage: 8, 537665765 unique kmers selected
[2018-02-24 02:33:14] INFO: Filling index table
[2018-02-24 02:34:25] DEBUG: Solid kmers: 537665765
[2018-02-24 02:34:25] DEBUG: Kmer index size: 11555644070
[2018-02-24 02:56:33] INFO: Extending reads
[2018-02-24 03:07:29] DEBUG: Mean read coverage: 9
[2018-02-24 03:07:39] DEBUG: Assembled contig 1
With 7 reads
Start read: +m170808_50191
At position: 2
leftTip: 1 rightTip: 1
Suspicios: 2
Mean extensions: 4
Inner reads: 0
Length: 31157
[2018-02-24 03:07:39] DEBUG: Inner: 16 covered: 88 total: 6005312
[2018-02-24 03:07:42] DEBUG: Assembled contig 2
With 13 reads
Start read: -m170713_180044
At position: 1
leftTip: 1 rightTip: 1
Suspicios: 1
Mean extensions: 3
Inner reads: 0
Length: 79404
[2018-02-24 03:07:42] DEBUG: Inner: 82 covered: 200 total: 6005312
[2018-02-24 03:07:44] DEBUG: Assembled contig 3
With 12 reads
Start read: +m170515_163220
At position: 9
leftTip: 1 rightTip: 1
Suspicios: 2
Mean extensions: 2
Inner reads: 0
Length: 83051
[2018-02-24 03:07:44] DEBUG: Inner: 124 covered: 306 total: 6005312
[2018-02-24 03:07:45] DEBUG: Assembled contig 4
With 11 reads
Start read: +m170712_152947
At position: 2
leftTip: 1 rightTip: 1
Suspicios: 2
Mean extensions: 4
Inner reads: 0
Length: 75947
[2018-02-24 03:07:45] DEBUG: Inner: 196 covered: 429 total: 6005312
[2018-02-24 03:07:49] DEBUG: Assembled contig 5
With 20 reads
Start read: +m170516_157558
At position: 18
leftTip: 1 rightTip: 1
Suspicios: 2
Mean extensions: 4
Inner reads: 0
Length: 116272
[2018-02-24 03:07:49] DEBUG: Inner: 312 covered: 658 total: 6005312
[2018-02-24 03:07:58] DEBUG: Assembled contig 6
With 30 reads
Start read: -m170713_72622
At position: 16
leftTip: 1 rightTip: 1
Suspicios: 5
Mean extensions: 3
Inner reads: 0
Length: 157831
[2018-02-24 03:07:58] DEBUG: Inner: 444 covered: 972 total: 6005312
[2018-02-24 03:08:00] DEBUG: Assembled contig 7
With 25 reads
Start read: +m170516_207012
At position: 12
leftTip: 1 rightTip: 1
Suspicios: 2
Mean extensions: 5
Inner reads: 0
Length: 170570
[2018-02-24 03:08:00] DEBUG: Inner: 600 covered: 1393 total: 6005312
[2018-02-24 03:08:01] DEBUG: Assembled contig 8
With 22 reads
Start read: -m170713_59169
At position: 13
leftTip: 1 rightTip: 1
Suspicios: 2
Mean extensions: 4
Inner reads: 0
Length: 139444
[2018-02-24 03:08:01] DEBUG: Inner: 734 covered: 1720 total: 6005312
[2018-02-24 03:08:04] DEBUG: Assembled contig 9
With 23 reads
Start read: -m170809_48452
At position: 10
leftTip: 1 rightTip: 0
Suspicios: 2
Mean extensions: 4
Inner reads: 0
Length: 147044
[2018-02-24 03:08:04] DEBUG: Inner: 852 covered: 1976 total: 6005312
[2018-02-24 03:08:04] DEBUG: Assembled contig 10
With 22 reads
Start read: -m170712_111658
At position: 8
leftTip: 1 rightTip: 1
Suspicios: 2
Mean extensions: 6
Inner reads: 0
Length: 120833
[2018-02-24 03:08:04] DEBUG: Inner: 980 covered: 2388 total: 6005312
[2018-02-24 03:08:08] DEBUG: Assembled contig 11
With 14 reads
Start read: -m170609_42878
At position: 8
leftTip: 1 rightTip: 1
Suspicios: 3
Mean extensions: 5
Inner reads: 0
Length: 81458
[2018-02-24 03:08:08] ERROR: Caught unhandled exception: Automatic expansion triggered when load factor was below minimum threshold
[2018-02-24 03:08:08] ERROR: flye-assemble(_Z16exceptionHandlerv+0x2d) [0x43cc8d]
[2018-02-24 03:08:08] ERROR: /usr/lib64/libstdc++.so.6(+0x96706) [0x2aaaab277706]
[2018-02-24 03:08:08] ERROR: /usr/lib64/libstdc++.so.6(+0x96751) [0x2aaaab277751]
[2018-02-24 03:08:08] ERROR: /usr/lib64/libstdc++.so.6(+0xc1708) [0x2aaaab2a2708]
[2018-02-24 03:08:08] ERROR: /lib64/libpthread.so.0(+0x8744) [0x2aaaab789744]
[2018-02-24 03:08:08] ERROR: /lib64/libc.so.6(clone+0x6d) [0x2aaaaba87aad]
-----------End assembly log------------
[2018-02-24 03:08:24] root: ERROR: Command '['flye-assemble', '-l', '/flush1/esc003/Flye_cynegetis_assembly/flye/flye.log', '-t', '20', '-v', '5000', '/flush2/esc003/Pacbio_subreads_smartbellremoved_headers_removed.fasta', '/flush1/esc003/Flye_cynegetis_assembly/flye/0-assembly/draft_assembly.fasta', '576716800', '/data/esc003/apps/Flye-flye-devel/flye/resource/asm_raw_reads.cfg']' returned non-zero exit status 1

@mikolmogorov
Copy link
Owner

Thanks,

It seems strange, we have never encountered issues with the newer hash function before. Potentially, it could be machine/OS specific. Could you provide "uname -a" output? Do you also have a possibility to run the dataset on a different machine?

@hermeseduardo
Copy link
Author

sure:
uname -a
Linux XXX 4.4.59-92.17-default #1 SMP Thu Apr 6 14:16:09 UTC 2017 (7bc489d) x86_64 x86_64 x86_64 GNU/Linux

@mikolmogorov
Copy link
Owner

Thanks,

I was not able to reproduce this problem on our datasets so far, so it makes the problem hard to debug. Is it possible for you to share the data - that would make debugging much easier? If so, you can write me to fenderglass@gmail.com

@hermeseduardo
Copy link
Author

looks like it was something to do with the installation in the cluster, it seems to be working now (v 2.3.2).

thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants