Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to assemble the D.melanogaster genome from 24kb HiFi reads #21

Closed
sebschmi opened this issue Feb 1, 2022 · 5 comments
Closed
Assignees
Labels
bug Something isn't working

Comments

@sebschmi
Copy link
Contributor

sebschmi commented Feb 1, 2022

thread 'main' panicked at 'called Result::unwrap() on an Err value: Error { kind: BufferLimit }', src/main.rs:187:33

Reads taken from: https://sra-downloadb.be-md.ncbi.nlm.nih.gov/sos2/sra-pub-run-17/SRR1023860/SRR10238607.1

Command: utils/multik <reads> <output prefix> 56

Happens both with and without homopolymer compression.

@ekimb ekimb added the bug Something isn't working label Feb 1, 2022
@rchikhi rchikhi self-assigned this Feb 1, 2022
@rchikhi
Copy link
Collaborator

rchikhi commented Feb 1, 2022

hmm weird, it worked for me, I did:

wget https://sra-downloadb.be-md.ncbi.nlm.nih.gov/sos2/sra-pub-run-17/SRR1023860/SRR10238607.1
fastq-dump SRR10238607.1
path/to/rust-mdbg/utils/multik SRR10238607.1.fastq prefix 6

finished with a N50 of 1.5 Mbp, 262 Mbp assembly size

@rchikhi
Copy link
Collaborator

rchikhi commented Feb 1, 2022

can you please paste the first 5 lines of your fastq file? seems the crash is due to a problem parsing the beginning of the input

@sebschmi sebschmi changed the title Unable to assembler the D.melanogaster genome from 24kb HiFi reads Unable to assemble the D.melanogaster genome from 24kb HiFi reads Feb 2, 2022
@sebschmi
Copy link
Contributor Author

sebschmi commented Feb 2, 2022

I am using a fasta file as in the quick start example. Here are its first few lines:

>file.1 1 length=25473
TATAAATAGAAAATGAAAATAATAAAACAGCTACGGAAACTGAATCGGCAAACTGAAATGGAAACTGAAA
AGTTGCTGCCCAAGTTGAGCGGCATAAATTTGCGGCGTGTTTAGTGTTTAGTCAGCTAACTTTTTGCAGC
CACTGTCCCAGGGACATTTGCGAAAGTTGGCCAACTGAAACAAATGGCAACAAATTGTAAATAAATAAAA
GTTTTTCGGCCACTCCGGGGCTGCATACATAAATCCCATTGAATAATGCCGATCAATTTCGGACTTTTGT
GCCATGCCGTGCCGGACAAAGTTTTTGCAGCAGTGTACACATACAGTGGCGGCCGCGTGAATAGCGCTGG
AAATTTTAACTTGCCATTCGCTGTAATTAAAATGTATTAGAATAATTGAAAATGAAAATGTATTGGGTTA
TATTTTATAGCCATTTAAAATATGATAATAAGTTTATATTTCATTGTTCATATTATTGTATTATTACTTT
CATTTCTACCGCACTTGCACACTGTCCCACATAATTTACAATTAGATTGTGTCTGCCGGAGACTTGGAAA
CTCGATGCTGAACTTGATTATGCTACCGAAACGGTTTTGTTAATGTTTATGGAAATTGTTTTTCCTCCAG

I am on the head of master btw, commit 9026156.

@rchikhi
Copy link
Collaborator

rchikhi commented Feb 2, 2022

I think I see the problem, rust-mdbg dislikes multi-line FASTAs for now, please convert your reads: seqtk seq -AU reads.unformatted.fa > reads.singleline.fa

@sebschmi
Copy link
Contributor Author

sebschmi commented Feb 2, 2022

Thank you very much, it seems to work now after converting to single-line FASTA!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants