-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CANU is failing - bogart issue #1323
Comments
Can you provide more details on how it is failing, post the unitigger.err log? I would guess it is similar to #1281, because the genome size is so low it's trying to load lots of overlaps and doesn't have enough memory (note in the metagenomic FAQ parameters it explicitly increases bogart memory to avoid this). If it is the same issue, you can edit unitigger.sh similarly (increase genome size, increase -M option) and resume canu. As for total runtime, it depends how much data you have, you're also dropping minimum overlap and read lengths much lower than the default which will add to runtime. |
Many thanks. I will try to change genomeSize and minOverlapLength. The thing is that the genomes sizes are really around 10k. If I would increase it maybe the assembly would not be correct? The unitigger.err is saying as you assumed, not enough memory. Do you have any idea how much memory would be enough? I could use 500. Untrigger.err ==> PARAMETERS. Resources: Lengths: Overlap Error Rates: Deviations: Edge Confusion: Unitig Construction: Debugging Enabled: ==> LOADING AND FILTERING OVERLAPS. ReadInfo()-- Using 140318 reads, no minimum read length used. OverlapCache()-- limited to 16384MB memory (user supplied). OverlapCache()-- 1MB for read data. |
The genome size won't make the assembly wrong, it's just used to compute some statistics and to guess at the coverage in your dataset. You can of course increase the memory, I expect 500 will be enough. However, the result of increasing the genome size or memory won't be very different. You don't really need 5000 overlaps per read to assemble the amplicon. I would increase the genome size to 1mb and see if it runs in the current memory. |
Dear Sergey!
I was trying different things and the Canu was still failing in the last stages. Can I share some data with you and then you would maybe see what is wrong with them?
I will be really happy if you could help me. Trying different thinks is really time consuming since it takes days to fail again.
Sorry for bothering you and many thanks!
Anja
…___________________________________________________
Anja Pecman
Mlada raziskovalka / PhD Student
Nacionalni inštitut za biologijo<http://www.nib.si/> / National Institute of Biology<http://www.nib.si/eng/>
Oddelek za biotehnologijo in sistemsko biologijo<http://www.nib.si/oddelki/oddelek-za-biotehnologijo-in-sistemsko-biologijo>
Department of Biotechnology and Systems Biology<http://www.nib.si/eng/index.php/departments/department-of-biotechnology-and-systems-biology>
Večna pot 111, SI-1000 Ljubljana, Slovenia
Phone: + 386 (0)59 232 823
Fax: + 386 (0)1 257 38 47
E-mail: anja.pecman@nib.si<mailto:anja.pecman@nib.si>
[cid:image001.png@01D51C66.0D148C30]
From: Sergey Koren [mailto:notifications@github.com]
Sent: Thursday, April 11, 2019 3:29 PM
To: marbl/canu <canu@noreply.github.com>
Cc: Anja Pecman <Anja.Pecman@nib.si>; Author <author@noreply.github.com>
Subject: Re: [marbl/canu] CANU is failing - bogard issue (#1323)
Can you provide more details on how it is failing, post the unitigger.err log? I would guess it is similar to #1281<#1281>, because the genome size is so low it's trying to load lots of overlaps and doesn't have enough memory (note in the metagenomic FAQ parameters it explicitly increases bogart memory to avoid this). If it is the same issue, you can edit unitigger.sh similarly (increase genome size, increase -M option) and resume canu.
As for total runtime, it depends how much data you have, you're also dropping minimum overlap and read lengths much lower than the default which will add to runtime.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#1323 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AvOA6C7FSlxh4QZscUVpZHgYNLgRYsg7ks5vfzhsgaJpZM4co9oU>.
|
Sure you can upload your full run directory or just the reads following the instructions on the FAQ. It shouldn't take days to re-run anything in the bogart step, that's all you need to run to test changing memory/genome size. |
Did you ever upload any data, i don't see anything on the FTP site. |
Dear Sergey,
I have just uploaded testAP.fastq file. I was basecalling again the data and because of that it took so long.
So I would like to have metagenomics approach and when I was running the canu I tried to use this:
overlapper=mhap utgReAlign=true corOutCoverage=10000 corMhapSensitivity=high minReadLength=100 minOverlapLength=100 corMinCoverage=0 (found in canu manual)
or this command
obtOverlapper=mhap obtReAlign=raw utgOverlapper=mhap utgReAlign=raw corOutCoverage=10000 corMhapSensitivity=high minReadLength=100 minOverlapLength=100 corMinCoverage=0 (found on github).
Do you think that this could work?
Normally the bogard failed.
Many thanks for any help,!
Best regards,
Anja
…___________________________________________________
Anja Pecman
Mlada raziskovalka / PhD Student
Nacionalni inštitut za biologijo<http://www.nib.si/> / National Institute of Biology<http://www.nib.si/eng/>
Oddelek za biotehnologijo in sistemsko biologijo<http://www.nib.si/oddelki/oddelek-za-biotehnologijo-in-sistemsko-biologijo>
Department of Biotechnology and Systems Biology<http://www.nib.si/eng/index.php/departments/department-of-biotechnology-and-systems-biology>
Večna pot 111, SI-1000 Ljubljana, Slovenia
Phone: + 386 (0)59 232 823
Fax: + 386 (0)1 257 38 47
E-mail: anja.pecman@nib.si<mailto:anja.pecman@nib.si>
[cid:image001.png@01D52205.AF3F9BE0]
From: Sergey Koren [mailto:notifications@github.com]
Sent: Thursday, June 13, 2019 4:22 PM
To: marbl/canu <canu@noreply.github.com>
Cc: Anja Pecman <Anja.Pecman@nib.si>; Author <author@noreply.github.com>
Subject: Re: [marbl/canu] CANU is failing - bogart issue (#1323)
Did you ever upload any data, i don't see anything on the FTP site.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#1323?email_source=notifications&email_token=ALZYB2DAMO73H2PK3WIOHJLP2JJXBA5CNFSM4HFD3IKKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXT3IMY#issuecomment-501724211>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ALZYB2ASL2FSF7S5N2HHJ43P2JJXBANCNFSM4HFD3IKA>.
|
The docs are going to be more up to date since there are GitHub issues referring to canu versions that no longer exist. Your second command seems reasonable though the metagenomic options also increase the bat memory which you've omitted ( However, I am not sure what assembling direct RNA means? Aren't these already full-length transcripts so is there anything to assemble? Are you trying to see if you can assemble RNA viruses? |
Dear all,
I am trying to run the CANU on my DATA and it keeps failing before assembly step (bogart failed). So I have corrected and trimmed data as the output. Anyway, I have directRNA data and I am trying to do the metagenomics approach to see what is in the sample of plant material infected with viruses/viroids. The transcripts and genomes should not be longer than 20k.
When I was running the command was:
/canu -d run1 -p run1 genomeSize=20k -nanopore-raw/DATA/run1.fa overlapper=mhap utgReAlign=true corOutCoverage=10000 corMhapSensitivity=high minReadLength=100 minOverlapLength=100 corMinCoverage=0 minMemory=100 maxMemory=200 maxThreads=24
I was trying also some other options of the command which I found by reading all these different issues of bogard failed but without success. Also, the assembly takes 10 days (we have a server with 36 threads and 250 memory) - is this normal?
Many thanks for any help!
The text was updated successfully, but these errors were encountered: