-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Full run command & expected output #21
Comments
Hi, I recommend reading through the readme file. I have copied and pasted an excerpt for you here: -l: controls the length cutoff for anchor contigs. A good rule of thumb is to start with the N50 of the self assembly. E.g. if the N50 of your self assembly is 2Mb then use 2000000 as your cutoff. Lowering this value may lead to more merging but may increase the probability of mis-joins. -ml: controls the minimum alignment length to be considered for merging. This is especially helpful for repeat-rich genomes. Default is 0 but higher values (>5000) are recommended. added note: we recommend using a higher -ml value based on what the expected repeat lengths will be. It is recommended for the length to be larger so that it can span an entire repeat. Most of the time you will want to only modify the -l and -ml parameters. I also recommend reading through our paper also. link below: If you still have questions after reading through the paper, please feel free to follow up with us again. Thank you |
Also to answer the first two questions. The full run command is also listed in the readme. The expected output is a fasta file named merge.fasta There are also summary files: aln_summary.tsv These contain summary information of alignments and overlaps. |
I'm getting a merged fasta file but all of the summary files are empty. I also dont see the -d out.JLXCBDrx.delta folder? ./quickmerge -d /home/tools/quickmerge/quickmerge/out.JLxCBDrx.delta -q /home/genome/V6_Paper_3.8Mb_181018/Jamaican_Lion_polished_V6_181018.merged.fa -r /home/genome/CBDrx/CBDrx/CBDrx_consensus_polished.contigs.fasta -hco 5.0 -c 1.5 -l 3800000 -ml 5000 -p JL_X_CBDrx |
/home/tools/quickmerge/quickmerge$ ls |
I think nucmer and/or delta-filter did not run properly. Without the delta
file, quickmerge has no alignment data to process. Hence no output in the
summary files.
Could you check if you can run nucmer manually?
…On Sun, Dec 9, 2018, 20:29 KevinMcKernan ***@***.*** wrote:
/home/tools/quickmerge/quickmerge$ ls
LICENSE aln_summary_JL_X_CBDrx.tsv merge_wrapper.py
param_summary_JL_X_CBDrx.txt
MUMmer3.23 anchor_summary_JL_X_CBDrx.txt merged_JL_X_CBDrx.fasta
quast_results
README.md make_merger.sh merger quickmerge
/home/tools/quickmerge/quickmerge$ more aln_summary_JL_X_CBDrx.tsv
REF QUERY REF-LEN Q-LEN REF-ST REF-END Q-ST Q-END
/home/tools/quickmerge/quickmerge$ more anchor_summary_JL_X_CBDrx.txt
REF_NAME Q_NAME REF_LENGTH Q_LENGTH REF-ST REF-END Q-ST Q-END
/home/tools/quickmerge/quickmerge$ more param_summary_JL_X_CBDrx.txt
REF QUERY REF_START REF_END Q_START Q_END ORIENTATION INNIE(1/0)
OVERLAP_LEN OVERLAP_PROP NO_OVERLAP_
AT_ENDS OVERHANG
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#21 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHMD6Kw6yMGmKxfQ-WhV7Fr0wNWX-KcPks5u3SUfgaJpZM4Oa6uN>
.
|
nucmer doesn't fire on the command line. I have MUMmer installed and can fire it off with ./nucmer. USAGE: nucmer [options] Try './nucmer -h' for more information. |
nucmer is now running on command line but the results are the same as above. How do I test the delta-filter |
I tried using the python script and it may have provided more diagnostics. /home/tools/quickmerge/quickmerge$ python merge_wrapper.py /home/genome/V6_Paper_3.8Mb_181018/Jamaican_Lion_polished_V6_181018.merged.fa /home/genome/CBDrx/CBDrx/CBDrx_consensus_polished.contigs.fasta reading input file "out.ntref" of length 746105410construct suffix tree for sequence of length 746105410(maximum reference length is 536870908)(maximum query length is 4294967295)process 7461054 characters per dot/usr/bin/mummer: suffix tree construction failed: textlen=746105410 larger than maximal textlen=536870908 |
Do you have MUMmer v4 installed in your system? Quickmerge has MUMmer v3
which is unable to handle the length of your sequences. MUMmer v3 64 bit
should be able to handle the length and I thought that's the version we
have. But may be not.
…On Sun, Dec 9, 2018, 21:37 KevinMcKernan ***@***.*** wrote:
I tried using the python script and it may have provided more diagnostics.
/home/tools/quickmerge/quickmerge$ python merge_wrapper.py
/home/genome/V6_Paper_3.8Mb_181018/Jamaican_Lion_polished_V6_181018.merged.fa
/home/genome/CBDrx/CBDrx/CBDrx_consensus_polished.contigs.fasta
1: PREPARING DATA
2,3: RUNNING mummer AND CREATING CLUSTERS
reading input file "out.ntref" of length 746105410 construct suffix tree
for sequence of length 746105410 (maximum reference length is 536870908) (maximum
query length is 4294967295) process 7461054 characters per dot
/usr/bin/mummer: suffix tree construction failed: textlen=746105410 larger
than maximal textlen=536870908
ERROR: mummer and/or mgaps returned non-zero
ERROR: Could not parse delta file, out.delta
error no: 400
Traceback (most recent call last):
File "merge_wrapper.py", line 176, in
subprocess.call(mergercall)
File "/usr/lib/python2.7/subprocess.py", line 172, in call
return Popen(*popenargs, **kwargs).wait()
File "/usr/lib/python2.7/subprocess.py", line 394, in *init*
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1047, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#21 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHMD6Ozpghb9YVXobvvp9P0DvTh8XJUhks5u3TT0gaJpZM4Oa6uN>
.
|
Hi
It looks like the version of MUMmer you are using is from your /use/bin
folder and not the 64bit version of MUMmer that is compiled with quickmerge
when you initially compile it. After compilation, there should be a
.qmbashrc script that you can source prior to running nucmer along with the
other binaries. Could you please do la -al inside your quickmerge directory
and let us know the contents of that folder?
Thank you,
Edwin
…On Sun, Dec 9, 2018, 8:07 AM KevinMcKernan ***@***.*** wrote:
I tried using the python script and it may have provided more diagnostics.
/home/tools/quickmerge/quickmerge$ python merge_wrapper.py
/home/genome/V6_Paper_3.8Mb_181018/Jamaican_Lion_polished_V6_181018.merged.fa
/home/genome/CBDrx/CBDrx/CBDrx_consensus_polished.contigs.fasta
1: PREPARING DATA
2,3: RUNNING mummer AND CREATING CLUSTERS
reading input file "out.ntref" of length 746105410 construct suffix tree
for sequence of length 746105410 (maximum reference length is 536870908) (maximum
query length is 4294967295) process 7461054 characters per dot
/usr/bin/mummer: suffix tree construction failed: textlen=746105410 larger
than maximal textlen=536870908
ERROR: mummer and/or mgaps returned non-zero
ERROR: Could not parse delta file, out.delta
error no: 400
Traceback (most recent call last):
File "merge_wrapper.py", line 176, in
subprocess.call(mergercall)
File "/usr/lib/python2.7/subprocess.py", line 172, in call
return Popen(*popenargs, **kwargs).wait()
File "/usr/lib/python2.7/subprocess.py", line 394, in *init*
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1047, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#21 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEI6virz4cv_3QVmFQZpL2do4mfS5uWwks5u3TT0gaJpZM4Oa6uN>
.
|
drwxrwxr-x 6 ubuntu ubuntu 4096 Dec 9 16:13 . |
We do have another copy of MUMmer installed. |
Yes that seems to be the issue. You should have a .qmbashrc in the
quickmerge folder where you ran the quickmerge install bash script
…On Sun, Dec 9, 2018, 11:13 AM KevinMcKernan ***@***.*** wrote:
We do have another copy of MUMmer installed.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#21 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEI6voxK27ooZv5yzsz38Q4dy__c29_Jks5u3WCngaJpZM4Oa6uN>
.
|
Please try sourcing .quickmergerc
…On Sun, Dec 9, 2018, 11:13 AM KevinMcKernan ***@***.*** wrote:
We do have another copy of MUMmer installed.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#21 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEI6voxK27ooZv5yzsz38Q4dy__c29_Jks5u3WCngaJpZM4Oa6uN>
.
|
Thank you! It seems to be running now. How much compute is required for 1Gb X 1Gb genome? |
Hi
The longest step is the nucmer and delta- filter step. This can take hours.
The quickmerge step is fairly quick, but all will take awhile since your
genome is 1Gb. I believe for a 600Mb genome it took about 6-8 hours for
everything. I can check my logs and give you a better estimate if you like.
Thank you,
Edwin
…On Sun, Dec 9, 2018, 3:18 PM KevinMcKernan ***@***.*** wrote:
It seems to be running now. How much compute is required for 1Gb X 1Gb
genome?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#21 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEI6vvgvRzsICtrI1iSCvl2ulPamRIiyks5u3ZokgaJpZM4Oa6uN>
.
|
I'll let it go overnight. Same box is assembling the organelle genomes on Canu. |
Awesome sounds good. Looking forward to hearing about a successfully run.
…On Sun, Dec 9, 2018, 3:48 PM KevinMcKernan ***@***.*** wrote:
I'll let it go overnight. Same box is assembling the organelle genomes on
Canu.
If its not done tomorrow AM, i'll fire it off on another box.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#21 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEI6vjdsrcOJRoXFU5MJ5mA7v7iLBg_zks5u3aFNgaJpZM4Oa6uN>
.
|
Is there anywhere with a full run command including all the parameters that need to be set? Also, what is the nature and format of the output?
The
-h
information is pretty limited. I can see that I need to set-l seed_length_cutoff -ml merging_length_cutoff
but don't really know how these are used to make an educated guess as to what to set.The text was updated successfully, but these errors were encountered: