Full run command & expected output #21

slimsuite · 2017-07-18T06:54:24Z

Is there anywhere with a full run command including all the parameters that need to be set? Also, what is the nature and format of the output?

The -h information is pretty limited. I can see that I need to set -l seed_length_cutoff -ml merging_length_cutoff but don't really know how these are used to make an educated guess as to what to set.

The text was updated successfully, but these errors were encountered:

esolares · 2017-07-18T22:10:18Z

Hi,

I recommend reading through the readme file. I have copied and pasted an excerpt for you here:

-l: controls the length cutoff for anchor contigs. A good rule of thumb is to start with the N50 of the self assembly. E.g. if the N50 of your self assembly is 2Mb then use 2000000 as your cutoff. Lowering this value may lead to more merging but may increase the probability of mis-joins.

-ml: controls the minimum alignment length to be considered for merging. This is especially helpful for repeat-rich genomes. Default is 0 but higher values (>5000) are recommended.

added note: we recommend using a higher -ml value based on what the expected repeat lengths will be. It is recommended for the length to be larger so that it can span an entire repeat.

Most of the time you will want to only modify the -l and -ml parameters.

I also recommend reading through our paper also. link below:
http://nar.oxfordjournals.org/content/early/2016/07/25/nar.gkw654.full

If you still have questions after reading through the paper, please feel free to follow up with us again.

Thank you

esolares · 2017-07-18T22:20:41Z

Also to answer the first two questions.

The full run command is also listed in the readme.
Where L and M are positive integer values
quickmerge -d out.rq.delta -q hybrid_assembly.fasta -r self_assembly.fasta -hco 5.0 -c 1.5 -l $L -ml $M

The expected output is a fasta file named merge.fasta

There are also summary files:

aln_summary.tsv
anchor_summary.txt
summaryOut.txt

These contain summary information of alignments and overlaps.

KevinMcKernan · 2018-12-09T14:56:04Z

I'm getting a merged fasta file but all of the summary files are empty. I also dont see the -d out.JLXCBDrx.delta folder?

./quickmerge -d /home/tools/quickmerge/quickmerge/out.JLxCBDrx.delta -q /home/genome/V6_Paper_3.8Mb_181018/Jamaican_Lion_polished_V6_181018.merged.fa -r /home/genome/CBDrx/CBDrx/CBDrx_consensus_polished.contigs.fasta -hco 5.0 -c 1.5 -l 3800000 -ml 5000 -p JL_X_CBDrx

KevinMcKernan · 2018-12-09T14:58:39Z

/home/tools/quickmerge/quickmerge$ ls
LICENSE aln_summary_JL_X_CBDrx.tsv merge_wrapper.py param_summary_JL_X_CBDrx.txt
MUMmer3.23 anchor_summary_JL_X_CBDrx.txt merged_JL_X_CBDrx.fasta quast_results
README.md make_merger.sh merger quickmerge
/home/tools/quickmerge/quickmerge$ more aln_summary_JL_X_CBDrx.tsv
REF QUERY REF-LEN Q-LEN REF-ST REF-END Q-ST Q-END
/home/tools/quickmerge/quickmerge$ more anchor_summary_JL_X_CBDrx.txt
REF_NAME Q_NAME REF_LENGTH Q_LENGTH REF-ST REF-END Q-ST Q-END
/home/tools/quickmerge/quickmerge$ more param_summary_JL_X_CBDrx.txt
REF QUERY REF_START REF_END Q_START Q_END ORIENTATION INNIE(1/0) OVERLAP_LEN OVERLAP_PROP NO_OVERLAP_
AT_ENDS OVERHANG

mahulchak · 2018-12-09T15:08:44Z

I think nucmer and/or delta-filter did not run properly. Without the delta file, quickmerge has no alignment data to process. Hence no output in the summary files. Could you check if you can run nucmer manually?

…

On Sun, Dec 9, 2018, 20:29 KevinMcKernan ***@***.*** wrote: /home/tools/quickmerge/quickmerge$ ls LICENSE aln_summary_JL_X_CBDrx.tsv merge_wrapper.py param_summary_JL_X_CBDrx.txt MUMmer3.23 anchor_summary_JL_X_CBDrx.txt merged_JL_X_CBDrx.fasta quast_results README.md make_merger.sh merger quickmerge /home/tools/quickmerge/quickmerge$ more aln_summary_JL_X_CBDrx.tsv REF QUERY REF-LEN Q-LEN REF-ST REF-END Q-ST Q-END /home/tools/quickmerge/quickmerge$ more anchor_summary_JL_X_CBDrx.txt REF_NAME Q_NAME REF_LENGTH Q_LENGTH REF-ST REF-END Q-ST Q-END /home/tools/quickmerge/quickmerge$ more param_summary_JL_X_CBDrx.txt REF QUERY REF_START REF_END Q_START Q_END ORIENTATION INNIE(1/0) OVERLAP_LEN OVERLAP_PROP NO_OVERLAP_ AT_ENDS OVERHANG — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#21 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHMD6Kw6yMGmKxfQ-WhV7Fr0wNWX-KcPks5u3SUfgaJpZM4Oa6uN> .

KevinMcKernan · 2018-12-09T15:53:09Z

nucmer doesn't fire on the command line. I have MUMmer installed and can fire it off with ./nucmer.
PATH problem?
/home/tools/quickmerge/quickmerge/MUMmer3.23$ ./nucmer

USAGE: nucmer [options]

Try './nucmer -h' for more information.

KevinMcKernan · 2018-12-09T15:59:32Z

nucmer is now running on command line but the results are the same as above. How do I test the delta-filter

KevinMcKernan · 2018-12-09T16:06:11Z

I tried using the python script and it may have provided more diagnostics.

/home/tools/quickmerge/quickmerge$ python merge_wrapper.py /home/genome/V6_Paper_3.8Mb_181018/Jamaican_Lion_polished_V6_181018.merged.fa /home/genome/CBDrx/CBDrx/CBDrx_consensus_polished.contigs.fasta
1: PREPARING DATA
2,3: RUNNING mummer AND CREATING CLUSTERS

reading input file "out.ntref" of length 746105410

construct suffix tree for sequence of length 746105410

(maximum reference length is 536870908)

(maximum query length is 4294967295)

process 7461054 characters per dot

/usr/bin/mummer: suffix tree construction failed: textlen=746105410 larger than maximal textlen=536870908
ERROR: mummer and/or mgaps returned non-zero
ERROR: Could not parse delta file, out.delta
error no: 400
Traceback (most recent call last):
File "merge_wrapper.py", line 176, in
subprocess.call(mergercall)
File "/usr/lib/python2.7/subprocess.py", line 172, in call
return Popen(*popenargs, **kwargs).wait()
File "/usr/lib/python2.7/subprocess.py", line 394, in init
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1047, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory

mahulchak · 2018-12-09T16:40:48Z

Do you have MUMmer v4 installed in your system? Quickmerge has MUMmer v3 which is unable to handle the length of your sequences. MUMmer v3 64 bit should be able to handle the length and I thought that's the version we have. But may be not.

…

On Sun, Dec 9, 2018, 21:37 KevinMcKernan ***@***.*** wrote: I tried using the python script and it may have provided more diagnostics. /home/tools/quickmerge/quickmerge$ python merge_wrapper.py /home/genome/V6_Paper_3.8Mb_181018/Jamaican_Lion_polished_V6_181018.merged.fa /home/genome/CBDrx/CBDrx/CBDrx_consensus_polished.contigs.fasta 1: PREPARING DATA 2,3: RUNNING mummer AND CREATING CLUSTERS reading input file "out.ntref" of length 746105410 construct suffix tree for sequence of length 746105410 (maximum reference length is 536870908) (maximum query length is 4294967295) process 7461054 characters per dot /usr/bin/mummer: suffix tree construction failed: textlen=746105410 larger than maximal textlen=536870908 ERROR: mummer and/or mgaps returned non-zero ERROR: Could not parse delta file, out.delta error no: 400 Traceback (most recent call last): File "merge_wrapper.py", line 176, in subprocess.call(mergercall) File "/usr/lib/python2.7/subprocess.py", line 172, in call return Popen(*popenargs, **kwargs).wait() File "/usr/lib/python2.7/subprocess.py", line 394, in *init* errread, errwrite) File "/usr/lib/python2.7/subprocess.py", line 1047, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#21 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHMD6Ozpghb9YVXobvvp9P0DvTh8XJUhks5u3TT0gaJpZM4Oa6uN> .

esolares · 2018-12-09T17:06:51Z

Hi It looks like the version of MUMmer you are using is from your /use/bin folder and not the 64bit version of MUMmer that is compiled with quickmerge when you initially compile it. After compilation, there should be a .qmbashrc script that you can source prior to running nucmer along with the other binaries. Could you please do la -al inside your quickmerge directory and let us know the contents of that folder? Thank you, Edwin

…

On Sun, Dec 9, 2018, 8:07 AM KevinMcKernan ***@***.*** wrote: I tried using the python script and it may have provided more diagnostics. /home/tools/quickmerge/quickmerge$ python merge_wrapper.py /home/genome/V6_Paper_3.8Mb_181018/Jamaican_Lion_polished_V6_181018.merged.fa /home/genome/CBDrx/CBDrx/CBDrx_consensus_polished.contigs.fasta 1: PREPARING DATA 2,3: RUNNING mummer AND CREATING CLUSTERS reading input file "out.ntref" of length 746105410 construct suffix tree for sequence of length 746105410 (maximum reference length is 536870908) (maximum query length is 4294967295) process 7461054 characters per dot /usr/bin/mummer: suffix tree construction failed: textlen=746105410 larger than maximal textlen=536870908 ERROR: mummer and/or mgaps returned non-zero ERROR: Could not parse delta file, out.delta error no: 400 Traceback (most recent call last): File "merge_wrapper.py", line 176, in subprocess.call(mergercall) File "/usr/lib/python2.7/subprocess.py", line 172, in call return Popen(*popenargs, **kwargs).wait() File "/usr/lib/python2.7/subprocess.py", line 394, in *init* errread, errwrite) File "/usr/lib/python2.7/subprocess.py", line 1047, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#21 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEI6virz4cv_3QVmFQZpL2do4mfS5uWwks5u3TT0gaJpZM4Oa6uN> .

KevinMcKernan · 2018-12-09T19:11:49Z

drwxrwxr-x 6 ubuntu ubuntu 4096 Dec 9 16:13 .
drwxrwxr-x 3 ubuntu ubuntu 4096 Dec 9 14:28 ..
drwxrwxr-x 8 ubuntu ubuntu 4096 Dec 9 14:28 .git
-rw-rw-r-- 1 ubuntu ubuntu 97 Dec 9 14:28 .quickmergerc
-rw-rw-r-- 1 ubuntu ubuntu 35142 Dec 9 14:28 LICENSE
drwxrwxr-x 6 ubuntu ubuntu 4096 Dec 9 14:28 MUMmer3.23
-rw-rw-r-- 1 ubuntu ubuntu 7125 Dec 9 14:28 README.md
-rw-rw-r-- 1 ubuntu ubuntu 50 Dec 9 16:13 aln_summary_out.tsv
-rw-rw-r-- 1 ubuntu ubuntu 62 Dec 9 16:13 anchor_summary_out.txt
-rw-rw-r-- 1 ubuntu ubuntu 1333462610 Dec 9 16:12 hybrid_oneline.fa
-rw-rw-r-- 1 ubuntu ubuntu 823 Dec 9 14:28 make_merger.sh
-rwxrwxr-x 1 ubuntu ubuntu 6434 Dec 9 14:28 merge_wrapper.py
-rw-rw-r-- 1 ubuntu ubuntu 1333462610 Dec 9 16:13 merged_out.fasta
drwxrwxr-x 2 ubuntu ubuntu 4096 Dec 9 14:28 merger
-rw-rw-r-- 1 ubuntu ubuntu 69 Dec 9 16:13 nucmer.error
-rw-rw-r-- 1 ubuntu ubuntu 1 Dec 9 16:13 out.mgaps
-rw-rw-r-- 1 ubuntu ubuntu 758540563 Dec 9 16:13 out.ntref
-rw-rw-r-- 1 ubuntu ubuntu 0 Dec 9 16:13 out.rq.delta
-rw-rw-r-- 1 ubuntu ubuntu 118 Dec 9 16:13 param_summary_out.txt
drwxrwxr-x 4 ubuntu ubuntu 4096 Dec 9 14:51 quast_results
lrwxrwxrwx 1 ubuntu ubuntu 17 Dec 9 14:28 quickmerge -> merger/quickmerge
-rw-rw-r-- 1 ubuntu ubuntu 746129254 Dec 9 16:13 self_oneline.fa

KevinMcKernan · 2018-12-09T19:12:39Z

We do have another copy of MUMmer installed.

esolares · 2018-12-09T19:18:24Z

Yes that seems to be the issue. You should have a .qmbashrc in the quickmerge folder where you ran the quickmerge install bash script

…

On Sun, Dec 9, 2018, 11:13 AM KevinMcKernan ***@***.*** wrote: We do have another copy of MUMmer installed. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#21 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEI6voxK27ooZv5yzsz38Q4dy__c29_Jks5u3WCngaJpZM4Oa6uN> .

esolares · 2018-12-09T19:19:39Z

Please try sourcing .quickmergerc

…

On Sun, Dec 9, 2018, 11:13 AM KevinMcKernan ***@***.*** wrote: We do have another copy of MUMmer installed. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#21 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEI6voxK27ooZv5yzsz38Q4dy__c29_Jks5u3WCngaJpZM4Oa6uN> .

KevinMcKernan · 2018-12-09T23:17:56Z

Thank you! It seems to be running now. How much compute is required for 1Gb X 1Gb genome?

esolares · 2018-12-09T23:33:13Z

Hi The longest step is the nucmer and delta- filter step. This can take hours. The quickmerge step is fairly quick, but all will take awhile since your genome is 1Gb. I believe for a 600Mb genome it took about 6-8 hours for everything. I can check my logs and give you a better estimate if you like. Thank you, Edwin

…

On Sun, Dec 9, 2018, 3:18 PM KevinMcKernan ***@***.*** wrote: It seems to be running now. How much compute is required for 1Gb X 1Gb genome? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#21 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEI6vvgvRzsICtrI1iSCvl2ulPamRIiyks5u3ZokgaJpZM4Oa6uN> .

KevinMcKernan · 2018-12-09T23:48:29Z

I'll let it go overnight. Same box is assembling the organelle genomes on Canu.
If its not done tomorrow AM, i'll fire it off on another box.

esolares · 2018-12-10T00:11:04Z

Awesome sounds good. Looking forward to hearing about a successfully run.

…

On Sun, Dec 9, 2018, 3:48 PM KevinMcKernan ***@***.*** wrote: I'll let it go overnight. Same box is assembling the organelle genomes on Canu. If its not done tomorrow AM, i'll fire it off on another box. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#21 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEI6vjdsrcOJRoXFU5MJ5mA7v7iLBg_zks5u3aFNgaJpZM4Oa6uN> .

mahulchak closed this as completed Feb 24, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Full run command & expected output #21

Full run command & expected output #21

slimsuite commented Jul 18, 2017

esolares commented Jul 18, 2017

esolares commented Jul 18, 2017

KevinMcKernan commented Dec 9, 2018

KevinMcKernan commented Dec 9, 2018

mahulchak commented Dec 9, 2018 via email

KevinMcKernan commented Dec 9, 2018 •

edited

Loading

KevinMcKernan commented Dec 9, 2018

KevinMcKernan commented Dec 9, 2018

mahulchak commented Dec 9, 2018 via email

esolares commented Dec 9, 2018 via email

KevinMcKernan commented Dec 9, 2018

KevinMcKernan commented Dec 9, 2018

esolares commented Dec 9, 2018 via email

esolares commented Dec 9, 2018 via email

KevinMcKernan commented Dec 9, 2018 •

edited

Loading

esolares commented Dec 9, 2018 via email

KevinMcKernan commented Dec 9, 2018

esolares commented Dec 10, 2018 via email

Full run command & expected output #21

Full run command & expected output #21

Comments

slimsuite commented Jul 18, 2017

esolares commented Jul 18, 2017

esolares commented Jul 18, 2017

KevinMcKernan commented Dec 9, 2018

KevinMcKernan commented Dec 9, 2018

mahulchak commented Dec 9, 2018 via email

KevinMcKernan commented Dec 9, 2018 • edited Loading

KevinMcKernan commented Dec 9, 2018

KevinMcKernan commented Dec 9, 2018

reading input file "out.ntref" of length 746105410

construct suffix tree for sequence of length 746105410

(maximum reference length is 536870908)

(maximum query length is 4294967295)

process 7461054 characters per dot

mahulchak commented Dec 9, 2018 via email

esolares commented Dec 9, 2018 via email

KevinMcKernan commented Dec 9, 2018

KevinMcKernan commented Dec 9, 2018

esolares commented Dec 9, 2018 via email

esolares commented Dec 9, 2018 via email

KevinMcKernan commented Dec 9, 2018 • edited Loading

esolares commented Dec 9, 2018 via email

KevinMcKernan commented Dec 9, 2018

esolares commented Dec 10, 2018 via email

KevinMcKernan commented Dec 9, 2018 •

edited

Loading

KevinMcKernan commented Dec 9, 2018 •

edited

Loading