Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement MEGAHIT #244

Closed
levinas opened this issue Oct 14, 2014 · 7 comments
Closed

implement MEGAHIT #244

levinas opened this issue Oct 14, 2014 · 7 comments
Assignees

Comments

@levinas
Copy link
Contributor

levinas commented Oct 14, 2014

No description provided.

@sebhtml
Copy link
Contributor

sebhtml commented Feb 12, 2015

root@kbase-devel:/kbase/arast/assembly# git log --oneline|grep megahit
7bcc026 megahit: normalize memory limit using the total thread count
af59dd0 plugins: remove foo=bar option for megahit
9b39c30 plugins: add megahit plugin
e4600be plugins: add megahit configuration file
8c5061c Remove megahit version and release numbers
1c13a6e tools: add megahit package

@sebhtml sebhtml closed this as completed Feb 12, 2015
@levinas
Copy link
Contributor Author

levinas commented Feb 12, 2015

The test completed successfully. I'm reopening the issue to make sure the logging is correct.

Here's the megahit.out file I see in the work directory (note that the --input-cmd parameter contains only one file):

Command: megahit --cpu-only --num-cpu-threads 4 -l 512 -m 131448646144 --input-cmd p1.fq -o megahit
MEGAHIT v0.2.0
[Thu Feb 12 16:25:56 2015] Start assembly. Number of CPU threads 4.
[Thu Feb 12 16:25:56 2015] Extracting solid (k+1)-mers for k = 21
[Thu Feb 12 16:26:01 2015] Building graph for k = 21
[Thu Feb 12 16:26:05 2015] Assembling contigs from SdBG for k = 21
[Thu Feb 12 16:26:11 2015] Extracting iterative edges from k = 21 to 31
[Thu Feb 12 16:26:12 2015] Building graph for k = 31
[Thu Feb 12 16:26:13 2015] Assembling contigs from SdBG for k = 31
[Thu Feb 12 16:26:14 2015] Extracting iterative edges from k = 31 to 41
[Thu Feb 12 16:26:14 2015] Building graph for k = 41
[Thu Feb 12 16:26:14 2015] Assembling contigs from SdBG for k = 41
[Thu Feb 12 16:26:14 2015] Extracting iterative edges from k = 41 to 51
[Thu Feb 12 16:26:15 2015] Building graph for k = 51
[Thu Feb 12 16:26:15 2015] Assembling contigs from SdBG for k = 51
[Thu Feb 12 16:26:15 2015] Extracting iterative edges from k = 51 to 61
[Thu Feb 12 16:26:15 2015] Merging to output final contigs.
[Thu Feb 12 16:26:15 2015] ALL DONE.

The actual command seems correct:

['/home/ubuntu/assembly/third_party/megahit/megahit', '--cpu-only', '--num-cpu-threads', '4', '-l', '512', '-m', '131448646144', '--input-cmd', u'cat /mnt/data/fang
fang/119/raw/p2.fq /mnt/data/fangfang/119/raw/p1.fq', '-o', u'/mnt/data/fangfang/119/107/megahit_fff2ac6b-67d3-42d6-b160-d4ca6b04a211/megahit']

@levinas levinas reopened this Feb 12, 2015
@sebhtml
Copy link
Contributor

sebhtml commented Feb 12, 2015

The argument in the list after "input-cmd" is basically "cat " + " ".join(files).

Do you want me to test with more than 1 file before closing the issue ?

@levinas
Copy link
Contributor Author

levinas commented Feb 12, 2015

I think the command is probably passed correctly. Can you look into how to get the "Command: megabit..." line in the output file to reflect the "cat .." subcommand correctly?

@sebhtml
Copy link
Contributor

sebhtml commented Feb 12, 2015

This can be done by printing the command from within the megahit assembly service plugin.

@levinas
Copy link
Contributor Author

levinas commented Feb 12, 2015

MEGAHIT is now part of dev recipes and testing. The issue will be closed when Seb completes the test on big files.

@sebhtml
Copy link
Contributor

sebhtml commented Feb 12, 2015

I tested with a pair of files.

seb@kbase-devel:~/bug-167/job-80$ arast get -j 80
File downloaded: 80_1.megahit_contigs.fa
File downloaded: 80_report.txt
File downloaded: 80_analysis.tar.gz
HTML extracted: 80_analysis/report.html

seb@kbase-devel:~/bug-167/job-80$ ls -lh
total 2.4M
-rw-rw-r-- 1 seb seb 2.3M Feb 12 23:08 80_1.megahit_contigs.fa
drwxrwxr-x 5 seb seb 4.0K Feb 12 23:08 80_analysis
-rw-rw-r-- 1 seb seb 6.3K Feb 12 23:08 80_report.txt

seb@kbase-devel:~/bug-167/job-80$ head 80_report.txt
All statistics are based on contigs of size >= 500 bp, unless otherwise noted (e.g., "# contigs (>= 0 bp)" and "Total length (>= 0 bp)" include all contigs).

Assembly megahit_contigs

contigs (>= 0 bp) 2003

contigs (>= 1000 bp) 77

Total length (>= 0 bp) 2300567
Total length (>= 1000 bp) 1807716

contigs 115

Largest contig 132002
Total length 1831638

seb@kbase-devel:~/bug-167/job-80$ arast stat -d|grep 80
| 80 | 4 | Complete | 1:20:47 | None | -p megahit |

@sebhtml sebhtml closed this as completed Feb 12, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants