-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluation of large input files #45
Comments
large input files (500MB-1GB) are working with virsorter and pprmeta. I will test the other tools. However, the r_plot process takes much time and seems even not be able to terminate for some files. Besides, the visualization is not really usefull for large input sets so I will deactivate it in a separate branch for my test runs. |
Update: Now testing Marvel |
so you dropped all the 29 metagenomes with > 1mio contigs (for each sample) on it? :D |
yeah... I thought the EBI cluster is huge so just go for it WtP! :D At the moment I am just running one sample with the -resume option adding more and more tools (currently Marvel is running). |
marvel is super difficult to implement here. as its analysing "bins" by default. so i need to split each contig into a separate fasta file. and you have 1-2 mio contigs per file |
uff I see. Maybe skipping Marvel if too many contigs are provided? I mean, it's just due to how Marvel is implemented and not reallt an issue of WtP |
yep i was thinking about an "autoconfig" depending on the "assemblystats" of the input |
I think that is a good idea and report back to the user what was deactivated and why. |
these issue information are for #47 |
This issue is for documentation of the behavior of WtP for large input files. Based on this @replikation might implement FASTA chunking to increase speed of the pipeline.
case 1, aquadiva sample
(execluded metaphinder because of an previous issue)
started: Dec 31 12:50
Tools completed
Job was aborted after 2.5 days by cluster for unclear reason. No stats for deepvirfinder and marvel
The text was updated successfully, but these errors were encountered: