-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does NanoSim works for large genomes? #18
Comments
Yes, NanoSim works fine with large genomes, like human. The run time should range from minutes to hours, depending on the desired number of reads. If it has been running for 3 days already, you can stop it now, because it means something is going wrong. Please make sure you got version 1.1.0, as it fixed several bugs. If you still run into the problem, let me know which dataset you used to generate the models, and you can also email me the log file so I have a better understanding of the problem. Thanks! |
I'm using the latest commit. And other tools meet the requirements:
I obtained the dataset from Nanopore WGS Consortium repository. Here is the link to the fastq file: Here are the steps I followed:
But the simulator.py script keeps running without generating any simulated reads. Also Please let me know if you need any of the files. |
Hi there, Did you have a chance to look into this issue? Thanks |
I repeated your steps and didn't notice any error. One thing to notice is it takes long to read in the reference genome given the size. I haven't finished reading in yet, but the
If you do have such log file, it means it's trying to read in the reference genome, and probably better to choose a faster machine, or just wait. If your log file is empty, unfortunately I cannot reproduce this problem. |
I get the same log when I kill the process. But I don't think reading the reference genome should take more than 3 days as I am using a powerful workstation. |
NanoSim reads in the genome and converts all bases to upper case. I think this is the most time consuming part for large genome right now. I'll fix the code and let you know. You don't have to wait for it to run. Thanks for letting me know about this! |
I actually fixed the code and will send a pull request. There were two very slow parts in the code.
Note that this code is still not optimized in using memory which can be fixed. |
Hello,
I tried to simulate reads using NanoSim for the human genome.
The read_analysis.py script worked and generated the required models.
However, simulator.py script is running for 3 days now and is not generating any output or simulated reads.
Is there anything special I need to do for large genomes?
Can you please comment on how NanoSim scales for human size genome?
The text was updated successfully, but these errors were encountered: