-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem speed of mapping #44
Comments
Hi, There are a lot of issues here. The toml file you provide (human_chr_selection.toml.txt) won't pass validation as it has no targets. It isn't the one passed in the command shown in ru_test.log (that one is human_chr_selection.toml). The ru_test.log shows that you actually have two targets in your toml file (further suggesting an incorrect toml file here) BUT your used toml file has 2 targets, none of which are found in the reference. Reads will either be always off target or not map at all. If they are not mapping (and I suspect that is the case here) you will collect more data and so your basecalling will take longer and longer. In essence I'm not sure you have configured this experiment properly. If you can provide further information including the source of data (are you playing back a bulkfile here or something else?) and the correct toml file we might be able to help further. Matt |
Thanks for the answer. I changed the reference and the file passed the test. After starting, it still shows a long time. To start, I used the bulk file from Attach files. |
Last time I attached the wrong file (TOML) attached .. |
Thanks for the update - So that is a lot slower than I would expect. I would check a few things here. First - how quickly can your GPU call reads when running standalone. You may need to play with guppy parameters to tune your guppy basecaller optimally. However, we need to see if it is GPU or CPU which is limiting here - how big is your reference file that you are mapping too? Also what sort of power is your CPU? Have you tried the fast basecalling model instead of the high accuracy model? If you see an improvement in speed here then we can pinpoint the source of the problem a little. Thanks |
Thank. I launched it on a high accuracy model - the speed for the first 2 minutes was normal, but then again everything started to slow down to 1 second or more. I have CPU Ryzen 7 (3800X 8 core 16 treads). I use as a reference the indexed file from ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA_000001405.15_GRCh38/seqs_for_alignment_pipelines.ucsc_ids/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.gz |
This doesn't really make sense then. Can you please try setting the max chunks to 8 rather than infinite and see what happens? Also - please leave it running for more than 15 minutes and check the resulting data to see if the selection is working. |
Also - please can you try it with the FAST model and not the High Accuracy Model. Running on the fast model will tell us something about where the lag is. |
Dear Matt. I started the process on fast and hac models with max_chunks = 8 Thank |
can I check what operating system you are on? And also can you provide a metric for how quickly you can basecall standard reads on your current setup? |
Dear Matt. Here are the system data (Ubuntu 18.04.4 LTS |
I have observed the same problem here when using the hac model in the toml file, I obtained very slow mapping time (>1s). |
Hi All, A quick question - could people confirm the version of guppy they are using? Thanks. |
If you are on version 3.6 it may be worth trying guppy 3.4.5 - it is available from: https://mirror.oxfordnanoportal.com/software/analysis/ont-guppy_3.4.5_linux64.tar.gz It looks as though there is a change in guppy performance that might be negatively impacting the speed of read until. |
Thanks Matt - will try 3.4.5 later today. |
HI Chris, If you can let us know how 3.4.5 goes - the accuracy differences aren't key here but the speed is. So you should find that gives you better performance. We're really keen to resolve this ASAP! Best Matt |
By the way, you can see my previous results using guppy 3.5.2 in Question #39. |
Thanks @tchrisboles We're just running some equivalence tests across a few GPUs here. All our work was reported using 3.4.5 - we will investigate the issues with guppy > 3.4.5 with ONT. |
Dear Matt. I got similar results as Chris using guppy 3.4.5. |
Hi - you have to record a bulkfile from a run - you cannot use any fast5 file. Look under the advanced file save options. For depletion of a human genome you just need to configure your toml file to reject anything that maps to the reference you want to get rid off. Have a look at our paper for detailsl. |
* Update README.md Closes #44 Uses BETA syntax see https://github.com/orgs/community/discussions/16925# thub pages. Adds a link to the Sphinx documentation for readfish on the looselab github pages * Exclude README.md from trailing-whitespace pre-commit Need trailing whitespace to render the warning boxes * Invert the notes about the FAQ and README
Dear Matt. I tried to run Read Until, went through the testing stages (to the Testing basecalling and mapping stage, point 6). In this case, basecalling is launched on GPU (RTX 2080Ti). But the speed of mapping shows more than 1-3 seconds. What could be the problem?
thank
I did the launch from a file - an example "Testing"
I attach files
human_chr_selection.toml.txt
chunk_log.log
ru_test.log
guppy_basecall_server_log-2020-05-04_15-07-45.log
The text was updated successfully, but these errors were encountered: