-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segemetation fault when match is at the very first location #5
Comments
Firstly the first line of the input file should be location of fasta file, not a pattern. And the fasta file should start with '>' for each sequence! |
Hi, sorry for the confusion. I actually have the correct input files, I was only providing the info. cas-offinder> cat t.in
|
Some other suggestions: (1) cas-offinder currently can only take a directory name as the first input line, it would be really great if it can use a sequence file name instead. As we keep the hg38.fa file in a folder that contains many other sequence files. If I made a folder and create a symbolic link that points to hg38.fa, cas-offinder does not seem to be able to follow a symbolic link. (2) it will be great if we can specify the number of CPU/GPU to be used. We have a shared GPU cluster, if some GPUs are already in use, it will be great if we can tell cas-finder not to use all GPU devices. Thank you for your kind consideration! |
Thank you so much for your bug report and suggestions.
You can checkout and pull the latest version of develop branch for the above changes. They will be applied to master branch & released public after some testing. Best, |
Thank you so much for implementing all these requests so quickly!! #ifdef _WIN32 (3) Since you are so open to suggestions, I have one last one. I try to use cas-offinder to implement what Broad group did in their 2015 paper. Since they allow up to 5 mismatches, cas-offinder will generate tons of output (for the many sgRNAs, I will be looking at GBs' output). So I plan to modify cas-offinder, so that it can dump output to stdout, then pipe the stdout to Broad's cfd-score-calculation and only keep the lines that has significant off-target scores. E.g., I will run: Thanks again for your effort in providing such a high-performance piece of program! |
|
Done. Would you test it again? |
Thanks! It works great. "-" triggers output to stdout! $ ./cas_dev t.in G out |
I fixed it, there was an error while counting the number of available devices for each platform. |
Perfect, works as expected! (The only minor thing is you might want to update the usage text to show the new features) Also, just an interesting thing (I test to query 6 sequences against human exonome) Thanks again for making all the improvements so quickly, this tools is perfect now!$ ./cas_dev t.in G6 - > /dev/null |
Maybe you have a bug, G1 actually means using all GPUs and G6 means use only one? The performance data above seems odd. |
I think because I/O latency dominates when Cas-OFFinder analyze such a small number of targets, I expect different result when the number of targets becomes so large. Well, I will test it anyway. |
Hi, I have trouble with GRID K520. I added a few cerr line into initOpenCL()
Here is the output (there are only 2 GPUs, however, m_devnum is 10 intead of 1 after process i=0) [ec2-user@ip-172-31-14-64 cas-offinder-develop_new]$ ./cas-offinder ../t.in G - |
Hi, would you add the same cerr codes to the old version of Cas-OFFinder? |
With old version: I added the following debugging lines: With the new version: [ec2-user@ip-172-31-14-64 cas-offinder-develop_new]$ ./cas-offinder ../t.in G1 - |
It looks like device_cnt should be initialized to zero before running clGetDeviceIDs. |
Hi, I rebooted my Amazon instance and just tested it. Somehow the error is gone. I am no longer able to reproduce it, so I can only assume it my own fault somehow. Thank you so much for your patience, I really appreciate all your helps! |
I am now testing with 100 gRNA as input, the background sequence is at 10% of the genome, so there is decent amount of computation. Running 1 G6 job on 100 sgRNA takes 332.941 sec. This is very strange compared to numbers below. Looks like there is huge benefit of running jobs as G1, if we can solve the clEnqueueNDRangeKernel issue. |
No I don't think so. This is obviously a bug in Cas-OFFinder. The 'device_cnt' variable must be initialized first, according to the spec of OpenCL. It might work sometimes without proper initialization, but it is not a complete solution. The speed of computation depends on many things. Memory size, vectorization, interface speed of PCIe slots, driver, and so on. In other words, computation time does not always decrease along the number of devices. That means it is so reasonable to include a feature for user to limit the number of computing devices in the future version of Cas-OFFinder. Because in this case all Cas-OFFinder processes try to use the first device simultaneously, then clEnqueueNDRangeKernel must fail (usually I don't recommend you to run Cas-OFFinder on one device more than 2 instances). Hmm, maybe it would be better to change the meaning of number after 'G', e.g. maximum number of GPU -> index of GPU? |
|
Yes, but I would like to suggest below syntax for specifying devices using examples) rules)
What do you think? |
Looks good to me. I assume others won't have a case where they need 1 GPU, but don't know which one is available. In our GPU cluster, we can get the GPU ID of the free devices, so this is not a problem for us. |
Well, maybe when one needs to run several processes simultaneously, I think it would be useful. But I am not pretty sure how many people have such a big cluster with lots of GPUs installed in.. Anyway would you mind if I ask the name of GPU cluster that you use? Is it commercially available? It looks pretty nice. |
To reproduce, use Fasta file:
Use input file:
NNNNNNNNNNNNNNNNNNNNNNN
GGCCGACCTGTCGCTGACGCAGG 5
Get a segmentation fault. However, if we use Fasta file, it works.
Could you help take a look? Thanks!
The text was updated successfully, but these errors were encountered: