-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FPS slow down #4
Comments
If run to train speed of learning is slow down too |
The performance of CL is lower than corresponding CUDA implementation on the same device. |
Hi @Kylin-PHYTIUM |
I am not an expert in neither YOLO nor CL, but while debugging a bit the code I found that the code that slows down over time is https://github.com/ganyc717/Darknet-On-OpenCL/blob/master/darknet_cl/src/network.cpp#L781. Particularly, using tiny_yolo_v2, the layer that causes the issue is the layer number 15, which seems to be the region_layer.... The deeper I went was this line
|
I modified the code to enable CL_QUEUE_PROFILING_ENABLE. Then I added the following lines to profile the enqueue call:
These measures of time remain stable... which confuses me more. It seems that the GPU run it self takes the same time, but when the measure of the cpu call it grows in time:
|
…th tiny yolov2. The reason of the "hack" is extended in ganyc717#4. As a summary, the time expend in that layer was drastically increasing for some reason. Current version is stable and actually faster than GPU implementation in a MSI laptop with an i7.
Hi @ganyc717
I find one more problem. After run
darknet_cl.exe detector demo data/voc.data yolo-voc.cfg yolo-voc.weights test.mp4
On the first frames I get 7,1 FPS, but after several minutes I get less 5 FPS.
The text was updated successfully, but these errors were encountered: