About Speed on CPU and GPU #1398

akde · 2018-08-14T11:55:39Z

Hi @AlexeyAB Thanks a lot for the repository, especially for the tracking part. And of course for the quick responses.

I switched into AlexeyAB repository from pjreddie because of tracking feature. Now I am trying to maximize the FPS I am getting with AlexeyAB repository.

I have tested AlexeyAB and pjreddie repositories on the same PC and video (1920 x 1080, avi format with yolov3-tiny).
Here are the results:

computer	specifications
GPU	GT 730
CPU	i7-4790 CPU @ 3.60GHz
CUDA	8.0
Ubuntu	16.04

<------------	pjreddie	------------	------------>	<------------	AlexeyAB	------------>
GPU	0	1	1	0	1	1
AVX	-	-	-	0	0	1
OPENMP	0	0	1	0	0	1
LIBSO	-	-	-	0	0	1
FPS	1.1	10.4	10.4	1.2	3.3	3.3

As you can see from the table, without the GPU support, both repositories are doing equal. However when GPU support is enabled, with the AlexeyAB repository the maximum fps I can get is 3.3 while it was 10.4 in the pjreddie repository. So am I doing something wrong while using @AlexeyAB 's repository? What can I do to get higher fps?

AlexeyAB · 2018-08-14T12:00:50Z

@akde Hi,

What command do you use for both repositories?
Be sure to use DEBUG=0
Try to install cuDNN and set CUDNN=1 in the Makefile.
AVX=1 and OPENMP=1 has an effect only if GPU=0 (i.e. only of is used CPU instead of GPU)

akde · 2018-08-14T12:12:33Z

Hi @AlexeyAB thx for immediate response:

What command do you use for both repositories?
./darknet detector demo cfg/coco.data cfg/yolov3-tiny.cfg yolov3-tiny.weights ~/darknet/teklerbad.mp4
Be sure to use DEBUG=0
in both Makefiles DEBUG = 0
Try to install cuDNN and set CUDNN=1 in the Makefile.
Sadly the compute capacity of my GPU is NOT high enough for cuDNN (to be more specific compute capacity is 2.1 but to install cuDNN it has to be at least 3.0)
AVX=1 and OPENMP=1 has an effect only if GPU=0 (i.e. only of is used CPU instead of GPU)
Just as you said when GPU is enabled AVX and OPENMP does NOT have any effect of FPS it is 3.3 in both cases.

AlexeyAB · 2018-08-14T12:18:38Z

computer specifications

GPU GT 730

Try to install cuDNN and set CUDNN=1 in the Makefile.
Sadly the compute capacity of my GPU is NOT high enough for cuDNN (to be more specific compute capacity is 2.1 but to install cuDNN it has to be at least 3.0)

As I see GeForce GT 730 has 3.5 compute capacity: https://en.wikipedia.org/wiki/CUDA#GPUs_supported

akde · 2018-08-14T12:35:36Z

@AlexeyAB Thx for the reply!

At first I thought the same! But then realized that is has multiple versions:

https://developer.nvidia.com/cuda-gpus

AlexeyAB · 2018-08-14T12:57:35Z

@akde Yes, looks like it is 2.1

Try to change this line:

darknet/Makefile

Line 15 in e548489

ARCH= -gencode arch=compute_30,code=sm_30 \

to this:
ARCH= -gencode arch=compute_21,code=sm_21 \

May be it will not work propertly for old GPU.

akde · 2018-08-14T13:28:28Z

@AlexeyAB thx for immediate response!

in my case it was
ARCH= -gencode arch=compute_20,code=sm_20

then I changed it into
ARCH= -gencode arch=compute_21,code=sm_21
and it gave me this error.

nvcc fatal : Unsupported gpu architecture 'compute_21'
Makefile:136: recipe for target 'obj/convolutional_kernels.o' failed
make: *** [obj/convolutional_kernels.o] Error 1

so I believe the correct way is ARCH= -gencode arch=compute_20,code=sm_20

I am attaching the Makefiles for both (AlexeyAB and pjreddie) repositories. Everything looks the same to me except the FPS.
https://drive.google.com/open?id=1HQevPS6fgRzNCtN4yV2Dk_bwTuqO6mCp

It is a bit weird to get 3x higher performance in such similar cases.

I have read other issues and saw that there is a rescale line which is NOT in pjreddie repository. Can this line cause 3x FPS reduction?

AlexeyAB · 2018-08-14T14:05:26Z

I have read other issues and saw that there is a rescale line which is NOT in pjreddie repository. Can this line cause 3x FPS reduction?

What is the rescale line?

I can only recommend to use modern GPU. I don't have old GPU, so I can't test and tune code for it.

akde · 2018-08-14T14:41:18Z

@AlexeyAB You are totally right I need to have more recent set up. But for now that is the best I can get.

The rescale line is mentioned here

darknet/src/detector.c

Line 1124 in 3e856ec
image sized = resize_image(im, net.w, net.h);

AlexeyAB · 2018-08-14T15:25:56Z

@akde No. I don't use resize_image() function for detector demo on video. I use very fast OpenCV function (AVX/OpenMP optimized) cvResize() instead of resize_image():

darknet/src/image.c

Line 1069 in a9fef1b

cvResize(src, *in_img, CV_INTER_LINEAR);

Since in the pjreddie/darknet is used letterbox_image_into() for detector demo on video https://github.com/pjreddie/darknet/blob/9a4b19c4158b064a164e34a83ec8a16401580850/src/demo.c#L144
that uses slow function resize_image() inside: https://github.com/pjreddie/darknet/blob/9a4b19c4158b064a164e34a83ec8a16401580850/src/image.c#L959

You issue is related to some CUDA-functions that isn't optimized for very old GPU.

akde · 2018-08-15T07:29:42Z

@AlexeyAB Hi, thx for the detailed explanation and all the concern during conversation. Apparently as you suggest, I need to update my set up.

akde closed this as completed Aug 15, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About Speed on CPU and GPU #1398

About Speed on CPU and GPU #1398

akde commented Aug 14, 2018

AlexeyAB commented Aug 14, 2018

akde commented Aug 14, 2018 •

edited

Loading

AlexeyAB commented Aug 14, 2018

akde commented Aug 14, 2018 •

edited

Loading

AlexeyAB commented Aug 14, 2018

akde commented Aug 14, 2018

AlexeyAB commented Aug 14, 2018

akde commented Aug 14, 2018

AlexeyAB commented Aug 14, 2018

akde commented Aug 15, 2018

About Speed on CPU and GPU #1398

About Speed on CPU and GPU #1398

Comments

akde commented Aug 14, 2018

AlexeyAB commented Aug 14, 2018

akde commented Aug 14, 2018 • edited Loading

AlexeyAB commented Aug 14, 2018

akde commented Aug 14, 2018 • edited Loading

AlexeyAB commented Aug 14, 2018

akde commented Aug 14, 2018

AlexeyAB commented Aug 14, 2018

akde commented Aug 14, 2018

AlexeyAB commented Aug 14, 2018

akde commented Aug 15, 2018

akde commented Aug 14, 2018 •

edited

Loading

akde commented Aug 14, 2018 •

edited

Loading