Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No detections with CUDNN=1 and tiny-yolo #405

Open
jacobsuh opened this issue Jan 7, 2018 · 16 comments
Open

No detections with CUDNN=1 and tiny-yolo #405

jacobsuh opened this issue Jan 7, 2018 · 16 comments

Comments

@jacobsuh
Copy link

jacobsuh commented Jan 7, 2018

When I had CUDNN=0 and GPU=1, both yolo.weights and tiny-yolo.weights worked fine, but when I recompiled with CUDNN=1 and GPU=1, tiny-yolo.weights no longer has detections (even with a very low threshold). Strangely enough, the normal yolo.weights still works. Any help on why this could be?

Also, is the only difference between the two setups is that in the latter, it's now using the cuDNN library, as well as CUDA already being used? How much of a performance improvement would there realistically be between the two?

@ivpusic
Copy link

ivpusic commented Mar 5, 2018

I have the same issue

@ahsan856jalal
Copy link

You can try one thing:
with CUDNN=1, GPU=1, Opencv=1
train one small model with one class having ~1000 items for around 2k iterations and then with the same configuration, test the detection.

@ahsan856jalal
Copy link

I believe you are using this sort of command in the first place to test the image.
./darknet detector test data/coco.data cfg/yolo.cfg yolo.weights data/dog.jpg -i 0 -thresh 0.2

@kausb
Copy link

kausb commented Mar 19, 2018

Hi Jacob/IvPusic,
I am facing a similar issue with custom trained weights and tiny yolo (CUDNN=1, GPU=1, OpenCV=0).
With CUDNN =0,GPU=1, inferencing works fine. Have you been able to debug/resolve this issue?

@barkermlpg
Copy link

barkermlpg commented Apr 7, 2018

Very interesting post and I would also like to know how to resolve this issue and help if possible. I'm using Yolo_v2 and ran into the same issue on one of two systems:

  • System1: Ubuntu 14.04, cuda 8.0 toolkit, cudnn 7.0.3, Tesla K80 = issue occurs as described above
  • System2: Ubuntu 16.04, cuda 8.0 toolkit, cudnn 7.0.3, GeForce GTX 1080 = no issue, everything works fine

What other system information would be useful to debug the issue? Please share, thanks.

@chiefkarlin
Copy link

Also running into this issue on a machine with a Tesla K80, Ubuntu 16.04, cuda 9.1, cudnn 7.12 using a custom trained model.

Compiling with CUDNN=1 detections work fine.

@cometyang
Copy link

I had the similar issue on P100, ubuntu 16.04, cuda 8.0, cudnn 7.05 when I run example script
./darknet classifier predict cfg/imagenet1k.data cfg/extraction.cfg extraction.weights data/dog.jpg
GPU=0, CUDNN=0
GPU=1, CUDNN=0
both works, but
GPU=1, CUDNN=1 fail.
output


Loading weights from extraction.weights...Done!
data/dog.jpg: Predicted in 0.003717 seconds.
0.41%: bucket
0.39%: hook
0.39%: tennis ball
0.35%: paper towel
0.35%: water bottle

@TetsuakiBaba
Copy link

TetsuakiBaba commented Jun 14, 2018

Hi, I got same issue on classifier and detector option. As a result, I resolved it by editing cfg file.

I got a same issue by typing a below command,
./darknet classifier predict cfg/imagenet1k.data cfg/extraction.cfg extraction.weights data/dog.jpg

but, after changing batch and subdivisions parameter on extraction.cfg, I got a correct recognition result.

I think, whenever we predict or test or demo on darknet with GPU, we have to be sure cfg file is test mode( i.e. batch=1, subdivisions=1). Default setting of extraction.cfg is batch=128, subdivision=8, which is train mode settings.

Anyway, it runs correctly on CPU mode, but on GPU, we have to change batch and subdivisions. I hope this will help you.

@barkermlpg
Copy link

That is it! I just confirmed on my system. This solves it, thank you!

@fspeed
Copy link

fspeed commented Jul 11, 2018

I had the similar issue on Titan X, ubuntu 18.04, cuda 9.0, cudnn 7.1 when I run example script
./darknet classifier predict cfg/imagenet1k.data cfg/extraction.cfg extraction.weights data/eagle.jpg
GPU=0, CUDNN=0
GPU=1, CUDNN=0
both works, but
GPU=0, CUDNN=1 fail.
GPU=1, CUDNN=1 fail.
output

Loading weights from extraction.weights...Done!
data/dog.jpg: Predicted in 0.004810 seconds.
0.41%: bucket
0.39%: hook
0.39%: tennis ball
0.35%: paper towel
0.35%: water bottle

@jerinka
Copy link

jerinka commented Jul 19, 2018

I tried tiny darknet for classification and got error when i put cudnn=1, works fine when cudnn=0. I am using Cuda 8 and cudnn 6, i tried making batchsize=1 and subdivision=1, still error in result(always showing same values).

@sharowyeh
Copy link

sharowyeh commented Jul 20, 2018

darknet will pre-allocate GPU virtual memory for each layers if GPU=1 or CUDNN=1, depends on batch size and sub division settings in cfg file.
For training, batch size indicated how many pictures will be performed to GPU in iteration, greater value can reduce the training time, and sub division can slices them to groups prevents memory size limitation issue if GPU does not have enough memory in iteration.
Keeps 1 in detection because these settings are mainly for training network.

@lintangsutawika
Copy link

I had this same issue.
A quick fix would be to copy your current cfg file and comment the training batch and subdivision while uncommenting the testing batch and subdivision. That way, you have 2 cfg file that differ there.

# Testing
batch=1
subdivisions=1
# Training
#batch=256
#subdivisions=64

@braddockcg
Copy link

Same issue here. Setting batch=1 sudivisions=1 worked for doing detections. thanks!
Does this mean I can't train with CuDNN?

@kuriel07
Copy link

i had the same issue, changing the cuDNN to older version and rebuild the project solved it

@zkailinzhang
Copy link

zkailinzhang commented Dec 28, 2018

opencv=1
cuda=1
cudnn=1

same error, please modify cfg/yolov3.cfg,

# Testing
batch=1
subdivisions=1
# Training
#batch=256
#subdivisions=64

then, detector result display ok.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests