Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Results between OpenCV and Darknet CLI differ #5435

Open
JonathanSamelson opened this issue Apr 30, 2020 · 9 comments
Open

Results between OpenCV and Darknet CLI differ #5435

JonathanSamelson opened this issue Apr 30, 2020 · 9 comments

Comments

@JonathanSamelson
Copy link

JonathanSamelson commented Apr 30, 2020

Hi,

I have an issue with the results I obtained. On one hand, I run Yolo v3 ( OpenCV 4.2 with CUDA 10.2 and cuDNN 7.6.5). On the other hand, I recently compiled Darknet following the release of Yolo v4. It was compiled with CUDA and cuDNN (same versions) but the OpenCV_DIR used was the one from vcpkg and the version is 4.1.1 (I don't know if it matters, but I prefer to highlight it just in case).

Both use the same weights and the same config files starting with:

[net]
# Testing
batch=1
subdivisions=1
# Training
# batch=64
# subdivisions=16
width=608
height=608
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

With the same image, I get different results:

Darknet CLI

PS> darknet.exe detector test cfg\coco.data cfg\yolov3.cfg path\to\yolov3\model.weights -thresh 0.25 -letter_box

image

E:\darknet-master\data\horses.jpg: Predicted in 31.315000 milli-seconds.
horse: 96%
horse: 100%
horse: 95%
horse: 25%
horse: 100%

OpenCV implementation

image

horse 99,67%
horse 99,52%
horse 96,78%
horse 89,76%
horse 33,61%

Here is the code that I use:

def prepare_model(model_folder):
    weightsPath = os.path.sep.join([model_folder, "model.weights"])
    configPath = os.path.sep.join([model_folder, "model.cfg"])

    net = cv2.dnn.readNetFromDarknet(configPath, weightsPath)

    net.setPreferableBackend(cv2.dnn.DNN_BACKEND_CUDA)
    net.setPreferableTarget(cv2.dnn.DNN_TARGET_CUDA)

    ln = net.getLayerNames()
    ln = [ln[i[0] - 1] for i in net.getUnconnectedOutLayers()]

    return net, ln
if __name__ == "__main__":
    model_dir = "path\to\\alexey-yolo-v3-coco"
    net, ln = prepare_model(model_dir)

    labelsPath = os.path.sep.join([model_dir, "coco.names"])
    LABELS = open(labelsPath).read().strip().split("\n")

    np.random.seed(30)
    COLORS = np.random.randint(0, 255, size=(len(LABELS), 3), dtype="uint8")

    min_confidence=0.25
    nms_threshold=0.45

    img = cv2.imread("E:\\darknet-master\\data\\horses.jpg")
    (H, W) = img.shape[:2]

    #Omitted if not letter_box
    black = (0,0,0)
    border_size_right = max(0, int(H-W))
    border_size_bottom = max(0,int(W-H))
    img = cv2.copyMakeBorder(img, 0, border_size_bottom, 0, border_size_right,
cv2.BORDER_CONSTANT, black)

    img = cv2.resize(img, (608,608))
    (H, W) = img.shape[:2]
    blob = cv2.dnn.blobFromImage(img, scalefactor=1/255.0, size=(608,608),
swapRB=True, crop=False)
    
    net.setInput(blob)
    layerOutputs = net.forward(ln)

    boxes = []
    confidences = []
    classIDs = []

    for output in layerOutputs:
        for detection in output:
            scores = detection[5:]
            classID = np.argmax(scores)
            confidence = scores[classID]

            if confidence > min_confidence:
                box = detection[0:4] * np.array([W, H, W, H])
                (centerX, centerY, width, height) = box.astype("int")

                x = int(centerX - (width / 2))
                y = int(centerY - (height / 2))

                boxes.append([x, y, int(width), int(height)])
                confidences.append(float(confidence))
                classIDs.append(classID)

    idxs = cv2.dnn.NMSBoxes(boxes, confidences, min_confidence, nms_threshold)

    if len(idxs) > 0:
        for i in idxs.flatten():
            (x, y) = (boxes[i][0], boxes[i][1])
            (w, h) = (boxes[i][2], boxes[i][3])

            color = [int(c) for c in COLORS[classIDs[i]]]
            cv2.rectangle(img, (x, y), (x + w, y + h), color, 2)
            text = "{}: {:.4f}".format(LABELS[classIDs[i]], confidences[i])
            cv2.putText(img, text, (x, y - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2)

            print(LABELS[classIDs[i]], confidences[i])

    cv2.imshow("Image", img)
    cv2.waitKey(0)

I also tried without the black border (and without -letter_box) but I get different results as well:
Darknet CLI:
image

horse: 88%
horse: 100%
horse: 91%
horse: 100%

OpenCV implementation:
image

horse 99,83%
horse 99,68%
horse 90,06%
horse 54,31%

Is there something is handled differently in OpenCV or is there a problem in my code?

Also, the original image size is 773x512, the image that I display with the OpenCV implementation is 608x608 (with or without borders) and the one that is displayed in command line is always 790x608. Is there any reason for such size?

Thank you very much for your help,

Jonathan Samelson

@AlexeyAB
Copy link
Owner

There are 3 different approaches for resizing: #232 (comment)

For fair comparison you must use the same .png image (not jpeg) with the same 416x416 size, and use the same weight/cfg file with width=416 height=416 in cfg file.

In the OpenCV are very strong tests for equivalence of network results, so there can be no different results.

@Hiwyl
Copy link

Hiwyl commented May 1, 2020

nms wrong

@YashasSamaga
Copy link

The region layer in OpenCV performs NMS classwise. You can disable it by setting nms_threshold=0 in all [yolo] blocks and perform NMS on your own after inference.

This has the side-effect of improving the performance by avoiding the switch to CPU for NMS during inference (this happens three times in total).

@arocketman
Copy link

arocketman commented Jun 24, 2020

Hi @AlexeyAB , I am having the same issues where the darknet version (darknet.py) is yielding different results from the opencv dnn implementation. Mostly in the confidence.

At first, I thought it was related to image resizing and different methods being used (keeping/not keeping aspect ratio ecc.), but even trying different methods I couldn't align the confidences. Eventually I just resized the image to 416x416 using an external tool and used the resized image as input, so no resizing should be done. However I am still experiencing a difference in confidence.

Here's my code for the DNN:

    inpWidth = 416
    inpHeight = 416  

    custom_image_bgr = cv.imread(input_img, 1) # Also tried to remove this "1" here
    custom_image = cv.cvtColor(custom_image_bgr, cv.COLOR_BGR2RGB) # Also tried to comment this and set swapRB to True
    #custom_image = cv.resize(custom_image, (inpWidth, inpHeight), interpolation=cv.INTER_LINEAR) #Previous tests

    blob = cv.dnn.blobFromImage(custom_image, scalefactor=(1/255), size=(inpWidth, inpHeight), mean=[0, 0, 0], swapRB=False,
                                crop=False, ddepth=cv.CV_32F) # Also tried with no ddepth, different scale factors, with/without mean
    # Run a model
    net.setInput(blob)
    outs = net.forward(outNames)

Of course, weights, classes and config file are exactly the same.

Any ideas? Thank you!

@matt-sharp
Copy link

@arocketman Did you find any solution?

@YashasSamaga
Copy link

@matt-sharp Are you facing the same problem? If yes, try the following:

  1. Try with another backend and check if you face the same issue
  2. If yes, then it's mostly a problem with NMS. Please check Results between OpenCV and Darknet CLI differ #5435 (comment)

@matt-sharp
Copy link

The region layer in OpenCV performs NMS classwise. You can disable it by setting nms_threshold=0 in all [yolo] blocks and perform NMS on your own after inference.

This has the side-effect of improving the performance by avoiding the switch to CPU for NMS during inference (this happens three times in total).

@YashasSamaga I've tried setting the nms threshold to zero in my cfg file but this doesn't seem to change either the FPS or accuracy.
Here's my cfg file:
exp4_yolov4.txt

I'm getting approx. 45FPS with 1 x Tesla v100 using:

# width of network's input image
inpWidth = 608
# height of network's input image
inpHeight = 608
# scale factor for image normalization (1 / 255)
scale = 0.00392
# confidence threshold
confThreshold = 0.005
# non-max suppression threshold
nmsThreshold = 0.4
# use high level API for DNN module to do pre and post-processing
model = cv2.dnn_DetectionModel(net)
model.setInputParams(size=(inpWidth, inpHeight), scale=1/255, swapRB=True, crop=False)

To measure speed:

start = time.time()
classIDs, confidences, boxes = model.detect(image, confThreshold, nmsThreshold)
end = time.time()

totalTime += (end - start)

My F1-score is 0.88 when using Darknet detector test but only 0.75 with OpenCV 4.5.1.

Please can you help me to understand if there is anything else I can do to improve the speed for inference and get the accuracy to match more closely to Darknet?

@YashasSamaga
Copy link

YashasSamaga commented May 5, 2021

@matt-sharp The NMS issue has been fixed since OpenCV 4.4 I think. The nms threshold fix is no longer required. What backend did you use? I think I have seen people get 100FPS on V100 with 608 x 608 images. What version of cuDNN are you using? cuDNN 8 caused some slowdowns. You might surpass 100+ FPS on FP16 target with cuDNN 7.

My F1-score is 0.88 when using Darknet detector test but only 0.75 with OpenCV 4.5.1.

That's surprising. I had found the accuracy tests to be practically identical to Darknet. This was the script I used. mAP results from my calculations are here: opencv/opencv#17621.

@matt-sharp
Copy link

@matt-sharp The NMS issue has been fixed since OpenCV 4.4 I think. The nms threshold fix is no longer required. What backend did you use? I think I have seen people get 100FPS on V100 with 608 x 608 images. What version of cuDNN are you using? cuDNN 8 caused some slowdowns. You might surpass 100+ FPS on FP16 target with cuDNN 7.

My F1-score is 0.88 when using Darknet detector test but only 0.75 with OpenCV 4.5.1.

That's surprising. I had found the accuracy tests to be practically identical to Darknet. This was the script I used. mAP results from my calculations are here: opencv/opencv#17621.

@YashasSamaga I'm using cuDNN 8 - libcudnn8-8.0.5.39-1.cuda11.0.
The accuracy is more of a concern for me. I've checked other guidance and it does seem that Darknet uses confThreshold = 0.005 so I'm not sure what else I can change to try and match results?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants