Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to build a yolo pipeline? #1492

Open
ritwikdubey opened this issue Aug 28, 2018 · 14 comments
Open

How to build a yolo pipeline? #1492

ritwikdubey opened this issue Aug 28, 2018 · 14 comments

Comments

@ritwikdubey
Copy link

ritwikdubey commented Aug 28, 2018

Hi @AlexeyAB ,
Is it possible to build a detection pipeline within darknet? If so, how can I do that?
Say in stage 1 I want to detect all faces in an image and then in stage 2 I want to perform face recognition of a known or unknown person.

Right now I perform face recognition on the whole image but I wonder if I use multistage detection will it improve the accuracy?

@AlexeyAB
Copy link
Owner

@ritwikdubey Hi,

There is no multi-stage prediction.

  • Or implement multi-stage prediction by yourself inside Darknet-code
  • Or implement it in your own C/C++-code by using darknet.so/dll library

Yes, multi-stage inference can improve accuracy, if you will crop faces from original not-resized image and recognize each face in separate inference by using neural network that is optimal for face recognition.

@Muhammad057
Copy link

Muhammad057 commented Aug 29, 2018

Hello @ritwikdubey, I am also looking into the same problem these days. I have two separate trained YOLO models. One for License Plate Detection from an incoming vehicle, and secondly, the other model for recognizing the license plate from the detected model. I need to build a pipeline, where the output of detected model is fed to the recognized model. If you have any further insight, kindly share with me. Thank you.

@Muhammad057
Copy link

Hello @AlexeyAB, regarding your comment 'implement multi-stage prediction by yourself inside Darknet-code', can you please point me out to that file, where I have to make changes for multi-stage prediction? Thank you.

@AlexeyAB
Copy link
Owner

AlexeyAB commented Aug 30, 2018

@Muhammad057
To implement it for Detection on the images you should modify this function:

void test_detector(char *datacfg, char *cfgfile, char *weightfile, char *filename, float thresh,


  1. For example, add here:
    this code - where is resnet152.cfg/weights - your model for classification (face recognition, ...):
    network net_classifier = parse_network_cfg_custom("cfg/resnet152.cfg", 1);
    load_weights(&net_classifier , "resnet152.weights");

  1. And add between these lines:

    darknet/src/detector.c

    Lines 1142 to 1143 in 18d5e4f

    if (nms) do_nms_sort(dets, nboxes, l.classes, nms);
    draw_detections_v3(im, dets, nboxes, thresh, names, alphabet, l.classes, ext_output);

    the code for second stage recognition - something like this (may be I forgot something):
            for (i = 0; i < nboxes; ++i) {
                int class_id = -1;
                float prob = 0;
                for (j = 0; j < l.classes; ++j) {
                    if (dets[i].prob[j] > thresh && dets[i].prob[j] > prob) {
                        prob = dets[i].prob[j];
                        class_id = j;
                    }
                }
                if (class_id >= 0) {
                    image im_classify = crop_image(im, dets[i].bbox.x, dets[i].bbox.y, dets[i].bbox.w, dets[i].bbox.h);
                    image r = letterbox_image(im_classify, net_classifier.w, net_classifier.h);
                    float *predictions_classify = network_predict(net_classifier, r.data);

                    int top = 1;
                    top_k(predictions_classify , net_classifier.outputs, top, indexes);
                    int *indexes = calloc(top, sizeof(int));

                    for(i = 0; i < top; ++i){
                        int index = indexes[i];
                         printf("%s: %f\n",names[index], predictions_classify [index]);
                    }
                }
            }

@ritwikdubey
Copy link
Author

ritwikdubey commented Aug 30, 2018

I believe you meant net_classifier in load_weights function.

   network net_classifier = parse_network_cfg_custom("cfg/resnet152.cfg", 1);
    load_weights(&net, "resnet152.weights");

BTW, why letterbox_image the image? I can't open the code files right now but I guess this function draws the colored box around the prediction and adds label to it, correct? Why would I want to add that before sending the cropped image for second prediction?

@AlexeyAB
Copy link
Owner

I believe you meant net_classifier in load_weights function.

Yes, I fixed it.

BTW, why letterbox_image the image? I can't open the code files right now but I guess this function draws the colored box around the prediction and adds label to it, correct? Why would I want to add that before sending the cropped image for second prediction?

No, it doesn't draw boxes. letterbox_image() resizes image. More about letterbox_image() and resize_image(): #232 (comment)

@Muhammad057
Copy link

Muhammad057 commented Aug 31, 2018

@AlexeyAB
Thanks for the reply. Ok, I have added the above code and recompiled the darknet.sln file. When I run the darknet.exe file, it takes the .cfg file and weights file from your below mentioned code to detect the license plate from a vehicle.

net_classifier = parse_network_cfg_custom("cfg/yolo-obj.cfg", 1); //yolo_obj is the test cfg file
load_weights(&net_classifier , "yolo-obj_4600.weights"); //trained weights file

But, what I wanted to do is a little bit different. I run the trained YOLO on a live stream (rtsp link) through this command
'darknet.exe detector demo data/obj.data cfg/yolo-obj.cfg backup/yolo-obj_4600.weights rtsp link',

and I save the frames of a live stream in 'result_img' folder.
Now, I want to test each frame through another trained model (recognition model). Can you please help me to point out the relevant changes that I need to do in src/detector.c file? Where can I add the recognition model's cfg file, weights file, obj.data & obj.names file?
Regards.

@ritwikdubey
Copy link
Author

ritwikdubey commented Aug 31, 2018

@Muhammad057
Good to know.

One for License Plate Detection from an incoming vehicle, and secondly, the other model for recognizing the license plate from the detected model. I need to build a pipeline, where the output of detected model is fed to the recognized model. If you have any further insight, kindly share with me. Thank you.

I sorta worked on similar problem. I found out that Yolo v3 was quite good in detecting numbers off license plate without character segmentation stage. I guess the way you're thinking is sufficient; 1- localize and segment the license plate region 2- Run prediction on the scaled license plate to detect the numbers.
predictions

@Muhammad057
Copy link

@ritwikdubey
Yes, I have already detected the numbers from the license plate (which I call it my recognition model) using YOLOV3. But before detecting the license plate (LP) numbers, first, I segment out the license plate from a vehicle through my detection model and then, fed it to the recognition model (as started in my earlier comments). I think, YOLO doesn't have any pipeline for multi-stage prediction , i.e, first detect the LP and then recognize the letters from the same src/detector.c code. Below is the snippet of detecting the numbers from a LP
recognized_image431 - copy

@kmsravindra
Copy link

kmsravindra commented Mar 27, 2019

    network net_classifier = parse_network_cfg_custom("cfg/resnet152.cfg", 1);
    load_weights(&net_classifier , "resnet152.weights");

@AlexeyAB , Hi

I have a custom classifier (inceptionResNetV2) implemented in python/keras that takes cropped bounding boxes as image inputs. I want to plug that classifier on top of darknet detection. Could you please guide me as to where should I define that model, compile and load the trained weights in this darknet detector code and how do I do it? Thanks for your help!

@Favi0
Copy link

Favi0 commented Apr 8, 2019

@Muhammad057
Good to know.

One for License Plate Detection from an incoming vehicle, and secondly, the other model for recognizing the license plate from the detected model. I need to build a pipeline, where the output of detected model is fed to the recognized model. If you have any further insight, kindly share with me. Thank you.

I sorta worked on similar problem. I found out that Yolo v3 was quite good in detecting numbers off license plate without character segmentation stage. I guess the way you're thinking is sufficient; 1- localize and segment the license plate region 2- Run prediction on the scaled license plate to detect the numbers.
predictions

what do you mean by scaled?

@barzan-hayati
Copy link

@Muhammad057 Are you able to use pipeline in order to combine two stage of plate detection and recognition? I found an article that did license plate recognition by one model not two.

Real-Time Brazilian License Plate Detection and Recognition Using Deep Convolutional Neural Networks

@Muhammad057
Copy link

@barzan-hayati Yes I created pipeline to combine both of my models. Anyways thanks for sharing.

@barzan-hayati
Copy link

@Muhammad057 Could you please explain your method for pipeline?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants