-
Notifications
You must be signed in to change notification settings - Fork 7.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to compute mAP of tiny yolo on VOC2007-test #350
Comments
@szm2015 Hi,
|
1- I will look into it and report, thank you. Now I tried what you said about computing AP for every class and then averaging over them to get the mAP, here are the results: AP of class "aeroplane": 44.0196 Now it's even less! |
@szm2015 Can you show your C-code for mAP? |
Hello again, I attached the code. It's a Qt project (I use the ui for plotting). Here's an overall explanation: In lines 57 to 112, there's a for loop on 11 thresholds (from 0 to 1). Inside this (line 63 to 108) is a loop over txt prediction files (which I also attached). In this loop the detections that have a score above the threshold are stored in cv::Rect objects (along with their scores and labels). then functions "FillEvaluationsMatrix" evaluates the predictions against the ground truth and fills a confusion matrix which is initialized at the beginning of threshold loop. Outside the predictions loop (still inside threshold loop), the TP and FP values are computed using the confusion matrix in "finalEval" function (I count the total objects in every class in ground truth labels and use that for TP+FN value in recall denominator). This function computes precision and recall and saves them in a matrix (named PRpairs) that has 20 rows (number of classes) and 11 columns (number of thresholds), this way each class has PRpair for every threshold at the end of the loop. Finally "ComputeAPs" function computes the APs of every class using the PRpairs calculated before and averages them to get mAP. |
@szm2015 How about the issues now? I also have the problem w.r.t PR curve. I wonder why the recall (x axis) is 60 rather than 100? |
Hello everyone, I haven't had the time to work on this matter for a while. just now I was checking voc_eval,py and came across these lines:
It seems that detections are only counted if their ground truth is not "difficult" and also if not R['det'][jmax]. I haven't considered any of these in my code, though I have no idea what the second one is! I would appreciate any clarifications! |
This code is taken from the repository of the author of Faster-RCNN detector: https://github.com/rbgirshick/py-faster-rcnn/blob/781a917b378dbfdedb45b6a56189a31982da1b43/lib/datasets/voc_eval.py#L177-L189 overlaps = inters / uni
ovmax = np.max(overlaps)
jmax = np.argmax(overlaps)
if ovmax > ovthresh:
if not R['difficult'][jmax]:
if not R['det'][jmax]:
tp[d] = 1.
R['det'][jmax] = 1
else:
fp[d] = 1.
else:
fp[d] = 1. Where:
So if ground-truth is |
Thank you @AlexeyAB for your complete explanation. I do something exactly like checking R['det'][jmax] in my code. I added the "difficult" checking part, but for some reason, the AP got even worse, I should check it more, but meanwhile, can you point me to the exact procedure of evaluating with voc_eval.py? Most important of which is the command line code to get the detection results as there are several validation functions in detector.c as far as I have understood. |
Hi @szm2015 @AlexeyAB , I also plot the PR curve following the link https://github.com/D-X-Y/caffe-faster-rcnn/blob/dev/examples/FRCNN/calculate_voc_ap.py , and print the P-R values before mAP is computed. I would be much appreciate if all of you can help me to solve the problems above. And hope further discussions on the P-R curve issues. |
Hi @MiZhangWhuer , as I myself am still struggling with this issues I can't be of much help to you! But hopefully, if I could solve it, I would share my results. Now @AlexeyAB , I still don't know how to run voc_eval.py . My python knowledge is really rusty! I first created the detection results using the following command: (Note that I'm using pjreddie version of darknet) Then I had my detection results in a folder named voc in results directory with the following format: Now I added these lines at the end of voc_eval.py code to be able to run it (told you my python is rusty!!!):
But detpath and others seem to be something other that simple addresses, because running the code gives me the following error:
Can you please tell me how should I path these arguments to voc_eval.py? |
Hi everyone,
Now I have a more fundamental question, In this code, we just hand the previously generated detection files to the evaluation function and in that we just calculate one pair of recall and precision for every class (as far as I have understood) and then calculate AP. Shouldn't there be some kind of a loop over different score thresholds (applied on the "confidence") to give us the precision-recall curve (to use for AP calculation)? |
@szm2015 In your code
Also:
mAP calculation - 11 point method for PascalVOC: Lines 31 to 45 in 9c84764
|
@szm2015 @MiZhangWhuer I just added cmd-file for windows to calculate mAP. I got 56.6% for Tiny-Yolo 416x416 on PascalVOC 2007 test, that a little bit less than 57.1% that stated on the site: https://pjreddie.com/darknet/yolo/ If you use Windows and Python >= 3.5:
|
Thank you @AlexeyAB, after digging a little more into the code I found what I've been missing, the fact that predicted bounding boxes are ranked according to their confidence scores and then the recall/precision is computed for every one of these ranks. What I myself have been doing was to consider a number of thresholds (say 20) and then compute the PR pair for each one of them (by omitting the predictions with confidence scores below the threshold in each turn) and then calculating AP from these PR pairs. I still don't know why this way of computing AP gives such a drastically wrong result, but for now, I will stick to your code. Thanks again. |
@szm2015 @MiZhangWhuer I added C-code for calculation mAP (mean average precision) using Darknet for VOCdataset and any your custom dataset. Just use command: But my implementation shows lower value than Line 498 in a1af57d
I don't check Lines 37 to 38 in a1af57d
So on the site and in the article stated 78.6% page-4 table-3: https://arxiv.org/pdf/1612.08242v1.pdf
|
I think you should still consider difficult objects, because there may be
cases where the model detects a difficult object and as that object is not
listed as ground truth by voc_label.py, the code will count it as a false
positive when it should not, and this will decrease precision. I think this
explains the little difference between the mAP of python code and the C
code (The python code gets the ground truth directly from xml files)
…On Thu, Feb 15, 2018 at 4:28 PM, Alexey ***@***.***> wrote:
@szm2015 <https://github.com/szm2015> @MiZhangWhuer
<https://github.com/mizhangwhuer>
I added C-code for calculation mAP (mean average precision) using Darknet
for VOCdataset and any your custom dataset. Just use command: darknet.exe
detector map data/voc.data tiny-yolo-voc.cfg tiny-yolo-voc.weights
where in the voc.data file should be stated validation dataset
valid=2007_test.txt
*But my implementation shows lower value than reval_voc.py + voc_eval.py.
If you will find error in my code and can fix it, let me know about it:*
https://github.com/AlexeyAB/darknet/blob/a1af57d8d60b50e8188f36b7f74752
c8cc124177/src/detector.c#L498
I don't check difficult of ground truth as it does voc_eval.py, but as I
see voc_label.py remove difficult objects already on labeling stage:
https://github.com/AlexeyAB/darknet/blob/a1af57d8d60b50e8188f36b7f74752
c8cc124177/scripts/voc_label.py#L37-L38
- get mAP using Darknet C-code: https://github.com/AlexeyAB/
darknet/blob/master/build/darknet/x64/calc_mAP.cmd
<https://github.com/AlexeyAB/darknet/blob/master/build/darknet/x64/calc_mAP.cmd>
- get mAP using Python code: https://github.com/AlexeyAB/
darknet/blob/master/build/darknet/x64/calc_mAP_voc_py.cmd
<https://github.com/AlexeyAB/darknet/blob/master/build/darknet/x64/calc_mAP_voc_py.cmd>
------------------------------
- For darknet.exe detector map data/voc.data tiny-yolo-voc.cfg
tiny-yolo-voc.weights
width=416 height=416
Got *mAP = 55.61%*
- but using reval_voc.py and voc_eval.py we can get *56.6%*: #350
(comment)
<#350 (comment)>
- but on the site *57.1%*: https://pjreddie.com/darknet/yolo/
class = 0, name = aeroplane, ap = 61.01 %
class = 1, name = bicycle, ap = 71.18 %
class = 2, name = bird, ap = 47.84 %
class = 3, name = boat, ap = 40.23 %
class = 4, name = bottle, ap = 20.88 %
class = 5, name = bus, ap = 67.68 %
class = 6, name = car, ap = 66.21 %
class = 7, name = cat, ap = 70.46 %
class = 8, name = chair, ap = 33.77 %
class = 9, name = cow, ap = 54.15 %
class = 10, name = diningtable, ap = 55.45 %
class = 11, name = dog, ap = 62.47 %
class = 12, name = horse, ap = 71.24 %
class = 13, name = motorbike, ap = 68.72 %
class = 14, name = person, ap = 59.28 %
class = 15, name = pottedplant, ap = 27.54 %
class = 16, name = sheep, ap = 54.45 %
class = 17, name = sofa, ap = 50.07 %
class = 18, name = train, ap = 70.83 %
class = 19, name = tvmonitor, ap = 58.63 %
mean average precision (mAP) = 0.556050, or 55.61 %
Total Detection Time: 56.000000 Seconds
------------------------------
-
For darknet.exe detector map data/voc.data yolo-voc.cfg
yolo-voc.weights
width=544 height=544
Got *mAP = 75.77%*
- but using reval_voc.py and voc_eval.py we can get *77.1%*: #350
(comment)
<#350 (comment)>
- but on the site *78.6%*: https://pjreddie.com/darknet/yolo/
So on the site and in the article stated *78.6%* page-4 table-3:
https://arxiv.org/pdf/1612.08242v1.pdf
Lower value because:
* My implementation shows some lower value than reval_voc.py+voc_eval.py.
If you will find error in my code, let me know about it.
* yolo-voc.weights trained with keeping aspect ratio
<#232 (comment)>
and with some other modification in the original repo, so you should test
it on original repo: https://github.com/pjreddie/darknet
class = 0, name = aeroplane, ap = 80.84 %
class = 1, name = bicycle, ap = 84.10 %
class = 2, name = bird, ap = 75.03 %
class = 3, name = boat, ap = 65.30 %
class = 4, name = bottle, ap = 55.22 %
class = 5, name = bus, ap = 83.66 %
class = 6, name = car, ap = 84.53 %
class = 7, name = cat, ap = 88.20 %
class = 8, name = chair, ap = 58.35 %
class = 9, name = cow, ap = 80.53 %
class = 10, name = diningtable, ap = 69.81 %
class = 11, name = dog, ap = 84.07 %
class = 12, name = horse, ap = 86.17 %
class = 13, name = motorbike, ap = 83.33 %
class = 14, name = person, ap = 78.44 %
class = 15, name = pottedplant, ap = 50.86 %
class = 16, name = sheep, ap = 77.36 %
class = 17, name = sofa, ap = 71.74 %
class = 18, name = train, ap = 82.96 %
class = 19, name = tvmonitor, ap = 74.95 %
mean average precision (mAP) = 0.757728, or 75.77 %
Total Detection Time: 214.000000 Seconds
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#350 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/APaJX0Dugg0eviXFI04ZbK0D3vwfGv5Eks5tVCoOgaJpZM4Rs4Fh>
.
|
@szm2015 Yes, I think it can influence. Maybe I'll add a separate Python-script |
Now we somewhere lose 0.16 - 0.39 % of mAP :) |
Hi everyone,
The title says everything, I want to compute the mAP of tiny yolo on VOC2007-test, I have written a cpp code for this and get 39.78% for mAP whereas pjreddie reports 57.1% mAP on VOC2007-test.
I first downloaded the weights using:
wget https://pjreddie.com/media/files/tiny-yolo-voc.weights
Then performed detection with:
./darknet -i 0 detector valid cfg/voc.data cfg/tiny-yolo-voc.cfg models/tiny-yolo-voc.weights
I just changed detector.c code to save the results in a different format that was easier for me to read in my code.
I then count all the TPs and FPs (in all classes) and compute Precision-Recall for 11 thresholds (from 0 to 1) and then the AP (with the formula mentioed in Pascal VOC paper). Here is the PR curve I get:
My true purpose is to write a code to compute the AP for the model trained on my own custom data, but in order to verify it I am testing on a pretrained tiny yolo.
Thanks in advance for your help.
The text was updated successfully, but these errors were encountered: