SOTA claims vs leaderboards mismalignment #40

LifeIsStrange · 2022-07-07T21:36:43Z

YOLOv7 surpasses all known object detectors in both speed and accuracy in the range from 5 FPS to 160 FPS and has the highest accuracy 56.8% AP among all known real-time object detectors with 30 FPS or higher on GPU V100.

Weird claim when you actually rank #20 on COCO
If we exclude all models with extra training data you still rank #11.
the #1 without extra data is Dual-Swin-L(HTC, multi-scale), with 60.1 box AP
with extra data it is DINO(Swim-L,multi-scale) with 63.3 box AP

AlexeyAB · 2022-07-07T22:25:20Z

They are much slower than 5 FPS on GPU Tesla V100, and they are not Real-time.

Dual-Swin-L (HTC) 1600x1600 - 59.1% AP - 1.5 FPS V100 - isn't real-time - is 2000% FSP slower than YOLOv7-e6e
Dual-Swin-L(HTC, multi-scale) - 60.1% AP - 0.3 FPS V100 - isn't real-time - is 12000% FPS slower than YOLOv7-e6e
DINO-5scale-R50 (10 FPS, 51.0% AP) is less accurate and 1500% FPS slower than YOLOv7 (161 FPS, 51.2% AP)
DINO(Swim-L,multi-scale) with 63.3 box AP - additional training datasets are used (so no fair comparison), no publicly available code and models, it is slower than 1 FPS - isn't real-time is ~10000% slower than YOLOv7-e6e

There are Dual-Swin-L (HTC) and DINO-5scale (R50) in the Table 9: https://arxiv.org/abs/2207.02696

LifeIsStrange · 2022-07-08T10:05:49Z

@AlexeyAB Great answer! I can see the significant value proposition of this implementation now :)
So how about you update the abstract from

YOLOv7 surpasses all known object detectors

to

YOLOv7 surpasses all known real-time object detectors

bonus question:
how does it compare to the recently anounced YOLOv6? https://github.com/meituan/YOLOv6

AlexeyAB · 2022-07-10T00:36:50Z

YOLOv7 surpasses all known object detectors

to

YOLOv7 surpasses all known real-time object detectors

Real-time is 30 FPS or higher.

YOLOv7 surpasses not only real-time detectors from 30 to 160 FPS, but also non-real-time detectors in the range from 4 to 30 FPS.

how does it compare to the recently anounced YOLOv6? https://github.com/meituan/YOLOv6

Page 11: https://arxiv.org/pdf/2207.02696.pdf

LifeIsStrange · 2022-07-10T14:17:39Z

@AlexeyAB
Fair enough, I wish every paper would defend their value as well as you did, in an evidence based way :).
However, it seems to me that YOLOR-D6 beat (in some FPS range at least) YOLOv7.
YOLOR-D6 is not YOLOv6, it achieve 57.3% AP which is 0.5% more than YOLOv7, and has 34fps while YOLOv7 has 36fps if I understand correctly.
Still YOLOR-D6 is using extra training data indeed. But at the end of the day, end users want a fast model with the best accuracy and will generally accept extra training data for pragmatism sake.
Hence the following questions:
Do you plan on making a YOLOv7 version with improved accuracy via leveraging extra training data?
Secondly, I believe you can improve the state of the art while not significantly altering performance, by being the first to use the following very simple to use innovations, for object detection.
https://github.com/lessw2020/Ranger21
or
https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer
https://arxiv.org/abs/2106.13731

it includes generally applicable innovations that improve accuracy, such as:
https://github.com/digantamisra98/Mish
The mish activation function is in most cases the best activation function, often yielding 0.5-1% accuracy increase for free.
Ranger can in addition use gradient centralization,
https://github.com/Yonghongwei/Gradient-Centralization
which generally also give free gains.
then it can use a synergetic combination of optimizers,
such as RAdam in place of Adam
https://github.com/LiyuanLucasLiu/RAdam
+
the complementary LookAhead
https://github.com/michaelrzhang/lookahead
and others

his library makes the integration and selection of optimizations passes easy. It is a tragedy that those innovations are generally ignored by all despite their huge potential in increasing SOTA for free, in key tasks.

AlexeyAB · 2022-07-11T03:47:19Z

Still YOLOR-D6 is using extra training data indeed. But at the end of the day, end users want a fast model with the best accuracy and will generally accept extra training data for pragmatism sake.

If you will train your own model on your custom dataset, you will get higher accuracy for YOLOv7 than for YOLOR. And YOLOv7 is faster.

silvada95 · 2022-12-30T17:30:29Z

What is the definition that you use to define a detector as real-time or not? I saw a lot of authors mentioning it on their works, but no definition at all...

SteTala97 · 2023-01-16T08:30:28Z

What is the definition that you use to define a detector as real-time or not?

AlexeyAB commented on Jul 10, 2022:

Real-time is 30 FPS or higher.

So, real-time is 30FPS or higher.
It commonly refers to the fact that if you have your input coming from a 30FPS camera, or you are processing a video captured by a 30FPS camera (which usually is the most common video frame rate used), you have no delay between one frame and the next one. Of course this also means that if the input rate of your system is e.g. 10FPS, a model that performs at 10FPS can be considered "real-time" for your application.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SOTA claims vs leaderboards mismalignment #40

SOTA claims vs leaderboards mismalignment #40

LifeIsStrange commented Jul 7, 2022

AlexeyAB commented Jul 7, 2022 •

edited

Loading

LifeIsStrange commented Jul 8, 2022 •

edited

Loading

AlexeyAB commented Jul 10, 2022 •

edited

Loading

LifeIsStrange commented Jul 10, 2022 •

edited

Loading

AlexeyAB commented Jul 11, 2022

silvada95 commented Dec 30, 2022

SteTala97 commented Jan 16, 2023

SOTA claims vs leaderboards mismalignment #40

SOTA claims vs leaderboards mismalignment #40

Comments

LifeIsStrange commented Jul 7, 2022

AlexeyAB commented Jul 7, 2022 • edited Loading

LifeIsStrange commented Jul 8, 2022 • edited Loading

AlexeyAB commented Jul 10, 2022 • edited Loading

LifeIsStrange commented Jul 10, 2022 • edited Loading

AlexeyAB commented Jul 11, 2022

silvada95 commented Dec 30, 2022

SteTala97 commented Jan 16, 2023

AlexeyAB commented Jul 7, 2022 •

edited

Loading

LifeIsStrange commented Jul 8, 2022 •

edited

Loading

AlexeyAB commented Jul 10, 2022 •

edited

Loading

LifeIsStrange commented Jul 10, 2022 •

edited

Loading