Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🌟 PP-PicoDet已发布,欢迎大家试用&讨论 #4420

Closed
yghstill opened this issue Nov 2, 2021 · 130 comments
Closed

🌟 PP-PicoDet已发布,欢迎大家试用&讨论 #4420

yghstill opened this issue Nov 2, 2021 · 130 comments

Comments

@yghstill
Copy link
Collaborator

yghstill commented Nov 2, 2021

PP-PicoDet是轻量级实时移动端目标检测模型,我们提出了从小到大的一系列模型,包括S、M、L等,超越现有SOTA模型。

模型特色:

  • 🌟精度高:1M参数量以内mAP(0.5:0.95)达到30.6,3.3M参数量mAP(0.5:0.95)达到40.9。
  • 🚀速度快:在SD865上达到150FPS。
  • 😊部署友好:我们支持PaddleInference/PaddleLite/MNN/NCNN/OpenVINO,并且提供C++/Python/Android demo。

链接:

欢迎大家试用,有疑问欢迎讨论盖楼~

和其他模型对比:
picodet_map2

FAQ汇总: (持续更新中)

  • 版本要求: 训练导出模型要求Paddle版本统一,同时 PaddlePaddle >= 2.1.2。
  • 学习率、GPU数和batch-size关系: 采用线性伸缩准则,发布的配置文件基本都是4卡GPU训练的,例如:变成单卡,请学习率除以4,如果batch size从80变成40,请学习率再除以2。
  • 配置优先级: 一般picodet_x_coco.yml中的配置优先级高于__base__中配置,picodet_x_coco.yml中的所有设置会覆盖__base__中配置,所以修改picodet_x_coco.yml的配置即可。
  • 在自己数据集上训练模型: 支持COCO和VOC两种数据格式,同时建议采用迁移学习加快收敛,具体步骤:从PicoDet的Readme中拷贝COCO上训好的pretrain weights链接,更新配置文件中pretrain_weights参数为COCO上训好的权重。

为了方便大家交流沟通,欢迎扫码添加微信群,继续交流有关PP-PicoDet的使用及建议~

@yghstill yghstill pinned this issue Nov 2, 2021
@ghost
Copy link

ghost commented Nov 2, 2021

PaddlePaddle=2.2.0明天才发布对吧,找了半天好像没地方安装

@yghstill
Copy link
Collaborator Author

yghstill commented Nov 2, 2021

PaddlePaddle=2.2.0明天才发布对吧,找了半天好像没地方安装

@Xwmiss 嗯嗯,今天刚打tag,这两天发布安装包后我再同步,可以先使用dev版本Paddle:https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html

@ghost
Copy link

ghost commented Nov 2, 2021

PaddlePaddle=2.2.0明天才发布对吧,找了半天好像没地方安装

@Xwmiss 嗯嗯,今天刚打tag,这两天发布安装包后我再同步,可以先使用dev版本Paddle:https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html

好的好的👍

@PaddlePaddle PaddlePaddle deleted a comment Nov 2, 2021
@yghstill
Copy link
Collaborator Author

yghstill commented Nov 2, 2021

@Xwmiss

  1. pretrain_weights是用来设置预训练权重,确认下路径是否正确,下载是下载模型权重还是数据集,如果是数据集,请修改config/datasets/coco_detection的路径
  2. weights指向训练最后保存权重的路径。
  3. snapshot_epoch这里设置的话,优先级比runtime.yml高
  4. cycle_epoch是指Cycle-EMA中reset的epoch数,EMA会在固定epoch数时重新reset。

@ghost
Copy link

ghost commented Nov 2, 2021

image
请问我在使用上诉命令进行训练时,如果同时给定了--slim_config,好像模型是不在训练的,这是因为训练和量化两个步骤要分开吗?
感谢你的回复!

@yghstill
Copy link
Collaborator Author

yghstill commented Nov 2, 2021

@Xwmiss 应该先训练好模型,再配置slim_config中的pretain_weights为训好的fp32模型,再进行量化训练。 现在默认的指令会下载训好的COCO上的预训练模型 进行量化训练。

@ghost
Copy link

ghost commented Nov 2, 2021

十分感谢您的回复!👍

@yu937861
Copy link

yu937861 commented Nov 3, 2021

训练报错
配置文件
python tools/train.py -c configs/picodet/picodet_l_416_coco.yml

报错行
File "PaddleDetection/ppdet/modeling/heads/simota_head.py", line 351, in
featmap.shape[-2] * featmap.shape[-1] for featmap in cls_scores
AttributeError: 'list' object has no attribute 'shape'

@yghstill
Copy link
Collaborator Author

yghstill commented Nov 3, 2021

@yu937861 Fixed in #4438,拉取下最新代码试下吧

@ChinaRush
Copy link

picodet有VOC格式配置文件吗

@lyuwenyu
Copy link
Collaborator

lyuwenyu commented Nov 3, 2021

@ChinaRush ppdet支持的 这个和模型关系不是很大可以自己配一下的datasets 或者 转成coco跑 在tools里有工具

@yu937861
Copy link

yu937861 commented Nov 3, 2021

@yu937861 Fixed in #4438,拉取下最新代码试下吧

感谢

@thenighthunter0
Copy link

thenighthunter0 commented Nov 3, 2021

你好,下面问题可能是什么原因呢。按照third_engine/demo_openvino教程做的,电脑的openvino跑其他模型是可以的
问题:picodet third_engine/demo_openvino 模型输出为空,可视化无检测框,只有原图
描述:
picodet_m_416.bin
start init model
success
picodet min = 14.95 max = 15.65 avg = 15.47
找到原因了:
resize_uniform(image, resized_img, cv::Size(320, 320), effect_roi);
auto detector = PicoDet("../weight/picodet_m_416.xml");
修改为416,不知道是模型精度问题,还是后处理映射的问题
image

@gaorui999
Copy link

训练后报错提示:
AttributeError: 'list' object has no attribute 'shape'

@yghstill
Copy link
Collaborator Author

yghstill commented Nov 3, 2021

@gaorui999 这个问题已解,请更新最新代码就行。

@yu937861
Copy link

yu937861 commented Nov 3, 2021

训练
python tools/train.py -c configs/picodet/picodet_l_416_coco.yml

自己的数据集,训练完后eval ap50精度只有0.224, 配置文件也只改了num_classes, 是还有哪里需要改的吗

@yghstill
Copy link
Collaborator Author

yghstill commented Nov 4, 2021

@yu937861 您自己的数据集多大呢?默认的预训练只有Backbone在Imagenet上的,由于移动端模型收敛较慢,如果您数据集较小,建议pretrain_weights直接加载COCO上训好的权重,这样迁移学习会加快模型的收敛。

@Xiaoyw1998
Copy link

为什么 picodet_l 的 epoch=250,而 CosineDecay 的 max_epochs=300,二者不应该是相同的吗

@yu937861
Copy link

yu937861 commented Nov 4, 2021

@yu937861 您自己的数据集多大呢?默认的预训练只有Backbone在Imagenet上的,由于移动端模型收敛较慢,如果您数据集较小,建议pretrain_weights直接加载COCO上训好的权重,这样迁移学习会加快模型的收敛。

好的我再试一下,谢谢

@yghstill
Copy link
Collaborator Author

yghstill commented Nov 4, 2021

为什么 picodet_l 的 epoch=250,而 CosineDecay 的 max_epochs=300,二者不应该是相同的吗

@Xiaoyw1998 PicoDet-l的模型较大,会提前收敛到最优mAP,所以总得epoch调小成250。

@yghstill
Copy link
Collaborator Author

yghstill commented Nov 4, 2021

@zwhua006
Copy link

zwhua006 commented Nov 4, 2021

想问一下,cycle-epochs应该是一个经验值,那如果训练轮数不再是250,而是150或者更少的话,适合多少呢,这一设置是否有明显的影响对于结果?

@thenighthunter0
Copy link

你好,picodet_s_coco,转换为openvino成功,但是使用third_engine/demo_openvino调用,报错如下:
start init model
success
begin inferenceterminate called after throwing an instance of 'InferenceEngine::GeneralError'
what(): Cannot find blob with name: save_infer_model/scale_4.tmp_1
Aborted (core dumped)

@yghstill
Copy link
Collaborator Author

yghstill commented Nov 4, 2021

想问一下,cycle-epochs应该是一个经验值,那如果训练轮数不再是250,而是150或者更少的话,适合多少呢,这一设置是否有明显的影响对于结果?

@zwhua006 cycle_epoch这个参数是控制Cycle-EMA中的reset的epoch间隔数,我们实验中证明40是一个不错的设置。训练轮数的话需要看模型的收敛情况,在COCO上我们默认设置了300或250epoch.

@yghstill
Copy link
Collaborator Author

yghstill commented Nov 4, 2021

@thenighthunter0 我们openvino的问题再认真排查下~

@17076372880
Copy link

image

请问,为什么会出现这个问题,上面显示coco数据都已经读进来了,但是下面显示没有coco数据。

@yghstill
Copy link
Collaborator Author

yghstill commented Nov 4, 2021

@17076372880 先读入的是EvalReader中COCO 验证集,然后读入训练集,确认下dataset/coco/annotations/train_anno.json是否存在。

@17076372880
Copy link

@17076372880 先读入的是EvalReader中COCO 验证集,然后读入训练集,确认下dataset/coco/annotations/train_anno.json是否存在。

这个是存在的,数据集和配置文件(coco_detection.yml)都是之前跑2.1版本的yolo v3的,之前是没啥问题的。

@lguowang
Copy link

lguowang commented Mar 8, 2022

大家好,我用pp训练了PicoDet,然后导出模型进行sering部署,但预测时出现主要的错误: {'err_no': 8, 'err_msg': "(data_id=0 log_id=0) [ppyolo|0] Failed to postprocess: 'transpose_17.tmp_0.lod'", 'key': [], 'value': [], 'tensors': []}
请问 fetch_list: 要配置成什么样的?
请问 fetch_list: 要配置成什么样的?
请问 fetch_list: 要配置成什么样的?

serving_server 中的 serving_server_conf.prototxt :
feed_var {
name: "image"
alias_name: "image"
is_lod_tensor: false
feed_type: 1
shape: 1
shape: 3
shape: 416
shape: 416
}
fetch_var {
name: "transpose_10.tmp_0"
alias_name: "transpose_10.tmp_0"
is_lod_tensor: false
fetch_type: 1
shape: 1
shape: 2704
shape: 4
}
fetch_var {
name: "transpose_11.tmp_0"
alias_name: "transpose_11.tmp_0"
is_lod_tensor: false
fetch_type: 1
shape: 1
shape: 2704
shape: 32
}
fetch_var {
name: "transpose_12.tmp_0"
alias_name: "transpose_12.tmp_0"
is_lod_tensor: false
fetch_type: 1
shape: 1
shape: 676
shape: 4
}
fetch_var {
name: "transpose_13.tmp_0"
alias_name: "transpose_13.tmp_0"
is_lod_tensor: false
fetch_type: 1
shape: 1
shape: 676
shape: 32
}
fetch_var {
name: "transpose_14.tmp_0"
alias_name: "transpose_14.tmp_0"
is_lod_tensor: false
fetch_type: 1
shape: 1
shape: 169
shape: 4
}
fetch_var {
name: "transpose_15.tmp_0"
alias_name: "transpose_15.tmp_0"
is_lod_tensor: false
fetch_type: 1
shape: 1
shape: 169
shape: 32
}
fetch_var {
name: "transpose_16.tmp_0"
alias_name: "transpose_16.tmp_0"
is_lod_tensor: false
fetch_type: 1
shape: 1
shape: 49
shape: 4
}
fetch_var {
name: "transpose_17.tmp_0"
alias_name: "transpose_17.tmp_0"
is_lod_tensor: false
fetch_type: 1
shape: 1
shape: 49
shape: 32
}

我配置的config.yml:
dag:
is_thread_op: false
tracer:
interval_s: 30
http_port: 18888
op:
ppyolo:
concurrency: 1

local_service_conf:
  client_type: local_predictor
  device_type: 1
  devices: '0'
  fetch_list: 
  - transpose_17.tmp_0
  model_config: serving_server/

rpc_port: 9998
worker_num: 2

请问 fetch_list: 要配置成什么样的?
请问 fetch_list: 要配置成什么样的?

@sdreamforchen
Copy link

python tools/export_model.py -c configs/picodet/picodet_L_640_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/pretrained/ESNet_x1_25_pretrained.pdparams --output_dir=inference_model TestReader.inputs_def.image_shape=[3,640,640]
1改了下激活函数+2执行上面的指令,一路调试,现在遇到这个报错,实在不知道怎么走下去了。
[Hint: 'cudaErrorInitializationError'. The API call failed because the CUDA driver and runtime could not be initialized. ] (at /paddle/paddle/fluid/platform/gpu_info.cc:108)
3安装了paddlepaddle-gpu2.2.2+paddledet2.3.0+paddle2onnx0.9.1

@sdreamforchen
Copy link

另外,想请教下,看了下cfg文件, 数据增强只用了这个吗- RandomFlip: {prob: 0.5}

@yghstill
Copy link
Collaborator Author

@sdreamforchen 数据增强包含crop和flip还有RandomDistort

导出未修改的模型预测结果正常吗?需要看下你修改的位置,改了什么激活函数。

@sdreamforchen
Copy link

您好,我改了hard sigmoid和hard swish(基于relu6实现),因为我的下游嵌入式对这两个函数支持也存在一定的问题; 改了后用eval.py测试了下,各方面都是对的,就是延迟厉害,Titan RTX才17帧,l-640模型。
1 转为onnx后,出现了一个新的问题:我下载的官方的onnx和我转换后的onnx的conv和bn处在差异,官方的是融合一起的,而我的是分开的,这个不知道是为啥,通过netron查看的。
2 另外,通过安装nccl之类的操作,目前用单卡训练,出现我配置export CUDA_VISIBLE_DEVICES=4,但是train还是去用GPU:0,而eval没有问题。 这个问题,我可以慢慢去学习paddle,看看代码,如果方便就麻烦回答一下吧。谢谢。

@sdreamforchen
Copy link

@sdreamforchen 数据增强包含crop和flip还有RandomDistort

导出未修改的模型预测结果正常吗?需要看下你修改的位置,改了什么激活函数。

之所以改激活函数,是因为转换后的onnx的hardsigmoid函数。下游嵌入式会报错。 今天用netron确认了问题,查看了转换的alpha不是0.2是0.166667

@yghstill
Copy link
Collaborator Author

@sdreamforchen

  1. 网络预测时延测试的话需要在具体的硬件上测试,eval.py脚本打印的FPS只是训练前向,和预测库预测不太一样,预测库做过预测加速,更快些。
  2. 转onnx后需要使用onnx-simplifier工具对模型进行简化融合操作,就可以看到conv bn已经fuse了。
  3. 关于训练GPU的问题一般指定CUDA_VISIBLE_DEVICES变量即可,看下是否是环境变量没有生效?

@tb5874
Copy link

tb5874 commented Mar 14, 2022

Hello, yghstill
PaddleDetection is wonderful platform. Thank you.

Should i ask some question of PaddleDetection-PP-PicoDet with Pytouch to TensorRT?

I found some information about Pytouch to TensorRT.
I think that TensorRT improve calculation performance when PP-PicoDet doing Parallel Processing.
If your PP-PicoDet team have Idea or Plan about Pytouch to TensorRT, could you tell me?

  • When I ask a question, i immediately found your guide 'Convert to ONNX'.
    Currently, Is that the best-way for Pytouch to TensorRT ? ( PyTorch -> ONNX-> TensorRT )
    Have your team another plan ?

Thank you.

  • When I wake up in the morning, I thought that question was worng. ;) sorry.
    Please Ignore above all Question..
    So now, Question is that ' Have your team Convert-Plan about ONNX to TensorRT ?'
    X_D Please Understand me.. Thank you.

@yghstill
Copy link
Collaborator Author

@tb5874 PP-PicoDet is a mobile model, and it is not planned to support TensorRT, but the upcoming PP-YOLOE model will support the function of TensorRT, welcome to pay attention.

@sdreamforchen
Copy link

So good! 想请问下,nccl想过的提示是,让我们安装nccl2.这个和conda install -c anaconda nccl是不一样的吗?还是必须用https://docs.nvidia.com/deeplearning/nccl/install-guide/index.html这个方法是安装nccl?

@tb5874
Copy link

tb5874 commented Mar 17, 2022

@yghstill Thank you for your kind explanation.
So, If i use low computing power( CPU with GPU, like Nvidia Jetson Xavier NX, but not mean restrained Nvidia Platform. )
( But i know, currently easily applicable GPU is Nvidia. So i don't care if you restrain Nvidia. )
Your best recommand model is upcoming PP-YOLO-E ?

I have AMD Ryzen5 5600G( 6 core, 3.9GHz ) with RTX3060. ( but i have another GPU(RTX3090). So i will test it. )
and Nvidia Jetson Xavier NX( 6 core, 1.4GHz) with Volta architecture(384 CUDAcore, 48 TENSORcore)
I just want to test this wonderful Object Detector :)

Target is low computing power, Like above environment.
( Nvidia Jetson Xavier NX or normal desktop )

Should i ask some best recommend of PaddleDetection Object Detector. like above environment ?

Thank you.

@yghstill
Copy link
Collaborator Author

@tb5874 It is also recommended that you use the PP-YOLOE model. We tested it on Jetson Xavier NX and it performs well

@tb5874
Copy link

tb5874 commented Mar 17, 2022

@yghstill Thank you !
So now, For understand upcomming PP-YOLOE, PP-YOLO is background? or PP-YOLOE is another paper?
When i read PP-YOLO paper ( https://arxiv.org/pdf/2007.12099.pdf ), i found PP-YOLO with Method 'E + Grid Sensitive'.
That is not mean PP-YOLOE ? X_D..

If that is not mean PP-YOLOE, should i pre-test PP-YOLOE ?

Thank you.

@yghstill
Copy link
Collaborator Author

@tb5874 PP-YOLOE is another paper, and it is coming soon.

@yuexiayiren159aaa
Copy link

我想知道picodet这个网络的输出代表什么,这里有8个输出,我需要自己写后处理

类似yolo的输出 (13,13,255),(26,26,255)(52,52,255),13x13的特征图,255=8x53 ,82=80 + 5 ,5 = x,y,w,h,c

image

@niancheng
Copy link

niancheng commented Apr 8, 2022 via email

@cv-nlp
Copy link

cv-nlp commented Apr 12, 2022

picodet 转成MNN最新版本输出的是transpose,以前版本输出的是save_infer_model/scale_4.tmp_1这种,现在demo mnn推理存在问题,能否帮忙看一下

@yghstill
Copy link
Collaborator Author

picodet 转成MNN最新版本输出的是transpose,以前版本输出的是save_infer_model/scale_4.tmp_1这种,现在demo mnn推理存在问题,能否帮忙看一下

@cv-nlp 好的我们修复下此问题。

@yghstill yghstill unpinned this issue Apr 16, 2022
@Costwen
Copy link

Costwen commented Jul 1, 2022

使用picodet_640_l训练模型之后, 使用动态图的文件进行预测结果没有问题, 但是当导出模型之后其中top, bottom 始终为inf是为什么呢?

@niancheng
Copy link

niancheng commented Jul 1, 2022 via email

@niancheng
Copy link

niancheng commented Oct 4, 2022 via email

@DarkWings520
Copy link

image
我用paddlex导出的输入为320的PicoDet-s模型为什么参数有3.75M那么多,并不是0.99M呢?我该怎么操作才能为0.99M,我需要在win10系统部署

@niancheng
Copy link

niancheng commented Jan 5, 2023 via email

@DarkWings520
Copy link

DarkWings520 commented Jan 5, 2023 via email

@GoAlers
Copy link

GoAlers commented Nov 8, 2023

使用picodet训练完测试能跑通,但导出模型显示这个,应该怎么办,是版本问题么,是的话应该更新什么版本呢?

1699423178071

我的版本

1699423319341

@niancheng
Copy link

niancheng commented Nov 8, 2023 via email

@DarkWings520
Copy link

DarkWings520 commented Nov 8, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests