[Fix] Bug in post dark udp #1079

X00123 · 2021-12-13T08:44:14Z

Motivation

Met a bug when training with udp, look like this

error: (-215:Assertion failed) top >= 0 && bottom >= 0 && left >= 0 && right >= 0 && _src.dims() <= 2 in function 'cv::copyMakeBorder'

Review the code and find that cv2.copMakeBorder will report an error when input dimension exceeds 512

In the raw code, the author directly use cv2.copyMakeBorder to operate batch of heatmaps and it crashed

The tested opencv-python version is 4.5.4.60, 4.4.0.40, 4.1.0.25, all have this problem

Complete environmental information is like this

Environment info:
------------------------------------------------------------
sys.platform: linux
Python: 3.7.11 (default, Jul 27 2021, 14:32:16) [GCC 7.5.0]
CUDA available: True
GPU 0,1,2,3: Tesla P40
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.0, V10.0.130
GCC: gcc (GCC) 8.3.0
PyTorch: 1.7.1+cu101
PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 10.1
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75
  - CuDNN 7.6.3
  - Magma 2.5.2
  - Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror$
return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -$
no-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -$
error=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.8.2+cu101
OpenCV: 4.5.4
MMCV: 1.3.13
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 11.0
MMPose: 0.21.0+ac48c74
------------------------------------------------------------

Modification

When the input heatmaps' dimension exceeds 512, split it to operation, and then concat the splitted heatmaps

Use cases (Optional)

Just test this bug with official command line

./tools/dist_train.sh configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/hrnet_w32_coco_256x192_udp.py 4

It doesn't work on the original code, and work well on the new code

Checklist

Before PR:

I have read and followed the workflow indicated in the CONTRIBUTING.md to create this PR.
Pre-commit or linting tools indicated in CONTRIBUTING.md are used to fix the potential lint issues.
Bug fixes are covered by unit tests, the case that causes the bug should be added in the unit tests.
New functionalities are covered by complete unit tests. If not, please add more unit tests to ensure correctness.
The documentation has been modified accordingly, including docstring or example tutorials.

After PR:

CLA has been signed and all committers have signed the CLA in this PR.

CLAassistant · 2021-12-13T08:44:24Z

All committers have signed the CLA.

ly015 · 2021-12-13T09:16:39Z

Thank you very much for your contribution. We will review the PR asap.
Meanwhile, could you please help also add a unit test to cover the reported case? The unit test of UDP postprocessing is here and you can follow it to add a new snippet below. Please initiate a discussion in this PR if you need any help.

codecov · 2021-12-13T09:22:14Z

Codecov Report

❗ No coverage uploaded for pull request base (dev-0.22@6c9ef32). Click here to learn what that means.
The diff coverage is n/a.

@@             Coverage Diff             @@
##             dev-0.22    #1079   +/-   ##
===========================================
  Coverage            ?   83.37%           
===========================================
  Files               ?      196           
  Lines               ?    15165           
  Branches            ?     2741           
===========================================
  Hits                ?    12644           
  Misses              ?     1827           
  Partials            ?      694

Flag	Coverage Δ
unittests	`83.31% <0.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6c9ef32...aa64e70. Read the comment docs.

innerlee · 2021-12-14T07:55:10Z

mmpose/core/evaluation/top_down_eval.py

@@ -3,6 +3,7 @@

 import cv2
 import numpy as np
+import math


use np for math ops

already done

X00123 · 2021-12-15T09:59:44Z

Thank you very much for your contribution. We will review the PR asap. Meanwhile, could you please help also add a unit test to cover the reported case? The unit test of UDP postprocessing is here and you can follow it to add a new snippet below. Please initiate a discussion in this PR if you need any help.

already done.
by the way, there is a duplicate test in the original unit test file, I also fix it

* update fap (open-mmlab#1070) * fix a bug in post_dark_udp * fix a bug in post_dark_udp * Modified a bad comment * add unit test for udp * add unit test for udp * use numpy instead math for math ops * fix lint Co-authored-by: Jas <jinsheng@sensetime.com> Co-authored-by: quyang <quyang@qiyi.com> Co-authored-by: ly015 <liyining0712@gmail.com>

qtboi · 2021-12-23T11:55:35Z

You should take into account that it is invalid use of copyMakeBorder: opencv/opencv#21274 (comment)
Probably it is way better to use np.pad(x, ((1, 1), (1, 1), (0, 0)), mode="edge")

jin-s13 · 2021-12-23T12:58:47Z

Thanks @qtboi
np.pad looks better.

* update fap (#1070) * fix a bug in post_dark_udp * fix a bug in post_dark_udp * Modified a bad comment * add unit test for udp * add unit test for udp * use numpy instead math for math ops * fix lint Co-authored-by: Jas <jinsheng@sensetime.com> Co-authored-by: quyang <quyang@qiyi.com> Co-authored-by: ly015 <liyining0712@gmail.com>

OpenMMLab-Assistant003 · 2023-04-04T13:16:00Z

Hi @X00123！First of all, we want to express our gratitude for your significant PR in the MMPose project. Your contribution is highly appreciated, and we are grateful for your efforts in helping improve this open-source project during your personal time. We believe that many developers will benefit from your PR.

We would also like to invite you to join our Special Interest Group (SIG) private channel on Discord, where you can share your experiences, ideas, and build connections with like-minded peers. To join the SIG channel, simply message moderator— OpenMMLab on Discord or briefly share your open-source contributions in the #introductions channel and we will assist you. Look forward to seeing you there! Join us ：https://discord.gg/UjgXkPWNqA

If you have WeChat account，welcome to join our community on WeChat. You can add our assistant ：openmmlabwx. Please add "mmsig + Github ID" as a remark when adding friends：）
Thank you again for your contribution❤ @X00123

* update fap (open-mmlab#1070) * fix a bug in post_dark_udp * fix a bug in post_dark_udp * Modified a bad comment * add unit test for udp * add unit test for udp * use numpy instead math for math ops * fix lint Co-authored-by: Jas <jinsheng@sensetime.com> Co-authored-by: quyang <quyang@qiyi.com> Co-authored-by: ly015 <liyining0712@gmail.com>

ly015 requested a review from jin-s13 December 13, 2021 09:05

innerlee reviewed Dec 14, 2021

View reviewed changes

jin-s13 approved these changes Dec 14, 2021

View reviewed changes

ly015 changed the title ~~Bug in post dark udp~~ [Fix] Bug in post dark udp Dec 15, 2021

ly015 changed the base branch from master to dev-0.22 December 15, 2021 13:33

jin-s13 and others added 8 commits December 16, 2021 10:55

update fap (open-mmlab#1070)

0b54682

fix a bug in post_dark_udp

c3320ad

fix a bug in post_dark_udp

e133b40

Modified a bad comment

63d319a

add unit test for udp

062d46d

add unit test for udp

2cec1bf

use numpy instead math for math ops

44d3b0a

fix lint

aa64e70

ly015 force-pushed the bug_in_post_dark_udp branch from a7c3794 to aa64e70 Compare December 16, 2021 02:59

ly015 merged commit c918dbe into open-mmlab:dev-0.22 Dec 16, 2021

jin-s13 mentioned this pull request Dec 23, 2021

Invalid usage of cv2.copyMakeBorder in post_dark_udp #1100

Closed

jin-s13 mentioned this pull request Dec 28, 2021

运行topdown_heatmap/coco/hrnet_w32_coco_256x192_udp.py的时候出现下面的错误 #1113

Closed

HAOCHENYE pushed a commit to HAOCHENYE/mmpose that referenced this pull request Jun 27, 2023

[Feature] Add torch_npu optimizer (open-mmlab#1079)

5b9a154

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Fix] Bug in post dark udp #1079

[Fix] Bug in post dark udp #1079

X00123 commented Dec 13, 2021

CLAassistant commented Dec 13, 2021 •

edited

Loading

ly015 commented Dec 13, 2021

codecov bot commented Dec 13, 2021 •

edited

Loading

innerlee Dec 14, 2021

X00123 Dec 15, 2021

X00123 commented Dec 15, 2021

qtboi commented Dec 23, 2021

jin-s13 commented Dec 23, 2021

OpenMMLab-Assistant003 commented Apr 4, 2023

[Fix] Bug in post dark udp #1079

[Fix] Bug in post dark udp #1079

Conversation

X00123 commented Dec 13, 2021

Motivation

Modification

Use cases (Optional)

Checklist

CLAassistant commented Dec 13, 2021 • edited Loading

ly015 commented Dec 13, 2021

codecov bot commented Dec 13, 2021 • edited Loading

Codecov Report

innerlee Dec 14, 2021

Choose a reason for hiding this comment

X00123 Dec 15, 2021

Choose a reason for hiding this comment

X00123 commented Dec 15, 2021

qtboi commented Dec 23, 2021

jin-s13 commented Dec 23, 2021

OpenMMLab-Assistant003 commented Apr 4, 2023

CLAassistant commented Dec 13, 2021 •

edited

Loading

codecov bot commented Dec 13, 2021 •

edited

Loading