Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[proto] Small optims for elastic op on bboxes #6897

Merged
merged 9 commits into from
Nov 4, 2022

Conversation

vfdev-5
Copy link
Collaborator

@vfdev-5 vfdev-5 commented Nov 3, 2022

[--------- elastic_bounding_box cpu BoundingBoxFormat.XYXY ----------]
            |  elastic_bounding_box_old v2  |  elastic_bounding_box v2
1 threads: -----------------------------------------------------------
      (4,)  |              230              |            190          
6 threads: -----------------------------------------------------------
      (4,)  |              300              |            200          

Times are in microseconds (us).

[--------- elastic_bounding_box cpu BoundingBoxFormat.XYWH ----------]
            |  elastic_bounding_box_old v2  |  elastic_bounding_box v2
1 threads: -----------------------------------------------------------
      (4,)  |              310              |            260          
6 threads: -----------------------------------------------------------
      (4,)  |              330              |            280          

Times are in microseconds (us).

[-------- elastic_bounding_box cpu BoundingBoxFormat.CXCYWH ---------]
            |  elastic_bounding_box_old v2  |  elastic_bounding_box v2
1 threads: -----------------------------------------------------------
      (4,)  |              351              |            300          
6 threads: -----------------------------------------------------------
      (4,)  |              379              |            330          

Times are in microseconds (us).

[--------- elastic_bounding_box cuda BoundingBoxFormat.XYXY ---------]
            |  elastic_bounding_box_old v2  |  elastic_bounding_box v2
1 threads: -----------------------------------------------------------
      (4,)  |              440              |            380          
6 threads: -----------------------------------------------------------
      (4,)  |              440              |            380          

Times are in microseconds (us).

[--------- elastic_bounding_box cuda BoundingBoxFormat.XYWH ---------]
            |  elastic_bounding_box_old v2  |  elastic_bounding_box v2
1 threads: -----------------------------------------------------------
      (4,)  |              560              |            490          
6 threads: -----------------------------------------------------------
      (4,)  |              560              |            490          

Times are in microseconds (us).

[-------- elastic_bounding_box cuda BoundingBoxFormat.CXCYWH --------]
            |  elastic_bounding_box_old v2  |  elastic_bounding_box v2
1 threads: -----------------------------------------------------------
      (4,)  |              640              |            570          
6 threads: -----------------------------------------------------------
      (4,)  |              640              |            570          

Times are in microseconds (us).

cc @datumbox @bjuncek @pmeier

Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I benchmarked your intermediate commits and what you have now (along with the proposed patch below) seems the fastest:

[------------------------------- elastic_bounding_box cpu BoundingBoxFormat.XYXY --------------------------------]
            |  elastic_bounding_box main  |  elastic_bounding_box_bbeca bbeca  |  elastic_bounding_box_b32f7 b32f7
1 threads: -------------------------------------------------------------------------------------------------------
      (4,)  |             613             |                575                 |                490               

Times are in microseconds (us).

[------------------------------- elastic_bounding_box cpu BoundingBoxFormat.XYWH --------------------------------]
            |  elastic_bounding_box main  |  elastic_bounding_box_bbeca bbeca  |  elastic_bounding_box_b32f7 b32f7
1 threads: -------------------------------------------------------------------------------------------------------
      (4,)  |             674             |                636                 |                551               

Times are in microseconds (us).

[------------------------------ elastic_bounding_box cpu BoundingBoxFormat.CXCYWH -------------------------------]
            |  elastic_bounding_box main  |  elastic_bounding_box_bbeca bbeca  |  elastic_bounding_box_b32f7 b32f7
1 threads: -------------------------------------------------------------------------------------------------------
      (4,)  |             710             |                676                 |                580               

Times are in microseconds (us).

[------------------------------- elastic_bounding_box cuda BoundingBoxFormat.XYXY -------------------------------]
            |  elastic_bounding_box main  |  elastic_bounding_box_bbeca bbeca  |  elastic_bounding_box_b32f7 b32f7
1 threads: -------------------------------------------------------------------------------------------------------
      (4,)  |             1370            |                1310                |                708               

Times are in microseconds (us).

[------------------------------- elastic_bounding_box cuda BoundingBoxFormat.XYWH -------------------------------]
            |  elastic_bounding_box main  |  elastic_bounding_box_bbeca bbeca  |  elastic_bounding_box_b32f7 b32f7
1 threads: -------------------------------------------------------------------------------------------------------
      (4,)  |             1470            |                1410                |                789               

Times are in microseconds (us).

[------------------------------ elastic_bounding_box cuda BoundingBoxFormat.CXCYWH ------------------------------]
            |  elastic_bounding_box main  |  elastic_bounding_box_bbeca bbeca  |  elastic_bounding_box_b32f7 b32f7
1 threads: -------------------------------------------------------------------------------------------------------
      (4,)  |             1540            |                1480                |                858               

Times are in microseconds (us).

torchvision/prototype/transforms/functional/_geometry.py Outdated Show resolved Hide resolved
Copy link
Contributor

@datumbox datumbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@vfdev-5 vfdev-5 merged commit 7320648 into pytorch:main Nov 4, 2022
@vfdev-5 vfdev-5 deleted the proto-update-elastic-bboxes branch November 4, 2022 10:22
@github-actions
Copy link

github-actions bot commented Nov 4, 2022

Hey @vfdev-5!

You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py

@vfdev-5 vfdev-5 added module: transforms Perf For performance improvements prototype labels Nov 4, 2022
facebook-github-bot pushed a commit that referenced this pull request Nov 14, 2022
Summary:
* [proto] Small optims for elastic op on bboxes

* More inplace ops according to the review

* Create grid on device directly. This should be faster

* PR Review update. Apply ceil on float input

Reviewed By: NicolasHug

Differential Revision: D41265181

fbshipit-source-id: c9e203d5d039f9969b28951ed2a60299b116e1a1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants