Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speedup the Video Inference by Accelerating data-loading Stage #7832

Merged
merged 13 commits into from May 8, 2022

Conversation

chenxinfeng4
Copy link
Contributor

Motivation

The video inference was not efficient, because there are many "resize", "padding", "to rgb" and "normalize" operations in CPU workload. Besides, the 'img_metas' is calculated for every frame, which is nonefficient. The CPU workload is really high, but the speed is slow.

Modification

I transformed the video "crop", "resize", "padding", "to rgb" in ffmpeg-based video reader, which is lite and CPU friendly. The video reader can support NVIDIA-VIDEO-DECODING if possible, to save more CPU workload. And the "normalize" is done in GPU. In all, I reduced the data-loading time and CPU workload.

Result

After this modification, the CPU workload can reduced from >1000% to 180%, and slightly improve the inference frame rate. It's significant in the case that dataloading is the heaviest cost than the network evalutation, such as the yolact.

@CLAassistant
Copy link

CLAassistant commented Apr 26, 2022

CLA assistant check
All committers have signed the CLA.

@chenxinfeng4
Copy link
Contributor Author

You can compare the original video_demo.py with my video_gpuaccel_demo.py. They have some argment inputs.
To see how much improved, please delete the post-operations in both codes. Because the post-operations are also CPU heavy.

# post-operations in `video_demo.py` and my `video_gpuaccel_demo.py`.
model.show_result(xxx)

```

demo/video_gpuaccel_demo.py Outdated Show resolved Hide resolved
demo/video_gpuaccel_demo.py Outdated Show resolved Hide resolved
demo/video_gpuaccel_demo.py Outdated Show resolved Hide resolved
demo/video_gpuaccel_demo.py Outdated Show resolved Hide resolved
@hhaAndroid
Copy link
Collaborator

@chenxinfeng4 Thank you very much for your contribution.

@chenxinfeng4
Copy link
Contributor Author

I tested the speed of data loading. It's about 230FPS , 800%CPU.

    with torch.cuda.device(args.device), torch.no_grad():
        for frame_resize, frame_origin in zip(tqdm.tqdm(video_resize), video_origin):
            data = process_img(frame_resize, img_metas)
            continue

image
image

demo/video_gpuaccel_demo.py Outdated Show resolved Hide resolved
demo/video_gpuaccel_demo.py Outdated Show resolved Hide resolved
demo/video_gpuaccel_demo.py Outdated Show resolved Hide resolved
demo/video_gpuaccel_demo.py Outdated Show resolved Hide resolved
@ZwwWayne
Copy link
Collaborator

lint failed.

@ZwwWayne ZwwWayne changed the base branch from master to dev April 27, 2022 08:21
Copy link
Member

@RangiLyu RangiLyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RangiLyu
Copy link
Member

Please use pre-commit hook to fix the lint.

demo/video_gpuaccel_demo.py Outdated Show resolved Hide resolved
demo/video_gpuaccel_demo.py Outdated Show resolved Hide resolved
@hhaAndroid
Copy link
Collaborator

@chenxinfeng4 Thanks for your very fast response.

Copy link
Collaborator

@jbwang1997 jbwang1997 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chenxinfeng4
Copy link
Contributor Author

I'm not an expert of git. Why the PR is block? What should I do next?

@jbwang1997
Copy link
Collaborator

I'm not an expert of git. Why the PR is block? What should I do next?

Hello @chenxinfeng4.

Merging is controlled by maintainers. We will merge this pr as soon as possible. Seems the lint fails again. Please remember to fix it.

Thanks for your quick response.

@ZwwWayne ZwwWayne merged commit b1f40ef into open-mmlab:dev May 8, 2022
ZwwWayne pushed a commit that referenced this pull request Jul 18, 2022
* add a faster inference for video

* Fix typos

* modify typo

* modify the numpy array to torch gpu

* fix lint

* add description

* add documents

* fix typro

* fix lint

* fix lint

* fix lint again

* fix a mistake
ZwwWayne pushed a commit to ZwwWayne/mmdetection that referenced this pull request Jul 19, 2022
…mmlab#7832)

* add a faster inference for video

* Fix typos

* modify typo

* modify the numpy array to torch gpu

* fix lint

* add description

* add documents

* fix typro

* fix lint

* fix lint

* fix lint again

* fix a mistake
ZwwWayne pushed a commit to ZwwWayne/mmdetection that referenced this pull request Jul 19, 2022
…mmlab#7832)

* add a faster inference for video

* Fix typos

* modify typo

* modify the numpy array to torch gpu

* fix lint

* add description

* add documents

* fix typro

* fix lint

* fix lint

* fix lint again

* fix a mistake
SakiRinn pushed a commit to SakiRinn/mmdetection-locount that referenced this pull request Mar 17, 2023
…mmlab#7832)

* add a faster inference for video

* Fix typos

* modify typo

* modify the numpy array to torch gpu

* fix lint

* add description

* add documents

* fix typro

* fix lint

* fix lint

* fix lint again

* fix a mistake
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants