Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optim performance and cls/rec batchsize for inference #480

Merged
merged 1 commit into from
Jul 6, 2023
Merged

optim performance and cls/rec batchsize for inference #480

merged 1 commit into from
Jul 6, 2023

Conversation

liangxhao
Copy link
Contributor

@liangxhao liangxhao commented Jul 4, 2023

Thank you for your contribution to the MindOCR repo.
Before submitting this PR, please make sure:

Motivation

  1. 优化模型流水线队列启动时间,约降低30%,例如crnn_resnet34vd,3.2s ->2.2s
  2. 优化数据队列进程之间数据传输时间,约降低1.5ms
  3. 支持cls和rec模型的batch size>1,在模型转换lite mindir时设置,推理时从模型文件中读取
    实现逻辑:在每个模型推理子进程获取参数值,同步到主进程,在主进程发送数据之前对数据切分打包,避免在子进程缓存数据
    crnn_resnet34vd, ic15-2077, fps
    batch_size=1 batch_size=6
    fps 420 1250

Test Plan

(How should this PR be tested? Do you require special setup to run the test or repro the fixed bug?)

Related Issues and PRs

(Is this PR part of a group of changes? Link the other relevant PRs and Issues here. Use https://help.github.com/en/articles/closing-issues-using-keywords for help on GitHub syntax)

@liangxhao liangxhao changed the title optim performance and cls/rec batchsize for inference optim performance and cls/rec batchsize for inference[WIP] Jul 4, 2023
@liangxhao liangxhao changed the title optim performance and cls/rec batchsize for inference[WIP] optim performance and cls/rec batchsize for inference Jul 4, 2023

if image_total > 0:
self.profiling(profiling_data, image_total)
print(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use log

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个之前是写在log中的,后来被反馈后修改增加了print。
因为其它log一般不重要默认关闭,但是性能信息log却很重要,故需要单独打出来,确保不开log也能看得到。

@@ -39,7 +78,8 @@ def free_model(self):
del model
else:
del self.model
self.model = None

self.model = None

def __del__(self):
self.free_model()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

将self.free_model()的代码直接移动到__del__里面

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

free_model()有时需要单独调用,而不依赖__del__

@@ -39,7 +78,8 @@ def free_model(self):
del model
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

删除循环,直接使用del self.model[:]或者del self.model,参见:https://stackoverflow.com/questions/12417498/how-to-release-used-memory-immediately-in-python-list

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

我这里写错了,self.model是字典,已修复
self.model.clear()
del self.model
gc.collect()


self._hw_list = []
self._bs_list = []

self.model = defaultdict()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.model = defaultdict(Model)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Model是有必选参数的,若使用defaultdict(Model),命中时会因缺少初始化参数而报错
已修复为:
self.model: Dict[int, Model] = {}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change input = data["image"] to input_data = data["image"] in function model_infer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

f"total cost {cost_time:.2f}s, FPS: "
f"{safe_div(image_total, cost_time):.2f}"
)
print(perf_info)
Copy link
Collaborator

@VictorHe-1 VictorHe-1 Jul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个perf_info 108行也有个log.info,也是需要不开log也打印的吗?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

对,既要确保开log时输入到log文件中,又要确保不开log时打印到屏幕(否则没法测试性能)

for i in range(batch):
label, score = output[i]
if "180" == label and score > self.cls_thresh:
sub_images[i] = cv2.rotate(sub_images[i], cv2.ROTATE_180)
input_data.sub_image_list = sub_images
else:
# TODO: only support batch=1
input_data.infer_result = output[0]
input_data.infer_result = output

input_data.data = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete input_data.data first and then set to None to avoid memory leak

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

添加del可能会引起gc概率性的提前执行,导致性能降低
这里是由python引用计数器来自动记录,一般会在cpu相对空闲时(比如数据队列发送等待时)自行gc释放内存

@liangxhao liangxhao merged commit 8e0b7ba into mindspore-lab:main Jul 6, 2023
@liangxhao liangxhao deleted the optim_perf branch August 17, 2023 01:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants