Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

運行速度問題 #43

Open
lqqpenny opened this issue Sep 2, 2019 · 13 comments
Open

運行速度問題 #43

lqqpenny opened this issue Sep 2, 2019 · 13 comments

Comments

@lqqpenny
Copy link

lqqpenny commented Sep 2, 2019

您好,我有設置O2,也有開啟openmp,100萬像素的圖像匹配速度在一百毫秒以上,請問問題出在哪裡呢,謝謝。
construct response map:
elasped time:0.217464s
templ match:
elasped time:0.131106s
total time:
elasped time:0.351647s
matches size: 83

@meiqua
Copy link
Owner

meiqua commented Sep 2, 2019

SIMD选项呢

@lqqpenny
Copy link
Author

lqqpenny commented Sep 2, 2019

選擇了SSE2,但是有一條warning:
cl : 命令行 warning D9002: 忽略未知选项“/arch:SSE2”
查了一下,似乎是因為所有64位處理器都有SSE2,並不需要手動開啟

@meiqua
Copy link
Owner

meiqua commented Sep 2, 2019

有个MIPP test可以知道开了没有。选AVX呢

@lqqpenny
Copy link
Author

lqqpenny commented Sep 2, 2019

MIPP test的輸出是這樣的:
MIPP tests

Instr. type: NO
Instr. full type: NO_INTRINSICS
Instr. version: 1
Instr. size: 0 bits
Instr. lanes: 1
64-bit support: yes
Byte/word support: yes
in this SIMD, int8 max is not inplemented by MIPP
in this SIMD, int8 shuff is not inplemented by MIPP

選擇AVX生成的時候正常,但是運行會出錯
20190902105902

@lqqpenny
Copy link
Author

lqqpenny commented Sep 2, 2019

試了一下AVX2,生成正常,輸出是這樣的:

MIPP tests

Instr. type: AVX
Instr. full type: AVX2
Instr. version: 2
Instr. size: 256 bits
Instr. lanes: 2
64-bit support: yes
Byte/word support: yes

test img size: 1679616

運行到這裡出錯:
20190902111600

看了一下堆棧,是到這裡出錯:
20190902112634
20190902112659

@meiqua
Copy link
Owner

meiqua commented Sep 2, 2019

可能是opencv比较老的版本不支持这个?试试把step1()改成cols()*channels()

@lqqpenny
Copy link
Author

lqqpenny commented Sep 2, 2019

改了之後出現奇怪的問題,會在這裡出錯:
20190902152908
如果不設置AVX2,這裡就不會出錯,找了很久沒有找到原因

@meiqua
Copy link
Owner

meiqua commented Sep 2, 2019

我看都是跟opencv相关,可能opencv需要也用AVX编译?

@lqqpenny
Copy link
Author

lqqpenny commented Sep 4, 2019

重新編譯了好多次opencv都沒有效果,最後偶然找到了原因,我的CPU不支持AVX2!換了機器之後這個問題解決了!
另外,construct response map的時間還是比較長,請問有什麼好的辦法解決嗎?

@meiqua
Copy link
Owner

meiqua commented Sep 4, 2019

哈哈,居然是这样。response map可以这样进一步加速,我估计最后能到halcon 20ms的程度,因为最好情况就是只从主存读写一遍。

@xinsuinizhuan
Copy link

選擇了SSE2,但是有一條warning:
cl : 命令行 warning D9002: 忽略未知选项“/arch:SSE2”
查了一下,似乎是因為所有64位處理器都有SSE2,並不需要手動開啟

您好,vs2019的SSE2在哪儿启动呢?

@xinsuinizhuan
Copy link

哈哈,居然是这样。response map可以这样进一步加速,我估计最后能到halcon 20ms的程度,因为最好情况就是只从主存读写一遍。

你好,你做完加速了没?现在能到多少时间?

@xinsuinizhuan
Copy link

您好,我有設置O2,也有開啟openmp,100萬像素的圖像匹配速度在一百毫秒以上,請問問題出在哪裡呢,謝謝。
construct response map:
elasped time:0.217464s
templ match:
elasped time:0.131106s
total time:
elasped time:0.351647s
matches size: 83

你是用什么环境编译的?最后加速效果怎么样了?我也遇到了同样的问题,我100万像素照片,136ms,慢,要加速,使用的是vs2019环境。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants