Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

oneDNN 如何能提升DeepSpeech的语音处理性能 #31838

Closed
DemoMoon opened this issue Mar 24, 2021 · 5 comments
Closed

oneDNN 如何能提升DeepSpeech的语音处理性能 #31838

DemoMoon opened this issue Mar 24, 2021 · 5 comments
Assignees
Labels
status/close 已关闭

Comments

@DemoMoon
Copy link

信息如下:
1.Python 3.7.6
2.paddlepaddle 2.0.0
3.oneDNN-2.1.2
4.CPU: 32 Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
5.DeepSpeech 分支 release/v1.8.5

aishell 中文识别需要3分钟。目前是单核CPU在处理。

我需要如何操作呢,还是安装完oneDNN,paddle就会自动使用多核CPU呢?

@paddle-bot-old
Copy link

您好,我们已经收到了您的问题,会安排技术人员尽快解答您的问题,请耐心等待。请您再次检查是否提供了清晰的问题描述、复现代码、环境&版本、报错信息等。同时,您也可以通过查看官网API文档常见问题历史IssueAI社区来寻求解答。祝您生活愉快~

Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the APIFAQGithub Issue and AI community to get the answer.Have a nice day!

@DemoMoon
Copy link
Author

(venv) [root@ip build]# ctest
Test project /home/oneDNN-2.1.2/build
Start 1: cpu-bnorm-u8-via-binary-postops-cpp
1/107 Test #1: cpu-bnorm-u8-via-binary-postops-cpp ............... Passed 0.02 sec
Start 2: cpu-cnn-inference-f32-cpp
2/107 Test #2: cpu-cnn-inference-f32-cpp ......................... Passed 0.49 sec
Start 3: cpu-cnn-inference-int8-cpp
3/107 Test #3: cpu-cnn-inference-int8-cpp ........................ Passed 0.02 sec
Start 4: cpu-cnn-training-bf16-cpp
4/107 Test #4: cpu-cnn-training-bf16-cpp ......................... Passed 0.28 sec
Start 5: cpu-cnn-training-f32-cpp
5/107 Test #5: cpu-cnn-training-f32-cpp .......................... Passed 0.26 sec
Start 6: cpu-rnn-inference-f32-cpp
6/107 Test #6: cpu-rnn-inference-f32-cpp ......................... Passed 1.69 sec
Start 7: cpu-rnn-inference-int8-cpp
7/107 Test #7: cpu-rnn-inference-int8-cpp ........................ Passed 0.62 sec
Start 8: cpu-getting-started-cpp
8/107 Test #8: cpu-getting-started-cpp ........................... Passed 0.01 sec
Start 9: cpu-memory-format-propagation-cpp
9/107 Test #9: cpu-memory-format-propagation-cpp ................. Passed 0.01 sec
Start 10: cpu-performance-profiling-cpp
10/107 Test #10: cpu-performance-profiling-cpp ..................... Passed 0.27 sec
Start 11: cpu-primitives-batch-normalization-cpp
11/107 Test #11: cpu-primitives-batch-normalization-cpp ............ Passed 0.02 sec
Start 12: cpu-primitives-binary-cpp
12/107 Test #12: cpu-primitives-binary-cpp ......................... Passed 0.02 sec
Start 13: cpu-primitives-concat-cpp
13/107 Test #13: cpu-primitives-concat-cpp ......................... Passed 0.02 sec
Start 14: cpu-primitives-convolution-cpp
14/107 Test #14: cpu-primitives-convolution-cpp .................... Passed 0.01 sec
Start 15: cpu-primitives-eltwise-cpp
15/107 Test #15: cpu-primitives-eltwise-cpp ........................ Passed 0.02 sec
Start 16: cpu-primitives-inner-product-cpp
16/107 Test #16: cpu-primitives-inner-product-cpp .................. Passed 0.72 sec
Start 17: cpu-primitives-layer-normalization-cpp
17/107 Test #17: cpu-primitives-layer-normalization-cpp ............ Passed 0.01 sec
Start 18: cpu-primitives-logsoftmax-cpp
18/107 Test #18: cpu-primitives-logsoftmax-cpp ..................... Passed 0.01 sec
Start 19: cpu-primitives-lrn-cpp
19/107 Test #19: cpu-primitives-lrn-cpp ............................ Passed 0.02 sec
Start 20: cpu-primitives-lstm-cpp
20/107 Test #20: cpu-primitives-lstm-cpp ........................... Passed 0.06 sec
Start 21: cpu-primitives-matmul-cpp
21/107 Test #21: cpu-primitives-matmul-cpp ......................... Passed 0.08 sec
Start 22: cpu-primitives-pooling-cpp
22/107 Test #22: cpu-primitives-pooling-cpp ........................ Passed 0.01 sec
Start 23: cpu-primitives-prelu-cpp
23/107 Test #23: cpu-primitives-prelu-cpp .......................... Passed 0.03 sec
Start 24: cpu-primitives-reduction-cpp
24/107 Test #24: cpu-primitives-reduction-cpp ...................... Passed 0.03 sec
Start 25: cpu-primitives-reorder-cpp
25/107 Test #25: cpu-primitives-reorder-cpp ........................ Passed 0.02 sec
Start 26: cpu-primitives-resampling-cpp
26/107 Test #26: cpu-primitives-resampling-cpp ..................... Passed 0.03 sec
Start 27: cpu-primitives-shuffle-cpp
27/107 Test #27: cpu-primitives-shuffle-cpp ........................ Passed 0.40 sec
Start 28: cpu-primitives-softmax-cpp
28/107 Test #28: cpu-primitives-softmax-cpp ........................ Passed 0.01 sec
Start 29: cpu-primitives-sum-cpp
29/107 Test #29: cpu-primitives-sum-cpp ............................ Passed 0.04 sec
Start 30: cpu-rnn-training-f32-cpp
30/107 Test #30: cpu-rnn-training-f32-cpp .......................... Passed 0.17 sec
Start 31: cpu-tutorials-matmul-matmul-quantization-cpp
31/107 Test #31: cpu-tutorials-matmul-matmul-quantization-cpp ...... Passed 0.07 sec
Start 32: cpu-tutorials-matmul-sgemm-and-matmul-cpp
32/107 Test #32: cpu-tutorials-matmul-sgemm-and-matmul-cpp ......... Passed 0.01 sec
Start 33: cpu-tutorials-matmul-inference-int8-matmul-cpp
33/107 Test #33: cpu-tutorials-matmul-inference-int8-matmul-cpp .... Passed 0.02 sec
Start 34: cpu-cnn-inference-f32-c
34/107 Test #34: cpu-cnn-inference-f32-c ........................... Passed 0.03 sec
Start 35: cpu-cnn-training-f32-c
35/107 Test #35: cpu-cnn-training-f32-c ............................ Passed 0.07 sec
Start 36: api-c
36/107 Test #36: api-c ............................................. Passed 0.01 sec
Start 37: test_c_symbols-c
37/107 Test #37: test_c_symbols-c .................................. Passed 0.00 sec
Start 38: test_primitive_cache_mt
38/107 Test #38: test_primitive_cache_mt ........................... Passed 0.01 sec
Start 39: test_iface_primitive_cache
39/107 Test #39: test_iface_primitive_cache ........................ Passed 0.01 sec
Start 40: test_iface_pd
40/107 Test #40: test_iface_pd ..................................... Passed 0.01 sec
Start 41: test_iface_pd_iter
41/107 Test #41: test_iface_pd_iter ................................ Passed 0.01 sec
Start 42: test_iface_attr
42/107 Test #42: test_iface_attr ................................... Passed 0.01 sec
Start 43: test_iface_handle
43/107 Test #43: test_iface_handle ................................. Passed 0.01 sec
Start 44: test_iface_runtime_dims
44/107 Test #44: test_iface_runtime_dims ........................... Passed 0.01 sec
Start 45: test_iface_runtime_attr
45/107 Test #45: test_iface_runtime_attr ........................... Passed 0.02 sec
Start 46: test_iface_wino_convolution
46/107 Test #46: test_iface_wino_convolution ....................... Passed 0.01 sec
Start 47: test_dnnl_threading
47/107 Test #47: test_dnnl_threading ............................... Passed 0.01 sec
Start 48: test_sum
48/107 Test #48: test_sum .......................................... Passed 0.47 sec
Start 49: test_reorder
49/107 Test #49: test_reorder ...................................... Passed 1.42 sec
Start 50: test_cross_engine_reorder
50/107 Test #50: test_cross_engine_reorder ......................... Passed 0.01 sec
Start 51: test_concat
51/107 Test #51: test_concat ....................................... Passed 0.08 sec
Start 52: test_softmax
52/107 Test #52: test_softmax ...................................... Passed 0.08 sec
Start 53: test_eltwise
53/107 Test #53: test_eltwise ...................................... Passed 40.18 sec
Start 54: test_lrn_forward
54/107 Test #54: test_lrn_forward .................................. Passed 1.39 sec
Start 55: test_lrn_backward
55/107 Test #55: test_lrn_backward ................................. Passed 40.05 sec
Start 56: test_pooling_forward
56/107 Test #56: test_pooling_forward .............................. Passed 9.36 sec
Start 57: test_pooling_backward
57/107 Test #57: test_pooling_backward ............................. Passed 21.74 sec
Start 58: test_batch_normalization_f32
58/107 Test #58: test_batch_normalization_f32 ...................... Passed 3.55 sec
Start 59: test_batch_normalization_s8
59/107 Test #59: test_batch_normalization_s8 ....................... Passed 0.45 sec
Start 60: test_inner_product_forward
60/107 Test #60: test_inner_product_forward ........................ Passed 0.24 sec
Start 61: test_inner_product_backward_data
61/107 Test #61: test_inner_product_backward_data .................. Passed 0.11 sec
Start 62: test_inner_product_backward_weights
62/107 Test #62: test_inner_product_backward_weights ............... Passed 0.35 sec
Start 63: test_shuffle
63/107 Test #63: test_shuffle ...................................... Passed 0.04 sec
Start 64: test_rnn_forward
64/107 Test #64: test_rnn_forward .................................. Passed 0.10 sec
Start 65: test_convolution_format_any
65/107 Test #65: test_convolution_format_any ....................... Passed 0.01 sec
Start 66: test_convolution_forward_f32
66/107 Test #66: test_convolution_forward_f32 ...................... Passed 3.38 sec
Start 67: test_convolution_forward_u8s8s32
67/107 Test #67: test_convolution_forward_u8s8s32 .................. Passed 0.08 sec
Start 68: test_convolution_forward_u8s8fp
68/107 Test #68: test_convolution_forward_u8s8fp ................... Passed 0.08 sec
Start 69: test_convolution_eltwise_forward_f32
69/107 Test #69: test_convolution_eltwise_forward_f32 .............. Passed 4.85 sec
Start 70: test_convolution_eltwise_forward_x8s8f32s32
70/107 Test #70: test_convolution_eltwise_forward_x8s8f32s32 ....... Passed 2.89 sec
Start 71: test_convolution_backward_data_f32
71/107 Test #71: test_convolution_backward_data_f32 ................ Passed 5.16 sec
Start 72: test_convolution_backward_weights_f32
72/107 Test #72: test_convolution_backward_weights_f32 ............. Passed 5.98 sec
Start 73: test_deconvolution
73/107 Test #73: test_deconvolution ................................ Passed 0.40 sec
Start 74: test_gemm_f16
74/107 Test #74: test_gemm_f16 ..................................... Passed 0.01 sec
Start 75: test_gemm_f32
75/107 Test #75: test_gemm_f32 ..................................... Passed 1.74 sec
Start 76: test_gemm_f16f16f32
76/107 Test #76: test_gemm_f16f16f32 ............................... Passed 0.01 sec
Start 77: test_gemm_bf16bf16f32
77/107 Test #77: test_gemm_bf16bf16f32 ............................. Passed 3.02 sec
Start 78: test_gemm_bf16bf16bf16
78/107 Test #78: test_gemm_bf16bf16bf16 ............................ Passed 0.01 sec
Start 79: test_gemm_u8s8s32
79/107 Test #79: test_gemm_u8s8s32 ................................. Passed 0.93 sec
Start 80: test_gemm_s8s8s32
80/107 Test #80: test_gemm_s8s8s32 ................................. Passed 0.92 sec
Start 81: test_gemm_s8u8s32
81/107 Test #81: test_gemm_s8u8s32 ................................. Passed 0.01 sec
Start 82: test_gemm_u8u8s32
82/107 Test #82: test_gemm_u8u8s32 ................................. Passed 0.01 sec
Start 83: test_layer_normalization
83/107 Test #83: test_layer_normalization .......................... Passed 0.07 sec
Start 84: test_binary
84/107 Test #84: test_binary ....................................... Passed 0.20 sec
Start 85: test_logsoftmax
85/107 Test #85: test_logsoftmax ................................... Passed 0.08 sec
Start 86: test_matmul
86/107 Test #86: test_matmul ....................................... Passed 0.09 sec
Start 87: test_resampling
87/107 Test #87: test_resampling ................................... Passed 0.08 sec
Start 88: test_global_scratchpad
88/107 Test #88: test_global_scratchpad ............................ Passed 0.01 sec
Start 89: test_reduction
89/107 Test #89: test_reduction .................................... Passed 0.01 sec
Start 90: test_isa_mask
90/107 Test #90: test_isa_mask ..................................... Passed 0.01 sec
Start 91: test_isa_hints
91/107 Test #91: test_isa_hints .................................... Passed 0.00 sec
Start 92: test_isa_iface
92/107 Test #92: test_isa_iface .................................... Passed 0.00 sec
Start 93: test_api
93/107 Test #93: test_api .......................................... Passed 0.02 sec
Start 94: test_internals
94/107 Test #94: test_internals .................................... Passed 0.01 sec
Start 95: mkldnn-compat-cpu-cnn-inference-f32-cpp
95/107 Test #95: mkldnn-compat-cpu-cnn-inference-f32-cpp ........... Passed 0.19 sec
Start 96: mkldnn-compat-cpu-cnn-inference-int8-cpp
96/107 Test #96: mkldnn-compat-cpu-cnn-inference-int8-cpp .......... Passed 0.01 sec
Start 97: mkldnn-compat-cpu-cnn-training-f32-cpp
97/107 Test #97: mkldnn-compat-cpu-cnn-training-f32-cpp ............ Passed 0.02 sec
Start 98: mkldnn-compat-cpu-cnn-training-bf16-cpp
98/107 Test #98: mkldnn-compat-cpu-cnn-training-bf16-cpp ........... Passed 0.02 sec
Start 99: mkldnn-compat-cpu-memory-format-propagation-cpp
99/107 Test #99: mkldnn-compat-cpu-memory-format-propagation-cpp ... Passed 0.01 sec
Start 100: mkldnn-compat-cpu-rnn-inference-f32-cpp
100/107 Test #100: mkldnn-compat-cpu-rnn-inference-f32-cpp ........... Passed 0.14 sec
Start 101: mkldnn-compat-cpu-rnn-inference-int8-cpp
101/107 Test #101: mkldnn-compat-cpu-rnn-inference-int8-cpp .......... Passed 0.02 sec
Start 102: mkldnn-compat-cpu-getting-started-cpp
102/107 Test #102: mkldnn-compat-cpu-getting-started-cpp ............. Passed 0.01 sec
Start 103: mkldnn-compat-cpu-performance-profiling-cpp
103/107 Test #103: mkldnn-compat-cpu-performance-profiling-cpp ....... Passed 0.08 sec
Start 104: mkldnn-compat-cpu-rnn-training-f32-cpp
104/107 Test #104: mkldnn-compat-cpu-rnn-training-f32-cpp ............ Passed 0.07 sec
Start 105: mkldnn-compat-cpu-cnn-inference-f32-c
105/107 Test #105: mkldnn-compat-cpu-cnn-inference-f32-c ............. Passed 0.01 sec
Start 106: mkldnn-compat-cpu-cnn-training-f32-c
106/107 Test #106: mkldnn-compat-cpu-cnn-training-f32-c .............. Passed 0.02 sec
Start 107: noexcept-cpp
107/107 Test #107: noexcept-cpp ...................................... Passed 0.01 sec

100% tests passed, 0 tests failed out of 107

Total Test time (real) = 155.98 sec

@vslyu
Copy link
Contributor

vslyu commented Mar 25, 2021

请问你走到是推理还是训练?~

@zh794390558
Copy link
Contributor

zh794390558 commented Mar 25, 2021

DeepSpeech 分支 release/v1.8.5 已经不维护了,paddle 2.x 后建议使用develop分支上的代码,之后会release/2.0的分支。
mkl加速,需要编译支持mkl的paddle, 同时enable paddle mkl,优化线程数和其他mkl相关配置。
目前paddle2.0 cpu运行deepspeech有已知的精度问题,正在修复中。 GPU已经修复了,develop已合入,准备合入2.0.2分支。
#31846

@DemoMoon
Copy link
Author

请问你走到是推理还是训练?~

推理

@paddle-bot paddle-bot bot added the status/close 已关闭 label Jan 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/close 已关闭
Projects
None yet
Development

No branches or pull requests

3 participants