-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISCV64][SHL] Added FC FP32 executor #23964
[RISCV64][SHL] Added FC FP32 executor #23964
Conversation
e318622
to
ab0dff9
Compare
f9f480f
to
4519a74
Compare
@EgorDuplensky Could you please review the PR? |
4519a74
to
53fe96e
Compare
ded6a39
to
384cab0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any plans regarding the tests?
Is there any RISCV emulator or something?
src/plugins/intel_cpu/src/nodes/executors/fullyconnected_implementations.cpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_cpu/src/nodes/executors/fullyconnected_implementations.cpp
Outdated
Show resolved
Hide resolved
src/plugins/intel_cpu/src/nodes/executors/shl/shl_fullyconnected.cpp
Outdated
Show resolved
Hide resolved
I used the current common As for emulators,
|
384cab0
to
540d081
Compare
This PR will be closed in a week because of 2 weeks of no activity. |
540d081
to
866bbbe
Compare
@EgorDuplensky rebased on the latest master and also add the following changes to the latest commit 12b9a9f:
|
866bbbe
to
12b9a9f
Compare
12b9a9f
to
260910c
Compare
Just wondering, is there any weights packing actually happening underneath? Or this is just shl fc not supporting a transposed weights? |
260910c
to
e1377b3
Compare
wei.setData(memory.at(ARG_WEI)->getData()); | ||
dst.setData(memory.at(ARG_DST)->getData()); | ||
|
||
OPENVINO_ASSERT(csinn_fullyconnected(src.get(), dst.get(), wei.get(), bias.get(), params.get()) == CSINN_TRUE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why bias data handle is not updated inside execute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because bias
is constant data and can be handled once in executor constructor.
openvino/src/plugins/intel_cpu/src/nodes/executors/shl/shl_fullyconnected.cpp
Lines 72 to 74 in e1377b3
bias = ShlTensor(sess, memory.at(ARG_BIAS)->getDescPtr()->getShape().getStaticDims(), | |
precisionToShlDataType(biasDesc->getPrecision()), | |
getShlDataLayoutByMemoryDesc(biasDesc), memory.at(ARG_BIAS)->getData()); |
Correct me please if I missed something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline: aligned wei
and bias
tensors behaviors. Now the both tensors update data pointers in execute
and set static shapes in constructor once.
282e5b5
...lugins/intel_cpu/tests/functional/custom/single_layer_tests/instances/riscv64/shl/matmul.cpp
Outdated
Show resolved
Hide resolved
...lugins/intel_cpu/tests/functional/custom/single_layer_tests/instances/riscv64/shl/matmul.cpp
Show resolved
Hide resolved
e1377b3
to
b745f10
Compare
|
src/plugins/intel_cpu/src/nodes/executors/shl/shl_fullyconnected.cpp
Outdated
Show resolved
Hide resolved
a3a1003
to
282e5b5
Compare
### Details: - *Reused FC RVV from SHL* - *The PR to SHL dev branch with accuracy fix for FC f32: openvinotoolkit/shl#3 ### Tickets: - *N/A* ### TODO: - [x] Fix `execType: gemm_f32` - [x] Added wrapper for `csinn_tensor` and `csinn_session` to allocate these structures and deallocate them ### Prerequisites: - [x] openvinotoolkit#23901
### Details: - *Added parallelism support for FC* - *Enabled OpenMP on rv64 by default* - *PR to oneDNN: openvinotoolkit/oneDNN#260 ### Tickets: - *N/A * ### Prerequisites: - [x] #23901 - [x] #23964 - [x] #26175
Details:
Tickets:
TODO:
execType: gemm_f32
csinn_tensor
andcsinn_session
to allocate these structures and deallocate themPrerequisites: