New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[microTVM][RISCV] Tensorization for conv_2d_nchw_int8 with RVV extension #14836
base: main
Are you sure you want to change the base?
[microTVM][RISCV] Tensorization for conv_2d_nchw_int8 with RVV extension #14836
Conversation
Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment. Generated by tvm-bot |
|
||
# 2) Check target. | ||
current_target = target.Target.current(allow_none=False) | ||
has_attr = "+v" in current_target.mattr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please look up if your schedules would also be supported by the sub-extensions zve32x
, zve32f
, zve64x
, zve64f
, zve64d
? If this is the case, please extend this check to be aware of these mattr
flags.
#ifdef __cplusplus | ||
extern "C" | ||
#endif | ||
int32_t dot_uint8_int8_int32_body(uint8_t* data, int8_t* kernel, int32_t* output) {{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you by any chance have some benchmark results on how this performs compared to the RVV autovectorizer which is provided by RISCV LLVM/GCC?
tests/python/relay/aot/riscv.mk
Outdated
@@ -0,0 +1,85 @@ | |||
# Licensed to the Apache Software Foundation (ASF) under one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to upstream a patch like this, before and if was rejected because of the AOTTestRunner
being removed at some point a favor of the MicroTVM Project API interface. (See #12534 (comment))
I actually have MicroTVM Project API template for Spike for a long time now (and use it regulary). I still need to clean it up before upstreaming: https://github.com/PhilippvK/microtvm-spike-template
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added Project API template for RISC-V with Spike based on CRT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for coming up with that change. Would it be possible to split this up into a different PR with separate tests etc.?
That would make reviewing the individual changes much easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sure. I'll create a new PR for Project API
tests/python/relay/aot/riscv.mk
Outdated
|
||
TARGET_CFLAGS = --target=riscv64-unknown-linux-gnu -march=rv64gcv -static | ||
RUNNER = spike | ||
RUNNER_OPT = --isa=rv64gcv $(shell which pk) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to have support for rv32gc
here as well.
tests/python/relay/aot/riscv.mk
Outdated
RUNNER = spike | ||
RUNNER_OPT = --isa=rv64gcv $(shell which pk) | ||
|
||
PKG_CFLAGS = ${PKG_COMPILE_OPTS} ${TARGET_CFLAGS} -O2 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some reason why using -O2
instead of -O3
or -Os
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there was no reason
tests/python/relay/aot/riscv.mk
Outdated
DMLC_CORE=$(TVM_ROOT)/3rdparty/dmlc-core | ||
TOOLCHAIN_PATH=$(shell dirname $(shell which riscv64-unknown-linux-gnu-gcc))/.. | ||
|
||
CC = clang-16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang-16 and RISC-V GCC could both be supported here?
has_attr = "+v" in current_target.mattr | ||
is_arch_support = "v" in current_target.arch[2:] | ||
if not is_arch_support and "march" in current_target.attrs: | ||
is_arch_support = "v" in current_target.attrs["march"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was not aware that -march=
is actually exposed to the TVM target. This is great because it would also work with non-llvm TVM targets.
s[kernel_vec].reorder(oc_chunk, oh, ic_chunk, ow, ic_block, oc_block) | ||
oc_bn = cfg["tile_oc"].size[-1] | ||
if oc_bn > 1: | ||
s[kernel_vec].vectorize(oc_block) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are adding LLVM-exclusive schedule primitives but only testing them on TVMs C-backends (with tir.disable_vectorize=1
). We eventually need to also test if your vector intrinsics does interfere with the auto-vectorization going on in LLVM. Testing this should be possible without compiling and running it for Spike, since errors would already show up during tvm.build()
.
b6db5aa
to
f7fb425
Compare
target_include_directories(tvm_model PRIVATE ${CMAKE_SOURCE_DIR}/include crt_config crt/include) | ||
target_compile_options(tvm_model PRIVATE -Wno-error=unused-variable -Wno-error=missing-braces -Wno-error=unused-const-variable -Wno-unused-variable) | ||
set_target_properties(tvm_model PROPERTIES LINKER_LANGUAGE C) | ||
target_link_libraries(main PRIVATE tvm_model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you would need to add m
(aka. libmath) here to support Softmax layers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add it.
spike_args = ["spike", f"--isa={march}", os.path.join(toolchain_path, target, "bin/pk")] | ||
self._proc = subprocess.Popen( | ||
spike_args + [self.BUILD_TARGET], | ||
stdin=subprocess.PIPE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering if you also run into the issue with the broken STDIN support in recent version of spike mentioned here: riscv-software-src/riscv-isa-sim#1400
Are you using an version of spike which does not have that problem or did you find another workaround?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the latest version of spike, when using instruction counting, I got an error
An illegal instruction was executed!
now I'm using spike version 1.1.0 and this error does not occur on it.
set(CMAKE_CXX_STANDARD 11) | ||
set(CMAKE_C_COMPILER "clang") | ||
set(CMAKE_CXX_COMPILER "clang++") | ||
set(FLAGS "--target=${TARGET} --sysroot=${TOOLCHAIN_PATH}/sysroot --gcc-toolchain=${TOOLCHAIN_PATH} -march=${MARCH} -static") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should also expose -mabi
to the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add it.
toolchain_path = options.get("toolchain_path") | ||
target = options.get("target") | ||
march = options.get("march") | ||
spike_args = ["spike", f"--isa={march}", os.path.join(toolchain_path, target, "bin/pk")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The --isa
argument of spike will not always be the same as the -march
used with clang. I would prefer having isa exposed as project option (defaulting to use the same value as march) to overwrite it if required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add isa as project option.
default="rv64gc", | ||
help="Sets the value of target architecture.", | ||
), | ||
], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To compare how this performs on different vector length, we should make it configurable either by exposing vlen
and elen
as project options or by adding a generic spike_extra_args
option to add any user defined arguments to the cmdline. See here: https://github.com/PhilippvK/microtvm-spike-template/blob/main/template_project/microtvm_api_server.py#L136-L149
f15d664
to
fe48a1b
Compare
Dot product tensorization with RISC-V V-extension intrinsic for conv_2d_nchw_int8. Using llvm intrinsic.
Testing with AOT test runner on Spike.