Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault and Assembler messages while compiling OpenCV with Branch rvv-intrinsic #701

joy2myself opened this issue Aug 27, 2020 · 8 comments


Copy link

I used some rvv intrinsics while developing in OpenCV. And I tried to use the branch rvv-intrinsics to compile OpenCV. Two errors have been encountered so far.


In file included from /home/git/opencv/modules/core/src/mathfuncs_core.dispatch.cpp:7:
/home/git/opencv/modules/core/src/mathfuncs_core.simd.hpp:76:8: internal compiler error: Segmentation fault
   76 | struct v_atan_f32
      |        ^~~~~~~~~~
0xfb1dbf crash_signal
0x127ec43 selt
0x127ec43 wi::lts_p_large(long const*, unsigned int, unsigned int, long const*, unsigned int)
0x98a486 bool wi::lts_p<generic_wide_int<wi::extended_tree<192> >, generic_wide_int<wi::extended_tree<192> > >(generic_wide_int<wi::extended_tree<192> > const&, generic_wide_int<wi::extended_tree<192> > const&)
0x98a486 wi::binary_traits<generic_wide_int<wi::extended_tree<192> >, generic_wide_int<wi::extended_tree<192> >, wi::int_traits<generic_wide_int<wi::extended_tree<192> > >::precision_type, wi::int_traits<generic_wide_int<wi::extended_tree<192> > >::precision_type>::signed_predicate_result operator< <generic_wide_int<wi::extended_tree<192> >, generic_wide_int<wi::extended_tree<192> > >(generic_wide_int<wi::extended_tree<192> > const&, generic_wide_int<wi::extended_tree<192> > const&)
0x98a486 tree_int_cst_lt(tree_node const*, tree_node const*)
0x98a486 walk_subobject_offsets
0x994d36 layout_class_type
0x994d36 finish_struct_1(tree_node*)
0x996db4 finish_struct(tree_node*, tree_node*)
0xa523c1 cp_parser_class_specifier_1
0xa54039 cp_parser_class_specifier
0xa54039 cp_parser_type_specifier
0xa54bd1 cp_parser_decl_specifier_seq
0xa556e4 cp_parser_simple_declaration
0xa799b9 cp_parser_declaration
0xa7a59c cp_parser_declaration_seq_opt
0xa7a59c cp_parser_namespace_body
0xa7a59c cp_parser_namespace_definition
0xa79a8f cp_parser_declaration
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <> for instructions.
modules/core/CMakeFiles/opencv_core.dir/build.make:577: recipe for target 'modules/core/CMakeFiles/opencv_core.dir/src/mathfuncs_core.dispatch.cpp.o' failed


/tmp/ccyu6Srz.s: Assembler messages:
/tmp/ccyu6Srz.s:32785: Error: unrecognized opcode `vqmaccu.vv v4,v2,v1'
/tmp/ccyu6Srz.s:33036: Error: unrecognized opcode `vqmacc.vv v4,v2,v1'
/tmp/ccyu6Srz.s:33285: Error: unrecognized opcode `vqmaccu.vv v4,v2,v1'
/tmp/ccyu6Srz.s:33523: Error: unrecognized opcode `vqmacc.vv v4,v2,v1'
modules/core/CMakeFiles/opencv_core.dir/build.make:590: recipe for target 'modules/core/CMakeFiles/opencv_core.dir/src/matmul.dispatch.cpp.o' failed

This seems to be problems with the toolchain itself, right?

Steps to compile OpenCV:

git clone
cd opencv
git checkout rvv
mkdir build && cd build
cmake -DCMAKE_TOOLCHAIN_FILE=../platforms/linux/riscv64-gcc.toolchain.cmake ../
make -k

Suppose the toolchain with branch rvv-intrinsic is installed in /opt/RISCV. If not, you may need edit opencv/platforms/linux/riscv64-gcc.toolchain.cmake.

Copy link

vq* instruction require zvqmac extension which is not enabled by default, you need enable this by -march, e.g. -march=rv64gcv_zvqmac.

For compiler issue, I am investigating now, thanks your report :)

Copy link

The error message is because you declare a class/struct with RVV vector type member, which is unsupported usage for those types, and we'll emit error message to forbid that soon.

The reason why we can't/don't support that is because the size of RVV vector type is unknown at compilation time, so compiler can't layout that at compilation time.

Copy link

@kito-cheng Hi Kito. Thanks lot for your reply.
In OpenCV universal intrinsics, vector types are always a fix size (usually 128 bits, For example v_int8x16, v_int16x8 and so on). Then If I can't use RVV vector type in a class/struct, how could I implement those universal intrinsic vector types?

Copy link

I have a roughly idea for solving that, but that might got very bad code gen for current GCC implementation:

#include <riscv_vector.h>
#include <stdio.h>
class rvv_vector_type_wrapper32x4 {
    rvv_vector_type_wrapper32x4 () {
      for (int i=0; i<4; i++){
        data[i] = 0;
    rvv_vector_type_wrapper32x4 (vint32m1_t val) {
      vsetvl_e32m1 (4);
      vse32_v_i32m1 (&data[0], val);
    void load (int32_t *ptr) {
      vsetvl_e32m1 (4);
      vint32m1_t val = vle32_v_i32m1 (ptr);
      vse32_v_i32m1 (&this->data[0], val);

    void store (int32_t *ptr) const {
      vsetvl_e32m1 (4);
      vint32m1_t val = vle32_v_i32m1 (&this->data[0]);
      vse32_v_i32m1 (ptr, val);

    rvv_vector_type_wrapper32x4 operator+(
      const rvv_vector_type_wrapper32x4 &rhs) const {
      rvv_vector_type_wrapper32x4 rv;
      vsetvl_e32m1 (4);
      vint32m1_t vrhs = vle32_v_i32m1 (&[0]);
      vint32m1_t vlhs = vle32_v_i32m1 (&this->data[0]);

      vint32m1_t val = vadd_vv_i32m1(vrhs, vlhs);
      return rvv_vector_type_wrapper32x4(val);
    int32_t operator[](size_t idx) const {
      return data[idx];

    int32_t data[4];

int x[4] = {1, 2, 3, 4};
int y[4] = {5, 6, 7, 8};
int z[4] = {0};

int main()
  rvv_vector_type_wrapper32x4 a, b;
  a.load (&x[0]);
  b.load (&y[0]);
  rvv_vector_type_wrapper32x4 c = a + b; (&z[0]);
  printf ("result :");
  for (int i=0; i<4; i++){
    printf ("%d ", c[i]);
  printf ("\n");
  return 0;

Copy link

@kito-cheng Thanks a lot for the idea. I will use it temporarily. And we are discussing on making some changes on OpenCV universal intrinsic framework to fit scalable vector architecture.

Copy link

@kito-cheng , what are your thoughts on this approach for avoiding vector-member-in-class?
Basic example
Dot product

It's a thin C++ wrapper that can use e.g. vfloat32m1_t, hidden from user code via auto. An empty Simd<T, N> "descriptor" is used to select overloads of functions.

More info (slides). Would be happy to discuss.

Copy link

@jan-wassenberg I just spend some time more than I expect on investigate that since my C++ is little rusty, since GCC just stay at C++98 :P highways seems awesome to me, and basically RVV is very similar to the SVE, so if it work with SVE, it should work on RVV too, but I am curious about the empty Simd<T, N> descriptor, I saw most backend in highway has some kind of Raw data member in Vec128 and Mask128, I am not sure how to handle this part in SVR or RVV?

And I also not saw the SVE support on github, does here any branch or repo for that?

Thanks again for sharing your awesome work!

Copy link

@kito-cheng thanks!

I saw most backend in highway has some kind of Raw data member in Vec128 and Mask128

Right, this is fine on x86 and NEON but will not work on SVE/RVV. That is why the Highway API is designed to work with builtin/non-class vector types such as vfloat32m1_t. Highway uses nonmember functions.

The RVV backend would return vfloat32m1_t from functions instead of Vec128. The empty Simd<T, N> helps select the correct overload, e.g.

vint8m1_t Undefined(Simd<int8_t, N>) { return vundefined_i8m1(); }
vint32m1_t Undefined(Simd<int32_t, N>) { return vundefined_i32m1(); }

I believe the N (some large constant) will be able to encode the mf8..m8.

And I also not saw the SVE support on github

It is not implemented yet. We are currently removing things that will not work in SVE, e.g. MaxLanes. After that I will look into a development environment for RVV and/or SVE.

does here any branch or repo for that?

Our work is automatically mirrored to so you can watch that repo :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
None yet

No branches or pull requests

3 participants