-
Notifications
You must be signed in to change notification settings - Fork 0
Xsimd #12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d0066dc to
5c95f8c
Compare
roversch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! All in all, the code became a lot smaller.
I don't have any blocking comments, just minor suggestions and questions.
| sudo apt-get update | ||
| sudo apt install libboost-exception-dev libbenchmark-dev -y | ||
| - name: Install LLVM and Clang 18 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any way to make this version indepedent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be not a problem. But the CI system has clang-14, and on my system the oldest clang I can install with apt is clang-15. Compiling with older clang requires adding the typename keyword here and there. Also the code builds with gcc now, but I am not sure about older versions.
I suggest that we keep it in mind and come back to the compiler support issue later.
| cmake -B ${{github.workspace}}/build \ | ||
| -DCMAKE_BUILD_TYPE=${{env.BUILD_TYPE}} \ | ||
| -DCMAKE_CXX_COMPILER=clang++ \ | ||
| -DCMAKE_CXX_COMPILER=clang++-18 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answered above
| && cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo \ | ||
| -DCMAKE_CXX_COMPILER="clang++-15" \ | ||
| -DCMAKE_CXX_FLAGS="-march=native -mfma -mavx -mavx2 -msse4 -fno-math-errno" \ | ||
| -DCMAKE_CXX_FLAGS="-march=native -mfma -mavx -mavx2 -msse4 -fno-math-errno -DXSIMD_DEFAULT_ARCH='fma3<avx2>'" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to detect this automatically (but that is maybe out of scope of this pull request)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe xsimd does automatically deduce the best possible architecture if you don't specify one. For now I made it explicit because currently we only support avx2.
|
|
||
| template <typename Arch> | ||
| requires std::is_base_of_v<xsimd::avx2, Arch> | ||
| inline std::tuple<xsimd::batch<float, Arch>, xsimd::batch<std::int32_t, Arch>> imax(xsimd::batch<float, Arch> const& v1, xsimd::batch<std::int32_t, Arch> const& idx) noexcept |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some documentation would be nice here. If it's a standard approach, at least a link or so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in 2f757bb
| using MaskType = typename Simd<std::remove_cv_t<T>>::MaskType; | ||
| using SimdVecType = SimdVec<std::remove_cv_t<T>>; | ||
| using IntrinsicType = SimdVecType::IntrinsicType; | ||
| using MaskType = SimdMask<std::remove_cv_t<T>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question: I guess you remove const and volatile qualifiers here, because otherwise you run into problems with const correctness?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, without it I used to run into compiler errors. We can try removing it and see what happens.
| v[i] = mask[i] ? ptr_[spacing_ * i] : T {}; | ||
|
|
||
| return SimdVecType {v}; | ||
| return SimdVecType {v, false}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was surprising to me. What is the false doing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It specifies whether v is aligned or not.
| return ReturnType(*pm); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did this even compile before if you're missing braces :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this class was not used. But it will come back to life in the next PR.
|
|
||
| using MaskType = typename Simd<ET2>::MaskType; | ||
| using IntType = typename Simd<ET2>::IntType; | ||
| using MaskType = SimdMask<ET2, xsimd::default_arch>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The xsimd::default_arch is implied here I guess? Not really necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. I think I wrote this line before I added a default value for Arch in SimdMask.
| * | ||
| * @return Number of SIMD registers for AVX2 | ||
| */ | ||
| std::size_t constexpr registerCapacity(xsimd::avx2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a todo for other architectures?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The compiler will force us to add it when we try to compile for a different arch.
| size_t constexpr SimdSize_v = SimdSize<T>::value; | ||
| } | ||
| template <typename T, typename Arch = xsimd::default_arch> | ||
| std::size_t constexpr SimdSize_v = xsimd::batch<std::remove_cv_t<T>, Arch>::size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this what SS was doing before? Do we have some duplication perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SS is just a shorthand to avoid long expressions like SimdSize_v<double> or xsimd::batch<double, Arch>::size. SS is a private implementation detail. SimdSize_v is supposed to be a shorter version of xsimd::batch<double, Arch>::size, it also hides the dependency on xsimd. We can replace it with something more modern and convenient, if we want, e.g.
template <typename T, typename Arch>
constexpr std::size_t simdSize(Arch arch)
{
return xsimd::batch<T, Arch>::size;
}
blast/math/simd/arch.