Skip to content

Conversation

@mkatliar
Copy link
Owner

@mkatliar mkatliar commented Jul 23, 2024

  • SIMD operations reimplemented using xsimd library.
  • Architecture-specific code moved to blast/math/simd/arch.

@mkatliar mkatliar linked an issue Jul 23, 2024 that may be closed by this pull request
@mkatliar mkatliar requested a review from roversch August 1, 2024 07:49
@mkatliar mkatliar force-pushed the xsimd branch 5 times, most recently from d0066dc to 5c95f8c Compare August 2, 2024 16:19
@mkatliar mkatliar marked this pull request as ready for review August 3, 2024 03:34
@mkatliar mkatliar self-assigned this Aug 3, 2024
Copy link
Collaborator

@roversch roversch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! All in all, the code became a lot smaller.

I don't have any blocking comments, just minor suggestions and questions.

sudo apt-get update
sudo apt install libboost-exception-dev libbenchmark-dev -y
- name: Install LLVM and Clang 18
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any way to make this version indepedent?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be not a problem. But the CI system has clang-14, and on my system the oldest clang I can install with apt is clang-15. Compiling with older clang requires adding the typename keyword here and there. Also the code builds with gcc now, but I am not sure about older versions.

I suggest that we keep it in mind and come back to the compiler support issue later.

cmake -B ${{github.workspace}}/build \
-DCMAKE_BUILD_TYPE=${{env.BUILD_TYPE}} \
-DCMAKE_CXX_COMPILER=clang++ \
-DCMAKE_CXX_COMPILER=clang++-18 \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answered above

&& cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DCMAKE_CXX_COMPILER="clang++-15" \
-DCMAKE_CXX_FLAGS="-march=native -mfma -mavx -mavx2 -msse4 -fno-math-errno" \
-DCMAKE_CXX_FLAGS="-march=native -mfma -mavx -mavx2 -msse4 -fno-math-errno -DXSIMD_DEFAULT_ARCH='fma3<avx2>'" \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to detect this automatically (but that is maybe out of scope of this pull request)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe xsimd does automatically deduce the best possible architecture if you don't specify one. For now I made it explicit because currently we only support avx2.


template <typename Arch>
requires std::is_base_of_v<xsimd::avx2, Arch>
inline std::tuple<xsimd::batch<float, Arch>, xsimd::batch<std::int32_t, Arch>> imax(xsimd::batch<float, Arch> const& v1, xsimd::batch<std::int32_t, Arch> const& idx) noexcept
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some documentation would be nice here. If it's a standard approach, at least a link or so.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in 2f757bb

using MaskType = typename Simd<std::remove_cv_t<T>>::MaskType;
using SimdVecType = SimdVec<std::remove_cv_t<T>>;
using IntrinsicType = SimdVecType::IntrinsicType;
using MaskType = SimdMask<std::remove_cv_t<T>>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question: I guess you remove const and volatile qualifiers here, because otherwise you run into problems with const correctness?

Copy link
Owner Author

@mkatliar mkatliar Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, without it I used to run into compiler errors. We can try removing it and see what happens.

v[i] = mask[i] ? ptr_[spacing_ * i] : T {};

return SimdVecType {v};
return SimdVecType {v, false};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was surprising to me. What is the false doing here?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It specifies whether v is aligned or not.

return ReturnType(*pm);
}
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did this even compile before if you're missing braces :D

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this class was not used. But it will come back to life in the next PR.


using MaskType = typename Simd<ET2>::MaskType;
using IntType = typename Simd<ET2>::IntType;
using MaskType = SimdMask<ET2, xsimd::default_arch>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The xsimd::default_arch is implied here I guess? Not really necessary?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. I think I wrote this line before I added a default value for Arch in SimdMask.

*
* @return Number of SIMD registers for AVX2
*/
std::size_t constexpr registerCapacity(xsimd::avx2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a todo for other architectures?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler will force us to add it when we try to compile for a different arch.

size_t constexpr SimdSize_v = SimdSize<T>::value;
}
template <typename T, typename Arch = xsimd::default_arch>
std::size_t constexpr SimdSize_v = xsimd::batch<std::remove_cv_t<T>, Arch>::size;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this what SS was doing before? Do we have some duplication perhaps?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SS is just a shorthand to avoid long expressions like SimdSize_v<double> or xsimd::batch<double, Arch>::size. SS is a private implementation detail. SimdSize_v is supposed to be a shorter version of xsimd::batch<double, Arch>::size, it also hides the dependency on xsimd. We can replace it with something more modern and convenient, if we want, e.g.

template <typename T, typename Arch>
constexpr std::size_t simdSize(Arch arch)
{
    return xsimd::batch<T, Arch>::size;
}

@mkatliar mkatliar merged commit 999b6c7 into master Aug 5, 2024
@mkatliar mkatliar deleted the xsimd branch August 5, 2024 12:18
roversch pushed a commit that referenced this pull request Aug 13, 2024
roversch pushed a commit that referenced this pull request Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Switch to xsimd

3 participants