Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
0b9550a
Removed load() and store() from StaticPanelMatrix
mkatliar Jul 23, 2024
b6eedca
Removed load() functions from Simd.hpp
mkatliar Jul 23, 2024
231d175
SimdVec implementation using xsimd
mkatliar Jul 25, 2024
be5b0fd
Moved operator[] to SimdVecBase
mkatliar Jul 25, 2024
e27037d
Generic member functions in SimdVec
mkatliar Jul 26, 2024
39856a3
Install xsimd step
mkatliar Jul 26, 2024
7aa0563
Add missing "typename " keyword
mkatliar Jul 26, 2024
d663683
Update "Build xsimd" step
mkatliar Jul 26, 2024
8b9124b
Simd<> specializations working for both avx2 and fma3<avx2>
mkatliar Aug 1, 2024
17521ea
Remove XSIMD_DEFAULT_ARCH setting from CMakeLists.txt
mkatliar Aug 1, 2024
c3adc2a
Code compiles for both avx2 and fma3<avx2>
mkatliar Aug 1, 2024
c21776f
SimdVec decoupled from SimdVecBase
mkatliar Aug 1, 2024
7ff99d0
SimdMask<> is an alias for xsimd::batch_bool<>
mkatliar Aug 1, 2024
e7ed800
Added SimdIndex, removed some messy types.
mkatliar Aug 2, 2024
25538a0
Got rid of Simd and SimdTraits classes
mkatliar Aug 2, 2024
87d7565
Got rid of the old Simd.hpp header
mkatliar Aug 2, 2024
f91233a
Default architecture changed to fma3<avx2> in Dockerfile
mkatliar Aug 2, 2024
4b50684
Got rid of SimdVecBase
mkatliar Aug 2, 2024
2c56305
Made the code compile with gcc (except internal compiler errors)
mkatliar Aug 2, 2024
7efe21b
Build with clang-16
mkatliar Aug 2, 2024
84695b9
Build with clang-18
mkatliar Aug 2, 2024
c3d5882
Removed RegisterMatrix specializations
mkatliar Aug 3, 2024
d789021
Removed avx2 SimdVec specializations
mkatliar Aug 3, 2024
01754af
Removed avx2 hsum()
mkatliar Aug 3, 2024
1b8c62c
Arch-specific code moved to blast/math/simd/arch
mkatliar Aug 3, 2024
9ebf1ed
Builds with gcc
mkatliar Aug 3, 2024
2f757bb
Documented iamax()
mkatliar Aug 5, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 19 additions & 3 deletions .github/workflows/cmake.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,17 @@ jobs:
- uses: actions/checkout@v3

- name: Install APT packages
run: sudo apt install clang libboost-exception-dev libbenchmark-dev -y
run: |
sudo apt-get update
sudo apt install libboost-exception-dev libbenchmark-dev -y

- name: Install LLVM and Clang 18
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any way to make this version indepedent?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be not a problem. But the CI system has clang-14, and on my system the oldest clang I can install with apt is clang-15. Compiling with older clang requires adding the typename keyword here and there. Also the code builds with gcc now, but I am not sure about older versions.

I suggest that we keep it in mind and come back to the compiler support issue later.

run: |
sudo apt-get update
sudo apt-get install -y wget gnupg lsb-release
wget https://apt.llvm.org/llvm.sh
chmod +x llvm.sh
sudo ./llvm.sh 18

- name: Install Blaze
run: |
Expand All @@ -36,6 +46,13 @@ jobs:
-DCMAKE_INSTALL_PREFIX=/usr/local/ . \
&& sudo make install

- name: Install xsimd
run: |
git clone https://github.com/xtensor-stack/xsimd.git
cd xsimd
cmake -B build -DCMAKE_INSTALL_PREFIX=/usr/local/ .
sudo cmake --build build --target install

- name: Install GTest
run: |
git clone https://github.com/google/googletest.git
Expand All @@ -47,7 +64,7 @@ jobs:
run: |
cmake -B ${{github.workspace}}/build \
-DCMAKE_BUILD_TYPE=${{env.BUILD_TYPE}} \
-DCMAKE_CXX_COMPILER=clang++ \
-DCMAKE_CXX_COMPILER=clang++-18 \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answered above

-DBLAST_WITH_BENCHMARK=ON \
-DBLAST_WITH_TEST=ON

Expand All @@ -60,4 +77,3 @@ jobs:
# Execute tests defined by the CMake configuration.
# See https://cmake.org/cmake/help/latest/manual/ctest.1.html for more detail
run: ctest -C ${{env.BUILD_TYPE}}

2 changes: 2 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ set(CMAKE_DEBUG_POSTFIX d)

find_package(Boost REQUIRED COMPONENTS exception)
find_package(blaze REQUIRED)
find_package(xsimd REQUIRED)

add_library(blast INTERFACE)

Expand All @@ -35,6 +36,7 @@ target_include_directories(blast INTERFACE

target_link_libraries(blast
INTERFACE blaze::blaze
INTERFACE xsimd
)

target_compile_options(blast
Expand Down
3 changes: 1 addition & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ ENV PKG_CONFIG_PATH=/usr/local/lib
RUN mkdir -p blast/build && cd blast/build \
&& cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DCMAKE_CXX_COMPILER="clang++-15" \
-DCMAKE_CXX_FLAGS="-march=native -mfma -mavx -mavx2 -msse4 -fno-math-errno" \
-DCMAKE_CXX_FLAGS="-march=native -mfma -mavx -mavx2 -msse4 -fno-math-errno -DXSIMD_DEFAULT_ARCH='fma3<avx2>'" \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to detect this automatically (but that is maybe out of scope of this pull request)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe xsimd does automatically deduce the best possible architecture if you don't specify one. For now I made it explicit because currently we only support avx2.

-DCMAKE_CXX_FLAGS_RELEASE="-O3 -g -DNDEBUG -ffast-math" .. \
-DBLAST_WITH_TEST=ON \
-DBLAST_WITH_BENCHMARK=ON \
Expand All @@ -79,4 +79,3 @@ CMD mkdir -p blast/bench_result/data \
&& mkdir -p blast/bench_result/image \
&& cd blast \
&& make -j 1 bench_result/image/dgemm_performance.png bench_result/image/dgemm_performance_ratio.png

3 changes: 0 additions & 3 deletions include/blast/math/RegisterMatrix.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,3 @@

#include <blast/math/register_matrix/RegisterMatrix.hpp>
#include <blast/math/register_matrix/DynamicRegisterMatrix.hpp>
#include <blast/math/register_matrix/double_4_4_4.hpp>
#include <blast/math/register_matrix/double_8_4_4.hpp>
#include <blast/math/register_matrix/double_12_4_4.hpp>
10 changes: 5 additions & 5 deletions include/blast/math/RowColumnVectorPointer.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
#pragma once

#include <blast/math/typetraits/MatrixPointer.hpp>
#include <blast/math/simd/Simd.hpp>
#include <blast/math/Simd.hpp>
#include <blast/math/TransposeFlag.hpp>
#include <blast/util/Assert.hpp>

Expand All @@ -28,9 +28,9 @@ namespace blast
{
public:
using ElementType = typename MP::ElementType;
using IntrinsicType = typename Simd<std::remove_cv_t<ElementType>>::IntrinsicType;
using MaskType = typename Simd<std::remove_cv_t<ElementType>>::MaskType;
using SimdVecType = SimdVec<std::remove_cv_t<ElementType>>;
using IntrinsicType = SimdVecType::IntrinsicType;
using MaskType = SimdMask<std::remove_cv_t<ElementType>>;

static TransposeFlag constexpr transposeFlag = TF;
static bool constexpr aligned = MP::aligned;
Expand Down Expand Up @@ -176,7 +176,7 @@ namespace blast


private:
static size_t constexpr SS = Simd<std::remove_cv_t<ElementType>>::size;
static size_t constexpr SS = SimdVecType::size();

MP ptr_;
};
Expand Down Expand Up @@ -244,4 +244,4 @@ namespace blast
{
return RowColumnVectorPointer<MP, rowVector> {std::move(p)};
}
}
}
12 changes: 12 additions & 0 deletions include/blast/math/Simd.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
// Copyright (c) 2019-2020 Mikhail Katliar All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.

#pragma once

#include <blast/math/simd/SimdVec.hpp>
#include <blast/math/simd/SimdMask.hpp>
#include <blast/math/simd/SimdIndex.hpp>
#include <blast/math/simd/SimdSize.hpp>
#include <blast/math/simd/IsSimdAligned.hpp>
#include <blast/math/simd/RegisterCapacity.hpp>
17 changes: 7 additions & 10 deletions include/blast/math/dense/DynamicMatrixPointer.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -8,24 +8,21 @@
#include <blast/math/TransposeFlag.hpp>
#include <blast/math/StorageOrder.hpp>
#include <blast/math/TypeTraits.hpp>
#include <blast/math/simd/Simd.hpp>
#include <blast/math/simd/IsSimdAligned.hpp>
#include <blast/math/Simd.hpp>
#include <blast/util/Assert.hpp>

#include <blaze/util/Exception.h>



namespace blast
{
template <typename T, bool SO, bool AF, bool PF>
class DynamicMatrixPointer
{
public:
using ElementType = T;
using IntrinsicType = typename Simd<std::remove_cv_t<T>>::IntrinsicType;
using MaskType = typename Simd<std::remove_cv_t<T>>::MaskType;
using SimdVecType = SimdVec<std::remove_cv_t<T>>;
using IntrinsicType = SimdVecType::IntrinsicType;
using MaskType = SimdMask<std::remove_cv_t<T>>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question: I guess you remove const and volatile qualifiers here, because otherwise you run into problems with const correctness?

Copy link
Owner Author

@mkatliar mkatliar Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, without it I used to run into compiler errors. We can try removing it and see what happens.


static bool constexpr storageOrder = SO;
static bool constexpr aligned = AF;
Expand Down Expand Up @@ -83,9 +80,9 @@ namespace blast
}


IntrinsicType broadcast() const noexcept
SimdVecType broadcast() const noexcept
{
return blast::broadcast<SS>(ptr_);
return *ptr_;
}


Expand Down Expand Up @@ -187,7 +184,7 @@ namespace blast


private:
static size_t constexpr SS = Simd<std::remove_cv_t<T>>::size;
static size_t constexpr SS = SimdVecType::size();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would consider making this a function instead.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what exeactly? making SS a function?

static TransposeFlag constexpr majorOrientation = SO == columnMajor ? columnVector : rowVector;


Expand Down Expand Up @@ -253,4 +250,4 @@ namespace blast
{
return {p, spacing};
}
}
}
26 changes: 13 additions & 13 deletions include/blast/math/dense/DynamicVectorPointer.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,7 @@

#pragma once


#include <blast/math/simd/Simd.hpp>
#include <blast/math/Simd.hpp>
#include <blast/util/Assert.hpp>

#include <type_traits>
Expand All @@ -18,9 +17,9 @@ namespace blast
{
public:
using ElementType = T;
using IntrinsicType = typename Simd<std::remove_cv_t<T>>::IntrinsicType;
using MaskType = typename Simd<std::remove_cv_t<T>>::MaskType;
using SimdVecType = SimdVec<std::remove_cv_t<T>>;
using IntrinsicType = SimdVecType::IntrinsicType;
using MaskType = SimdMask<std::remove_cv_t<T>>;

static bool constexpr transposeFlag = TF;
static bool constexpr aligned = AF;
Expand All @@ -40,7 +39,7 @@ namespace blast
, spacing_ {spacing}
{
BLAST_USER_ASSERT(spacing > 0, "Vector element spacing must be positive.");
BLAST_USER_ASSERT(!AF || reinterpret_cast<ptrdiff_t>(ptr) % (SS * sizeof(T)) == 0, "Pointer is not aligned");
BLAST_USER_ASSERT(!AF || isSimdAligned(ptr), "Pointer is not aligned");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! self-documenting code

}


Expand All @@ -51,6 +50,7 @@ namespace blast
SimdVecType load() const noexcept
{
// Non-optimized
// TODO: use gather()
IntrinsicType v;
for (size_t i = 0; i < SS; ++i)
v[i] = ptr_[spacing_ * i];
Expand All @@ -62,18 +62,18 @@ namespace blast
SimdVecType load(MaskType mask) const noexcept
{
// Non-optimized
IntrinsicType v = blast::setzero<std::remove_cv_t<ElementType>, SS>();
// TODO: use gather()
T v[SS];
for (size_t i = 0; i < SS; ++i)
if (mask[i])
v[i] = ptr_[spacing_ * i];
v[i] = mask[i] ? ptr_[spacing_ * i] : T {};

return SimdVecType {v};
return SimdVecType {v, false};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was surprising to me. What is the false doing here?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It specifies whether v is aligned or not.

}


IntrinsicType broadcast() const noexcept
SimdVecType broadcast() const noexcept
{
return blast::broadcast<SS>(ptr_);
return *ptr_;
}


Expand Down Expand Up @@ -172,7 +172,7 @@ namespace blast


private:
static size_t constexpr SS = Simd<std::remove_cv_t<T>>::size;
static size_t constexpr SS = SimdVecType::size();


T * ptrOffset(ptrdiff_t i) const noexcept
Expand All @@ -191,4 +191,4 @@ namespace blast
{
return p.trans();
}
}
}
3 changes: 1 addition & 2 deletions include/blast/math/dense/Iamax.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@

#include <blast/util/Exception.hpp>
#include <blast/math/dense/VectorPointer.hpp>
#include <blast/math/simd/Avx256.hpp>

#include <cmath>
#include <tuple>
Expand Down Expand Up @@ -177,4 +176,4 @@ namespace blast

return iamax(N, ptr(*x));
}
}
}
10 changes: 5 additions & 5 deletions include/blast/math/dense/StaticMatrixPointer.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@

#include <blast/math/StorageOrder.hpp>
#include <blast/math/TransposeFlag.hpp>
#include <blast/math/simd/Simd.hpp>
#include <blast/math/simd/SimdVec.hpp>
#include <blast/math/simd/SimdMask.hpp>
#include <blast/math/simd/IsSimdAligned.hpp>
#include <blast/math/TypeTraits.hpp>
#include <blast/util/Assert.hpp>
Expand All @@ -24,9 +24,9 @@ namespace blast
{
public:
using ElementType = T;
using IntrinsicType = typename Simd<std::remove_cv_t<T>>::IntrinsicType;
using MaskType = typename Simd<std::remove_cv_t<T>>::MaskType;
using SimdVecType = SimdVec<std::remove_cv_t<T>>;
using IntrinsicType = SimdVecType::IntrinsicType;
using MaskType = SimdMask<std::remove_cv_t<T>>;

static bool constexpr storageOrder = SO;
static bool constexpr aligned = AF;
Expand Down Expand Up @@ -188,7 +188,7 @@ namespace blast


private:
static size_t constexpr SS = Simd<std::remove_cv_t<T>>::size;
static size_t constexpr SS = SimdVecType::size();
static TransposeFlag constexpr majorOrientation = SO == columnMajor ? columnVector : rowVector;


Expand Down Expand Up @@ -225,4 +225,4 @@ namespace blast
{
return trans(ptr<AF>(m.operand(), j, i));
}
}
}
Loading