Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gdcmSerieHelper.h not found on Ubuntu 18.04 despite ITKGDCM module enabled #1493

Closed
KrisThielemans opened this issue Dec 15, 2019 · 8 comments
Labels
status:Backlog Postponed without a fixed deadline type:Bug Inconsistencies or issues which will cause an incorrect result under some or all circumstances type:Compiler Compiler support or related warnings

Comments

@KrisThielemans
Copy link
Contributor

Description

Installing ITK 5.0.1 via conda leads to a compilation problem when using gdcm IO despite a system gdcm being present. See conda-forge/libitk-feedstock#42

This seems to say that in whatever way ITK was built for conda, it doesn't find the system GDCM. Nevertheless, /home/sirfuser/miniconda3/lib/cmake/ITK-5.0/ITKConfig.cmake sets ITK_MODULES_ENABLED to include ITKGDCM.

Steps to Reproduce

conda install libitk-dev
sudo apt install libgdcm2-dev
git clone https://github.com/UCL/STIR.git
mkdir build; cd build; cmake ../STIR; make

Actual behavior

output

In file included from /home/sirfuser/devel/STIR/src/IO/ITKImageInputFileFormat.cxx:41:0:
/home/sirfuser/miniconda3/include/ITK-5.0/itkGDCMSeriesFileNames.h:25:10: fatal error: gdcmSerieHelper.h: No such file or directory
 #include "gdcmSerieHelper.h"
          ^~~~~~~~~~~~~~~~~~~
compilation terminated.

This is despite /usr/include/gdcm-2.8/gdcmSerieHelper.h existing.

Reproducibility

always

Versions

5.0.1

Environment

Ubuntu 18.04, cmake version 3.13.1

Additional Information

@KrisThielemans KrisThielemans added the type:Bug Inconsistencies or issues which will cause an incorrect result under some or all circumstances label Dec 15, 2019
@stale
Copy link

stale bot commented Apr 13, 2020

This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions.

@stale stale bot added the status:Backlog Postponed without a fixed deadline label Apr 13, 2020
@thewtex
Copy link
Member

thewtex commented Apr 14, 2020

ITK provides GDCM by default, and a quick glance at the conda libitk recipe indicates that this GDCM is used in the conda recipe.

Where is the relevant find_package(ITK [...] of the dependent project?

@stale stale bot removed the status:Backlog Postponed without a fixed deadline label Apr 14, 2020
@thewtex thewtex added the type:Compiler Compiler support or related warnings label Apr 14, 2020
@blowekamp
Copy link
Member

There are some know issues with Conda not fixing up the paths in the CMake files correctly, but that is known to only happen on windows:
conda-forge/libitk-feedstock#43

It may be working grep-ing around the installation director as done in the above issue to see if something similar is occurring on your platform.

@KrisThielemans
Copy link
Contributor Author

thanks for picking this up. I'll try to reproduce this with current conda on Ubuntu 18.04 using our VM https://doi.org/10.5281/zenodo.3552234

Our find_package is quite simple :-)

   find_package(ITK QUIET)
   if (ITK_FOUND)
      include(${ITK_USE_FILE})
   endif()

We use target_link_libraries here but this is a compilation error, so I guess that line is irrelevant.

@blowekamp
Copy link
Member

It looks like you are including "itkGDCMSeriesFileNames.h" in the project, which includes "gdcmSerieHelper.h". This does not follow best practices of encapsulation, this PR #1768 addressed that issue.

The Conda-forge has two packages, libitk and itk-devel. I believe the GDCM "Headers" component used by GDCM. This PR should address it:
conda-forge/libitk-feedstock#44

@KrisThielemans
Copy link
Contributor Author

@blowekamp both PRs look good to me. thanks a lot!

@rijobro, maybe we can try the new const itk-devel package once that PR is merged.

@stale
Copy link

stale bot commented Aug 13, 2020

This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions.

@stale stale bot added the status:Backlog Postponed without a fixed deadline label Aug 13, 2020
@dzenanz
Copy link
Member

dzenanz commented Aug 18, 2020

That PR is merged. Reopen if it didn't fix the issue.

@dzenanz dzenanz closed this as completed Aug 18, 2020
phcerdan pushed a commit to phcerdan/ITK that referenced this issue Jan 10, 2022
Run the UpdateFromUpstream.sh script to extract upstream Eigen3
using the following shell commands.

$ git archive --prefix=upstream-eigen3/ 332b838c -- 
Eigen/Cholesky
Eigen/CholmodSupport
Eigen/Core
Eigen/Dense
Eigen/Eigen
Eigen/Eigenvalues
Eigen/Geometry
Eigen/Householder
Eigen/IterativeLinearSolvers
Eigen/Jacobi
Eigen/LU
Eigen/MetisSupport
Eigen/OrderingMethods
Eigen/PardisoSupport
Eigen/PaStiXSupport
Eigen/QR
Eigen/QtAlignedMalloc
Eigen/Sparse
Eigen/SparseCholesky
Eigen/SparseCore
Eigen/SparseLU
Eigen/SparseQR
Eigen/SPQRSupport
Eigen/StdDeque
Eigen/StdList
Eigen/StdVector
Eigen/SuperLUSupport
Eigen/SVD
Eigen/UmfPackSupport
Eigen/src
COPYING.BSD
COPYING.MINPACK
COPYING.MPL2
COPYING.README
README.md
README.kitware.md
CMakeLists.txt
cmake/FindStandardMathLibrary.cmake
cmake/Eigen3Config.cmake.in
.gitattributes
 | tar x
$ git shortlog --perl-regexp --author='^((?!Kitware Robot).*)$' --no-merges --abbrev=8 --format='%h %s' b51eab5c..332b838c

Aaron Franke (1):
      5c22c7a7 Make file formatting comply with POSIX and Unix standards

Abhijit Kundu (3):
      9bc0a357 Fixed nested angle barckets >> issue when compiling with cuda 8
      4343db84 updated warning number for nvcc relase 8 (V8.0.61) for the stupid warning message  'calling a __host__ function from a __host__ __device__ function is not allowed'.
      6d991a95 bug #1464 : Fixes construction of EulerAngles from 3D vector expression.

Adam Kallai (1):
      277d3690 win: include intrin header in Windows on ARM

Adam Shapiro (1):
      2ac0b787 Fixed sparse conservativeResize() when both num cols and rows decreased.

Akshay Naresh Modi (1):
      bcc0e9e1 Add numeric_limits min and max for bool

Alberto Luaces (1):
      c694be12 Fixed Tensor documentation formatting.

Alessio M (1):
      96cd1ff7 Fixed: - access violation when initializing 0x0 matrices - exception can be thrown during stack unwind while comma-initializing a matrix if eigen_assert if configured to throw

Alex Druinsky (1):
      b0fe1421 Fix vectorized reductions for Eigen::half

Alexander Grund (3):
      a967fadb Make relative path variables of type STRING
      cf0b5b03 Remove code checking for CMake < 3.5
      929bc0e1 Fix alias violation in BFloat16

Alexander Karatarakis (2):
      517294d6 Make DenseStorage<> trivially_copyable
      c334eece _DerType -> DerivativeType as underscore-followed-by-caps is a reserved identifier

Alexander Neumann (2):
      dd58462e fixed inlining issue with clang-cl on visual studio (grafted from 7962ac1a5855e8b7a60d5d90e61365b71f5501a5 )
      52721068 remove semi triggering -Wextra-semi-stmt

Alexander Turkin (1):
      60faa9f8 user-defined copy operations removed in favor of compiler-generated ones

Alexey Frunze (6):
      3875fb05 Add support for MIPS SIMD (MSA)
      1f523e73 Add MIPS changes missing from previous merge.
      7b91c112 bug #1578: Improve prefetching in matrix multiplication on MIPS.
      ec38f07b bug #1595: Don't use C++11's std::isnan() in MIPS/MSA packet math.
      050bcf61 bug #1584: Improve random (avoid undefined behavior).
      edeee16a Fix build failures in matrix_power and matrix_exponential tests.

Allan Leal (1):
      37ccb869 Update NullaryFunctors.h

Andrea Bocci (1):
      f7124b3e Extend CUDA support to matrix inversion and selfadjointeigensolver

Andreas Krebbel (2):
      1e74f93d Fix some packet-functions in the IBM ZVector packet-math.
      23469c3c ZVector: Move alignas qualifier to come first

Androbin42 (2):
      3f7fb5a6 Make eigen_monitor_perf.sh more robust
      95ecb2b5 Make buildtests.in more robust

Andy May (1):
      ae33e866 Fix compilation with PGI version 19

Angelos Mantzaflaris (5):
      aeba0d86 fix two warnings(unused typedef, unused variable) and a typo (grafted from a9aa3bcf50d55b63c8adb493a06c903ec34251c6 )
      8c24723a typo UIntPtr (grafted from b6f04a2dd4d68fe1858524709813a5df5b9a085b )
      e8a6aa51 1. Add explicit template to abs2 (resolves deduction for some arithmetic types) 2. Avoid signed-unsigned conversion in comparison (warning in case Scalar is unsigned) (grafted from 4086187e49760d4bde72750dfa20ae9451263417 )
      18de9232 use numext::abs (grafted from 0a08d4c60b652d1f24b2fa062c818c4b93890c59 )
      76946849 Remove superfluous const's (can cause warnings on some Intel compilers) (grafted from e236d3443c79f38aa721d95e64c275abbb5df10f )

Anshul Jaiswal (6):
      fab51d13 Updated Eigen_Colamd.h, namespacing macros ALIVE & DEAD as COLAMD_ALIVE & COLAMD_DEAD to prevent conflicts with other libraries / code.
      0a6b553e Eigen_Colamd.h edited online with Bitbucket replacing constant #defines with const definitions
      39f30923 Eigen_Colamd.h edited replacing macros with constexprs and functions.
      283558fa Ordering.h edited to fix dependencies on Eigen_Colamd.h
      a4d1a6cd Eigen_Colamd.h updated to replace constexpr with consts and enums.
      c1a67cb5 Update ConfigureVectorization.h to not optimize fp16 routines when compiling with cuda.

Antonio Sanchez (168):
      9dda5eb7 Missing struct definition in NumTraits
      8e875719 Replace norm() with squaredNorm() to address integer overflows
      c854e189 Fixed commainitializer test.
      a7d2552a Remove HasCast and fix packetmath cast tests.
      03ebdf6a Added missing NEON pcasts, update packetmath tests.
      ff4e7a08 Add missing Packet2l/Packet2ul ops for NEON.
      7222f0b6 Fix packetmath_1 float tests for arm/aarch64.
      145e5151 Fix denormal check pre c++11.
      9cb8771e Fix tensor casts for large packets and casts to/from std::complex
      d5a0d894 Fix alignedbox 32-bit precision test failure.
      d9f0d9eb Fix missing `pfirst<Packet16b>` for MSVC.
      69614689  Address issues with `openglsupport` test.
      852513e7 Disable testing of OpenGL by default.
      bb69a8db Explicit casts of S -> std::complex<T>
      8e9cc5b1 Eliminate double-promotion warnings.
      117a4c06 Fix missing `EIGEN_CONSTEXPR` pop_macro in `Half`.
      60218829 EOF newline added to InverseSize4.
      3669498f Fix rule-of-3 for the Tensor module.
      41d5d533 Initialize primitives to fix -Wuninitialized-const-reference.
      17268b15 Add bit_cast for half/bfloat to/from uint16_t, fix TensorRandom
      a8fdcae5 Fix sparse_extra_3, disable counting temporaries for testing DynamicSparseMatrix.
      fd1dcb6b Fixes duplicate symbol when building blas
      4cf01d2c Update AVX half packets, disable test.
      38abf2be Fix Half NaN definition and test.
      a3b300f1 Implement missing AVX half ops.
      22f67b59 Fix boolean float conversion and product warnings.
      89f90b58 AVX512 missing ops.
      1992af3d Fix #2077, `EIGEN_CONSTEXPR` in `Half`.
      ddd48b24 Implement CUDA __shfl* for Eigen::half
      2627e2f2 Fix neon cmp* functions for bf16.
      70fbcf82 Fix typo in `F32MaskToBf16Mask`.
      eb4d4ae0 Include chrono in main for c++11.
      9ee9ac81 Fix shfl* macros for CUDA/HIP
      e2f21465 Special function implementations for half/bfloat16 packets.
      2dbac2f9 Fix bad NEON fp16 check
      5ec49074 Clean up `#if`s in GPU PacketPath.
      655c3a40 Add specialization for compile-time zero-sized dense assignment.
      634bd79b Fix unused warning on new `dense_assignment_loop` impl.
      8cfe0db1 Fix host/device calls for __half.
      82c0c18a Remove private access of std::deque::_M_impl.
      e82722a4 Fix MSVC SSE casts.
      c6efc4e0 Replace M_LOG2E and M_LN2 with custom macros.
      8c9976d7 Fix more SSE/AVX packet conversions for peven.
      839aa505 Fix typo in AVX512 packet math.
      55967f87 Fix NEON pmax<PropagateNumbers,Packet4bf>.
      5dc2fbab Fix implicit cast to double.
      070d303d Add CUDA complex sqrt.
      bb1de9db Fix Ref Stride checks.
      166fcdec Allow CwiseUnaryView to be used on device.
      52d1dd97 Fix Ref initialization.
      8d9cfba7 Fix rand test for MSVC.
      f149e0eb Fix MSVC complex sqrt and packetmath test.
      587fd6ab Only specialize complex `sqrt_impl` for CUDA if not MSVC.
      3daf92c7 Transform::computeScalingRotation flush determinant to +/- 1.
      20440849 Remove TODO from Transform::computeScaleRotation()
      352f1422 Remove `inf` local variable.
      bde67416 Improved std::complex sqrt and rsqrt.
      d5b79811 Fix signed-unsigned comparison.
      25d8498f Fix stable_norm_1 test.
      b2126fd6 Fix pfrexp/pldexp for half.
      f19bcffe Specialize std::complex operators for use on GPU device.
      f0e46ed5 Fix pow and other cwise ops for half/bfloat16.
      e0d13ead Replace std::isnan with numext::isnan for c++03
      4c42d5ee Eliminate implicit conversion warning in test/array_cwise.cpp
      3f4684f8 Include `<cstdint>` in one place, remove custom typedefs
      1615a279 Fix altivec packetmath.
      fb4548e2 Implement bit_* for device.
      56c8b14d Eliminate implicit conversions from float to double.
      f85038b7 Fix excessive GEBP register spilling for 32-bit NEON.
      abcde69a Disable vectorized pow for half/bfloat16.
      66841ea0 Enable bdcsvd on host.
      4cb563a0 Fix ldexp implementations.
      9fde9cce Adjust bounds for pexp_float/double
      90ee821c Use vrsqrts for rsqrt Newton iterations.
      7ff0b7a9 Updated pfrexp implementation.
      0845df7f Fix uninitialized warning on AVX.
      5f9cfb25 Add missing adolc isinf/isnan.
      db5691ff Fix some CUDA warnings.
      aba39982 Fix check if GPU compile phase for std::hash
      6cf0ab5e Disable fast psqrt for NEON.
      119763cf Eliminate CMake FindPackageHandleStandardArgs warnings.
      5908aeea Fix CUDA device new and delete, and add test.
      a31effc3 Add `invoke_result` and eliminate `result_of` warnings for C++17+.
      ecb7b19d Disable new/delete test for HIP
      5529db75 Fix SSE/NEON pfloor/pceil for saturated values.
      e19829c3 Fix floor/ceil for NEON fp16.
      29ebd84c Fix NEON sqrt for 32-bit, add prsqrt.
      c65c2b31 Make half/bfloat16 constructor take inputs by value, fix powerpc test.
      1e0c7d4f Add print for SSE/NEON, use NEON rounding intrinsics if available.
      e72dfeb8 Fix rint for SSE/NEON.
      82d61af3 Fix rint SSE/NEON again, using optimization barrier.
      2468253c Define EIGEN_CPLUSPLUS and replace most __cplusplus checks.
      60452431 Revert stack allocation limit change that crept in.
      1296abdf Fix non-trivial Half constructor for CUDA.
      94327dbf Fix typo: DEVICE -> GPU
      853a5c4b Fix ambiguous call to CUDA __half constructor.
      543e34ab Re-implement move assignments.
      d098c4d6 Disable EIGEN_OPTIMIZATION_BARRIER for PPC clang.
      b2711107 Bump up rand histogram threshold.
      14487ed1 Add increment/decrement operators to Eigen::half.
      d24f9f9b Fix NVCC+ICC issues.
      14b7ebea Fix numext::round pre c++11 for large inputs.
      f612df27 Add fmod(half, half).
      75ce9cd2 Augment NumTraits with min/max_exponent().
      8dfe1029 Augment NumTraits with min/max_exponent() again.
      c3fbc6ce Use singleton pattern for static registered tests.
      5521c65a Eliminate mixingtypes_7 warning.
      87729ea3 Eliminate `round_impl` double-promotion warnings for c++03.
      af1247fb Use Index type in loop over coefficients.
      78ee3d62 Fix CUDA constexpr issues for numeric_limits.
      90187a33 Fix SelfAdjoingEigenSolver (#2191)
      ace7f132 Fix clang tidy warnings in AnnoyingScalar.
      fcb5106c Scaled epsilon the wrong way.
      69adf26a Modify googlehash use to account for namespace issues.
      ab7fe215 Fix ldexp for AVX512 (#2215)
      8830d66c DenseStorage safely copy/swap.
      587a6915 Check existence of BSD random before use.
      a33855f6 Add missing pcmp_lt_or_nan for NEON Packet4bf.
      fc2cc108 Better CUDA complex division.
      da19f7a9 Simplify TensorRandom and remove time-dependence.
      42acbd57 Fix numext::arg return type.
      25424f4c Clean up gpu device properties.
      2947c0cc Restore ABI compatibility for conj with 3.3, fix conflict with boost.
      ee2a8f71 Modify Unary/Binary/TernaryOp evaluators to work for non-class types.
      98cf1e07 Add missing NEON ptranspose implementations.
      4b683b65 Allow custom TENSOR_CONTRACTION_DISPATCH macro.
      b5fc69bd Add ability to permanently enable HIP/CUDA gpu* defines.
      5e75331b Fix checking of version number for mingw.
      2d6eaaf6 Fix placement of permanent GPU defines.
      1374f49f Add missing ppc pcmp_lt_or_nan<Packet8bf>
      ee4e099a Remove pset, replace with ploadu.
      c2c0f6f6 Fix fix<> for gcc-4.9.3.
      a2040ef7 Rewrite balancer to avoid overflows.
      d82d9150 Modify tensor argmin/argmax to always return first occurence.
      b6db0134 Fix inverse nullptr/asan errors for LU.
      8190739f Fix compile issues for gcc 4.8.
      84955d10 Fix Tensor documentation page.
      7571704a Fix CMake directory issues.
      5d37114f Fix explicit default cache size typo.
      c0c7b695 Fix assignment operator issue for latest MSVC+NVCC.
      3dc42eea Enable equality comparisons on GPU.
      237c59a2 Modify scalar pzero, ptrue, pselect, and p<binary> operations to avoid memset.
      bb33880e Fix TriSycl CMake files.
      9a1691a1 Fix cmake warnings, FindPASTIX/FindPTSCOTCH.
      46ecdcd7 Fix MPReal detection and support.
      5b83d3c4 Make inverse 3x3 faster and avoid gcc bug.
      0d890127 Update code snippet for tridiagonalize_inplace.
      f1032255 Add missing PPC packet comparisons.
      aef926ab Renamed shift_left/shift_right to shiftLeft/shiftRight.
      fd100138 Remove unaligned assert tests.
      115591b9 Workaround VS 2017 arg bug.
      7aee90b8 Fix fix<N> when variable templates are not supported.
      c2b6df6e Disable cuda Eigen::half vectorization on host.
      4ef67cbf GCC 4.8 arm EIGEN_OPTIMIZATION_BARRIER fix (#2315).
      07cc3622 Fix EIGEN_OPTIMIZATION_BARRIER for arm-clang.
      f03d3e70 Missing EIGEN_DEVICE_FUNCs to get `gpu_basic` passing with  CUDA 9.
      3395f4e6 Fix tridiagonalization_inplace_selector.
      f046e326 Fix strict aliasing bug causing product_small failure.
      ebd5c6d4 Add -mfma for AVX512DQ tests.
      71498b32 Disable more NVCC warnings.
      7ea4adb5 Disable another device warning
      943ef50a Disable testing of complex compound assignment operators for MSVC.
      05c9d7ce Disable MSVC constant condition warning.
      6b6ba412 Fix min/max nan-propagation for scalar "other".
      f9b2e920 Remove bad "take" impl that causes g++-11 crash.
      18824d10 Fix ZVector build.
      0ab1f8ec Fix broadcasting oob error.
      7e3bc417 Fix tensor broadcast off-by-one error.

Antonio Sánchez (3):
      8719b9c5 Disable test for 32-bit systems (e.g. ARM, i386)
      128eebf0 Revert "add EIGEN_DEVICE_FUNC to EIGEN_MAKE_ALIGNED_OPERATOR_NEW_IF macros (only if not HIPCC)."
      9a663973 Revert "Fix rint for SSE/NEON."

Anuj Rawat (3):
      8c7a6feb Adding lowlevel APIs for optimized RHS packet load in TensorFlow  SpatialConvolution
      ad372084 Removing unused API to fix compile error in TensorFlow due to  AVX512VL, AVX512BW usage
      452371ce Fix for gcc build error when using Eigen headers with AVX512

Artem Belevich (2):
      25230d18 Improve performance of contraction kernels
      8056a05b Undo the block size change.

Ashutosh Sharma (2):
      7eb07da5 loop less ptranspose
      f702792a missing method in packetmath.h void ptranspose(PacketBlock<Packet16uc, 4>& kernel)

Basil Fierz (1):
      624df509 Adds missing EIGEN_STRONG_INLINE to support MSVC properly inlining small vector calculations

Ben Boeckel (1):
      de051677 git: remove executable permissions from header files

Ben Niu (1):
      b8d1857f [MSVC-specific] Define EIGEN_ARCH_x86_64 for native x64 (_M_X64 is defined and _M_ARM64EC is not), and define EIGEN_ARCH_ARM64 for both the native ARM64 (_M_ARM64 is defined) or ARM64EC (_M_ARM64EC is defined). _M_ARM64EC is defined when the code is compiled by MSVC for ARM64EC, a new ARM64 ABI designed to be compatible with x64 application emulation on ARM64. If _M_ARM64EC is defined, _M_X64 and _M_AMD64 are also defined, so x64-specific code (especially intrinsics) is also compiled to ARM64 instructions (compliant with the ARM64EC ABI) for maximum x64 compatibility. Although a majority of x64-specific intrinsics can emulated by ARM64 instructions, it is still a good to simply recompile the native ARM64 code paths to ARM64EC for pure computation tasks, for performance reasons.

Benoit Jacob (5):
      751e097c Use 32 registers on ARM64
      61160a21 ARM prefetch fixes: Implement prefetch on ARM64. Do not clobber cc on ARM32.
      7b1cb8a4 fix the build on 64-bit ARM when NEON is disabled
      a4159dba do not read buffers out of bounds -- load only the 4 bytes we know exist here.  Could also have done a vld1_lane_f32 but doing so here, without the overhead of initializing the unused lane, would have triggered used-of-uninitialized-value errors in tools such as ASan.  Note that this code is sub-optimal before or after this change: we should be reading either 2 or 4 float32 values per load-instruction  (2 for ARM in-order cores with an affinity for 8-byte loads;  4 for ARM out-of-order cores able to dual-issue 16-byte load instructions with arithmetic instructions).  Before or after this patch, we are only loading 4 bytes of useful data here (even if before this patch, we were technically loading 8, only to use only the 4 first).
      cc0c38ac Remove old Clang compiler bug work-arounds. The two LLVM bugs referenced in the comments here have long been fixed. The workarounds were now detrimental because (1) they prevented using fused mul-add on Clang/ARM32 and (2) the unnecessary 'volatile' in 'asm volatile' prevented legitimate reordering by the compiler.

Benoit Steiner (143):
      75c080b1 Added a test to validate memory transfers between host and sycl device
      dff9a049 Optimized the computation of exp, sqrt, ceil anf floor for fp16 on Pascal GPUs
      f2e8b732 Enable the use of AVX512 instruction by default
      004344cf Avoid calling log(0) or 1/0
      a6a3fd07 Made TensorDeviceCuda.h compile on windows
      4349fc64 Created a test to check that the sycl runtime can successfully report errors (like ivision by 0). Small cleanup
      72a45d32 Cleanup
      553f50b2 Added a way to detect errors generated by the opencl device from the host
      7335c492 Fixed the cxx11_tensor_device_sycl test
      37c2c516 Cleaned up the sycl device code
      b5e3285e Test broadcasting on OpenCL devices with 64 bit indexing
      110b7f8d Deleted unnecessary semicolons
      8649e16c Enable EIGEN_HAS_C99_MATH when building with the latest version of Visual Studio
      dc601d79 Added the ability to run test exclusively OpenCL devices that are listed by sycl::device::get_devices().
      ca754caa Only runs the cxx11_tensor_reduction_sycl on devices that are available.
      1c6eafb4 Updated cxx11_tensor_device_sycl to run only on the OpenCL devices available on the host
      a357fe1f Code cleanup
      2d1aec15 Added missing include
      9265ca70 Made it possible to check the state of a sycl device without synchronization
      81151bd4 Fixed merge conflicts
      79a07b89 Fixed a typo
      ed839c58 Enable the use of constant expressions with clang >= 3.6
      f11da1d8 Made the QueueInterface thread safe
      3be1afca Disabled the "remove the call to 'std::abs' since unsigned values cannot be negative" warning introduced in clang 3.5
      7ad37606 Fixed the documentation of Scalar Tensors
      7fe70459 Added missing array_get method for numeric_list
      67b2c41f Avoided unnecessary type conversion
      9fd081cd Fixed compilation warnings
      3011dc94 Call internal::array_prod to compute the total size of the tensor.
      df3da078 Updated customIndices2Array to handle various index sizes.
      e37c2c52 Added an implementation of numeric_list that works with sycl
      f5107010 Udated the Sizes class to work on AMD gpus without requiring a separate implementation
      7cd33df4 Improved formatting
      e633a837 Simplified includes
      fca27350 Added the deallocate_all() method back
      e073de96 Moved the MemCopyFunctor back to TensorSyclDevice since it's the only caller and it makes TensorFlow compile again
      a70393fd Cleaned up forward declarations
      7bfff853 Added support for thread cancellation on Linux
      69ef267a Added the new threadpool cancel method to the threadpool interface based class.
      28ee8f42 Added a Flush method to the RunQueue
      3d59a477 Added a message to ease the detection of platforms on which thread cancellation isn't supported.
      2f5b7a19 Reworked the threadpool cancellation mechanism to not depend on pthread_cancel since it turns out that pthread_cancel doesn't work properly on numerous platforms.
      aafa97f4 Fixed build error with MSVC
      4deafd35 Introduce a portable EIGEN_SLEEP macro.
      76fca221 Use a more accurate timer to sleep on Linux systems.
      8ae68924 Made ThreadPoolInterface::Cancel() an optional functionality
      a432fc10 Moved the choice of ThreadPool to unsupported/Eigen/CXX11/ThreadPool
      3beb180e Don't call EnvThread::OnCancel by default since it doesn't do anything.
      2c2e2184 Avoid using #define since they can conflict with user code
      1324ffef Reenabled the use of constexpr on OpenCL devices
      8910442e Fixed memcpy, memcpyHostToDevice and memcpyDeviceToHost for Sycl.
      9e03dfb4 Made sure EIGEN_HAS_C99_MATH is defined when compiling OpenCL code
      fb1d0138 Include SSE packet instructions when compiling with avx512 enabled.
      923acadf Fixed compilation errors with gcc6 when compiling the AVX512 intrinsics
      27ceb43b Fixed race condition in the tensor_shuffling_sycl test
      548ed30a Added an OpenCL regression test
      c19fe5e9 Added support for libxsmm in the eigen makefiles
      f9eff17e Leverage libxsmm kernels within signle threaded contractions
      b91be602 Automatically include and link libxsmm when present.
      06572285 Simplified the way we link libxsmm
      519d63d3 Added support for libxsmm kernel in multithreaded contractions
      4236aebe Simplified the contraction code`
      d7825b67 Use native AVX512 types instead of Eigen Packets whenever possible.
      354baa0f Avoid using horizontal adds since they're not very efficient.
      3eda02d7 Fixed the sycl benchmarking code
      924600a0 Made sure that enabling avx2 instructions enables avx and sse instructions as well.
      fcd25703 Replaced EIGEN_DEVICE_FUNC template<foo> with template<foo> EIGEN_DEVICE_FUNC to make the code compile with nvcc8.
      2db75c07 fixed the ordering of the template and EIGEN_DEVICE_FUNC keywords in a few more places to get more of the Eigen codebase to compile with nvcc again.
      442e9cbb Silenced several compilation warnings
      8b3cc54c Added a new EIGEN_HAS_INDEXED_VIEW define that set to 0 for older compilers that are known to fail to compile the indexed views (I used the define from the indexed_views.cpp test). Only include the indexed view methods when the compiler supports the code. This makes it possible to use Eigen again in complex code bases such as TensorFlow and older compilers such as gcc 4.8
      1ef30b80 Fixed bug introduced in previous commit
      cfa0568e Size indices are signed.
      34d9fce9 Avoid unecessary float to double conversions.
      554116be Added EIGEN_DEVICE_FUNC to make the prototype of the EigenBase override match that of DenseBase
      b1fc7c9a Added missing EIGEN_DEVICE_FUNC qualifiers.
      ed4dc9d0 Declared the plset, ploadt_ro, and ploaddup packet primitives as usable within a gpu kernel
      193939d6 Added missing EIGEN_DEVICE_FUNC qualifiers to several nullary op methods.
      889c606f Added missing EIGEN_DEVICE_FUNC to the SelfCwise binary ops
      f3e9c428 Added missing EIGEN_DEVICE_FUNC qualifiers
      33443ec2 Added missing EIGEN_DEVICE_FUNC qualifiers
      e993c94f Added missing EIGEN_DEVICE_FUNC qualifiers
      765f4cc4 Deleted extra: EIGEN_DEVICE_FUNC: the QR and Cholesky code isn't ready to run on GPU yet.
      de7b0fde Made the TensorStorage class compile with clang 3.9
      4a7df114 Added missing EIGEN_DEVICE_FUNC
      c36bc2d4 Added missing EIGEN_DEVICE_FUNC qualifiers
      857adbbd Added missing EIGEN_DEVICE_FUNC qualifiers
      c92406d6 Silenced clang compilation warning.
      7b619446 Made most of the packet math primitives usable within CUDA kernel when compiling with clang
      3a3f040b Added missing EIGEN_DEVICE_FUNC qualifiers
      c1d87ec1 Added missing EIGEN_DEVICE_FUNC qualifiers
      1e2d0466 Silenced a couple of compilation warnings
      09ae0e65 Adjusted the EIGEN_DEVICE_FUNC qualifiers to make sure that:   * they're used consistently between the declaration and the definition of a function   * we avoid calling host only methods from host device methods.
      a71943b9 Made the Tensor code compile with clang 3.9
      f0f35911 Made the reduction code compile with cuda-clang
      fd7db52f Silenced compilation warning
      73fcaa31 Gate the sycl specific code under #ifdef sycl
      e2d5d4e7 Restore the old constructors to retain compatibility with non c++11 compilers.
      c1b3d5ec Restored code compatibility with compilers that dont support c++11 Gated more sycl code under #ifdef sycl
      bc050ea9 Fixed compilation error when sycl is enabled.
      63840d46 iGate the sycl specific code under a EIGEN_USE_SYCL define
      e3e34339 Guard the sycl specific code with a #ifdef EIGEN_USE_SYCL
      66c63826 Guard the sycl specific code with EIGEN_USE_SYCL
      a1304b95 Code cleanup
      a5a0c8fa Guard sycl specific code under a EIGEN_USE_SYCL ifdef
      c302ea7b Deleted empty line of code
      068cc097 Preserve file naming conventions
      44993682 Added missing __device__ qualifier
      53725c10 Merged in mehdi_goli/opencl/DataDependancy (pull request PR-10)
      c92faf9d Merged in mehdi_goli/upstr_benoit/HiperbolicOP (pull request PR-13)
      62b4634e Merged in mehdi_goli/upstr_benoit/TensorSYCLImageVolumePatchFixed (pull request PR-14)
      dc524ac7 Fixed compilation warning
      6795512e Improved the randomness of the tensor random generator
      9daed679 Merged in tntnatbry/eigen (pull request PR-319)
      5ac27d5b Avoid relying on cxx11 features when possible.
      575cda76 Fixed syntax errors generated by xcode
      f0b154a4 Code cleanup
      84d7be10 Fixing Argmax that was breaking upstream TensorFlow.
      a4089991 Added support for CUDA 9.0.
      ea4e65bf Fixed compilation with cuda_clang.
      a6d875ba Removed unecesasry #include
      6118c6ff Enable RawAccess to tensor slices whenever possinle. Avoid 32-bit integer overflow in TensorSlicingOp
      522d3ca5 Don't use std::equal_to inside cuda kernels since it's not supported.
      d011d05f Fixed compilation errors.
      10d286f5 Silenced a couple of compilation warnings.
      4be42862 Made the code compile with gcc 5.4.
      c8ea3986 Avoided language features that are only available in cxx11 mode.
      e6d5be81 Fixed syntax of nested templates chevrons to make it compatible with c++97 mode.
      3810ec22 Don't use the auto keyword since it's not always supported properly.
      26239ee5 Use NULL instead of nullptr to avoid adding a cxx11 requirement.
      3d3711f2 Fixed compilation errors.
      501be70b Code cleanup
      59bba77e Fixed compilation errors with gcc 4.7 and 4.8
      ab3f4811 Cleaned up the code and make it compile with more compilers
      43ec0082 Made the kronecker_product test compile again
      6bb3f1b4 Made the tensor_block_access test compile again
      fbb83414 Fixed more compilation errors
      b6f96cf7 Removed dependencies on cxx11 language features from the tensor_block_access test
      41815569 Fixed the tensor contraction code.
      e23c8c29 Use actual types instead of the auto keyword to make the code more portable
      ede580cc Avoid using the auto keyword to make the tensor block access test more portable
      f641cf12 Adding missing at method in Eigen::array
      43d9dd9b Removed more dependencies on cxx11.
      ff8e0ecc Updated one more line of code to avoid making the test dependent on cxx11 features.

Bernardo Bahia Monteiro (1):
      54a0a9c9 Bugfix: conjugate_gradient did not compile with lazy-evaluated RealScalar

Bernhard M. Wiedemann (1):
      b071672e Do not keep latex logs

Bowie Owens (1):
      9842366b Make inclusion of doc sub-directory optional by adjusting options.

Brad King (1):
      880fa43b Add support for CastXML on ARM aarch64

Brian Zhao (1):
      3afb640b Fixing incorrect size in Tensor documentation.

Changming Sun (1):
      b1aa07a8 Fix a bug in TensorIndexList.h

Chip Kerchner (10):
      e5886457 Change Packet8s and Packet8us to use vector commands on Power for pmadd, pmul and psub.
      0784d9f8 Fix sqrt, ldexp and frexp compilation errors.
      1414e221 Fix clang compilation for AltiVec from previous check-in
      9b51dc79 Fixed performance issues for VSX and P10 MMA in general_matrix_matrix_product
      c9d4367f Fix pround and add print
      d59ef212 Fixed performance issues for complex VSX and P10 MMA in gebp_kernel (level 3).
      c24bee61 Fix address of temporary object errors in clang11.
      eebde572 Create the ability to disable the specialized gemm_pack_rhs in Eigen (only PPC) for TensorFlow
      44cc96e1 Get rid of used uninitialized warnings for EIGEN_UNUSED_VARIABLE in gcc11+
      fbdaff81 Invert rows and depth in non-vectorized portion of packing (PowerPC).

Chip-Kerchner (8):
      10c77b0f Fix compilation errors with later versions of GCC and use of MMA.
      8523d447 Fixes to support old and new versions of the compilers for built-ins.  Cast to non-const when using vector_pair with certain built-ins.
      c31ead8a Having forward template function declarations in a P10 file causes bad code in certain situations.
      6eebe97b Fix clang compile when no MMA flags are set. Simplify MMA compiler detection.
      28564957 Fix taking address of rvalue compiler issue with TensorFlow (plus other warnings).
      9fc93ce3 EIGEN_STRONG_INLINE was NOT inlining in some critical needed areas (6.6X slowdown) when used with Tensorflow.  Changing to EIGEN_ALWAYS_INLINE where appropiate.
      0b56b62f Reverse compare logic �in F32ToBf16 since vec_cmpne is not available in Power8 - now compiles for clang10 default (P8).
      f57dec64 Fix unaligned loads in ploadLhs & ploadRhs for P8.

ChipKerchner (1):
      13d7658c Fix errors on older compilers (gcc 7.5 - lack of vec_neg, clang10 - can not use const pointers with vec_xl).

Christian von Schultz (1):
      4a40b378 Collapsed revision (based on pull request PR-325) * Support compiling without IO streams

Christoph Grüninger (1):
      dc0b81fb Pass CMAKE_MAKE_PROGRAM to Fortran language support test

Christoph Hertzberg (156):
      22f7d398 bug #1355: Fixed wrong line-endings on two files
      642dddcc Fix nonnull-compare warning
      4247d35d Fixed bug which (extremely rarely) could end in an infinite loop
      10c6bcdc Add support for long indexes and for (real-valued) row-major matrices to CholmodSupport module
      1c024e55 Added some possible temporaries to .hgignore
      e0181426 Make sure CholmodSupport works when included in multiple compilation units (issue was reported on stackoverflow.com)
      157040d4 Make sure CMAKE_Fortran_COMPILER is set before checking for Fortran functions
      0c9ad2f5 std::integral_constant is not C++03 compatible
      23f8b00b clang provides __has_feature(is_enum) (but not <type_traits>) in C++03 mode
      11ddac57 Merged in guillaume_michel/eigen (pull request PR-334)
      072e111e SelfAdjointView<...,Mode> causes a static assert since commit d820ab9edc0b38af4cdb3d545714a0c9083e5a78
      4d392d93 Make hypot_impl compile again for types with expression-templates (e.g., boost::multiprecision)
      84dcd998 Recent Adolc versions require C++11
      2cbb00b1 No need to make noise, if KLU is found
      c8b19702 Limit test size for sparse Cholesky solvers to EIGEN_TEST_MAX_SIZE
      c9ecfff2 Add links where to make PRs and report bugs into README.md
      42715533 bug #1493: Make representation of HouseholderSequence consistent and working for complex numbers. Made corresponding unit test actually test that. Also simplify implementation of QR decompositions
      775766d1 Add parenthesis to fix compiler warnings
      50633d1a Renamed .trans() et al. to .reverseFlag() et at. Adapted documentation of .setReverseFlag()
      34e499ad Disable -Wshadow when compiling with g++
      0272f245 Fix "suggest parentheses around comparison" warning
      d6559009 bug #1544: Generate correct Q matrix in complex case. Original patch was by Jeff Trull in PR-386.
      d06a753d Make qr_fullpivoting unit test run for fixed-sized matrices
      750af063 Add an option to test with external BLAS library
      e5f9f476 Avoid unnecessary C++11 dependency
      7d7bb915 Missing line during manual rebase of PR-374
      636126ef Allow to filter out build-error messages
      fd4fe7cb Fixed issue which made documentation not getting built anymore
      44ee2013 Rename variable which shadows class name
      5f79b7f9 Removed several shadowing types and use global Index typedef everywhere
      5e79402b fix warnings for doc-eigen-prerequisites
      397b0547 DIsable static assertions only when necessary and disable double-promotion warnings in that case as well
      edfb7962 Use `static const int` instead of `enum` to avoid numerous `local-type-template-args` warnings in C++03 mode
      a80a2900 Fix 'template argument uses local type'-warnings (when compiled in C++03 mode)
      dbdeceab Silence double-promotion warning (when converting double to complex<long double>)
      c9b25fbe Silence unused parameter warning
      595cae9b Silence logical-op-parentheses warning
      4713465e Silence double-promotion warning
      41f1cc67 Assertion depended on a not yet initialized value
      39335cf5 Make MaxSizeVector leak-safe
      a709c8ef Replace pointers by values or unique_ptr for better leak-safety
      ad4a08fb Use Intel cast intrinsics, since MSVC does not allow direct casting. Reported by David Winkler.
      f7675b82 Fix several integer conversion and sign-compare warnings
      8295f02b Hide "maybe uninitialized" warning on gcc
      5aaedbec Fixed more sign-compare and type-limits warnings
      495f6c3c Fix missing-braces warnings
      209b4972 Fix conversion warning
      f155e97a Previous fix broke compilation for clang
      117bc5d5 Fix some shadow warnings
      4b1ad086 Fix shadow warnings in doc-snippets
      42123ff3 Make unit test C++03 compatible
      b1653d15 Fix some trivial C++11 vs C++03 compatibility warnings
      42f3ee4f Old gcc versions have problems with recursive #pragma GCC diagnostic push/pop Workaround: Don't include "DisableStupidWarnings.h" before including other main-headers
      ef4d79fe Disable/ReenableStupidWarnings did not work properly, when included recursively
      73ca600b Fix numerous shadow-warnings for GCC<=4.8
      20ba2eee gcc thinks this may not be initialized
      ddbc5643 Fixed a few more shadowing warnings when compiling with g++ (and c++03)
      c2f4e8c0 Fix integer conversion warning
      023ed6b9 Product of empty array must be 1 and not 0.
      ff4e835d "sparse_product.cpp" must be included before "sparse_basic.cpp", otherwise EIGEN_SPARSE_CREATE_TEMPORARY_PLUGIN has no effect
      ba2c8efd EIGEN_UNUSED is not supported by g++4.7 (and not portable)
      7e9c9fbb Disable type-limits warnings for g++ < 4.8
      3adece48 Fix misleading indentation of errorCode and make it loop-local
      d7378aae Provide EIGEN_ALIGNOF macro, and give handmade_aligned_malloc the possibility for alignments larger than the standard alignment.
      007f165c bug #1598: Let MaxSizeVector respect alignment of objects and add a unit test Also revert 8b3d9ed081fc5d4870290649853b19cb5179546e
      42705ba5 Fix weird error for building with g++-4.7 in C++03 mode.
      c50250cb Avoid warning "suggest braces around initialization of subobject". This test is not run in C++03 mode, so no compatibility is lost.
      a0166ab6 Workaround for spurious "array subscript is above array bounds" warnings with g++4.x
      e3c82890 Replace unused PREDICATE by corresponding STATIC_ASSERT
      2c083ace Provide EIGEN_OVERRIDE and EIGEN_FINAL macros to mark virtual function overrides
      0a3356f4 Don't deactivate BVH test for clang (probably, this was failing for very old versions of clang)
      86ba50be Fix integer conversion warnings
      b786ce8c Fix conversion warning ... again
      051f9c1a Make code compile in C++03 mode again
      b92c7123 Move struct outside of method for C++03 compatibility.
      c5f1d0a7 Fix shadow warning
      f6359ad7 Small Doxygen fixes
      3f2c8b7f Fix a lot of Doxygen warnings in Tensor module
      f3130ee1 Avoid empty macro arguments
      24dc0765 Explicitly convert 0 to Scalar for custom types
      40fa6f98 bug #1606: Explicitly set the standard before find_package(StandardMathLibrary). Also replace EIGEN_COMPILER_SUPPORT_CXX11 in favor of EIGEN_COMPILER_SUPPORT_CPP11. Grafted manually from a4afa90d161faab385a77f0e2764fb13ff3b9484
      449ff746 Fix most Doxygen warnings. Also add links to stable documentation from unsupported modules (by using the corresponding Doxytags file). Manually grafted from d107a371c61b764c73fd1570b1f3ed1c6400dd7e
      b5f077d2 Document EIGEN_NO_IO preprocessor directive
      66b28e29 bug #1618: Use different power-of-2 check to avoid MSVC warning
      0ec8afde Fixed most conversion warnings in MatrixFunctions module
      806352d8 Small typo found be Patrick Huber (pull request PR-547)
      ea60a172 Add default constructor to Bar to make test compile again with clang-3.8
      919414b9 bug #785: Make Cholesky decomposition work for empty matrices
      c1d356e8 bug #1635: Use infinity from Numtraits instead of creating it manually.
      6dd93f7e Make code compile again for older compilers. See https://stackoverflow.com/questions/7411515/
      0522460a bug #1656: Enable failtests only if BUILD_TESTING is enabled
      d575505d After fixing bug #1557, boostmultiprec_7 failed with NumericalIssue instead of NoConvergence (all that matters here is no Success)
      da0a41b9 Mask unused-parameter warnings, when building with NDEBUG
      e16913a4 Fix name of tutorial snippet.
      bd6dadcd Tell doxygen that cxx11 math is available
      934b8a13 Avoid `I` as an identifier, since it may clash with the C-header complex.h
      5a52e35f Renaming some more `I` identifiers
      c9825b96 Renaming even more `I` identifiers
      a7779a9b Hide some annoying unused variable warnings in g++8.1
      ec032ac0 Guard C++11-style default constructor. Also, this is only needed for MSVC
      a1646fc9 Commas at the end of enumerator lists are not allowed in C++03
      4270c628 Split the implementation of i?amax/min into two. Based on PR-627 by Sameer Agarwal. Like the Netlib reference implementation, I*AMAX now uses the L1-norm instead of the L2-norm for each element. Changed I*MIN accordingly.
      cca76c27 Restore C++03 compatibility
      e54dc24d Restore C++03 compatibility
      e6667a70 Fix stupid shadow-warnings (with old clang versions)
      4ccd1ece bug #1707: Fix deprecation warnings, or disable warnings when testing deprecated functions
      5f32b79e Collapsed revision from PR-641 * SparseLU.h - corrected example, it didn't compile * Changed encoding back to UTF8
      ac21a08c Cast Index to RealScalar This fixes compilation issues with RealScalar types that are not implicitly castable from Index (e.g. ceres Jet types). Reported by Peter Anderson-Sprecher via eMail
      56144005 digits10() needs to return an integer Problem reported on https://stackoverflow.com/questions/56395899
      e0be7f30 bug #1724: Mask buggy warnings with g++-7 (grafted from 427f2f66d69ae9b124c2f8bcd927fb6e19e07e91 )
      adec097c Remove extra comma (causes warnings in C++03)
      c2671e53 Build deprecated snippets with -DEIGEN_NO_DEPRECATED_WARNING Also, document LinSpaced only where it is implemented
      9237883f Escape \# inside doxygen docu
      ea6d7eb3 Move variadic constructors outside `#ifndef EIGEN_PARSED_BY_DOXYGEN` block, to make it actually appear in the generated documentation.
      e0f5a2a4 Remove {} accidentally added in previous commit
      ba0736fa Fix (or mask away) conversion warnings introduced in 553caeb6a3bb545aef895f8fc9f219be44679017 .
      e4c1b3c1 Fix implicit conversion warnings and use pnegate to negate packets
      efd9867f bug #1746: Removed implementation of standard copy-constructor and standard copy-assign-operator from PermutationMatrix and Transpositions to allow malloc-less std::move. Added unit-test to rvalue_types
      9b7a2b43 Renamed .hgignore to .gitignore (removing hg-specific "syntax" line)
      8e5da714 Resolve double-promotion warnings when compiling with clang. `sin` was calling `sin(double)` instead of `std::sin(float)`
      5a3eaf88 Workaround class-memaccess warnings on newer GCC versions
      72166d0e Fix some maybe-unitialized warnings
      6965f6de Fix unit-test which I broke in previous fix
      870e53c0 Bug #1788: Fix rule-of-three violations inside the stable modules. This fixes deprecated-copy warnings when compiling with GCC>=9 Also protect some additional Base-constructors from getting called by user code code (#1587)
      a3273aef Fix trivial shadow warning
      c21771ac Use double-braces initialization (as everywhere else in the test-suite).
      dde279f5 Hide recursive meta templates from Doxygen
      d86544d6 Reduce code duplication and avoid confusing Doxygen
      1e9664b1 Bug #1796: Make matrix squareroot usable for Map and Ref types
      bcbaad6d Bug #1800: Guard against misleading indentation
      9623c0c4 Fix formatting
      8333e035 Use data.data() instead of &data (since it is not obvious that Array is trivially copyable)
      35219cea Bug #1790: Make `areApprox` check `numext::isnan` instead of bitwise equality (NaNs don't have to be bitwise equal).
      1d0c4512 Removing executable bit from file mode
      d46d726e CommaInitializer wrongfully asserted for 0-sized blocks commainitialier unit-test never actually called `test_block_recursion`, which also was not correctly implemented and would have caused too deep template recursion.
      6b0c0b58 Provide a more efficient Packet2l->Packet2d cast method
      ecb7bc95 Bug #2036 make sure find_standard_math_library_test_program actually compiles (and is guaranteed to call math functions)
      90f6d9d2 Suppress ignored-attributes warning (same as in vectorization_logic). Remove redundant include and using namespace.
      12dda34b Eliminate boolean product warnings by factoring out a `combine_scalar_factors` helper function.
      a7749c09 Bug #1910: Make SparseCholesky work for RowMajor matrices
      ce4af0b3 Missing change regarding #1910
      73922b01 Fixes Bug #1925. Packets should be passed by const reference, even to inline functions.
      4fb3459a Fix double-promotion warnings
      81b5fe2f ReturnByValue is already non-copyable
      ca528593 Fixed/masked more implicit copy constructor warnings
      a3521d74 Fix some enum-enum conversion warnings
      2660d01f Inherit from `no_assignment_operator` to avoid implicit copy constructor warnings
      8f686ac4 clang 10 aggressively warns about precision loss when converting int to float (or long to double)
      39a590df Remove unused include
      199c5f2b geo_alignedbox_5 was failing with AVX enabled, due to storing `Vector4d` in a `std::vector` without using an aligned allocator. Got rid of using `std::vector` and simplified the code. Avoid leading `_`
      69a4f709 Revert "Uses _mm512_abs_pd for Packet8d pabs"
      6197ce1a Replace `-2147483648` by `-0.0f` or `-0.0` constants (this should fix #2189). Also, remove unnecessary `pgather` operations.
      d5867806 Make iterators default constructible and assignable, by making...
      1e1c8a73 Use EIGEN_HAS_CXX11 and EIGEN_COMP_CXXVER macros to detect C++ version for `std::result_of` and `std::invoke_result`. Fixes #2209
      9357feed Avoid using uninitialized inputs and if available, use slightly more efficient `movsd` instruction for `pset1<Packet2cf>`.
      9e0dc8f0 Revert addition of unused `paddsub<Packet2cf>`. This fixes #2242

Christoph Junghans (1):
      95177362 .gitlab-ci.yml: initial commit

Christopher Moore (2):
      a187ffea Resolve "IndexedView of a vector should allow linear access"
      fa8fd4b4 Indexed view should have RowMajorBit when there is staticly a single row

Chun Wang (1):
      0d0948c3 Workaround for error in VS2012 with /clr

Clément Grégoire (1):
      82f54ad1 Fix perf monitoring merge function

Cyril Kaiser (1):
      573570b6 Remove EIGEN_DEVICE_FUNC from CwiseBinaryOp's default copy constructor.

Cédric Hubert (1):
      98bfc5aa Update MarketIO.h

Dan Miller (1):
      1f6b1c1a Fix duplicate definitions on Mac

Daniel N. Miller (APD) (1):
      1e9f623f Do not build shared libs if not supported

Daniel Trebbien (1):
      0c57be40 Move up the specialization of std::numeric_limits

Daniele E. Domenichelli (1):
      a12b8a8c FindEigen3: Set Eigen3_FOUND variable

David Hyde (1):
      d908afe3 bug #1558: fix a corner case in MINRES when both v_new and w_new vanish.

David Tellenbach (78):
      db152b9e PR 572: Add initializer list constructors to Matrix and Array (include unit tests and doc) - {1,2,3,4,5,...} for fixed-size vectors only - {{1,2,3},{4,5,6}} for the general cases - {{1,2,3,4,5,....}} is allowed for both row and column-vector
      237b03b3 PR 574: use variadic template instead of initializer_list to implement fixed-size vector ctor from coefficients.
      97f9a46c PR 593: Add variadtic ctor for DiagonalMatrix with unit tests
      b013176e Remove undefined std::complex<int>
      bd9c2ae3 Fix include guard comments
      3031d572 PR 621: Fix documentation of EIGEN_COMP_EMSCRIPTEN
      5c4e19fb Possibility to specify user-defined default cache sizes for GEBP kernel
      5328cd62 Guard usage of decltype since it's a C++11 feature
      3ce18d3c Revert ".gitlab-ci.yml: initial commit"
      c6c84ed9 Fix unused variable warning on Arm
      13d25f5e Add initial CI configuration file.
      f3b8d441 Remote CI tags to enable shared runners
      689b5707 Report custom C++ flags in CMake testing summary
      cb631531 Make test packetmath C++98 compliant
      ee4715ff Fix test basic stuff
      38b91f25 Fix cast of blfoat16 to std::complex<T>
      c1ffe452 Fix bfloat16 casts
      b8ca9384 Improve CI configuration
      e48d8e47 Don't allow failure for CI build stage anymore
      99da2e1a Fix clang-tidy warnings in generic bfloat16 implementation
      5e484fa1 Fix StlDeque for GCC 10
      23b7f057 Disable CI buildstage again
      8ba1b0f4 bfloat16 packetmath for Arm Neon backend
      c6820a63 Replace the call to int64_t in the blasutil test by explicit types
      d2bb6cf3 Fix compilation error in blasutil test
      d4a727d0 Disable min/max NaN propagation in test cxx11_tensor_expr
      fe8c3ef3 Add possibility to split test suit build targets and improved CI configuration
      c060114a Fix nightly CI configuration
      adc861ca New CI infrastructure, including AArch64 runners
      c4aa8e0d Rename variable to avoid shadowing of a previously declared one
      493a7c77 Remove EIGEN_CONSTEXPR from NumTraits<boost::multiprecision::number<...>>
      b8a13f13 Add CI configuration for ppc64le
      30960d48 Fix failure in GEBP kernel when compiling with OpenMP and FMA
      f66f3393 Use reinterpret_cast instead of C-style cast in Inverse_NEON.h
      8f8d77b5 Add EIGEN prefix for HAS_LGAMMA_R
      4091f6b2 Drop EIGEN_USING_STD_MATH in favour of EIGEN_USING_STD
      9022f5aa Mention problems when using potentially throwing scalars and OpenMP
      7a8d3d5b Disable test exceptions when using OpenMP.
      e3e2cf9d Add MatrixBase::cwiseArg()
      e265f7ed Add support for Armv8.2-a __fp16
      09f01585 Replace numext::as_uint with numext::bit_cast<numext::uint32_t>
      f895755c Remove unused functions in Half.h.
      e9b55c4d Avoid promotion of Arm __fp16 to float in Neon PacketMath
      11e4056f Re-enable Arm Neon Eigen::half packets of size 8
      6c9c3f9a Remove explicit casts from Eigen::half and Eigen::bfloat16 to bool
      550e8f8f Include CMakeDependentOption to be able to use cmake_dependent_option
      305b8bd2 Remove duplicate #if clause
      2e8f850c Fix a typo in SparseMatrix documentation.
      8eb461a4 Remove comma at end of enumerator list in NEON PacketMath
      00be0a7f Fix vectorization of complex sqrt on NEON
      c7eb3a74 Don't guard psqrt for std::complex<float> with EIGEN_ARCH_ARM64
      536c8a79 Remove unused macro in Half.h
      751f18f2 Remove comma at the end of enumeration list to silence C++03 warnings
      0bdc0dba Add missing #endif directive in Macros.h
      660c6b85 Remove std::cerr in iterative solver since we don't have iostream.
      65e2169c Add support for Arm SVE
      598e1b6e Add the following functions:
      170a504c Add the following functions
      54589635 Replace nullptr by NULL in SparseLU.h to be C++03 compliant.
      36200b78 Remove vim specific comments to recognoize correct file-type.
      622c5989  Don't allow all test jobs to fail but only the currently failing ones.
      9ad4096c Document possible inconsistencies when using `Matrix<bool, ...>`
      5336ad85 Define internal::make_unsigned for [unsigned]long long on macOS.
      aa8b22e7 Bump to 3.4.99
      976ae0ca Document that using raw function pointers doesn't work with unaryExpr.
      5bfc67f9 Deactive CI for Power due to problems with GitLab runner
      5f0b4a40 Revert "Adds EIGEN_CONSTEXPR and EIGEN_NOEXCEPT to rows(), cols(), innerStride(), outerStride(), and size()"
      9fb70624 Silence warning on comma at end of enumerator list
      df4bc273 Revert "Augment NumTraits with min/max_exponent()."
      eb71e5db Fix another warning on missing commas
      0cc9b5eb Split test commainitializer into two substests
      4811e819 Remove yet another comma at end of enum
      824272cd Re-enable CI for Power
      ae95b74a Add CMake infrastructure for smoke testing
      e4233b6e Add CI infrastructure for pre-merge smoke tests.
      3e819d83 Before 3.4 branch
      1f4c0311 Bump to 3.3.91 (3.4-rc1)
      3af8c262 Include immintrin.h if F16C is available and vectorization is disabled

Deven Desai (39):
      f124f079 applying EIGEN_DECLARE_TEST to *gpu* tests
      8fbd4705 Adding support for using Eigen in HIP kernels.
      ba972fb6 moving Half headers from CUDA dir to GPU dir, removing the HIP versions
      b6cc0961 updates based on PR feedback
      7e41c8f1 renaming *Cuda files to *Gpu in the unsupported/Eigen/CXX11/src/Tensor and unsupported/test directories
      cfdabbcc removing the *Hip files from the unsupported/Eigen/CXX11/src/Tensor and unsupported/test directories
      1bb6fa99 merging the CUDA and HIP implementation for the Tensor directory and the unit tests
      471cfe5f renaming CUDA* to GPU* for some header files
      dec47a64 renaming CUDA* to GPU* for some header files
      1fe0b749 deleting hip specific files that are no longer required
      876f392c Updates corresponding to the latest round of PR feedback
      946c3e25 adding EIGEN_DEVICE_FUNC attribute to fix some GPU unit tests that are broken in HIP mode
      c64fe9ea Updates to fix HIP-clang specific compile errors.
      94898488 This commit contains the following (HIP specific) updates:
      e7e6809e ROCm/HIP specfic fixes + updates
      51e399fc updates requested in the PR feedback. Also droping coded within #ifdef EIGEN_HAS_OLD_HIP_FP16
      66a885b6 adding EIGEN_DEVICE_FUNC to the recently added TensorContractionKernel constructor. Not having the EIGEN_DEVICE_FUNC attribute on it was leading to compiler errors when compiling Eigen in the ROCm/HIP path
      2c389301 fix for HIP build errors that were introduced by a commit earlier this week
      ba506d5b fix for a ROCm/HIP specificcompile errror introduced by a recent commit.
      7eb2e0a9 adding the EIGEN_DEVICE_FUNC attribute to the constCast routine.
      cdb377d0 Fix for the HIP build+test errors introduced by the ndtri support.
      e02d4296 Fix for the HIP build+test errors.
      5e186b19 Fix for the HIP build+test errors.
      102cf2a7 Fix for the HIP build+test errors.
      312c8e77 Fix for the HIP build+test errors.
      c49f0d85 Fix for HIP breakage detected on 191210
      636e2bb3 Fix for HIP breakage - 191220
      6d284bb1 Fix for HIP breakage - 200115. Adding a missing EIGEN_DEVICE_FUNC attr
      7158ed4e Fixing HIP breakage caused by the recent commit that introduces Packet4h2 as the Eigen::Half packet type
      46f8a185 Adding an explicit launch_bounds(1024) attribute for GPU kernels.
      603e213d Fixing a CUDA / P100 regression introduced by PR 181
      ce5c5972 Fix for ROCm/HIP breakage - 200921
      011e0db3 Fix for ROCm/HIP breakage - 201013
      39a038f2 Fix for ROCm (and CUDA?) breakage - 201029
      9d11e2c0 CMakefile update for ROCm 4.0
      f3d2ea48 Fix for broken ROCm/HIP Support
      2a6addb4 Fix for breakage in ROCm support - 210108
      1a96d49a Changing the Eigen::half implementation for HIP
      748489ef Un-defining EIGEN_HAS_CONSTEXPR on the HIP platform

Dmitriy Korchemkin (1):
      02d2f1cb Cast zeros to Scalar in RealSchur

Duncan McBain (1):
      0cb3c7c7 Update FindComputeCpp.cmake with new changes from SDK

Essex Edwards (1):
      e741b436 Make Transform::computeRotationScaling(0,&S) continuous

Eugene Chereshnev (1):
      f558ad29 Fix incorrect ldvt in LAPACKE call from JacobiSVD

Eugene Zhulenev (166):
      01fd4096 Fuse computations into the Tensor contractions using output kernel
      b324ed55 Call OutputKernel in evalGemv
      e204ecda Remove SimpleThreadPool and always use {NonBlocking}ThreadPool
      43206ac4 Call OutputKernel in evalGemv
      c95aacab Fix TensorContractionOp evaluators for GPU and SYCL
      79d4129c Specify default output kernel for TensorContractionOp
      086ded5c Disable type traits for GCC < 5.1.0
      e3c2d617 Assert that no output kernel is defined for GPU contraction
      6e654f33 Reduce number of allocations in TensorContractionThreadPool.
      c58b8747 PR430: Convert count to the reducer type in MeanReducer
      2bf864f1 Disable type traits for stdlibc++ <= 4.9.3
      34a75c3c Initial support of TensorBlock
      d55efa6f TensorBlockIO
      6913221c Add tiled evaluation support to TensorExecutor
      966c2a7b Rename Index to StorageIndex + use Eigen::Array and Eigen::Map when possible
      83c0a16b Add block evaluation support to TensorOps
      64abdf1d Fix typo + get rid of redundant member variables for block sizes
      1b0373ae Replace all using declarations with typedefs in Tensor ops
      cfaedb38 Fix bug in a test + compilation errors
      f2209d06 Add block evaluationto CwiseUnaryOp and add PreferBlockAccess enum to all evaluators
      35d90e89 Fix BlockAccess enum in CwiseUnaryOp evaluator
      81b38a15 Fix compilation of tiled evaluation code with c++03
      d138fe34 Fis static_assert in test to conform c++11 standard
      01197e44 Fix warnings
      1b8d70a2 Support reshaping with static shapes and dimensions conversion in tensor broadcasting
      48633757 Explicitly construct tensor block dimensions from evaluator dimensions
      71070a1e Const cast scalar pointer in TensorSlicingOp evaluator
      f7d0053c Fix DSizes IndexList constructor
      f313126d Fix warnings in IndexList array_prod
      66f05677 Add DSizes index type promotion
      a5cd4e9a Replace deprecated Eigen::DenseIndex with Eigen::Index in TensorIndexList
      218a7b98 Enable DSizes type promotion with c++03 compilers
      c4627039 Support static dimensions (aka IndexList) in Tensor::resize(...)
      719e438a Collapsed revision * Split cxx11_tensor_executor test * Register test parts with EIGEN_SUFFIXES * Fix EIGEN_SUFFIXES in cxx11_tensor_executor test
      71cd3fbd Support multiple contraction kernel types in TensorContractionThreadPool
      22ed98a3 Conditionally add mkldnn test
      b314376f Test mkldnn pack for doubles
      9f498895 Remove explicit mkldnn support and redundant TensorContractionKernelBlocking
      9f33e71e Revert code lost in merge
      e95696ac Optimize TensorBlockCopyOp
      524c81f3 Add tests for evalShardedByInnerDim contraction + fix bugs
      bb13d5d9 Fix bug in copy optimization in Tensor slicing.
      c0ca8a9f Compile time detection for unimplemented stl-style iterators
      befcac88 Hide stl-container detection test under #if
      2bf1a31d Use void type if stl-style iterators are not supported
      8e6dc2c8 Fix bug in partial reduction of expressions requiring evaluation
      118520f0 Workaround nbcc+msvc compiler bug
      d9392f9e Fix code format
      900c7c61 Check if it's allowed to squueze inner dimensions in TensorBlockIO
      217d8398 Reduce thread scheduling overhead in parallelFor
      9e96e919 Move from rvalue arguments in ThreadPool enqueue* methods
      8a977c1f Fix cxx11_tensor_{block_access, reduction} tests
      80f1651f Use explicit packet type in SSE/PacketMath pldexp
      fd0fbfa9 Do not disable alignment with EIGEN_GPUCC
      0bb15bb6 Update checks in ConfigureVectorization.h
      190d053e Explicitly set fill character when printing aligned data to ostream
      e70ffef9 Optimize evalShardedByInnerDim
      0abe0376 Fix shorten-64-to-32 warning in TensorContractionThreadPool
      1e6d15b5 Fix shorten-64-to-32 warning in TensorContractionThreadPool
      690b2c45 Fix GeneralBlockPanelKernel Android compilation
      6d0f6265 Remove duplicated comment line
      eb21bab7 Parallelize tensor contraction only by sharding dimension and use 'thread-local' memory for packing
      84911270 Do not reduce parallelism too much in contractions with small number of threads
      59998117 Don't do parallel_pack if we can use thread_local memory in tensor contractions
      1e36166e Optimize TensorConversion evaluator: do not convert same type
      21eb97d3 Add PacketConv implementation for non-vectorizable src expressions
      8c2f30c7 Speedup Tensor ThreadPool RunQueu::Empty()
      106ba7bb Do not generate no-op cast() and conjugate() expressions
      f0d42d22 Fix signed-unsigned comparison warning in RunQueue
      7b837559 Fix signed-unsigned return in RuqQueue
      694084ec Use fast divisors in TensorGeneratorOp
      b95941e5 Add tiled evaluation for TensorForcedEvalOp
      efb5080d Do not initialize invalid fast_strides in TensorGeneratorOp
      b1a86274 Do not create Tensor<const T> in cxx11_tensor_forced_eval test
      56c6373f Add an extra check for the RunQueue size estimate
      a407e022 Tune tensor contraction threadpool heuristics
      5d9a6686 Block evaluation for TensorGeneratorOp
      25abaa2e Check that inner block dimension is continuous
      4e4dcd90 Remove redundant steal loop
      1bc2a0a5 Add missing return to NonBlockingThreadPool::LocalSteal
      0f8bfff2 Fix a data race in NonBlockingThreadPool
      899c16fa Fix a bug in TensorGenerator for 1d tensors
      001f10e3 Fix segfaults with cuda compilation
      4e2f6de1 Add support for custom packed Lhs/Rhs blocks in tensor contractions
      629ddebd Add missing semicolon
      68a2a8c4 Use packet ops instead of AVX2 intrinsics
      a7b7f3ca Add missing EIGEN_DEPRECATED annotations to deprecated functions and fix few other doxygen warnings
      07355d47 Get rid of SequentialLinSpacedReturnType deprecation warnings in DenseBase.h
      8ead5bb3 Fix doxygen warnings to enable statis code analysis
      01d7e6ee Check if gpu_assert was overridden in TensorGpuHipCudaDefines
      b4010f02 Add masked pstoreu to AVX and AVX512 PacketMath
      96e30e93 Add masked pstoreu for Packet16h
      e9f0eb8a Add masked_store_available to unpacket_traits
      45b40d91 Fix AVX512 & GCC 6.3 compilation
      96a27680 Always evaluate Tensor expressions with broadcasting via tiled evaluation code path
      01654d97 Prevent potential division by zero in TensorExecutor
      07131182 Remove XSMM support from Tensor module
      69017880 Asynchronous parallelFor in Eigen ThreadPoolDevice
      6e77f9be Remove shadow warnings in TensorDeviceThreadPool
      bc40d452 Const correctness in TensorMap<const Tensor<T, ...>> expressions
      66665e7e Asynchronous expression evaluation with TensorAsyncDevice
      619cea94 Revert accidentally removed <memory> header from ThreadPool
      f0b36fb9 evalSubExprsIfNeededAsync + async TensorContractionThreadPool
      edf2ec28 Fix block mapper type name in TensorExecutor
      79c402e4 Fix shadow warnings in TensorContractionThreadPool
      229db815 Optimize evaluation strategy for TensorSlicingOp and TensorChippingOp
      878845cb Add block access to TensorReverseOp and make sure that TensorForcedEval uses block access when preferred
      81a03bec Fix TensorReverse on GPU with m_stride[i]==0
      4ac93f8e Allocate non-const scalar buffer for block evaluation with DefaultDevice
      60830145 Add outer/inner chipping optimization for chipping dimension specified at runtime
      3cd148f9 Fix expression evaluation heuristic for TensorSliceOp
      47fefa23 Allow move-only done callback in TensorAsyncDevice
      f59bed7a Change typedefs from private to protected to fix MSVC compilation
      f68f2bba TensorMap constness should not change underlying storage constness
      a8d264fa Add test for const TensorMap underlying data mutation
      e3dec4dc ThreadLocal container that does not rely on thread local storage
      d918bd9a Update ThreadLocal to use separate Initialize/Release callables
      553caeb6 Use ThreadLocal container in TensorContractionThreadPool
      bf8866b4 Fix maybe-unitialized warnings in TensorContractionThreadPool
      7c732968 Revert accidental change to GCC diagnostics
      ef9dfee7 Tensor block evaluation V2 support for unary/binary/broadcsting
      c97b2084 Add new TensorBlock api implementation + tests
      c64396b4 Choose TensorBlock StridedLinearCopy type statically
      f35b9ab5 Fix a bug in a packed block type in TensorContractionThreadPool
      71d5bedf Fix compilation warnings and errors with clang in TensorBlockV2
      0c845e28 Fix erf in c++03
      7c8bc0d9 Fix cxx11_tensor_block_io test
      6e40454a Add beta to TensorContractionKernel and make memset optional
      60ae24ee Add block evaluation to TensorReshaping/TensorCasting/TensorPadding/TensorSelect
      98bdd725 Fix compilation warnings and errors with clang in TensorBlockV2 code and tests
      f74ab8cb Add block evaluation to TensorEvalTo and fix few small bugs
      33e17461 Block evaluation for TensorChipping + fixed bugs in TensorPadding and TensorSlicing
      a411e9f3 Block evaluation for TensorGenerator + TensorReverse + fixed bug in tensor reverse op
      d380c23b Block evaluation for TensorGenerator/TensorReverse/TensorShuffling
      02431cbe TensorBroadcasting support for random/uniform blocks
      0d2a14ce Cleanup Tensor block destination and materialized block storage allocation
      df0e8b81 Propagate block evaluation preference through rvalue tensor expressions
      bd864ab4 Prevent potential ODR in TensorExecutor
      fbc0a9a3 Fix CXX11Meta compilation with MSVC
      e7ed4bd3 Remove internal::smart_copy and replace with std::copy
      73ecb2c5 Cleanup includes in Tensor module after switch to C++11 and above
      c952b8df Break loop dependence in TensorGenerator block access
      13c3327f Remove legacy block evaluation support
      bc66c882 Add async evaluation support to TensorPadding/TensorImagePatch/TensorShuffling
      5496d0da Add async evaluation support to TensorReverse
      82a47338 Fix shadow warnings in AlignedBox and SparseBlock
      8f4536e8 Capture TensorMap by value inside tensor expression AST
      bb7ccac3 Add recursive work splitting to EvalShardedByInnerDimContext
      dbb703d4 Add async evaluation support to TensorSelectOp
      2918f85b Do not use std::vector in getResourceRequirements
      dbca11e8 Remove TensorBlock.h and old TensorBlock/BlockMapper
      1c879eb0 Remove V2 suffix from TensorBlock
      c9220c03 Remove block memory allocation required by removed block evaluation API
      963ba101 Add back accidentally deleted default constructor to TensorExecutorTilingContext.
      64272c7f Squeeze reads from two inner dimensions in TensorPadding
      381f8f31 Initialize non-trivially constructible types when allocating a temp buffer.
      788bef6a Reduce block evaluation overhead for small tensor expressions
      ae07801d Tensor block evaluation cost model
      73e55525 Return const data pointer from TensorRef evaluator.data()
      7a65219a Fix TensorPadding bug in squeezed reads from inner dimension
      b9362fb8 Convert StridedLinearBufferCopy::Kind to enum class
      3fda850c Remove dead code from TensorReduction.h
      f584bd9b Fail at compile time if default executor tries to use non-default device
      3c02fefe Add async evaluation support to TensorSlicingOp.
      2279f2c6 Use lgamma_r if it is available (update check for glibc 2.19+)
      a6601070 Ad…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status:Backlog Postponed without a fixed deadline type:Bug Inconsistencies or issues which will cause an incorrect result under some or all circumstances type:Compiler Compiler support or related warnings
Projects
None yet
Development

No branches or pull requests

4 participants