.NET 7 had brought very significant performance improvements to several LINQ queries by vectorising the processing of various data structures. Whilst the improvements achieved, as denoted in this article, are considerable, I wondered if it was possible to obtain further gains by resorting to .NET Hardware Intrinsics, particularly for large datasets.
This project is intended to provide an example of how hardware intrinsics could be used for the Max, Min, Average and Sum methods. The benchmark tests were run on a machine that supports all versions of AVX and SSE, therefore no checks and fallback methods were required.
Tests were performed on large arrays of 1 billion elements and the performance gains were very substantial. It is, therefore, clear that in business cases where the fast processing of large datasets is critical, hardware intrinsics could have an important role to play.