-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Open
Labels
P2Priority of the issue for triage purpose: Needs to be fixed at some point.Priority of the issue for triage purpose: Needs to be fixed at some point.enhancementNew feature or requestNew feature or requestup-for-grabsA good issue to fix if you are trying to contribute to the projectA good issue to fix if you are trying to contribute to the project
Description
Style changes needed to solve part of #823
Details
- Do "preamble" for the implementation of SSE/AVX intrinsics in
src\Microsoft.ML.CpuMath\SseIntrinsics.csandsrc\Microsoft.ML.CpuMath\AvxIntrinsics.cs:
- while (!aligned) { do scalar operation; } // preamble
- Do vectorized operation using ReadAligned
- while (!end) { do scalar operation; }
For large arrays, especially those that cross cache line or page boundaries, doing this should save some measurable amount of time.
Reference: https://github.com/dotnet/machinelearning/pull/562/files/f0f81a5019a3c8cbd795a970e40d633e9e1770c1#r204061074
#1143
Currently these functions are just using Unaligned Loads, we can make them after by aligning the data and doing aligned loads.
- AddScalerU
- ScaleSrcU
- AddScaleU
- ScaleAddU
- AddU
- AddScaleCopyU
- AddSU
- MulElementWiseU
- SumU
- SumSqU
- SumSqDiffU
- SumAbsU
- SumAbsDiffU
- MaxAbsU
- MaxAbsDiffU
- DotU
- DotSU
- Dist2
- SdcaL1UpdateU
- SdcaL1UpdateSU
Metadata
Metadata
Assignees
Labels
P2Priority of the issue for triage purpose: Needs to be fixed at some point.Priority of the issue for triage purpose: Needs to be fixed at some point.enhancementNew feature or requestNew feature or requestup-for-grabsA good issue to fix if you are trying to contribute to the projectA good issue to fix if you are trying to contribute to the project