adding SIMD/TensorPrimitives to dotent-diag/analyzing-dotnet-performance#330
adding SIMD/TensorPrimitives to dotent-diag/analyzing-dotnet-performance#330jeffschwMSFT wants to merge 9 commits intodotnet:mainfrom
Conversation
…p tests to come to this skill, moved from skill to reference in the performance skill
There was a problem hiding this comment.
Pull request overview
Adds SIMD/TensorPrimitives guidance and evaluation coverage to the dotnet-diag/analyzing-dotnet-performance skill, expanding it to recognize when scalar loops are good candidates for vectorization (or when SIMD is not applicable).
Changes:
- Adds new SIMD-focused fixtures and corresponding eval scenarios (TensorPrimitives reductions, SIMD-friendly loops, and a “no SIMD opportunity” case).
- Introduces a new
simd-vectorization.mdreference with decision gating (TensorPrimitives-first vs manual intrinsics). - Updates the skill description and detection signals to include SIMD/vectorization.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/dotnet-diag/analyzing-dotnet-performance/fixtures/simd-tensor-primitives-product.cs | New fixture for product reduction intended for TensorPrimitives optimization. |
| tests/dotnet-diag/analyzing-dotnet-performance/fixtures/simd-tensor-primitives-minmax.cs | New fixture for min/max reduction intended for TensorPrimitives optimization. |
| tests/dotnet-diag/analyzing-dotnet-performance/fixtures/simd-no-opportunity-catalog.cs | New fixture representing a case where SIMD is not a meaningful optimization. |
| tests/dotnet-diag/analyzing-dotnet-performance/fixtures/simd-conditional-increment.cs | New SIMD-friendly loop fixture (conditional increment). |
| tests/dotnet-diag/analyzing-dotnet-performance/fixtures/simd-bit-reverser.cs | New SIMD-friendly byte processing fixture (bit reversal). |
| tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml | Adds five new scenarios covering TensorPrimitives, SIMD intrinsics, and “no opportunity” detection. |
| plugins/dotnet-diag/skills/analyzing-dotnet-performance/references/simd-vectorization.md | New reference doc defining the SIMD/TensorPrimitives decision gate and implementation patterns. |
| plugins/dotnet-diag/skills/analyzing-dotnet-performance/SKILL.md | Updates description and signal detection to include SIMD vectorization and reference loading. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
tests/dotnet-diag/analyzing-dotnet-performance/fixtures/simd-tensor-primitives-product.cs
Show resolved
Hide resolved
plugins/dotnet-diag/skills/analyzing-dotnet-performance/references/simd-vectorization.md
Outdated
Show resolved
Hide resolved
plugins/dotnet-diag/skills/analyzing-dotnet-performance/references/simd-vectorization.md
Outdated
Show resolved
Hide resolved
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…nces/simd-vectorization.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…nces/simd-vectorization.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
plugins/dotnet-diag/skills/analyzing-dotnet-performance/references/simd-vectorization.md
Outdated
Show resolved
Hide resolved
…nces/simd-vectorization.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
plugins/dotnet-diag/skills/analyzing-dotnet-performance/references/simd-vectorization.md
Outdated
Show resolved
Hide resolved
plugins/dotnet-diag/skills/analyzing-dotnet-performance/references/simd-vectorization.md
Outdated
Show resolved
Hide resolved
plugins/dotnet-diag/skills/analyzing-dotnet-performance/references/simd-vectorization.md
Outdated
Show resolved
Hide resolved
plugins/dotnet-diag/skills/analyzing-dotnet-performance/references/simd-vectorization.md
Outdated
Show resolved
Hide resolved
plugins/dotnet-diag/skills/analyzing-dotnet-performance/references/simd-vectorization.md
Outdated
Show resolved
Hide resolved
plugins/dotnet-diag/skills/analyzing-dotnet-performance/references/simd-vectorization.md
Outdated
Show resolved
Hide resolved
plugins/dotnet-diag/skills/analyzing-dotnet-performance/references/simd-vectorization.md
Outdated
Show resolved
Hide resolved
|
Few nits/feedback, but this looks overall good! |
…nces/simd-vectorization.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
plugins/dotnet-diag/skills/analyzing-dotnet-performance/references/simd-vectorization.md
Outdated
Show resolved
Hide resolved
- Check Span<T>/MemoryExtensions before TensorPrimitives (no extra dependency) - Add all supported types (sbyte, ushort, uint, ulong, nint, nuint, char via ushort) - TensorPrimitives: add constraint and applicable types columns (not just float/double) - Add FusedMultiplyAdd, clarify AddMultiply vs MultiplyAdd distinction - Prefer portable APIs over platform-specific intrinsics (allow when perf justifies) - Fix dispatch pattern to use if/else if to avoid pessimizing small inputs - Remove LINQ references Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/evaluate |
|
|
||
| 3. **Scalar loop over contiguous array/span** of `byte`, `sbyte`, `short`, `ushort`, `int`, `uint`, `long`, `ulong`, `nint`, `nuint`, `float`, `double` (and `char` via reinterpretation as `ushort`)? → Implement with explicit `Vector128<T>` / `Vector256<T>` / `Vector512<T>` intrinsics using the patterns below. | ||
|
|
||
| 4. **No contiguous numeric array processing** (dictionary lookups, tree traversals, linked lists, state machines, string formatting, small collections, enum comparisons, recursive algorithms, decimal arithmetic)? → Report `[NO SIMD OPPORTUNITY]` and write a **full paragraph** explaining WHY, referencing the specific code characteristics that prevent vectorization (e.g., "State machines require sequential branching on enum values — there are no contiguous numeric arrays to process in parallel, and each transition depends on the previous state"). This explanation is graded. |
There was a problem hiding this comment.
Should this be "Non contiguous" instead of "No contiguous"?
There was a problem hiding this comment.
I debated this with the model, it felt that this language was clear to it. I asked it for a compromise and we made a slight change
There was a problem hiding this comment.
Gotcha. I personally found them to be polar opposite statements.
One is stating that it "should not perform simd on numeric arrays which are contiguous" the other stating that it "should not perf simd numeric arrays which are not contiguous"
plugins/dotnet-diag/skills/analyzing-dotnet-performance/references/simd-vectorization.md
Outdated
Show resolved
Hide resolved
tannergooding
left a comment
There was a problem hiding this comment.
Couple more nits, but LGTM
|
we are adding a new experimental branch to this repo and this may be one of the first skills to give it a try. @artl93 and I discussed and although this seems useful, we are not sure how many people it will be most helpful for. though willing to discuss. (fwiw, I am on the list of people that have found it helpful for numeric libraries) |
Skill Validation Results
[1] (Plugin) Quality improved but weighted score is -8.8% due to: judgment, quality
Model: claude-opus-4.6 | Judge: claude-opus-4.6 |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 8 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Incorporated feedback, trained on 100+ scalar loop examples, moved to analyzing-dotnet-performance reference