adding SIMD/TensorPrimitives to dotent-diag/analyzing-dotnet-performance by jeffschwMSFT · Pull Request #330 · dotnet/skills

jeffschwMSFT · 2026-03-11T17:35:44Z

Incorporated feedback, trained on 100+ scalar loop examples, moved to analyzing-dotnet-performance reference

…p tests to come to this skill, moved from skill to reference in the performance skill

Copilot

Pull request overview

Adds SIMD/TensorPrimitives guidance and evaluation coverage to the dotnet-diag/analyzing-dotnet-performance skill, expanding it to recognize when scalar loops are good candidates for vectorization (or when SIMD is not applicable).

Changes:

Adds new SIMD-focused fixtures and corresponding eval scenarios (TensorPrimitives reductions, SIMD-friendly loops, and a “no SIMD opportunity” case).
Introduces a new simd-vectorization.md reference with decision gating (TensorPrimitives-first vs manual intrinsics).
Updates the skill description and detection signals to include SIMD/vectorization.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
tests/dotnet-diag/analyzing-dotnet-performance/fixtures/simd-tensor-primitives-product.cs	New fixture for product reduction intended for TensorPrimitives optimization.
tests/dotnet-diag/analyzing-dotnet-performance/fixtures/simd-tensor-primitives-minmax.cs	New fixture for min/max reduction intended for TensorPrimitives optimization.
tests/dotnet-diag/analyzing-dotnet-performance/fixtures/simd-no-opportunity-catalog.cs	New fixture representing a case where SIMD is not a meaningful optimization.
tests/dotnet-diag/analyzing-dotnet-performance/fixtures/simd-conditional-increment.cs	New SIMD-friendly loop fixture (conditional increment).
tests/dotnet-diag/analyzing-dotnet-performance/fixtures/simd-bit-reverser.cs	New SIMD-friendly byte processing fixture (bit reversal).
tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml	Adds five new scenarios covering TensorPrimitives, SIMD intrinsics, and “no opportunity” detection.
plugins/dotnet-diag/skills/analyzing-dotnet-performance/references/simd-vectorization.md	New reference doc defining the SIMD/TensorPrimitives decision gate and implementation patterns.
plugins/dotnet-diag/skills/analyzing-dotnet-performance/SKILL.md	Updates description and signal detection to include SIMD vectorization and reference loading.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml

tests/dotnet-diag/analyzing-dotnet-performance/fixtures/simd-tensor-primitives-product.cs

plugins/dotnet-diag/skills/analyzing-dotnet-performance/references/simd-vectorization.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…nces/simd-vectorization.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

plugins/dotnet-diag/skills/analyzing-dotnet-performance/references/simd-vectorization.md

plugins/dotnet-diag/skills/analyzing-dotnet-performance/SKILL.md

…nces/simd-vectorization.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

plugins/dotnet-diag/skills/analyzing-dotnet-performance/references/simd-vectorization.md

tannergooding · 2026-03-16T16:15:49Z

Few nits/feedback, but this looks overall good!

…nces/simd-vectorization.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml

plugins/dotnet-diag/skills/analyzing-dotnet-performance/references/simd-vectorization.md

plugins/dotnet-diag/skills/analyzing-dotnet-performance/SKILL.md

tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml

- Check Span<T>/MemoryExtensions before TensorPrimitives (no extra dependency) - Add all supported types (sbyte, ushort, uint, ulong, nint, nuint, char via ushort) - TensorPrimitives: add constraint and applicable types columns (not just float/double) - Add FusedMultiplyAdd, clarify AddMultiply vs MultiplyAdd distinction - Prefer portable APIs over platform-specific intrinsics (allow when perf justifies) - Fix dispatch pattern to use if/else if to avoid pessimizing small inputs - Remove LINQ references Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

jeffschwMSFT · 2026-03-16T23:36:14Z

/evaluate

tannergooding · 2026-03-16T23:39:16Z

plugins/dotnet-diag/skills/analyzing-dotnet-performance/references/simd-vectorization.md

+
+3. **Scalar loop over contiguous array/span** of `byte`, `sbyte`, `short`, `ushort`, `int`, `uint`, `long`, `ulong`, `nint`, `nuint`, `float`, `double` (and `char` via reinterpretation as `ushort`)? → Implement with explicit `Vector128<T>` / `Vector256<T>` / `Vector512<T>` intrinsics using the patterns below.
+
+4. **No contiguous numeric array processing** (dictionary lookups, tree traversals, linked lists, state machines, string formatting, small collections, enum comparisons, recursive algorithms, decimal arithmetic)? → Report `[NO SIMD OPPORTUNITY]` and write a **full paragraph** explaining WHY, referencing the specific code characteristics that prevent vectorization (e.g., "State machines require sequential branching on enum values — there are no contiguous numeric arrays to process in parallel, and each transition depends on the previous state"). This explanation is graded.


Should this be "Non contiguous" instead of "No contiguous"?

I debated this with the model, it felt that this language was clear to it. I asked it for a compromise and we made a slight change

Gotcha. I personally found them to be polar opposite statements.

One is stating that it "should not perform simd on numeric arrays which are contiguous" the other stating that it "should not perf simd numeric arrays which are not contiguous"

plugins/dotnet-diag/skills/analyzing-dotnet-performance/references/simd-vectorization.md

tannergooding

Couple more nits, but LGTM

jeffschwMSFT · 2026-03-16T23:45:58Z

we are adding a new experimental branch to this repo and this may be one of the first skills to give it a try. @artl93 and I discussed and although this seems useful, we are not sure how many people it will be most helpful for. though willing to discuss. (fwiw, I am on the list of people that have found it helpful for numeric libraries)

github-actions · 2026-03-16T23:48:43Z

Skill Validation Results

Skill	Scenario	Quality (Isolated)	Quality (Plugin)	Skills Loaded	Overfit	Verdict
directory-build-organization	Organize build infrastructure for a multi-project repo	3.0/5 → 5.0/5 🟢	3.0/5 → 4.7/5 🟢	✅ directory-build-organization; tools: skill, create, edit, bash, task / ✅ msbuild-antipatterns; directory-build-organization; tools: task, bash, skill	✅ 0.15	✅
dotnet-trace-collect	High CPU in Kubernetes on Linux (.NET 8)	3.7/5 → 4.7/5 🟢	3.7/5 → 4.3/5 🟢	✅ dotnet-trace-collect; tools: report_intent, skill, view, glob, bash / ✅ dotnet-trace-collect; tools: skill, report_intent, view	✅ 0.16	✅
dotnet-trace-collect	.NET Framework on Windows without admin privileges	2.0/5 → 5.0/5 🟢	2.0/5 → 5.0/5 🟢	✅ dotnet-trace-collect; tools: skill / ✅ dotnet-trace-collect; tools: skill	✅ 0.16	✅
dotnet-trace-collect	.NET 10 on Linux with root access and native call stacks	1.7/5 → 4.0/5 🟢	1.7/5 → 4.0/5 🟢	✅ dotnet-trace-collect; tools: skill / ✅ dotnet-trace-collect; tools: skill	✅ 0.16	✅
dotnet-trace-collect	Memory leak on Linux (.NET 8)	2.3/5 → 3.0/5 🟢	2.3/5 → 3.0/5 🟢	✅ dotnet-trace-collect; tools: skill, report_intent, view, bash / ✅ dotnet-trace-collect; tools: skill, report_intent, view	✅ 0.16	✅
dotnet-trace-collect	Slow requests on Windows with PerfView	3.7/5 → 5.0/5 🟢	3.7/5 → 5.0/5 🟢	✅ dotnet-trace-collect; tools: skill, report_intent, view, glob, bash / ✅ dotnet-trace-collect; tools: skill, report_intent, view	✅ 0.16	✅
dotnet-trace-collect	Excessive GC on Linux (.NET 8)	3.3/5 → 5.0/5 🟢	3.3/5 → 4.7/5 🟢	✅ dotnet-trace-collect; tools: skill, glob / ✅ dotnet-trace-collect; tools: skill	✅ 0.16	✅
dotnet-trace-collect	Hang or deadlock diagnosis on Linux	2.7/5 → 3.7/5 🟢	2.7/5 → 3.0/5 🟢	✅ dotnet-trace-collect; tools: skill / ✅ dotnet-trace-collect; dump-collect; tools: skill, report_intent, view	✅ 0.16	❌ [1]
dotnet-trace-collect	Windows container high CPU with PerfView	1.7/5 → 4.3/5 🟢	1.7/5 → 5.0/5 🟢	✅ dotnet-trace-collect; tools: skill, glob / ✅ dotnet-trace-collect; tools: skill	✅ 0.16	✅
dotnet-trace-collect	Long-running intermittent issue with PerfView triggers	2.3/5 → 5.0/5 🟢	2.3/5 → 4.3/5 🟢	✅ dotnet-trace-collect; tools: skill, report_intent, view, bash, glob / ✅ dotnet-trace-collect; tools: skill, report_intent, view	✅ 0.16	✅
dotnet-trace-collect	Linux pre-.NET 10 needing native call stacks	2.7/5 → 4.7/5 🟢	2.7/5 → 4.3/5 🟢	✅ dotnet-trace-collect; tools: skill, report_intent, view, bash / ✅ dotnet-trace-collect; tools: skill, report_intent, view	✅ 0.16	✅
dotnet-trace-collect	Windows modern .NET with admin high CPU	2.0/5 → 4.7/5 🟢	2.0/5 → 5.0/5 🟢	✅ dotnet-trace-collect; tools: skill, report_intent, view, bash, glob / ✅ dotnet-trace-collect; tools: skill, report_intent, view	✅ 0.16	✅
dotnet-trace-collect	Memory leak on .NET Framework Windows	3.3/5 → 5.0/5 🟢	3.3/5 → 5.0/5 🟢	✅ dotnet-trace-collect; tools: report_intent, skill, view, glob, bash / ✅ dotnet-trace-collect; tools: report_intent, skill, view	✅ 0.16	✅
dotnet-trace-collect	Kubernetes with console access prefers console tools	4.3/5 → 4.7/5 🟢	4.3/5 → 5.0/5 🟢	✅ dotnet-trace-collect; tools: skill, report_intent, view, bash / ✅ dotnet-trace-collect; tools: skill, report_intent, view	✅ 0.16	❌ [2]
dotnet-trace-collect	Container installation without .NET SDK	3.0/5 → 3.3/5 🟢	3.0/5 → 4.7/5 🟢	✅ dotnet-trace-collect; tools: skill / ✅ dotnet-trace-collect; tools: skill	✅ 0.16	❌ [3]
dotnet-trace-collect	HTTP 500s from downstream service on Linux (.NET 8)	4.3/5 → 5.0/5 🟢	4.3/5 → 5.0/5 🟢	✅ dotnet-trace-collect; tools: skill, report_intent, view, bash, glob / ✅ dotnet-trace-collect; tools: report_intent, skill, view	✅ 0.16	✅
dotnet-trace-collect	Networking timeouts on Windows with admin (.NET 8)	2.0/5 → 5.0/5 🟢	2.0/5 → 4.7/5 🟢	✅ dotnet-trace-collect; tools: report_intent, skill, view, bash / ✅ dotnet-trace-collect; tools: skill, report_intent, view	✅ 0.16	✅
analyzing-dotnet-performance	Detects compiled regex startup budget and regex chain allocations	1.0/5 → 1.0/5	1.0/5 → 1.0/5	⚠️ NOT ACTIVATED / ✅ analyzing-dotnet-performance; tools: skill	✅ 0.14	❌ [4]
analyzing-dotnet-performance	Detects CurrentCulture comparer and compiled regex budget in inflection rules	1.0/5 → 1.0/5	1.0/5 → 1.0/5	⚠️ NOT ACTIVATED / ⚠️ NOT ACTIVATED	✅ 0.14	❌ [5]
analyzing-dotnet-performance	Finds per-call Dictionary allocation not hoisted to static	1.0/5 → 1.0/5	1.0/5 → 1.0/5	⚠️ NOT ACTIVATED / ⚠️ NOT ACTIVATED	✅ 0.14	❌ [6]
analyzing-dotnet-performance	Catches compound allocations in recursive number converter with ToLower	1.0/5 → 1.3/5 🟢	1.0/5 → 1.0/5	⚠️ NOT ACTIVATED / ⚠️ NOT ACTIVATED	✅ 0.14	❌ [7]
analyzing-dotnet-performance	Finds StringComparison.Ordinal missing and FrozenDictionary opportunities	1.0/5 → 1.0/5	1.0/5 → 1.0/5	⚠️ NOT ACTIVATED / ⚠️ NOT ACTIVATED	✅ 0.14	❌ [8]
analyzing-dotnet-performance	Detects Aggregate+Replace chain and struct missing IEquatable	1.0/5 → 1.0/5	1.0/5 → 1.0/5	⚠️ NOT ACTIVATED / ⚠️ NOT ACTIVATED	✅ 0.14	❌ [9]
analyzing-dotnet-performance	Finds branched Replace chain in format string manipulation	1.0/5 → 1.0/5	1.0/5 → 1.0/5	⚠️ NOT ACTIVATED / ⚠️ NOT ACTIVATED	✅ 0.14	❌ [10]
analyzing-dotnet-performance	Catches LINQ on hot-path string processing and All(char.IsUpper)	1.0/5 → 1.0/5	1.0/5 → 1.0/5	✅ analyzing-dotnet-performance; tools: glob, skill / ⚠️ NOT ACTIVATED	✅ 0.14	❌ [11]
analyzing-dotnet-performance	Detects LINQ pipeline in TimeSpan formatting and collection processing	1.0/5 → 1.0/5	1.0/5 → 1.0/5	⚠️ NOT ACTIVATED / ⚠️ NOT ACTIVATED	✅ 0.14	❌ [12]
analyzing-dotnet-performance	Flags Span inconsistencies and compound method chains in truncation library	1.3/5 → 1.0/5 🔴	1.3/5 → 1.0/5 🔴	⚠️ NOT ACTIVATED / ⚠️ NOT ACTIVATED	✅ 0.14	❌
analyzing-dotnet-performance	Identifies unsealed leaf classes and locale hierarchy patterns	1.0/5 → 1.0/5	1.0/5 → 1.3/5 🟢	⚠️ NOT ACTIVATED / ⚠️ NOT ACTIVATED	✅ 0.14	❌ [13]
analyzing-dotnet-performance	Optimize manual min/max with TensorPrimitives	1.0/5 ⏰ → 1.0/5	1.0/5 ⏰ → 1.0/5 ⏰	⚠️ NOT ACTIVATED / ⚠️ NOT ACTIVATED	✅ 0.14	❌ [14]
analyzing-dotnet-performance	Optimize manual product with TensorPrimitives	1.0/5 ⏰ → 1.0/5 ⏰	1.0/5 ⏰ → 2.0/5 ⏰ 🟢	✅ analyzing-dotnet-performance; tools: skill / ✅ analyzing-dotnet-performance; tools: skill, edit	✅ 0.14	✅
analyzing-dotnet-performance	No optimization opportunity — dictionary-based lookup service	1.0/5 → 1.0/5 ⏰	1.0/5 → 1.0/5 ⏰	✅ analyzing-dotnet-performance; tools: edit, create, skill / ✅ analyzing-dotnet-performance; tools: create, skill	✅ 0.14	❌ [15]
analyzing-dotnet-performance	Optimize int array conditional increment with SIMD	4.7/5 ⏰ → 3.3/5 ⏰ 🔴	4.7/5 ⏰ → 3.7/5 ⏰ 🔴	⚠️ NOT ACTIVATED / ⚠️ NOT ACTIVATED	✅ 0.14	❌
analyzing-dotnet-performance	Optimize byte buffer bit reversal with SIMD	3.7/5 ⏰ → 1.0/5 ⏰ 🔴	3.7/5 ⏰ → 1.0/5 ⏰ 🔴	⚠️ NOT ACTIVATED / ⚠️ NOT ACTIVATED	✅ 0.14	❌

[1] (Plugin) Quality improved but weighted score is -8.8% due to: judgment, quality
[2] (Isolated) Quality improved but weighted score is -10.0% due to: tokens (11654 → 105134), tool calls (0 → 6), time (8.8s → 39.4s)
[3] (Isolated) Quality improved but weighted score is -46.7% due to: judgment, quality
[4] (Isolated) Quality unchanged but weighted score is -13.2% due to: judgment, tokens (35194 → 40879)
[5] (Plugin) Quality unchanged but weighted score is -1.3% due to: tokens (34971 → 43307)
[6] (Plugin) Quality unchanged but weighted score is -0.8% due to: tokens (34933 → 38931)
[7] (Plugin) Quality unchanged but weighted score is -5.7% due to: tokens (23092 → 39001), tool calls (2 → 3), time (12.2s → 17.1s)
[8] (Isolated) Quality unchanged but weighted score is -0.8% due to: tokens (34917 → 40485)
[9] (Plugin) Quality unchanged but weighted score is -0.1% due to: tokens (34957 → 38958)
[10] (Isolated) Quality unchanged but weighted score is -3.3% due to: tokens (38898 → 52988), tool calls (3 → 4), time (13.4s → 16.7s)
[11] (Isolated) Quality unchanged but weighted score is -7.7% due to: tokens (34909 → 60848), tool calls (3 → 6), time (11.9s → 19.0s)
[12] (Isolated) Quality unchanged but weighted score is -1.4% due to: tokens (42920 → 48986), time (17.5s → 22.0s)
[13] (Isolated) Quality unchanged but weighted score is -0.3% due to: efficiency metrics
[14] (Plugin) Quality unchanged but weighted score is -14.1% due to: errors (0 → 1), tokens (109364 → 204707), time (71.7s → 180.1s), tool calls (9 → 17)
[15] (Isolated) Quality unchanged but weighted score is -15.0% due to: tokens (47249 → 196950), errors (0 → 1), tool calls (4 → 16), time (20.2s → 127.0s)

⏰ timeout — run hit the scenario timeout limit; scoring may be impacted by aborting model execution before it could produce its full output

Model: claude-opus-4.6 | Judge: claude-opus-4.6

Full results

Copilot

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 8 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/dotnet-diag/analyzing-dotnet-performance/fixtures/simd-tensor-primitives-product.cs

tests/dotnet-diag/analyzing-dotnet-performance/fixtures/simd-tensor-primitives-minmax.cs

tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml

plugins/dotnet-diag/skills/analyzing-dotnet-performance/SKILL.md

incorporated feedback from previous pr, iterated with 100+ scalar loo…

a19d353

…p tests to come to this skill, moved from skill to reference in the performance skill

jeffschwMSFT requested a review from a team as a code owner March 11, 2026 17:35

jeffschwMSFT requested review from artl93, Copilot and tannergooding March 11, 2026 17:35

Copilot started reviewing on behalf of jeffschwMSFT March 11, 2026 17:36 View session

Copilot AI reviewed Mar 11, 2026

View reviewed changes

Update tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml

fd3fcb3

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 11, 2026 17:41

Update tests/dotnet-diag/analyzing-dotnet-performance/eval.yaml

385c1f1

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot started reviewing on behalf of jeffschwMSFT March 11, 2026 17:41 View session

jeffschwMSFT and others added 2 commits March 11, 2026 10:43

Update plugins/dotnet-diag/skills/analyzing-dotnet-performance/refere…

71b89f6

…nces/simd-vectorization.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update plugins/dotnet-diag/skills/analyzing-dotnet-performance/refere…

95bdaf7

…nces/simd-vectorization.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI reviewed Mar 11, 2026

View reviewed changes

plugins/dotnet-diag/skills/analyzing-dotnet-performance/references/simd-vectorization.md Outdated Show resolved Hide resolved

plugins/dotnet-diag/skills/analyzing-dotnet-performance/SKILL.md Show resolved Hide resolved

Update plugins/dotnet-diag/skills/analyzing-dotnet-performance/refere…

a165429

…nces/simd-vectorization.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 11, 2026 23:52

Copilot started reviewing on behalf of jeffschwMSFT March 11, 2026 23:52 View session

Copilot AI reviewed Mar 11, 2026

View reviewed changes