[Efficiency Improver] Fix O(n2) hash collisions in DuplicateDataRowAnalyzer

🤖 *This is an automated PR from [Daily Efficiency Improver](https://github.com/microsoft/testfx/actions/runs/24872020067), an AI assistant focused on reducing the energy consumption and computational footprint of this repository.*

---

## Goal and Rationale

`DuplicateDataRowAnalyzer` uses a `Dictionary<ImmutableArray<TypedConstant>, int>` to detect duplicate `[DataRow]` attributes on a test method. The custom `TypedConstantArrayComparer.GetHashCode` only hashed `Kind` and `Type` — but not the actual `Value`.

This meant that for any test method with multiple `[DataRow(int)]` attributes (e.g. `[DataRow(1)]`, `[DataRow(2)]`, `[DataRow(3)]`...), every argument array produced the **same hash**, turning every dictionary operation into an O(n) linear scan and making the overall duplicate-detection algorithm **O(n2)**.

## Focus Area

**Code-Level Efficiency** — Algorithmic complexity: O(n2) where O(n) suffices.

## Approach

Include the `TypedConstant.Value` in the hash function, matching the recursive comparison logic already present in `Equals`. For array-typed arguments, recursively hash the `Values`.

### Before

```csharp
public int GetHashCode(ImmutableArray<TypedConstant> obj)
{
    var hashCode = default(RoslynHashCode);
    foreach (TypedConstant typedConstant in obj)
    {
        hashCode.Add(typedConstant.Kind);
        hashCode.Add(SymbolEqualityComparer.Default.GetHashCode(typedConstant.Type));
    }
    return hashCode.ToHashCode();
}
```

### After

```csharp
public int GetHashCode(ImmutableArray<TypedConstant> obj)
{
    var hashCode = default(RoslynHashCode);
    foreach (TypedConstant typedConstant in obj)
    {
        hashCode.Add(typedConstant.Kind);
        hashCode.Add(SymbolEqualityComparer.Default.GetHashCode(typedConstant.Type));

        if (!typedConstant.IsNull)
        {
            if (typedConstant.Kind == TypedConstantKind.Array)
            {
                hashCode.Add(GetHashCode(typedConstant.Values));
            }
            else
            {
                hashCode.Add(typedConstant.Value);
            }
        }
    }
    return hashCode.ToHashCode();
}
```

## Energy Efficiency Evidence

**Proxy metric**: CPU cycles / instruction count — fewer dictionary comparisons means fewer instructions executed by the Roslyn analyzer host on every build.

| Scenario | Before | After |
|----------|--------|-------|
| `n` DataRow attributes, same parameter type | O(n2) hash comparisons | O(n) — one per entry |
| `[DataRow(1)]..[DataRow(100)]` on one method | ~5050 equality checks | ~100 equality checks |

MSTest projects routinely have data-driven test methods with dozens of `[DataRow]` attributes. The analyzer runs on every build — the fix directly reduces the number of CPU operations per compilation.

*GSF principle: Hardware Efficiency — reducing unnecessary computation lowers energy draw per build on CI and developer machines.*

## Edge Cases

- **Null values**: skipped (no change — `IsNull` is already checked before accessing `Value`).
- **Array arguments**: recursively hashed to match the recursive `Equals` path.
- **`0.0f` vs `-0.0f`**: `Equals` treats these as different (via `BitConverter`); `float.GetHashCode()` returns different values for them in .NET, so the hash contract `Equals → same hash` is preserved. Even if two platforms hash them the same, this is only a collision (allowed), not a contract violation.

## Trade-offs

- Minimal added complexity: one `if`/`else` block.
- Slightly more work per element in `GetHashCode`, but vastly fewer dictionary collisions compensate.

## Reproducibility

To observe the improvement, benchmark a method with 100+ `DataRow` attributes of the same primitive type and compare Roslyn analysis time before/after.

## Test Status

All 11 existing `DuplicateDataRowAnalyzerTests` pass, including the float/double negative-zero edge cases.




> [!NOTE]
> <details>
> <summary>🔒 Integrity filter blocked 6 items</summary>
>
> The following items were blocked because they don't meet the GitHub integrity level.
>
> - [#2340](https://github.com/microsoft/testfx/issues/2340) `search_issues`: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
> - [#757](https://github.com/microsoft/testfx/issues/757) `search_issues`: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
> - [#6335](https://github.com/microsoft/testfx/issues/6335) `search_issues`: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
> - [#3759](https://github.com/microsoft/testfx/issues/3759) `search_issues`: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
> - [#7787](https://github.com/microsoft/testfx/pull/7787) `list_pull_requests`: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
> - [#6611](https://github.com/microsoft/testfx/pull/6611) `list_pull_requests`: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
>
> To allow these resources, lower `min-integrity` in your GitHub frontmatter:
>
> ```yaml
> tools:
>   github:
>     min-integrity: approved  # merged | approved | unapproved | none
> ```
>
> </details>


> Generated by [Daily Efficiency Improver](https://github.com/microsoft/testfx/actions/runs/24872020067/agentic_workflow) · ● 8.5M · [◷](https://github.com/search?q=repo%3Amicrosoft%2Ftestfx+%22gh-aw-workflow-id%3A+daily-efficiency-improver%22&type=pullrequests)





---

> [!NOTE]
> This was originally intended as a pull request, but GitHub Actions is not permitted to create or approve pull requests in this repository.
> The changes have been pushed to branch `efficiency/duplicate-datarow-hashcode-fix-4a6e49ff096d33a2`.
>
> **[Click here to create the pull request](https://github.com/microsoft/testfx/compare/main...efficiency/duplicate-datarow-hashcode-fix-4a6e49ff096d33a2?expand=1&title=%5BEfficiency%20Improver%5D%20Fix%20O(n2)%20hash%20collisions%20in%20DuplicateDataRowAnalyzer)**

To fix the permissions issue, go to **Settings** → **Actions** → **General** and enable **Allow GitHub Actions to create and approve pull requests**. See also: [gh-aw FAQ](https://github.github.com/gh-aw/reference/faq/#why-is-my-create-pull-request-workflow-failing-with-github-actions-is-not-permitted-to-create-or-approve-pull-requests)

<details><summary>Show patch preview (53 of 53 lines)</summary>

```diff
From 80902e574a6df2008c412dbd20ff62caafda5568 Mon Sep 17 00:00:00 2001
From: "github-actions[bot]" <github-actions[bot]@users.noreply.github.com>
Date: Fri, 24 Apr 2026 04:35:19 +0000
Subject: [PATCH] =?UTF-8?q?perf(analyzers):=20fix=20O(n=C2=B2)=20hash=20co?=
 =?UTF-8?q?llisions=20in=20DuplicateDataRowAnalyzer?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The TypedConstantArrayComparer.GetHashCode only hashed Kind and Type,
ignoring the actual value. This caused all DataRow attributes of the
same type (e.g., [DataRow(1)], [DataRow(2)], ...) to hash to the same
bucket, turning dictionary lookups into O(n) scans and the overall
duplicate-detection into O(n²).

Fix: include the TypedConstant Value (and recursive Values for arrays)
in the hash, so distinct DataRow arguments produce distinct buckets.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---
 .../MSTest.Analyzers/DuplicateDataRowAnalyzer.cs  | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/src/Analyzers/MSTest.Analyzers/DuplicateDataRowAnalyzer.cs b/src/Analyzers/MSTest.Analyzers/DuplicateDataRowAnalyzer.cs
index 6d560c7..99154d5 100644
--- a/src/Analyzers/MSTest.Analyzers/DuplicateDataRowAnalyzer.cs
+++ b/src/Analyzers/MSTest.Analyzers/DuplicateDataRowAnalyzer.cs
@@ -153,6 +153,21 @@ public int GetHashCode(ImmutableArray<TypedConstant> obj)
             {
                 hashCode.Add(typedConstant.Kind);
                 hashCode.Add(SymbolEqualityComparer.Default.GetHashCode(typedConstant.Type));
+
+                if (!typedConstant.IsNull)
+                {
+                    if (typedConstant.Kind == TypedConstantKind.Array)
+                    {
+                        // Recursively hash array elements to match recursive equality check.
+                        hashCode.Add(GetHashCode(typedConstant.Values));
+                    }
+                    else
+                    {
+                        // In
... (truncated)
```

</details>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Efficiency Improver] Fix O(n2) hash collisions in DuplicateDataRowAnalyzer #7796

Goal and Rationale

Focus Area

Approach

Before

After

Energy Efficiency Evidence

Edge Cases

Trade-offs

Reproducibility

Test Status

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Scenario	Before	After
`n` DataRow attributes, same parameter type	O(n2) hash comparisons	O(n) — one per entry
`[DataRow(1)]..[DataRow(100)]` on one method	~5050 equality checks	~100 equality checks

[Efficiency Improver] Fix O(n2) hash collisions in DuplicateDataRowAnalyzer #7796

Description

Goal and Rationale

Focus Area

Approach

Before

After

Energy Efficiency Evidence

Edge Cases

Trade-offs

Reproducibility

Test Status

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions