Skip to content

Remove PopCountOperator<T> guards in TrailingZeroCountOperator<T>.Invoke #116721

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

alexcovington
Copy link
Contributor

This is a small change that removes the redundant PopCountOperator.Vectorizable guards from the TrailingZeroCount.Invoke paths for Vector256 and Vector512. The Invoke functions inlines better by removing these guards and results in a performance gain.

| Type                                      | Method            | Job        | Toolchain                   | BufferLength | Mean       | Error     | StdDev    | Median     | Min        | Max        | Ratio | Allocated | Alloc Ratio |
|------------------------------------------ |------------------ |----------- |---------------------------- |------------- |-----------:|----------:|----------:|-----------:|-----------:|-----------:|------:|----------:|------------:|
| Perf_BinaryIntegerTensorPrimitives<Byte>  | TrailingZeroCount | Job-OWIWOI | \base\Core_Root\corerun.exe | 128          |  39.416 ns | 0.3226 ns | 0.3017 ns |  39.400 ns |  38.873 ns |  39.953 ns |  1.00 |         - |          NA |
| Perf_BinaryIntegerTensorPrimitives<Byte>  | TrailingZeroCount | Job-VUQMWI | \diff\Core_Root\corerun.exe | 128          |   5.588 ns | 0.0560 ns | 0.0467 ns |   5.603 ns |   5.494 ns |   5.647 ns |  0.14 |         - |          NA |
|                                           |                   |            |                             |              |            |           |           |            |            |            |       |           |             |
| Perf_BinaryIntegerTensorPrimitives<Int32> | TrailingZeroCount | Job-OWIWOI | \base\Core_Root\corerun.exe | 128          | 110.643 ns | 0.2528 ns | 0.2365 ns | 110.560 ns | 110.404 ns | 111.067 ns |  1.00 |         - |          NA |
| Perf_BinaryIntegerTensorPrimitives<Int32> | TrailingZeroCount | Job-VUQMWI | \diff\Core_Root\corerun.exe | 128          |  14.518 ns | 0.0879 ns | 0.0822 ns |  14.492 ns |  14.430 ns |  14.677 ns |  0.13 |         - |          NA |
|                                           |                   |            |                             |              |            |           |           |            |            |            |       |           |             |
| Perf_BinaryIntegerTensorPrimitives<Byte>  | TrailingZeroCount | Job-OWIWOI | \base\Core_Root\corerun.exe | 3079         | 204.174 ns | 0.5751 ns | 0.5098 ns | 204.422 ns | 202.991 ns | 204.618 ns |  1.00 |         - |          NA |
| Perf_BinaryIntegerTensorPrimitives<Byte>  | TrailingZeroCount | Job-VUQMWI | \diff\Core_Root\corerun.exe | 3079         |  62.966 ns | 0.2158 ns | 0.1802 ns |  62.942 ns |  62.719 ns |  63.306 ns |  0.31 |         - |          NA |
|                                           |                   |            |                             |              |            |           |           |            |            |            |       |           |             |
| Perf_BinaryIntegerTensorPrimitives<Int32> | TrailingZeroCount | Job-OWIWOI | \base\Core_Root\corerun.exe | 3079         | 800.225 ns | 7.7147 ns | 7.2163 ns | 800.139 ns | 790.602 ns | 814.042 ns |  1.00 |         - |          NA |
| Perf_BinaryIntegerTensorPrimitives<Int32> | TrailingZeroCount | Job-VUQMWI | \diff\Core_Root\corerun.exe | 3079         | 301.138 ns | 1.0371 ns | 0.8097 ns | 300.876 ns | 300.098 ns | 302.453 ns |  0.38 |         - |          NA |

@github-actions github-actions bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jun 16, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Jun 16, 2025
@MihaZupan MihaZupan added area-System.Numerics.Tensors and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Jun 16, 2025
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-numerics-tensors
See info in area-owners.md if you want to be subscribed.

@tannergooding
Copy link
Member

/ba-g unrelated infra issues

@tannergooding tannergooding merged commit 3ae4edb into dotnet:main Jun 18, 2025
86 of 90 checks passed
@tannergooding
Copy link
Member

Thanks for the improvement!

@MihaZupan
Copy link
Member

Should we have a JIT tracking issue for this? Seems like quite the difference for something that should have been stripped out already.

@tannergooding
Copy link
Member

I don't think there's much the JIT can do here. It's namely due to it requiring 3 total inlining steps to see that it ultimately becomes a constant. It's a bit of a niche case and the removed check here wasn't actually needed anyways

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Numerics.Tensors community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants