Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vector{<T>|64|128|256|512}.Narrow with saturation #75724

Open
gfoidl opened this issue Sep 15, 2022 · 9 comments
Open

Vector{<T>|64|128|256|512}.Narrow with saturation #75724

gfoidl opened this issue Sep 15, 2022 · 9 comments
Labels
api-approved API was approved in API review, it can be implemented area-System.Runtime.Intrinsics avx512 Related to the AVX-512 architecture
Milestone

Comments

@gfoidl
Copy link
Member

gfoidl commented Sep 15, 2022

Summary

For SSE2 there's Sse2.PackSignedSaturate which packs the two vectors with saturation.
Such functionality is missing for the xplat-vectors -- i.e. Vector128.Narrow packs the two vectors, but cuts / masks off the values outside of the target range, e.g. for short -> sbyte bits above 0xFF is cut off resulting in (for me) unexpected values.

Maybe it's a bit astonishing that Narrow doesn't pack with saturation. This just bit me, and as it seems #73064 (comment) was similar.

There should be something like NarrowWithSaturation in order to don't need to fallback to Sse2 / AdvSimd.

API Proposal

namespace System.Runtime.Intrinsics;

public static class Vector64
{
    // Existing methods left out for brevity

    [CLSCompliant(false)]
    public static Vector64<sbyte> NarrowWithSaturation(Vector64<short> lower, Vector64<short> upper);

    public static Vector64<short> NarrowWithSaturation(Vector64<int> lower, Vector64<int> upper);

    public static Vector64<int> NarrowWithSaturation(Vector64<long> lower, Vector64<long> upper);

    [CLSCompliant(false)]
    public static Vector64<byte> NarrowWithSaturation(Vector64<ushort> lower, Vector64<ushort> upper);

    [CLSCompliant(false)]
    public static Vector64<ushort> NarrowWithSaturation(Vector64<uint> lower, Vector64<uint> upper);

    public static Vector64<uint> NarrowWithSaturation(Vector64<ulong> lower, Vector64<ulong> upper);

    public static Vector64<float> NarrowWithSaturation(Vector64<double> lower, Vector64<double> upper);
}

public static class Vector128
{
    // Existing methods left out for brevity

    [CLSCompliant(false)]
    public static Vector128<sbyte> NarrowWithSaturation(Vector128<short> lower, Vector128<short> upper);

    public static Vector128<short> NarrowWithSaturation(Vector128<int> lower, Vector128<int> upper);

    public static Vector128<int> NarrowWithSaturation(Vector128<long> lower, Vector128<long> upper);

    [CLSCompliant(false)]
    public static Vector128<byte> NarrowWithSaturation(Vector128<ushort> lower, Vector128<ushort> upper);

    [CLSCompliant(false)]
    public static Vector128<ushort> NarrowWithSaturation(Vector128<uint> lower, Vector128<uint> upper);

    public static Vector128<uint> NarrowWithSaturation(Vector128<ulong> lower, Vector128<ulong> upper);

    public static Vector128<float> NarrowWithSaturation(Vector128<double> lower, Vector128<double> upper);
}

public static class Vector256
{
    // Existing methods left out for brevity

    [CLSCompliant(false)]
    public static Vector256<sbyte> NarrowWithSaturation(Vector256<short> lower, Vector256<short> upper);

    public static Vector256<short> NarrowWithSaturation(Vector256<int> lower, Vector256<int> upper);

    public static Vector256<int> NarrowWithSaturation(Vector256<long> lower, Vector256<long> upper);

    [CLSCompliant(false)]
    public static Vector256<byte> NarrowWithSaturation(Vector256<ushort> lower, Vector256<ushort> upper);

    [CLSCompliant(false)]
    public static Vector256<ushort> NarrowWithSaturation(Vector256<uint> lower, Vector256<uint> upper);

    public static Vector256<uint> NarrowWithSaturation(Vector256<ulong> lower, Vector256<ulong> upper);

    public static Vector256<float> NarrowWithSaturation(Vector256<double> lower, Vector256<double> upper);
}

public static class Vector512
{
    // Existing methods left out for brevity

    [CLSCompliant(false)]
    public static Vector512<sbyte> NarrowWithSaturation(Vector512<short> lower, Vector512<short> upper);

    public static Vector512<short> NarrowWithSaturation(Vector512<int> lower, Vector512<int> upper);

    public static Vector512<int> NarrowWithSaturation(Vector512<long> lower, Vector512<long> upper);

    [CLSCompliant(false)]
    public static Vector512<byte> NarrowWithSaturation(Vector512<ushort> lower, Vector512<ushort> upper);

    [CLSCompliant(false)]
    public static Vector512<ushort> NarrowWithSaturation(Vector512<uint> lower, Vector512<uint> upper);

    public static Vector512<uint> NarrowWithSaturation(Vector512<ulong> lower, Vector512<ulong> upper);

    public static Vector512<float> NarrowWithSaturation(Vector512<double> lower, Vector512<double> upper);
}
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Sep 15, 2022
@ghost
Copy link

ghost commented Sep 16, 2022

Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics
See info in area-owners.md if you want to be subscribed.

Issue Details

For SSE2 there's Sse2.PackSignedSaturate which packs the two vectors with saturation.
Such functionality is missing (?) for the xplat-vectors -- i.e. Vector128.Narrow packs the two vectors, but cuts / masks off the values outside of the target range, e.g. for short -> sbyte bits above 0xFF is cut off resulting in (for me) unexpected values.

Maybe it's a bit astonishing that Narrow doesn't pack with saturation. This just bit me, and as it seems #73064 (comment) was similar.

Should there be something like NarrowWithSaturation in order to don't need to fallback to Sse2 / AdvSimd?

Author: gfoidl
Assignees: -
Labels:

area-System.Runtime.Intrinsics, untriaged

Milestone: -

@tannergooding
Copy link
Member

tannergooding commented Sep 16, 2022

I'd be generally supportive of such new APIs, but this needs to be updated to be a "proper" API proposal for it to goto API review.

Basically, the proposal should minimally look like this: #73604

Provide a brief summary (which you already have) and then a section that lists the actual APIs being proposed (in this case just take all the Narrow APIs and expose NarrowSaturating or NarrowWithSaturation equivalents, not super picky on the name we can decide in API review).

@tannergooding tannergooding added api-suggestion Early API idea and discussion, it is NOT ready for implementation needs-author-action An issue or pull request that requires more info or actions from the author. and removed untriaged New issue has not been triaged by the area owner labels Sep 16, 2022
@ghost
Copy link

ghost commented Sep 16, 2022

This issue has been marked needs-author-action and may be missing some important information.

@gfoidl
Copy link
Member Author

gfoidl commented Sep 16, 2022

Updated to be a proposal -- just wanted to double-check with you before that (maybe there were good reasons why these APIs aren't available so far, etc.). Thanks.

@ghost ghost added needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration and removed needs-author-action An issue or pull request that requires more info or actions from the author. labels Sep 16, 2022
@tannergooding
Copy link
Member

Could you also update to cover Vector64 and Vector<T>? We want to try and maintain parity between all 4 types where possible.

Otherwise, LGTM and I can mark as api-ready-for-review.

@tannergooding
Copy link
Member

Ah, and Vector512<T> as well since that is approved but not yet implemented.

@gfoidl
Copy link
Member Author

gfoidl commented Sep 16, 2022

Updated.

@tannergooding tannergooding added api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration labels Sep 16, 2022
@jeffhandley jeffhandley added this to the Future milestone Sep 16, 2022
@terrajobst
Copy link
Member

terrajobst commented Oct 25, 2022

Video

  • We should also add them to Vector<T>
    • part of the proposal
  • We should differentiate between signed and unsigned narrowing
namespace System.Numerics;

public partial static class Vector
{
    [CLSCompliant(false)]
    public static Vector<sbyte> NarrowWithSaturation(Vector<short> lower, Vector<short> upper);
    public static Vector<short> NarrowWithSaturation(Vector<int> lower, Vector<int> upper);
    public static Vector<int> NarrowWithSaturation(Vector<long> lower, Vector<long> upper);
    [CLSCompliant(false)]
    public static Vector<byte> NarrowWithSaturation(Vector<ushort> lower, Vector<ushort> upper);
    [CLSCompliant(false)]
    public static Vector<ushort> NarrowWithSaturation(Vector<uint> lower, Vector<uint> upper);
    public static Vector<uint> NarrowWithSaturation(Vector<ulong> lower, Vector<ulong> upper);
    public static Vector<float> NarrowWithSaturation(Vector<double> lower, Vector<double> upper);
}
namespace System.Runtime.Intrinsics;

public partial static class Vector64
{
    [CLSCompliant(false)]
    public static Vector64<sbyte> NarrowWithSaturation(Vector64<short> lower, Vector64<short> upper);
    public static Vector64<short> NarrowWithSaturation(Vector64<int> lower, Vector64<int> upper);
    public static Vector64<int> NarrowWithSaturation(Vector64<long> lower, Vector64<long> upper);
    [CLSCompliant(false)]
    public static Vector64<byte> NarrowWithSaturation(Vector64<ushort> lower, Vector64<ushort> upper);
    [CLSCompliant(false)]
    public static Vector64<ushort> NarrowWithSaturation(Vector64<uint> lower, Vector64<uint> upper);
    public static Vector64<uint> NarrowWithSaturation(Vector64<ulong> lower, Vector64<ulong> upper);
    public static Vector64<float> NarrowWithSaturation(Vector64<double> lower, Vector64<double> upper);
}

public partial static class Vector128
{
    [CLSCompliant(false)]
    public static Vector128<sbyte> NarrowWithSaturation(Vector128<short> lower, Vector128<short> upper);
    public static Vector128<short> NarrowWithSaturation(Vector128<int> lower, Vector128<int> upper);
    public static Vector128<int> NarrowWithSaturation(Vector128<long> lower, Vector128<long> upper);
    [CLSCompliant(false)]
    public static Vector128<byte> NarrowWithSaturation(Vector128<ushort> lower, Vector128<ushort> upper);
    [CLSCompliant(false)]
    public static Vector128<ushort> NarrowWithSaturation(Vector128<uint> lower, Vector128<uint> upper);
    public static Vector128<uint> NarrowWithSaturation(Vector128<ulong> lower, Vector128<ulong> upper);
    public static Vector128<float> NarrowWithSaturation(Vector128<double> lower, Vector128<double> upper);
}

public partial static class Vector256
{
    [CLSCompliant(false)]
    public static Vector256<sbyte> NarrowWithSaturation(Vector256<short> lower, Vector256<short> upper);
    public static Vector256<short> NarrowWithSaturation(Vector256<int> lower, Vector256<int> upper);
    public static Vector256<int> NarrowWithSaturation(Vector256<long> lower, Vector256<long> upper);
    [CLSCompliant(false)]
    public static Vector256<byte> NarrowWithSaturation(Vector256<ushort> lower, Vector256<ushort> upper);
    [CLSCompliant(false)]
    public static Vector256<ushort> NarrowWithSaturation(Vector256<uint> lower, Vector256<uint> upper);
    public static Vector256<uint> NarrowWithSaturation(Vector256<ulong> lower, Vector256<ulong> upper);
    public static Vector256<float> NarrowWithSaturation(Vector256<double> lower, Vector256<double> upper);
}

public partial static class Vector512
{
    [CLSCompliant(false)]
    public static Vector512<sbyte> NarrowWithSaturation(Vector512<short> lower, Vector512<short> upper);
    public static Vector512<short> NarrowWithSaturation(Vector512<int> lower, Vector512<int> upper);
    public static Vector512<int> NarrowWithSaturation(Vector512<long> lower, Vector512<long> upper);
    [CLSCompliant(false)]
    public static Vector512<byte> NarrowWithSaturation(Vector512<ushort> lower, Vector512<ushort> upper);
    [CLSCompliant(false)]
    public static Vector512<ushort> NarrowWithSaturation(Vector512<uint> lower, Vector512<uint> upper);
    public static Vector512<uint> NarrowWithSaturation(Vector512<ulong> lower, Vector512<ulong> upper);
    public static Vector512<float> NarrowWithSaturation(Vector512<double> lower, Vector512<double> upper);
}

@terrajobst terrajobst added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Oct 25, 2022
@BruceForstall BruceForstall added the avx512 Related to the AVX-512 architecture label Oct 26, 2022
@BruceForstall BruceForstall changed the title Vector{128|256}.Narrow with saturation Vector{<T>|64|128|256|512}.Narrow with saturation Oct 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-approved API was approved in API review, it can be implemented area-System.Runtime.Intrinsics avx512 Related to the AVX-512 architecture
Projects
None yet
Development

No branches or pull requests

6 participants