Skip to content

[API Proposal]: Clearer additional method overloads for narrowing/widening a vector #113307

Closed
@gerhard17

Description

@gerhard17

Background and motivation

Comming from the need to narrow a Vector256<double> to a Vector128<float>, I was confrontated with three differerent coding possibilities, which produced different code gens (with identical results).

using System.Runtime.Intrinsics;

public static class TestClass
{ 	
	public static Vector128<float> Narrow1(Vector256<double> value) {
		return Vector128.Narrow(value.GetLower(), value.GetUpper());
	}

	public static Vector128<float> Narrow2(Vector256<double> value) {
		return Vector256.Narrow(value, Vector256<double>.Zero).GetLower();
	}
   	
	public static Vector128<float> Narrow3(Vector256<double> value) {
		return Vector256.Narrow(value, value).GetLower();
	}
}

translating to following code on my workstation (Windows x64, NET9.0, AVX2 support)

TestClass.Narrow1(System.Runtime.Intrinsics.Vector256`1<Double>)
    L0000: vmovups ymm0, [rdx]
    L0004: vmovaps ymm1, ymm0
    L0008: vcvtpd2ps xmm1, xmm1
    L000c: vextractf128 xmm0, ymm0, 1
    L0012: vcvtpd2ps xmm0, xmm0
    L0016: vmovlhps xmm0, xmm1, xmm0
    L001a: vmovups [rcx], xmm0
    L001e: mov rax, rcx
    L0021: vzeroupper
    L0024: ret

TestClass.Narrow2(System.Runtime.Intrinsics.Vector256`1<Double>)
    L0000: vcvtpd2ps xmm0, ymmword ptr [rdx]
    L0004: vxorps ymm1, ymm1, ymm1
    L0008: vcvtpd2ps xmm1, ymm1
    L000c: vinsertf128 ymm0, ymm0, xmm1, 1
    L0012: vmovups [rcx], xmm0
    L0016: mov rax, rcx
    L0019: vzeroupper
    L001c: ret

TestClass.Narrow3(System.Runtime.Intrinsics.Vector256`1<Double>)
    L0000: vcvtpd2ps xmm0, ymmword ptr [rdx]
    L0004: vmovaps ymm1, ymm0
    L0008: vinsertf128 ymm0, ymm1, xmm0, 1
    L000e: vmovups [rcx], xmm0
    L0012: mov rax, rcx
    L0015: vzeroupper
    L0018: ret

where Narrow3() seems to be the optimal one on my workstation.

API Proposal

I suggest clearer additional methods for narrowing/widening Vector256<TFrom> to Vector128<TTo>.
Where conversion TFrom/TTo are: double/float, long/int, ulong/uint, int/short, uint/ushort, short/sbyte, ushort/byte.

namespace System.Runtime.Intrinsics;

public class Vector256
{
    public static Vector128<float> Narrow(Vector256<double> value) => ...
    public static Vector128<int> Narrow(Vector256<long> value) => ...
    public static Vector128<uint> Narrow(Vector256<ulong> value) => ...
    public static Vector128<short> Narrow(Vector256<int> value) => ...
    public static Vector128<ushort> Narrow(Vector256<uint> value) => ...
    public static Vector128<sbyte> Narrow(Vector256<short> value) => ...
    public static Vector128<byte> Narrow(Vector256<ushort> value) => ...

    public static Vector256<double> Widen(Vector128<float> value) => ...
    public static Vector256<long> Widen(Vector128<int> value) => ...
    public static Vector256<ulong> Widen(Vector128<uint> value) => ...
    public static Vector256<int> Widen(Vector128<short> value) => ...
    public static Vector256<uint> Widen(Vector128<ushort> value) => ...
    public static Vector256<short> Widen(Vector128<sbyte> value) => ...
    public static Vector256<ushort> Widen(Vector128<byte> value) => ...
}

and analogous methods on Vector128 and Vector512...

Remark: I'm aware that oposed to the current Narrow/Widen methods, these new methods cannot be implemented as generic methods.

API Usage

Vector256<double> v0 = Vector256.Create(1.1, 2.2, 3.3, 4.4);

Vector128<float> v1 = Vector256.Narrow(v0);
Vector256<double> v2 = Vector256.Widen(v1);

Alternative Designs

Clearify the optimal usage in the current API documentation.

Risks

Because of being new overloads, I see no risk.

Metadata

Metadata

Assignees

No one assigned

    Labels

    api-suggestionEarly API idea and discussion, it is NOT ready for implementationarea-System.Runtime.Intrinsicsneeds-author-actionAn issue or pull request that requires more info or actions from the author.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions