Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

Add Span/ReadOnlySpan TryCast #26662

Closed
wants to merge 1 commit into from
Closed

Add Span/ReadOnlySpan TryCast #26662

wants to merge 1 commit into from

Conversation

ianhays
Copy link
Contributor

@ianhays ianhays commented Jan 29, 2018

  • Exposes Span.TryCast and ReadOnlySpan.TryCast as extension methods in MemoryExtensions.
  • Implemented for both the fast and portable implementations in equivalent ways, though eventually the fast implementation should be moved to corelib with the other MemoryExtensions.Fast methods and implemented using compiler directives (e.g. #IF intelProcessorCompiled) or maybe an intrinsic for the else case.
  • I decided that requiring a fixed memory location for the source span would be too restrictive, so I implemented the functions in a way that works even if the memory pointed to by the span gets moved mid-operation or post-operation. The downside to this is that we can't handle the "accidentally aligned" case as well, but this is a worthy tradeoff in my opinion. More details in the doc comments for TryCast.
  • Added a bunch of tests for TryCast that assert the behavior for param types of various sizes and alignment.
  • This code is based around two large assumptions:
    • The alignment rules for TFrom will be followed if sourceSpan is moved.
    • The maximum alignment size of any data item is 8 bytes. For C# this is true for all primitives, including decimal.

resolves https://github.com/dotnet/corefx/issues/26465

cc: @stephentoub @ahsonkhan @KrzysztofCwalina @GrabYourPitchforks

…in MemoryExtensions.

- Implemented for both the fast and portable implementations in equivalent ways, though eventually [the fast implementation should be moved to corelib with the other MemoryExtensions.Fast methods](https://github.com/dotnet/corefx/issues/25182) and implemented using compiler directives (e.g. #IF intelProcessorCompiled) or maybe an intrinsic for the else case.
- I decided that requiring a fixed memory location for the source span would be too restrictive, so I implemented the functions in a way that works even if the memory pointed to by the span gets moved mid-operation or post-operation. The downside to this is that we can't handle the "accidentally aligned" case as well, but this is a worthy tradeoff in my opinion. More details in the doc comments for TryCast.
- Added a bunch of tests for TryCast that assert the behavior for param types of various sizes and alignment.
- This code is based around two large assumptions:
  - The alignment rules for TFrom will be followed if sourceSpan<TFrom> is moved.
  - The maximum alignment size of any data item is 8 bytes. For C# this is true for all primitives, including decimal.
@ianhays ianhays self-assigned this Jan 29, 2018
@ianhays ianhays added this to the 2.1.0 milestone Jan 29, 2018
@@ -75,6 +75,8 @@ public static partial class MemoryExtensions
public static System.ReadOnlySpan<char> TrimStart(this System.ReadOnlySpan<char> span, char trimChar) { throw null; }
public static System.ReadOnlySpan<char> TrimStart(this System.ReadOnlySpan<char> span, System.ReadOnlySpan<char> trimChars) { throw null; }
public static bool TryGetString(this System.ReadOnlyMemory<char> readOnlyMemory, out string text, out int start, out int length) { throw null; }
public static bool TryCast<TFrom, TTo>(this ReadOnlySpan<TFrom> source, out ReadOnlySpan<TTo> output) where TFrom : struct where TTo : struct { throw null; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we talked in the other thread, it would be better to make these instance methods. I am not sure if you'd like to merge this and then separately DCR, or make the API change now.

// This check also returns false if it's coincidentally aligned to TTo in the case where sizeof(TFrom)<sizeof(TTo). This is
// because the source location can move such that it no longer adheres to the alignment requirement of TTo while still
// following the alignment rules for it's declared/constructed type, TFrom.
void* pointer = Unsafe.AsPointer(ref MemoryMarshal.GetReference(source));
Copy link
Member

@jkotas jkotas Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GC can move the objects around. For example - on ARM, the byref that was 8-byte aligned now can be 8-byte misaligned a moment later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. I thought that when GC moves objects, it maintains word alignment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on the GC implementation...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is based around two large assumptions:
The alignment rules for TFrom will be followed if sourceSpan is moved.

This is one of the assumptions made in this implementation.

for example - on ARM, the byref that was 8-byte aligned now can be 8-byte misaligned a moment later.

So you're saying that on ARM an array of 8-byte aligned TFrom's can be moved by the GC to a location such that it is no longer 8-byte aligned? Wouldn't that cause any subsequent valid TFrom accesses to fail, since unaligned access is unsupported? How does anything work when on a unaligned-unsupported architecture if the GC is breaking the alignment rules when it moves blocks?

Copy link
Member

@jkotas jkotas Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ARM an array of 8-byte aligned TFrom's can be moved by the GC to a location such that it is no longer 8-byte aligned

Right.

How does anything work when on a unaligned-unsupported architecture if the GC is breaking the alignment rules when it moves blocks?

The GC is not breaking any rules. The GC knows what the array type is. It will only do it when it is safe (e.g. when the array type is byte).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will only do it when it is safe (e.g. when the array type is byte).

Are you talking about the case in which the array type that the GC is aware of has a different alignment than the TFrom that is given to TryCast?

I am operating on the assumption that the alignment of TFrom is equivalent to the alignment that the GC is aware of. If they're different then the source Span is already subject to failed unaligned accesses and so our TryCast isn't making anything worse by allowing a cast from an invalid type to another (potentially) invalid type.

So under that assumption, the GC will either:

  • maintain the alignment rules of TFrom
  • fail whether we allow the cast or not, in which case we might as well allow it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider this:

// Note this type is guaranteed to be 4-byte aligned. There is nothing forcing it to be 8-byte aligned.
struct Bundle {
    int x,y;
}

Bundle[] a = new Bundle[10]; // Let's assume that the GC will happen to allocate the array on 8-byte boundary
Span<Bundle> s = new Span<Bundle>(a);
bool success = s.TryCast<Bundle,double>(out Span<double> sd); // This will succeed
//GC moves `a` to a different place that is not 8-byte aligned anymore ...
double d = sd[5]; // Misaligned access fault

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks for the example. We will return the wrong result in the case where TFrom is a struct in which all of these conditions are met:

  1. Unsafe.SizeOf<TFrom> != actualTFromAlignment
  2. actualTFromAlignment < Unsafe.SizeOf<TTo>
  3. addressof(source) % Unsafe.SizeOf<TTo> == 0

This is essentially the reverse of the issue I commented about above:

 // This check will return false in the case where TTo is a struct that is 8 bytes long but only 1/2/4-byte aligned and the source pointer
// is 1/2/4 byte-aligned but not also 8-byte aligned. Unfortunately, to catch that case we would have to walk the 
// struct element sizes to determine the actual struct alignment.

Except in this case we return true incorrectly while in the other case we return false incorrectly, which is of course a more severe error.

Both are the result of the difference between SizeOf and SizeOfLargestElement. I'm not aware of a way to get the latter without reflection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you feel about:

int sizeOfTFrom = typeof(TFrom).IsPrimitive ? Math.Min(8, Unsafe.SizeOf<TFrom>()) : 1;

Without being able to look into the struct to determine the real alignment, we can't really assume it follows any alignment but byte alignment, which means TryCast will pretty much always return false when TFrom is a struct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will make the API slower for the common use case of casting span of bytes to span of chars; or span of ints.

I think it is a good thing for the portable version of the API to be more conservative. It will tech folks to write proper fallbacks for this.

if (RuntimeHelpers.IsReferenceOrContainsReferences<TTo>())
ThrowHelper.ThrowArgumentException_InvalidTypeWithPointersNotSupported(typeof(TTo));

if (RuntimeInformation.OSArchitecture == Architecture.X64 ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should to check ProcessArchitecture, not OSArchitecture.

(You pretty much never want to use Runtime.OSArchitecture property. It is only useful if you want to deal with Windows WoW quirks that is very rare.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the implementation of either of these API is dog slow (each invocation takes a lock)...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the implementation of either of these API is dog slow (each invocation takes a lock)...

Is there a better way to check the processor architecture in a fully portable implementation?

@@ -3,6 +3,8 @@
// See the LICENSE file in the project root for more information.

using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we get rid of the AggressiveInlining, do we need using System.Runtime.CompilerServices; anymore now that we use Internal.Runtime.... for Fast span?

{
public char a1; // 2 bytes
// 2 bytes padding
public int a2; // 4 bytes
Copy link
Member

@ahsonkhan ahsonkhan Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this struct guaranteed to be laid out like this or do we need to set explicit offsets, for example:
https://github.com/dotnet/corefx/blob/master/src/System.Memory/tests/TestHelpers.cs#L241?

}

[Fact]
public static unsafe void TryCastSpan_Overflow()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this unsafe?

}

[Fact]
public static unsafe void TryCastSpan_ToEmptyStruct()
Copy link
Member

@ahsonkhan ahsonkhan Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, these don't need to be unsafe.

here and elsewhere

/// <summary>
/// Attempts to cast a ReadOnlySpan of one primitive type <typeparamref name="TFrom"/> to another primitive type <typeparamref name="TTo"/>.
/// </summary>
/// <param name="source">The source slice, of type <typeparamref name="TFrom"/>.</param>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove "slice," or rename to "span"

/// </summary>
/// <param name="source">The source slice, of type <typeparamref name="TFrom"/>.</param>
/// <param name="output">The destination of type <typeparamref name="TTo"/>.</param>
/// <remarks>If <typeparamref name="TTo"/> is 8-byte aligned and <paramref name="source"/> points to a valid aligned <typeparamref name="TTo"/> address,
Copy link
Member

@ahsonkhan ahsonkhan Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For x86, it could be 4-byte aligned too, right? Update to word size/pointer size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting point. Is the maximum alignment of a C# type on an x86 cpu 4 bytes? Seems like the word size wouldn't affect the alignment, but I'm not certain.

For some context, the shortcut here is that when a TFrom is aligned by 8-bytes then it is also aligned to 4-bytes, 2-bytes, 1-bytes. And since TTo can be at most aligned by 8-bytes, any TTo will also be aligned.


// First we have to make sure the given pointer follows the alignment rules of its TFrom type. Since the TFrom alignment will
// be preserved even if the memory is moved, we can safely make assumptions about the alignment of TTo without the need for pinning.
int sizeOfTFrom = Math.Min(8, Unsafe.SizeOf<TFrom>());
Copy link
Member

@ahsonkhan ahsonkhan Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to have different implementations for IntPtr.Size == 4 and IntPtr.Size == 8?

/// this will always return false</remarks>
/// <remarks>Does not require the memory being spanned over to be fixed.</remarks>
/// <returns>True if successful; else False</returns>
public static bool TryCast<TFrom, TTo>(this Span<TFrom> source, out Span<TTo> output) where TFrom : struct where TTo : struct
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for this method to call the ReadOnlySpan one or will the out parameter cause issues?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the out parameter would cause issues.

/// this will always return false</remarks>
/// <remarks>Does not require the memory being spanned over to be fixed.</remarks>
/// <returns>True if successful; else False</returns>
public static bool TryCast<TFrom, TTo>(this Span<TFrom> source, out Span<TTo> output) where TFrom : struct where TTo : struct
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Fast and Portable implementations of this API are essentially identical. Should we just have one copy in MemoryExtensions.cs? MemoryExtensions.Fast.cs only contains extension methods with different implementation and calls the Span class (within CoreLib) directly.

Copy link
Member

@jkotas jkotas Jan 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expect that we will move MemoryExtensions to CoreLib. Once that happens we can have much better, faster and accurate, platform-specific implementation of this API for fast Span. I do not think we need to bother with sharing it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're duplicated so we can replace the Fast implementation once it gets moved to corelib.
From the commit comments:

Implemented for both the fast and portable implementations in equivalent ways, though eventually the fast implementation should be moved to corelib with the other MemoryExtensions.Fast methods and implemented using compiler directives (e.g. #IF intelProcessorCompiled) or maybe an intrinsic for the else case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this is hard to get right in portable way. I would suggest that the portable implementation just checks that sizeof(TTo) <= sizeof(IntPtr) and returns false otherwise.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe even do it for x86/x64 and avoid dependency on the RuntimeInformation package.

@jkotas
Copy link
Member

jkotas commented Feb 4, 2018

We should validate the usefulness of this API on real world examples. It has proven to be very useful in other cases, e.g. resulting to the API not being added (#26598 (comment)) or significant changes in design/implementation (dotnet/coreclr#15608 (comment)).

@ianhays
Copy link
Contributor Author

ianhays commented Feb 5, 2018

The existence of https://github.com/dotnet/corefx/issues/26644 seems to suggest that it's former version, NonPortableCast, was being used at least enough to spark an (accepted) proposal to promote it from extension method to member.

I'm not necessarily against closing this and re-evaluating #26465, but to do so would likely require that we postpone it until the 2.2 timeframe as the 2.1 API cutoff is fast approaching. That would leave us without a portable casting solution since Cast was made fast-span only.

@jkotas
Copy link
Member

jkotas commented Feb 5, 2018

portable casting solution since Cast was made fast-span only

MemoryMarshal.Cast is in both slow Span and fast Span.

I'm not necessarily against closing

I would also feel better if we do this with longer runway. I am not convinced that this is the right API to have. The design will likely require iterations to get right.

@ianhays
Copy link
Contributor Author

ianhays commented Feb 6, 2018

MemoryMarshal.Cast is in both slow Span and fast Span.

So it is. I was confusing it with Create which was moved to fast-only.

I would also feel better if we do this with longer runway. I am not convinced that this is the right API to have. The design will likely require iterations to get right.

Since we don't have a good answer for portable span anyways, I'm fine with that.

@ianhays ianhays closed this Feb 6, 2018
@ianhays ianhays deleted the trycast branch June 11, 2018 17:59
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a Span.TryCast extension method
4 participants