-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide a generic API to read from and write to a pointer #16143
Comments
I assume the discussion should continue here. Since I can't wait to be able to use this I created an implementation of this and published it as a NuGet package called I have constrained the type to be a value type, as compared to @jkotas code and the proposal above, so the definition is: public static class Unsafe
{
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static unsafe T Read<T>(void* p) where T : struct
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static unsafe void Write<T>(void* p, T value) where T : struct
} I assume this shouldn't work for reference types. I also have not changed the order of parameters for the The implementation in IL, which I did by hand from a normal class library, can be found at https://github.com/DotNetCross/Memory.Unsafe/blob/master/il/DotNetCross.Memory.Unsafe.il which contains the following definition of the
With regard to alignment, I agree with @CarolEidt that unaligned read/write does only have a small extra cost for Intel Sandy Bridge or later. However, it is still an extra cost and also means higher cache usage if I remember correctly (I am not an assembly/micro-architecture expert). More importantly, what about other architectures such as ARM? Some ARM architectures don't even have unaligned read/wrire. Additionally, what about dynamically allocated memory on the stack. Does this also assume aligned access? That is, a contrived exampled could be: public static unsafe void StackAllocAlignmentDoNotDoThis()
{
var ptr = stackalloc byte[128];
ptr += 3;
var value = Unsafe.Read<double>(ptr);
} Now, you definitely shouldn't do this, but just trying to understand how alignment will work. Perhaps this should be discussed separately, as I have nothing against the proposed API. I would just prefer an API which also had explicit AlignedRead/AlignedWrite, UnalignedRead/UnalignedWrite, with fallback to whatever makes sense for a given platform. There still should be Read/Write which have sensible defaults as @CarolEidt have mentioned are:
|
Constraining T to reference type does not prevent one from using it for reference types. One just needs to go through extra hassle to wrap the reference type in struct:
There are valid situations for
The abstraction level that this is coded against is IL. In IL, all access is assumed to be sufficiently aligned by default. The unaligned access has to be explicitly requested by unaligned prefix for IL instruction. So the simple natural way to implement UnalignedRead would be:
It is JIT problem to figure out the most efficient way to translate this to machine instructions on given platform in given context. E.g. on platforms without direct instructions for unaligned access, the JIT has to expand this into series of byte loads unless it can prove that the pointer is aligned. |
@jkotas of course, I wasn't trying to imply this would make it "safe" as such, just that this could indicate the expected use case, guess I wasn't imaginative enough :). Could you elaborate on the valid situations for using |
E.g. Low level code that does reflection: |
I have updated DotNetCross.Memory.Unsafe to no longer have the generic value type constraint. I have also added @jkotas not sure I understand how the code you point to would benefit from fixed (IntPtr* pObj = &obj.m_pEEType)
{
IntPtr pData = (IntPtr)pObj;
IntPtr pField = pData + fieldOffset;
return LoadReferenceTypeField(pField);
} However, I do understand that it is beneficial to be able to cast a pointer (i.e. address) to be seen as a reference to a reference type. That makes sense. Despite all the issues that might follow with the GC etc.
From this I would assume that all
Which means that no SIMD non-stack locations will be accessed unaligned. Which would be terrible if it had to do byte loads as mentioned:
And this is exactly my concern. That we have no way of forcing through IL the JIT to do aligned accessing, but as you say we are expressing in IL that this should be aligned, the JIT, however, does what it thinks best. So I am not saying we should change the API, I'm just trying to express what we would ideally want. Full control ;) Finally, there is one scenario that isn't covered. Casting a pointer to a value type pointer i.e. without loading the value type to the stack/register. This could be an issue for big value types e.g. public struct BigValueType
{
double M00;
double M01;
double M02;
double M10;
double M11;
double M12;
double M20;
double M21;
double M22;
} Really, we do not want to load this completely if we only wanted to change a part of it e.g. public void AddToM02(void* ptr, double value)
{
BigValueType* big = Unsafe.Cast<BigValueType>(ptr);
big->M02 += value;
} Something we would like to do in a generic way. So would it even be possible to cast a pointer too? Like this: T* Cast<T>(void* p) Pretty sure C# won't accept this or am I wrong? We would to have the |
The thing is that IL knows nothing about SIMD. As far as IL is concerned
It would be better to investigate if aligned SIMD access is truly needed. AFAIK there's no perf penalty if you use unaligned loads/stores (
Fixing this requires compiler support. Either
I don't think that's needed in this case. The input pointer is already unmanaged. |
This sounds like as optimizer hints, like what C/C++ compilers have. A potential way to express this in IL would be via
I would track the optimizer hints separately because of it is a non-trivial feature on its own, and it is unlikely to help the |
Yeah ok I thought the Interlocked was there for a reason. So the problem was in
It would definitely be much better to test this instead of my perhaps unfounded theoretical concerns. Although it is not unheard of that unaligned reads/writes can have significant consequences, see http://www.agner.org/optimize/blog/read.php?i=285 where:
But that is probably an outlier. Agner has more info at http://www.agner.org/optimize/#manuals |
Incidentally Agner's manual is one source for my "AFAIK":
That's a bit more complicated and not solely caused by unaligned data. And again, there's a distinction between using unaligned load instructions and unaligned loads. |
@jkotas so @mikedn I agree and for x64 this discussion is probably mute, but given that .NET Core is targetting cross platform, we should at least consider consequences for ARM etc. However, this should probably be discussed separately too, I guess.
Not sure I understand this, isn't it the output pointer type that is the problem for By the way I have noticed many other projects that have the With regards to API I am happy with its elegant simplicity, so perhaps to move forward placement of API is a more pertinent question. We would prefer it to be as central as possible, as this would also make it useable for most projects without extra dependencies. |
Yep, that's a long story since the SIMD support for ARM is simply non-existent at the moment. And before discussing about Unsafe.Read/Write there's a lot of other things to settle and that includes how SIMD vectors are loaded from normal arrays.
The devil is in the details. What does Either way it works fine as long as the location pointed by the pointer is in stack or in a pinned GC heap object or in native memory. And that's expected since the input to these functions is a I don't think we'll see |
I have created one. |
@mikedn I though ARM NEON was pretty wide spread in high end smart phones and server ARM. I have been playing around a bit with [MethodImpl(MethodImplOptions.AggressiveInlining)]
public static unsafe int SizeOf<T>() Or in IL:
Not sure how this differs from Not sure if |
A completely different issue I have with However if we have say 31 bytes, AVX2 is as such not possible since we have less than 32 bytes, therefore, we often revert to using 128-bit (e.g. SSE2) registers instead, for the remaining, so at least 16 of the 32 bytes can be handled fast. As far as I can tell there is no way to use SIMD registers less than the largest available for the given architecture one is running on. I.e. if 256-bit registers are available, there is no way to access/use 128-bit registers, and with Unless one would have specific types for vectors with static sizes e.g. |
Yes, it is. But .NET's SIMD support for NEON does not exist at the moment.
|
Something like that is discussed in #16094 |
Since we had some internal discussions regarding this API, and since we are still trying to finalize the API itself, I'm pushing this back to RTM as there's no reason to try to rush this to completion in the next few days. |
Makes sense to me. |
Let us know if there is anything we could do to help with implementing it. I have added One thing that surprised me a bit was how explicit layout was not handled as I expected see https://github.com/DotNetCross/Memory.Unsafe/blob/master/src/DotNetCross.Memory.Unsafe.UnitTests/Unsafe.SizeOf.Test.cs#L29 basically |
Does |
@JonHanna I added Regarding public static Array1D<T> CreateCopy<T>(T[] source)
where T : struct
{
fixed (void* sourcePtr = &source) // Cannot fix or pin generic managed array
{ Is there anyway this can be done using |
For now, it is not very straightforward. One way to do it - this pattern is actually used in several places in BCL:
Once byref locals and returns are added to C#, it will become simpler:
|
@jkotas thanks! So we need to reinterpret the reference type as a public T Cast<T>(object o)
{
ldarg.0
ret
} which is basically what the |
I agree that a generic unsafe casting operations are interesting in general. However, the above pinning pattern is special and so it may deserve a dedicated API - even though it is still just a unsafe cast underneath. There are number of different interesting unsafe casting operations:
|
Yes in C++ terms I would have said we have a Additionally, I do not like the object obj = "test";
var weirdObject = Unsafe.Cast(obj).To<SomeWeirdObject>();
int value = 42;
var valueAsDouble = Unsafe.Cast(value).To<double>();
var array = new double[32];
fixed (void* pinPtr = &Unsafe.Cast(array).ToPinnable().Pin)
{
void* firstPtr = Unsafe.AddressOf(ref array[0]);
} with public class Pinnable
{
public byte Pin;
} I am not a big fan of the term public static ValueCaster<T> Cast<T>(T value)
where T : struct
{
return new ValueCaster<T>(value);
}
public static ReferenceCaster Cast(object obj)
{
return new ReferenceCaster(obj);
} with these types doing the actual cast (that is we have to add the hand-written IL to them): public struct ValueCaster<T>
where T : struct
{
public readonly T Value;
public ValueCaster(T value) {
Value = value;
}
public TTo To<TTo>() where TTo : struct {
// Add IL
}
}
public struct ReferenceCaster
{
public readonly object Object;
public ReferenceCaster(object obj) {
Object = obj;
}
public TTo To<TTo>() where TTo : class {
// Add IL
}
public Pinnable ToPinnable()
{ return To<Pinnable>(); }
} Not sure it makes sense to actually split these into two types but it allows us to only have I would assume the JIT can actually compile this to machine code without extra runtime costs despite the intermediary types. |
I just realized the |
I like the
Right, it does not work too well. The closes thing you can do is to treat it as unsafe unboxing ... but unsafe unboxing would probably deserve a separate method. |
I agree completely that the design goal for Since we are already talking about a whole family of public static class Unsafe
{
public static unsafe T Read<T>(void* p)
public static unsafe void Write<T>(void* p, T value)
public static unsafe int SizeOf<T>()
//
// All the castings or "As" family of methods
//
public static T As<T>(object obj)
// Perhaps leave out since very close to As<Pinnable>
public static Pinnable AsPinnable(object obj)
// Instead of AddressOf
public static unsafe void* AsPointer<T>(ref T value)
// Instead of AsByteRef a generic version? When possible in C#
public static unsafe ref TTo AsRef<TFrom, TTo>(ref TFrom value)
// If at all possible...
public static unsafe TTo AsValue<TFrom, TTo>(TFrom value)
}
public class Pinnable
{
public byte Pin;
} Could you explain how unbox IL might look for Note that I put |
This should constrain T to be a class. It won't work well for valuetypes.
Agree.
It cannot think of what a reasonable implementation of this would be. And all options that I can think of are just trivial combinations of the other Unsafe APIs (e.g.
This may be sealed. |
Thanks for the precision. It is indeed the case for me. Also, I think that getting rid of the public void WithFixed()
{
fixed (SomeStruct* self = &this)
{
// ...
}
} versus public void WithUnsafeAsPointer()
{
var self = (SomeStruct*) Unsafe.AsPointer(ref this);
// ...
} |
I've done some tests with BenchmarkDotNet, to compare the speed of the three methods (static with pointer, instance + fixed, instance + Unsafe.AsPointer), and the speed difference between fixed and Unsafe.AsPointer is almost inexistant, while the static version is a lot faster. Result of BenchmarkDotNet (x64, RyuJIT)
Looking at the assembly code, I see that the version with fixed is not inlined (causing a call), which may be expected. But when comparing the static version and the This gist contains the test code I used. tl;dr is:
RyuJIT x64 on .NET 4.6.1 Static version: ; return VarKey.GetKeyStatic(m_ptr);
mov rax, qword ptr [rcx+10h] ; access 'm_ptr'
cmp dword ptr [rax], eax ; attempt to trigger a nullref if null?
lea rcx, [rax+6] ; load address of this.Byte0
movzx eax,word ptr [rax+4] ; read this.Size
mov qword ptr [rdx],rcx ; inlined USlice ctor
mov dword ptr [rdx+8],eax ; (cont.)
ret AsPointer version: ; return m_ptr->GetKeyUnsafeAsPointer();
sub rsp,28h
mov rsi,rdx
mov rdi, qword ptr [rcx+10h] ; access 'm_ptr'
cmp dword ptr [rdi], edi ; attempt to trigger a nullref if null?
lea rcx, [rdi+6] ; load address of this.Byte0
call 00007FF7CAE001E0 ; what is this call??
movzx eax,word ptr [rax+4] ; read this.Size
mov qword ptr [rdx],rcx ; inlined USlice ctor
mov dword ptr [rdx+8],eax ; (cont.)
ret I'm not sure what the additional call does, maybe Unsafe.AsPointer is not inlined correctly? |
It isn't inlined, the IL of AsPointer appears to be
The argument is a managed pointer but the declared return type is an unmanaged pointer. The JIT doesn't inline the method because the types do not match. I think there should be a |
@mikedn I did not know that was an issue, but I haven't perf tested @jkotas does this make sense? Should public static void* AsPointer<T>(ref T value)
{
ldarg.0
conv.u
ret
} I can easily change this in https://github.com/DotNetCross/Memory.Unsafe |
It is a general rule in the current JIT, there's some code in the JIT's importer that looks like: if (returnType != originalCallType)
{
impAbortInline(true, false, "Return types are not matching.");
return false;
} I suspect that the condition could be relaxed but that may require the JIT to automatically insert the |
@AndyAyersMS may want to comment, as he is working on refactoring (and ultimately tuning) the inlining code in the JIT. The JIT could potentially try to do something to adjust for a mismatch, but I think it could be potentially error-prone or unsafe, and of limited value. |
Yes, it makes sense - it is the right way to address this problem. |
Thanks to all for the valuable info. @KrzysFR I have updated https://github.com/DotNetCross/Memory.Unsafe (and the nuget package new version 0.2.1.0) so @mikedn @AndyAyersMS without context I would assume the type check here is whether or not the "type" is a reference type or a value type. And if the call changes from reference type to value type or vice versa then this would be a problem. Or are you in fact tracking whether it is a "managed" or "native" type? Because, when |
@nietras Thanks I saw that, reran the benchmark, and now I'm getting the same performance as the static version. Though, curiously when I'm extracting the assembly code generated, I'm getting the exact same code as before (with the call) but performance points to it being inlined...? Probably an issue on my part, I'm not sure what is the most efficient way to extract the assembler from a test method with VS on .NET Desktop, without having to CTRL-F5, hookup the debugger, and then attempt to navigate to the test method.... I was using the LoH in my small contrived test case just to speed things up. In the actual code the memory is allocated on a native heap or comes from a memory mapped file. |
I think there's some sort of confusion between what the code is supposed to be doing and how it does it. It's not about what pointers point to (value types, references types etc.), it's solely about the type of the pointers. The |
Except when you are writing interop code with native memory, or using memory mapped files, where this becomes critical (if you want to remove copying). But I understand that it was not an intended use case when .NET/C# was first developped. So yes it is unusual in the sense that only low-level/high performance code (such as the Kestrel http server, or games engines, or databases, or JSON parsers) would need to deal with such things. It would be nice if such code would be easier to write in C# though :) |
I don't understand how that justifies having mismatched types in IL. Is there's something that prevents the use of |
@mikedn Sorry, I did not understood that you were talking about @nietras I somehow was able to get VS to show me the proper assembly code for the AsPointer version, and it looks identical to the static one, except that it uses a different register. Thanks for fix! |
No, just trying to understand. And trying to make sure the
Yeah I was way off :) Thanks for the explanation. I am asking, because this is interesting in the context of Looking at a possible future method My question could then be why is I guess I do not understand how the distinction of managed pointer or unmanaged pointer is defined. That is, which pointers are considered managed. Completely outside the scope of this and on a grander scope, have adding native integer types e.g.
It would have been great to have |
Yes,
Yes,
It wouldn't be legal (though "legal" is too strong, the IL spec doesn't seem to state anywhere that such code is not legal, just unverifiable).
Managed pointers are reported to the GC, unmanaged pointers are not. You, the author of the code, declare which pointers are managed (type The GC needs to know about managed pointers so it can update them when objects move. Managed pointers may point to unmanaged memory in which case they still need to be reported to the GC but the GC will ignore them. Unmanaged pointers do not carry any such requirements. See III.1.1.5.1 and III.1.1.5.2 in ECMA 335 spec.
As far as the JIT is concerned there's no such thing as
|
@mikedn Thank you very much that is very clear. I understand the GC requirements, but had been wondering why there was no concept of a "native pointer" in the jit, but this is just an integer type then, so that is how the distinction is made. |
@mellinoe @terrajobst since the timeline for I have been thinking about the statements (below) made by @mikedn and how this could perhaps allow an efficient definition of
Currently public partial struct Span<T> : IEnumerable<T>, IEquatable<Span<T>>
{
/// <summary>A managed array/string; or null for native ptrs.</summary>
internal readonly object Object;
/// <summary>An byte-offset into the array/string; or a native ptr.</summary>
internal readonly UIntPtr Offset;
/// <summary>Fetches the number of elements this Span contains.</summary>
public readonly int Length;
} This uses the I also noticed on dotnet/corefxlab#572 (comment) that an addition to [MethodImpl(MethodImplOptions.AggressiveInlining)]
public static T ReadUnaligned<T>(ref T p)
{
ldarg.0
unaligned. 1 ldobj !!T
ret
} I am considering adding this to By the way is |
Nope. You can't store a managed pointer in a reference variable as such a pointer may point to the interior of a managed object. When the GC encounters a reference it expects that the reference points to the beginning of the object because it needs to access the method table and the object header. I've never been curious enough to check how GC tracks managed pointers but they're likely problematic exactly due to the fact that they may point to the interior of a managed object. And probably this is why such pointers can only exist on the stack, you cannot have a struct/class field of managed pointer type.
|
Exactly, so I couldn't understand how this could be possible either but the sentence:
Kept nagging me and I thought perhaps there was someway to define public partial struct Span<T> : IEnumerable<T>, IEquatable<Span<T>>
{
/// <summary>A managed array/string; OR native pointer</summary>
internal readonly object Object;
/// <summary>An byte-offset into the array/string; or a native ptr.</summary>
internal readonly int Offset;
/// <summary>Fetches the number of elements this Span contains.</summary>
public readonly int Length;
} That is, one could misuse the |
Without knowing the entire context of the reference, unaligned writes may induce more than "small extra cost". Alois Kraus has one such case examined and written in detail Is This A CPU Bug?. A few quotes:
It might be that unaligned cross-page operations might become expensive too, especially concerning other than x86. One example at How do modern cpus handle crosspage unaligned access?
It thought just to chip in to the discussion and hope this brings new perspective (also, a shameless plug, I think I add soon to a networking issue few points on how could one make the Orleans faster in general and networking more stable in particular, so if you're interested... <edit: at dotnet/orleans#307 (comment)). |
We've added support for an initial set of "Unsafe" operations in RTM in dotnet/corefx#7966 . If anyone has some extra operations they think should be added, I encourage them to open a new issue with more specific details. |
I would like to see a CallIndirect and As. I have implemented these as follows:
|
I don't understand what it would mean request that the JIT "force" an alignment. If you want "full control" of alignment as you say, you have to arrange for it in the user code:
For example, if this were the preparation for a |
C# doesn't support pointer operations on generic types. We can, however, express them in IL. A prototype could like this:
This would also solve issues like #16026 without any additional work.
The text was updated successfully, but these errors were encountered: