-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Variable-size "long" for simplified p/invoke #27530
Comments
Thinking some more about implementation, it might make the most sense to just go with something like [MarshalAs(UnmanagedType.CLong)] applied to a regular Int32 (or UInt32), like so: [StructLayout(LayoutKind.Sequential)]
struct OggPacket
{
IntPtr Packet;
[MarshalAs(UnmanagedType.CLong] int Bytes;
[MarshalAs(UnmanagedType.CLong] int BeginningOfStream;
[MarshalAs(UnmanagedType.CLong] int EndOfStream;
long GranulePosition;
long PacketNumber;
} |
Looks like what you need is proposed here https://github.com/dotnet/corefxlab/blob/master/docs/specs/nativesized.md |
@EgorBo That refers to types that vary in size by word length (32-bit, 64-bit etc.). Different request but easy to confuse (I originally used the term "native long" but changed it to try and avoid this confusion). |
....so why isn't the native code considered "buggy", in that it's using a type with a variable size definition? Especially since the next set of fields has an explicit size? (Frankly, this is one of my least favorite C/C++ "features", for mostly this reason - that, and in almost all cases the far more interesting information is the range of the domain type, not the size of the physical type) |
The merits of using long isn’t really the point.. my example is a pre-existing API already present on most Linux distributions. And it isn’t particularly obscure - I can provide other examples.
|
This isn't sufficient. This would not allow the upper 32-bits to be used on 64-bit Linux. If this were to be handled, it would likely need to be done by an explicit type that the runtime will dynamically size as needed (much like |
I suppose that’s potentially true, although in many (most?) cases the upper 32 bits are not particularly useful (considering that the code is assumed to work on platforms with 4 byte longs). I’m curious if there is any new code being written with longs. My assumption is that their occurance in code is a legacy holdover from when 64-bit architectures were a pipe dream. Totally not against a “CLong” type, just playing this through. Any solution would be super helpful. |
I would like to hope that most "modern/new" C/C++ code is being written to be cross-platform and cross-architecture aware and that they will use explicitly sized types (int32_t, int64_t, intptr_t, etc), rather than relying on the C/C++ language keywords (int, long, long long, etc). And looking around at various "newer" codebases that are explicitly meant to be cross-platform/architecture, they tend to do this (clang, llvm, vulkan, among others). In an ideal world, many of these codebases (such as Ogg) would have been updated to use |
I think
I do not think a special casing like this would be worth it in this case. |
This keeps coming up (the most notable likely being from @migueldeicaza: https://github.com/dotnet/coreclr/issues/963), and the only "correct" solution, today, is to cross-compile.
Which requires you to actually marshal the data, rather than allowing pinning or blitting of the value on 64-bit. If you expect to support 64-bit, you have to use long (or We could potentially just handle this at the framework level, with |
dotnet/coreclr#963 is a different issue from this one. I think that paying the price to marshal the data is perfectly fine for CLong. |
What kind of performance overhead are we talking about? Just the overhead of copying arrays of these things vs pinning, or would even my example be affected somewhat? I’d be reluctant to implement anything that slowed my solution down noticeably. Would probably stick with cross compiling in performance critical code if that is the case.
|
You example would get some extra instructions as well, in some cases at least.
|
I think the best argument for not introducing a new type is that this issue relates to legacy / awkward code interop only. first-class support seems like overkill. On the other hand, if adding the type isn’t too hard, the performance advantages might be nice. And it’s probably cleaner. I will leave it up to you experts. both solutions are acceptable and I’d just like to see something as soon as possible! Just want to make sure the attribute would deal with sign extension. |
+1
Adding a first class type for this is complex. Many different parts of the system would have to know about it, depending on how much first class this type would be. I do not think it is cleaner from the architecture point of view at least. The interop specific constructs should stay within the interop subsystem. |
Lets assume a second-class type then, living in Its only when casting to a managed type that the developer needs to worry about range, which is why it feels cleaner to me. |
I'm asking myself if the correct porting to 64 bit would have to convert those field simply to the type int (or better int32_t) it seems on 64 bit OS the struct is now more bigger for no reason. I don't know if: would be more clear in this case. |
We also have some native library which we needed to interface. It's written in C++, and exposes both a C and a C++ interface, and has a history of 20+ years. When porting to 64 bit, the developers of the library said that they could not retroactively change the So we had to bite into the apple and duplicated about 80% of all p/invoke calls and structs we use for interop, guarding all this with So, types or a |
I am working on cross platform interop for an older library as well and am running into this same issue. My preference would also be for the introduction of @markusschaber are you able to point me to an example of how you are handling this at runtime with your |
@jkotas, would it be feasible for us to provide the following library only types? namespace System.Runtime.InteropServices
{
public struct CLong
{
#if Windows_NT
private int _value;
#else
private nint _value;
#endif
// Similar members to System.IntPtr
}
public struct CULong
{
#if Windows_NT
private uint _value;
#else
private nuint _value;
#endif
// Similar members to System.UIntPtr
}
} There would be no marshalling or runtime support just simple framework types (and utilizing the existing The goal here would be to solve a few of issues:
|
I do not think you can get away without marshalling or runtime support if you want this to represent C long/ulong for interop faithfully. Struct wrapping an integer is not same as unwrapped integer in all ABIs. |
Ah right, there are cases like COM where it isn't equivalent. I think it might be something worth some effort here as I don't think the difference will go away as even modern APIs use the type rather than something like For example, I've hit issues with trying to manually manage the difference in things like I think not being able to define a struct wrapper is particularly problematic as it means you can't define some general way to handle this. It forces you to redefine the same logic and forces you to incur the cost every time the native library uses |
That sounds reasonable to me. Your suggestion says " Similar members to System.IntPtr". I do not think it is desirable - it would inherit the problems with IntPtr not really being first class integer type. I would make these types as simple as possible, only methods convert to / from primitive integer and the few other "mandatory" struct methods, something like: readonly struct CLong : IEquatable<CLong>
{
public CLong(nint value);
public nint Value { get { return _value; } }
public override string ToString();
public override int GetHashCode();
public override bool Equals(object o);
public bool Equals(CLong other);
} I am wondering whether this should be done together with #13788. The places you need to check for |
That sounds reasonable to me.
That's actually what I was going to suggest. The feature work should be roughly the same, that is needing to interpret a user-defined struct as "transparent", but the latter will be more reusable and will unblock any similar scenarios in the future. |
I did a rough prototype over The prototype basically treats The new value and a corresponding namespace System.Runtime.InteropServices
{
public readonly struct CLong : IEquatable<CLong>
{
public CLong(nint value);
public nint Value { get; }
public override bool Equals(object o);
public bool Equals(CLong other);
public override int GetHashCode();
public override string ToString();
}
public readonly struct CULong : IEquatable<CULong>
{
public CULong(nuint value);
public nuint Value { get; }
public override bool Equals(object o);
public bool Equals(CULong other);
public override int GetHashCode();
public override string ToString();
} |
@tannergooding forgive me if I am not understanding but isn’t |
Can this be done directly without involving marshalers - make the JIT to use the right convention for these? If there are special runtime-provided marshalers involved, it does not work for the interop marshaling generated by source generators. |
@jkoritzinsky or @AaronRobinsonMSFT may know better, but AFAIK the decision for For |
I am not sure putting this special knowledge into the JIT is much better than embedding it in the marshaler. Source generators here seem like the win since users would be able to define their own semantics in any way they desire. There is also the possibility that in a source generator world types could declare their own marshaling semantics. Consider the Example: [MarshalInAttribute(CLong.MarshalIn)]
[MarshalOutAttribute(CLong.MarshalOut)]
public readonly struct CLong : IEquatable<CLong>
{
...
public static IntPtr MarshalIn(ref CLong l) { ... }
public static CLong MarshalOut(IntPtr i) { ... }
} @tannergooding Yes that is the location for returning, but there is also in and byref. Is the return location of special interest here? I just discovered this issue and quickly read it so I could be missing something. |
I just realized this is for the C# function pointer proposal. I believe that decision is in the JIT. |
With types like However, for types like C/C++ This means you need to define two native signatures (one taking The Having a standard interchange type in the runtime at least solves the problem of shipping the type to users since we already build and ship platform/architecture specific libraries. The remaining problem ends up being that for instance methods on WIndows, returning |
@tannergooding My point here is no changes are needed in the runtime at all to support this with the source generator proposal. The Example of source generator approach: // Defined in Source generator assembly
namespace SrcGenPInvokeLibrary
{
[MarshalInAttribute(CLong.MarshalIn)]
[MarshalOutAttribute(CLong.MarshalOut)]
public readonly struct CLong : IEquatable<CLong>
{
...
public static IntPtr MarshalIn(ref CLong l) { ... }
public static CLong MarshalOut(IntPtr i) { ... }
}
}
// Using Source generator in user application
namespace User
{
[GeneratedDllImportAttribute("NativeLib")]
partial static SrcGenPInvoke.CLong DoubleMe(SrcGenPInvoke.CLong value);
} In this scenario the esoteric types can be defined and owned by the tooling that will be dealing with the interop and kept out of the runtime. Of course there is the shared types issue so having them defined officially has a real compelling argument. However I would argue that P/Invokes having these argument types should be kept outside of public APIs which means they could be embedded into the assembly that is using the source generator. |
What is the source generator generated method going to look like for this? Is it going to work correctly on both Windows and Linux? |
I don't see why it couldn't handle that kind of checking in the implementation of |
I'm not against the new types and honestly the attribute isn't really all that bad. My perspective here comes down to some of the reasons for the ComWrappers API. That was designed because there was literally no other way to do accomplish the performance characteristics for object identity and run managed code during the GC. Therefore we added a basic building block that would let users stay in managed code and accomplish what they wanted without adding a lot of new interop surface area. This is also similar to In this case it appears possible to address this issue with the source generators feature and that seems to be our desired path anyways. I just don't see the need and why source generators aren't sufficient at the moment - that is really why I am not embracing the proposal. |
Yes, you could. @tannergooding 's point is that it is not pretty - even for the simple example. It would need to look like this:
Maybe it is ok since you will rarely look at this code. Making sure that |
That entirely depends on what the aim of the library is. There are some interop libraries who are merely providing what are essentially raw bindings with a few helper methods to handle array/string types. They are designed around exposing the raw API surface so that other libraries can use it to create their own managed wrapper or to provide barebones/no overhead access to the API.
I wouldn't expect My expectation with source generators is that, especially with interop bindings and the sheer number of attributes you may need to correctly annotate various bits of data (like whether a This would mean rather than taking a dependency on some
The problem with this checking is namely that |
Yes, and it gets worse when you have structs or arrays of structs containing struct CXUnsavedFile_Windows
{
public sbyte* Filename;
public sbyte* Contents;
public uint Length;
}
struct CXUnsavedFile_Unix
{
public sbyte* Filename;
public sbyte* Contents;
public nuint Length;
}
[DllImport("libClang", EntryPoint="clang_createTranslationUnitFromSourceFile")]
public CXTranslationUnitImpl* clang_createTranslationUnitFromSourceFile_Windows(void* CIdx, sbyte* source_filename, int num_clang_command_line_args, sbyte** clang_command_line_args, uint num_unsaved_files, CXUnsavedFile_Windows* unsaved_files);
[DllImport("libClang", EntryPoint="clang_createTranslationUnitFromSourceFile")]
public CXTranslationUnitImpl* clang_createTranslationUnitFromSourceFile_Unix(void* CIdx, sbyte* source_filename, int num_clang_command_line_args, sbyte** clang_command_line_args, uint num_unsaved_files, CXUnsavedFile_Unix* unsaved_files);
CXTranslationUnitImpl* clang_createTranslationUnitFromSourceFile(void* CIdx, sbyte* source_filename, int num_clang_command_line_args, sbyte** clang_command_line_args, uint num_unsaved_files, ????* unsaved_files)
{
// ....
} I don't have a common type I can use in the above. I have to either define a |
Okay. Having reread this thread and #13788 I see where you are coming from. To be honest the compelling argument for me is the fields in types. That is hard to say the least and I can't come up with a solution that doesn't involve swathes of code and that feels like it defeats the desire to make interop fast. I am not a fan of the attribute concept since it appears to have a single use case in practice even though in theory I guess it could be employed for other scenarios. At this point, I would prefer special casing the |
We can start small and just introduce the CLong types, without the general attribute. It sounds reasonable to me. |
Once #12375 is fixed, there would be no changes needed in the IL stubs to support a "treat a struct as if it was a primitive for ABI reasons" type flag or attribute. |
I think that is also reasonable. But, I think the attribute is a bit more generally useable and it is likely to more widely used than just The more general use-case is to provide a general mechanism for the same in the future so that we don't need to go modifying the IL again. For example, the iOS/macOS environment also has Likewise, it would be useable outside the strictly "required" scenario for things like Its also worth noting that we still generate significantly different codegen for returning
|
I think it would be fine to add nfloat along the same lines CLong. It needs to be bitness-specific anyway. We have number of number of Windows-specific core interop types, so having one Apple-specific core interop type would be reasonable . The "not strictly required" scenarios can be handled by source generators. I would wait for where the source generators will bring us and then we check again whether the general attribute understood by the runtime is needed.
I am not sure which part you mean. The only int vs. HRESULT inefficiency that I can see in your snippet is unnecessary RBP frame for HRESULT in CreateDXGIFactory3.
This is required for PInvoke transition (digressing from what this issue about). |
Unfortunately it wasn't just WPF. This had external uses as well. I don't think we can ever fix that decision. However, if we could fix that decision by adopting the attribute I would be, after you, its biggest champion. |
Do you have a concern with implementing |
I would like the special handling for these to not require marshaling stub, ie tell the JIT directly what to do when it matters; do not create a new type of marshaler for this. I do not think that the internal |
I personally just don't like the idea of a public attribute modifying a struct in this way. It seems like a mechanism that is going can be abused and then we wind up owning the semantics - similar to the struct HRESULT issue above. However, starting out with a private attribute as an implementation detail of this feature with the option to make it public later is something I could get behind. |
I am reading @tannergooding's suggestion as wanting the trigger to be the |
Right, my suggestion wasn't to add a new marshaler but was instead to keep |
@tannergooding I personally am okay with it being internal. I think @jkotas's point is that: else if (x || y || z) { ... } isn't much different than: else if (a) { ... } Regardless of how it is done, this conversation has convinced me it has value. I look forward to the PR 😉 |
The (internal) attributes are not free to lookup. It is cheaper to just check the name if the number of affected types is small. |
I think @tannergooding capture this conversation very well and has folded all the feedback into #13788. Closing as duplicate. |
To greatly simplify p/invoke code (particularly struct definitions, but also method signatures), it would be amazing if there was a "long" type that matched the OS's compiler definition of long. This would mean no more preprocessor definitions, and a single assembly that works with native libraries compiled for multiple OSes.
Here's an example of what you have to do in .net right now to p/invoke ogg_packet, defined here:
C code:
C# code:
Ugly, and I need to ship separate binaries for Windows and OSX/Linux.
To be clear, this is partly about word length and partly about what "long" means in compiled C code on Windows (vs basically every other OS):
https://software.intel.com/en-us/articles/size-of-long-integer-type-on-different-architecture-and-os/
I get that this is technically up to the compiler, not the OS, but I believe the dust has settled on these sizes since the shift from 16->32->64 bits.
The text was updated successfully, but these errors were encountered: