Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reference assemblies need to include private struct fields #16402

Closed
jaredpar opened this issue Feb 18, 2016 · 14 comments
Closed

Reference assemblies need to include private struct fields #16402

jaredpar opened this issue Feb 18, 2016 · 14 comments
Assignees
Milestone

Comments

@jaredpar
Copy link
Member

Today the reference assemblies for .NET assemblies strip private fields from structs. This has an observable, and potentially dangerous, impact on project that consume them:

  • Allows pointers to be created for structs with fields of a reference type.
  • Prevents the compiler from catching cyclic struct layout problems.
  • Breaks C# definite assignment checking.
  • Allows structs using [FieldOffset] to verify when they should not.
  • Prevents developers from correctly planning for interop scenarios.

More details are available here and an example of the problems this can produce is here.

I understand part of the motivation for removing these fields is to keep the reference assemblies small. Keeping the fields necessitates keeping the type definitions for the types of the fields and this can cascade into many more types / members being included thus increasing size.

Cutting private fields for classes is fine as it's unobservable. Unfortunately for structs fields are observable irrespective of their accessibility and must be maintained in reference assemblies. The only action I think that can be done to curtail the number of types a struct brings in is the following:

A reference assembly can represent an inaccessible struct field which is a reference type as object.

Essentially class, interface, delegate and generic type parameters constrained to class can be transformed to object.

// Implementation assembly
private interface IBigInterface { ... } 
public struct S 
{
  private IBigInterface _field;
}

// Reference assembly
public struct S 
{
  private object _field;
}

This is unobservable to the compiler and can help limit the number of types brought it.

@nguerrera
Copy link
Contributor

This has versioning implications. The best definition for a valid reference assembly I have is that it's indistinguishable to the compiler from what could just have been an earlier version of the implementation assembly. As such, for any transformation deemed invalid from implementation to reference assembly, the inverse should be deemed to be a breaking change. We should update https://github.com/dotnet/corefx/blob/master/Documentation/coding-guidelines/breaking-change-rules.md accordingly.

(That document could probably use a general review as I've just noted that the second bullet is wrong misleading. :( EDIT: Submitted dotnet/corefx#6225 to fix that at least.)

@jaredpar
Copy link
Member Author

I agree there are versioning implications here. I looked at that document for a bit but I'm having trouble processing it because it's unclear what it is targeting for breaking changes. Is the document concerned only about compilation back compat or is it concerned with binary compat? This has a big distinction for structs contained in a reference assembly. Here are my quick thoughts on it.

Binary Compat

At this level structs need to be broken down into two categories: those which have a field which is a reference type and those which do not.

Structs without a reference type field cannot change size between versions. The developer has to assume the worst which is the struct is involved in a PInvoke / COM interop scenario. Changing the size of the struct in such a scenario is an observable change which could break runtime compat.

There is probably some flexibility if such structs could change shape so long as they don't change size or add a reference type field. At a glance that seems okay but I may be missing something.

A struct with a reference type field could never be used in interop though. Hence it can fall back to the compile compat rules.

Compile Compat

For structs the hardest language to deal with compat on is C# as it has both definite assignment rules (F# does as well) and unsafe code (unique to C#). The compat rules for C# would be the following:

  • A struct with no inaccessible fields is frozen. Developer must assume such a struct was via direct field initialization in C# (no overall value assignment). Any change in this type by adding an accessible or inaccessible field would break such code.
  • A struct with at least one inacessible field can add / remove fields of any accessibility. Developer can safely assume struct instances were assigned with either default(T) or a ctor call.
    • Exception here is the new field cannot be a reference type if the struct previously did not have any reference type fields. Developer much assume the struct was previously used as a pointer type and adding a reference field would break that.
    • Remove of a public field could break a FieldOffset trick. I don't know if this falls into the compat rules considered here though.
  • A struct can change the type of inaccessible reference type fields to other reference types.

@gafter
Copy link
Member

gafter commented Feb 18, 2016

For compile-time compat, the following rules should work for producing a reference assembly. We drop all private fields, but add back certain synthesized private fields for a value type (struct) as follows:

  • If there are any private fields that are or contain any value type members, add a single private field of type int
  • If there are any private fields that are or contain any reference type members, add a single private field of type object.
  • If the type is generic, then for every type parameter of the type, if there are any private fields that are or contain any members whose type is that type parameter, we add a direct private field of that type.

@jaredpar
Copy link
Member Author

@gafter

If there are any private fields that are or contain any value type members, add a single private field of type int

This breaks the ability for developers to write correct PInvoke and unsafe code. The size of the struct is important in both scenarios to write correct code.

@gafter
Copy link
Member

gafter commented Feb 18, 2016

This breaks the ability for developers to write correct PInvoke and unsafe code. The size of the struct is important in both scenarios to write correct code.

@jaredpar Can you unpack that assertion? The size of a struct isn't considered to be a compile-time constant except for primitive types and enums. What kind of code would be broken by this?

@nguerrera
Copy link
Contributor

I would say that interop is the issue more than unsafe code. You can certainly write correct unsafe code without having size or layout guarantees. e.g. There is nothing wrong with &foo.x even if x moves around.

If we put further restrictions on struct changes in the name of interop, then structs marked with auto layout in v1 would be exempt. They are rejected by the interop layer, but that (justifiably) does not prevent them from being used by unsafe managed code.

I wish C# made auto the default and that sequential was opt-in for interop so that we didn't have to assume that any struct might be used for interop, but rather only those that were explicitly marked sequential or explicit.

@jaredpar
Copy link
Member Author

The size of a struct isn't considered to be a compile-time constant except for primitive types and enums

Agree but there is plenty of interop that is built around exactly such types. Or more accurately structs that transitively wrap those types.

Take for example Guid. This is a struct that is often used in interop and contains only primitive type fields. Adding / removing a primitive field to this struct would be essentially unobservable from the stand point of C#. But it would wreak havoc on interop code that used Guid

@jaredpar
Copy link
Member Author

I wish C# made auto the default and that sequential was opt-in for interop so that we didn't have to assume that any struct might be used for interop, but rather only those that were explicitly marked sequential or explicit.

Agree. Auto is also advantageous from the stand point of performance because the JIT can pack structs tighter. Stupid back compat 😦

@scottdorman
Copy link

Stupid back compat

Well said, @jaredpar! :)

I wish C# made auto the default and that sequential was opt-in for interop...

That would have been advantageous, especially since intereop usually requires a good understanding of what's going on at runtime. Another choice that would have been good was requiring constructor call chaining rather than implicitly adding it. :) I normally always follow that rule, except for the one time in the code that "discovered" this issue, specifically because it takes away the possibility of ambiguity and lead to "strange" results/compiler errors. 20-20 hindsight and all.

@weshaggard weshaggard removed their assignment Apr 7, 2017
@weshaggard weshaggard self-assigned this Jan 3, 2018
@weshaggard
Copy link
Member

Update CCIExtensions to understand private struct fields
Update GenAPI's to emit these private modifiers into the reference assembly code.

Cost assumes its @weshaggard

Will be done along with dotnet/corefx#6185.

@jkotas
Copy link
Member

jkotas commented Jan 3, 2018

How are you going to deal with the fact that the private fields are different between different implementations in number of cases?

@weshaggard
Copy link
Member

@jkotas is that true for any public structs?

@jkotas
Copy link
Member

jkotas commented Jan 3, 2018

A few examples:

// Similar differences exist for other async-related structs. These structs have been tweaked
// for performance a lot in .NET Core, and I think we would love to have the option to keep tweaking
// them without setting their layout in stone.
// .NET Framework
public struct AsyncVoidMethodBuilder
{
    private SynchronizationContext m_synchronizationContext;
    private AsyncMethodBuilderCore m_coreState;
    private Task m_task;
}
// .NET Core
public struct AsyncVoidMethodBuilder
{
    private SynchronizationContext _synchronizationContext;
    private AsyncTaskMethodBuilder _builder;
}
// .NET Native
public struct AsyncVoidMethodBuilder
{
    private Action m_moveNextAction;
    private SynchronizationContext m_synchronizationContext;
    private Task m_task;
}

// .NET Framework
public struct AsyncFlowControl: IDisposable
{
    private bool useEC;
    private ExecutionContext _ec;
    private SecurityContext _sc;
    private Thread _thread;
}
// .NET Core
public struct AsyncFlowControl : IDisposable
{
    private Internal.Runtime.Augments.RuntimeThread _thread;
}

// .NET Framework
struct OpCode
{
    private String m_stringname; // not used - computed lazily
    private StackBehaviour m_pop;
    private StackBehaviour m_push;
    private OperandType m_operand;
    private OpCodeType m_type;
    private int m_size;
    private byte m_s1;
    private byte m_s2;
    private FlowControl m_ctrl;
    private bool m_endsUncondJmpBlk;
    private int m_stackChange;
}
// .NET Core
public struct OpCode
{
    private OpCodeValues m_value;
    private int m_flags;
}

// Similar differences exist for other XXXHandles
// NET Framework and .NET Core
struct RuntimeTypeHandle
{
    private RuntimeType m_type;
}
// .NET Native:
struct RuntimeTypeHandle
{
    private IntPtr _value;
}

// Similar differences exist for other types related to Span
// Fast span
ref struct Span<T>
{
    internal readonly ByReference<T> _pointer;
    private readonly int _length;
}
// Slow span
ref struct Span<T>
{
    private readonly Pinnable<T> _pinnable;
    private readonly IntPtr _byteOffset;
    private readonly int _length;
}

weshaggard referenced this issue in weshaggard/buildtools Jan 11, 2018
For compile-time compat, the following rules should work for producing a reference assembly. We drop all private fields, but add back certain synthesized private fields for a value type (struct) as follows:
- If there are any private fields that are or contain any value type members, add a single private field of type int
- If there are any private fields that are or contain any reference type members, add a single private field of type object.
- If the type is generic, then for every type parameter of the type, if there are any private fields that are or contain any members whose type is that type parameter, we add a direct private field of that type.

For more details see issue https://github.com/dotnet/corefx/issues/6185
this blog is helpful as well http://blog.paranoidcoding.com/2016/02/15/are-private-members-api-surface.html
weshaggard referenced this issue in weshaggard/corefx Jan 11, 2018
This also includes an update to the genapi tool that includes the
necessary private fields in structs to enable the compiler to
do correctness checks.

See https://github.com/dotnet/corefx/issues/6185.
weshaggard referenced this issue in weshaggard/buildtools Jan 11, 2018
For compile-time compat, the following rules should work for producing a reference assembly. We drop all private fields, but add back certain synthesized private fields for a value type (struct) as follows:
- If there are any private fields that are or contain any value type members, add a single private field of type int
- If there are any private fields that are or contain any reference type members, add a single private field of type object.
- If the type is generic, then for every type parameter of the type, if there are any private fields that are or contain any members whose type is that type parameter, we add a direct private field of that type.

For more details see issue https://github.com/dotnet/corefx/issues/6185
this blog is helpful as well http://blog.paranoidcoding.com/2016/02/15/are-private-members-api-surface.html
@weshaggard
Copy link
Member

weshaggard commented Jan 12, 2018

All the refs in corefx that have structs have had their refs updated to have private fields that follow the rules defined in our tools
dotnet/buildtools#1859 https://github.com/dotnet/buildtools/blob/master/src/Microsoft.Cci.Extensions/Writers/CSharp/CSharpWriter.cs#L141.

terrajobst referenced this issue in terrajobst/standard Dec 12, 2018
This fixes dotnet#678. In a nutshell, the compiler needs to know whether a struct
has any fields in order to apply definitive assignment rules. While stripping
all private fields from types is generally OK, we can't do this for structs.
Fortunately, for private fields the compiler doesn't really care what they
are, but that their characteristics are. For example:

1. Does the struct have any fields?

2. Does the struct contain any reference types (to validate generic
   instantiations that have the unmanaged constraint)?

3. Does the struct use the generic parameter in a field declaration (to
   validate cyclic layout problems)?

This adds dummy fields to structs to conform to these rules. These aren't
computed separately but are instead taken from .NET Core.

For more details, see this issue in CoreFX:

https://github.com/dotnet/corefx/issues/6185
terrajobst referenced this issue in terrajobst/standard Dec 12, 2018
This fixes dotnet#678. In a nutshell, the compiler needs to know whether a struct
has any fields in order to apply definitive assignment rules. While stripping
all private fields from types is generally OK, we can't do this for structs.
Fortunately, for private fields the compiler doesn't really care what they
are, but what their characteristics are. For example:

1. Does the struct have any fields?

2. Does the struct contain any reference types (to validate generic
   instantiations that have the unmanaged constraint)?

3. Does the struct use the generic parameter in a field declaration (to
   validate cyclic layout problems)?

This adds dummy fields to structs to conform to these rules. These aren't
computed separately but are instead taken from .NET Core.

For more details, see this issue in CoreFX:

https://github.com/dotnet/corefx/issues/6185
terrajobst referenced this issue in terrajobst/standard Dec 12, 2018
This fixes dotnet#678. In a nutshell, the compiler needs to know whether a struct
has any fields in order to apply definitive assignment rules. While stripping
all private fields from types is generally OK, we can't do this for structs.
Fortunately, for private fields the compiler doesn't really care what they
are, but what their characteristics are. For example:

1. Does the struct have any fields?

2. Does the struct contain any reference types (to validate generic
   instantiations that have the unmanaged constraint)?

3. Does the struct use the generic parameter in a field declaration (to
   validate cyclic layout problems)?

This adds dummy fields to structs to conform to these rules. These aren't
computed separately but are instead taken from .NET Core.

For more details, see this issue in CoreFX:

https://github.com/dotnet/corefx/issues/6185
@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 2.1.0 milestone Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Jan 2, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants