Skip to content

Latest commit

 

History

History
376 lines (267 loc) · 17 KB

LDM-2019-05-15.md

File metadata and controls

376 lines (267 loc) · 17 KB

C# Language Design Meeting for May 15th, 2019

Agenda

  1. Refining nullability signatures in APIs

Proposal: Refining nullability signatures in APIs

This proposes a set of language-level mechanisms (mainly in the form of attributes) that would provide for accurate annotation of the large majority of CoreLib members, and also eliminate a good number of warnings in the CoreLib implementations.

The names of attributes need to undergo API review. The attributes count as “nullable annotations” (with the exception of the Flow attributes that aren’t directly related to nullability) and yield a warning if applied where the annotation context is disabled. They are part of the contract of the member, and as such they need to be respected “covariantly” by overrides and implementations that are also in an enabled annotation context. That is: an override can loosen a restriction or tighten a promise, not the other way around.

Simple pre- and postconditions

These are applied to an input and output, and specify an “override” to the type, in that they may limit to not-null or extend to maybe-null regardless of what the type says. These alone can take care of the bulk of CoreLib TODOs regarding imprecise signatures.

It is important that both preconditions and postconditions are separately expressible, with individual attributes, since some annotatable entities are both input and output (e.g. ref parameters, fields and properties), and it is common in CoreLib for a restriction to apply only to the input or to the output.

Simple preconditions

These apply to anything that takes input: value, in and ref parameters, fields, properties, indexers, etc:

  • [AllowNull]: Allows null as the input, even if the type disallows it
  • [DisallowNull]: Disallows null as the input, even if the type allows it

On invocation, preconditions are applied after any generic substitution, and are affected by the null-state of the input. [AllowNull] prevents any top-level nullability warning on the input, whereas [DisallowNull] emits a warning if the input state is “may be null”.

Internally to the member, the preconditions may affect the initial null-state of the input (parameter, value variable, etc).

Simple postconditions

These apply to anything that yields output: out parameters, return values, fields, properties, indexers, etc:

  • [MaybeNull]: The output may be null, even if the type disallows it
  • [NotNull]: For outputs (ref/out parameters, return values), the output will not be null, even if the type allows it. For inputs (by-value/in parameters) the value passed is known not to be null when we return.

On invocation, postconditions are applied after generic substitution, and affect the null state of the output. [MaybeNull] changes the null state of the output to “may be null”, whereas [NotNull] changes it to “not null”.

Internally to the member, returning or assigning a value to an output element thus annotated may get or avoid warnings based on the null-state of the value.

Examples of simple pre- and post-conditions

public class Array 
{ 
    // [AllowNull] on a non-nullable ref to express that nulls may be passed in, 
    // but that nulls will not come out. 
    public static void Resize<T>([AllowNull] ref T[] array, int newSize); 
 
    // Alternatively, [NotNull] on a nullable ref to express that nulls 
    // may be passed in but will not be passed out. 
    public static void Resize<T>([NotNull] ref T[]? array, int newSize); 
} 
 
public class Array 
{ 
    // Result can be default(T) if no match is found 
    [return: MaybeNull] 
    public static T Find<T>(T[] array, Predicate<T> match); 
} 

public class TextWriter 
{ 
    // [AllowNull] on the setter of a non-nullable property to say that 
    // the setter accepts nulls even though the getter will never return null. 
    public virtual string NewLine 
    { 
        get =>; 
        [AllowNull] 
        set =>; 
    } 
 
    // Alternatively, [NotNull] on the getter of a nullable property to say that 
    // the getter returns non-null even though the setter accepts null. 
    public virtual string? NewLine 
    { 
        [NotNull] 
        get =>; 
        set =>; 
    } 
} 
 
public class StrongBox<T> 
{ 
    // Public field that defaults to default(T) 
    public [MaybeNull] T Value; 
 
    public StrongBox() {} 
} 
 
public class ThreadLocal<T> 
{ 
    // Even when T is non-nullable, the getter will return default(T) 
    // when accessed on a thread where a value hasn’t been set. 

    [MaybeNull] 
    get =>set =>} 
 
public interface IEqualityComparer<in T> 
{ 
    // Regardless of the nullability of T, GetHashCode doesn’t 
    // allow nulls, but Equals does (whether [AllowNull] is needed 
    // on the equals parameters is debatable). 
    bool Equals([AllowNull] T x, [AllowNull] T y); 
    int GetHashCode([DisallowNull] T obj); 
} 
 
internal static void SafeHandleHelper 
{ 
    // Null shouldn’t come in, but null can (and will) come out. 
    public static void DisposeAndClear([DisallowNull] ref SafeHandle? handle); 
} 

Value nulls

There are places in CoreLib where only reference nulls need be prevented, but the large majority will throw on both value and reference nulls. It is not worth the complexity to distinguish these two cases; instead [DisallowNull] always prevents both value and reference nulls. The negative impact on the few places that can handle value nulls (such as IEqualityComparer.GetHashCode) is minimal.

Nonnullable constraint

For generic dictionary-like types in CoreLib, input elements of the key type should never be null. It is possible to use [DisallowNull] in most places where the key type is given as input, but it doesn’t work everywhere (e.g. doesn’t transfer to composed-in types, such as the key collection), and it diminishes the user value, because warnings aren’t yielded on generic instantiation with a null-unsafe key type, but only when members are called that take it as an input.

For this reason, there should be a generic constraint, nonnull, that prevents the type argument from being nullable. Like the [DisallowNull] it applies to both nullable value types and nullable reference types, and yields a nullability warning when violated. While technically “correct”, we think object would be confusing as the name of the constraint, because it puts readers in mind of reference types. Additionally, since the nullable reference types feature is purely a manifestation of the language, the constraint should not result in an IL constraint that the runtime enforces, and since object is a type and types used as constraints are emitted to the IL as a constraint the runtime is aware of, it’s strange to use object but then not have it emitted. It should instead be nonnull and not be emitted as a base type constraint.

Examples of non-nullable constraint:

public static class Marshal 
{ 
    // The method fills in the provided instance from the pointer, and thus 
    // T shouldn’t be null. 
    public static void PtrToStructure<T>(IntPtr ptr, T structure) 
        where T : nonnull 
 
    // Alternatively, [DisallowNull] could be used. 
    public static void PtrToStructure<T>(IntPtr ptr, [DisallowNull] T structure) 
} 
 
// TKey on Dictionary, IDictionary, ConcurrentDictionary, etc. should not be null 
public class Dictionary<TKey, TValue> : 
    IDictionary<TKey, TValue>, 
    IDictionary, IReadOnlyDictionary<TKey, TValue>, 
    ISerializable, 
    IDeserializationCallback 
    where TKey : nonnull 

Unconstrained “T?”

Because nullability can be expressed on both input and output (with [AllowNull] and [MaybeNull]) in the few cases in CoreLib where it is needed in both directions it can be expressed using both attributes together. There is therefore no real need for the ability to say T? on an unconstrained T. Conversely, having unconstrained T? doesn’t obviate the need for [AllowNull] and [MaybeNull], as they are needed for single-directional scenarios.

Having T? would allow for the elimination of some warnings in generic code, notably around propagation and storage of default(T) values. However we consider that a separate potential improvement that could be added to the language later.

Interdependent postconditions

There is a long and varied list of APIs where the nullability of an input or output depends on other input or output of a method in various ways. It is not reasonable to express all of these patterns with attributes. Some are common patterns and should be expressed as attributes. Some occur on a short list of very commonly used members, and can be hardcoded into the compiler. A remaining long and thin tail will have to be left inexpressible.

Conditional postconditions

These are applied to parameters of bool-returning methods, and express that a given nullability applies to the parameter only when the method returns a given value.

  • [MaybeNullWhen(bool)]: If the method returns the bool value, the parameter may be null, even if the type disallows it
  • [NotNullWhen(bool)]: If the method returns the bool value, the parameter will not be null, even if the type allows it

These can be applied to both input and output (i.e. to all kinds of parameters) because they may represent either a test on input, or an assignment to output.

On the outside these are realized by affecting the conditional null-state (not null when true/false) after the invocation of the member. On the inside the conditional nature cannot be faithfully enforced. Instead, the most lenient of the two states is enforced unconditionally.

The core libraries only have examples of [MaybeNullWhen(false)] and [NotNullWhen(true)] but we should offer all four combinations for generality and symmetry.

Examples of conditional postconditions:

public class Version 
{ 
    // If it parses successfully, the Version will not be null. 
    public static bool TryParse( 
        string? input, 
        [NotNullWhen(true)] out Version? Version); 
} 
 
public static class Semaphore 
{ 
    // Not just for parsing… 
    public static bool TryOpenExisting( 
        string name, 
        [NotNullWhen(true)] out Semaphore? result); 
} 
 
public class Queue<T> 
{ 
    // With unconstrained generics, we use the inverse 
    public bool TryDequeue([MaybeNullWhen(false)] out T result) 
} 
 
 
public static class MemoryMarshal<T> 
{ 
    // It applies as well with constrained generics. 
    public static bool TryGetMemoryManager<T, TManager>( 
        ReadOnlyMemory<T> memory, [NotNullWhen(true)] out TManager? manager) 
        where TManager : MemoryManager<T> 
} 
 
public class String 
{ 
    // Not just for outs… also relevant to inferrig nullability of 
    // input arguments based on the return value. 
    public static bool IsNullOrEmpty([NotNullWhen(true)] string? value) 
} 

Nullness dependencies between inputs and outputs

Another common pattern is an output being non-null if a particular input is non-null. We can codify that with another attribute:

  • [NotNullIfNotNull(string)]: Applied to a return, ref, or out to indicate that its result will be non-null if the parameter specified by name is non-null

On the outside, these attributes are realized by giving the annotated output a not-null state if the input specified by name is not-null. On the inside this does not realistically seem enforceable.

Examples of dependencies between inputs and outputs:

class Path 
{ 
    [return: NotNullIfNotNull(nameof(path))] 
    public static string? GetFileName(string? path); 
} 
 
class RegistryKey 
{ 
    [return: NotNullIfNotNull(nameof(defaultValue))] 
    public object? GetValue(string name, object? defaultValue) 
} 
 
class Interlocked 
{ 
    public static T Exchange<T>([NotNullIfNotNull(nameof(value))] ref T location1, T value) where T : class? 
} 
 
class Volatile 
{ 
    public static void Write<T>([NotNullIfNotNull(nameof(value))] ref T location, T value) where T : class? 
 
    [return: NotNullIfNotNull(nameof(location))] 
    public static T Read(ref T location); 
} 
 
class Delegate 
{ 
    [return: NotNullIfNotNull(nameof(a))] 
    [return: NotNullIfNotNull(nameof(b))] 
    public static Delegate? Combine(Delegate? a, Delegate? b) 
} 

Equality postconditions

Certain members are equality tests on their inputs, and if the result is true (or, in the case of !=, false) the nullability of the two parameters is equal.

Between the language and CoreLib, there is a small fixed set of equality methods and operators, and people are unlikely to add new ones. Therefore it is not worth creating attributes for these, and they should instead be hardcoded into the compiler.

Examples of equality:

  • Object.ReferenceEquals
  • Object.Equals
  • IEqualityComparer<T>.Equals
  • EqualityComparer<T>.Equals
  • IEquatable<T>.Equals

CompareExchange

Of the remaining members with special null behavior, the one that sees by far the most use is Interlocked.CompareExchange. It’s nullability contract is way too complex to reasonably express through attributes, but should be hardcoded into the compiler due to the amount of usage that the method sees.

There are two relevant overloads:

public static extern object? CompareExchange( 
    ref object? location1, object? value, object? comparand); 
 
public static T CompareExchange<T>( 
    ref T location1, T value, T comparand) where T : class? 

The most important case to infer here: If comparand is a constant null and value is non-null, location will be non-null on return.

That covers the very common case of lazy initialization:

if (_lazyValue == null) Interlocked.CompareExchange(ref _lazyValue, new Value(), null); 
return _lazyValue; 

Additionally as a bonus:

If location1 and value are both non-null on enter, location1 and the result value will be non-null on return.

Flow attributes

Since nullability analysis is affected by control flow, it is valuable to signal when a method affects control flow, even if it is not directly related to nullability of the method’s inputs and outputs. Specifically, if a method does not return, unconditionally or conditionally, the compiler can benefit from this information to reduce unnecessary warnings.

  • [DoesNotReturn]: Placed on the method. Code after the call is unreachable
  • [DoesNotReturnIf(bool)]: Placed on a bool parameter. Code after the call is unreachable if the parameter has the specified bool value

Examples of flow attributes:

public static class Environment 
{ 
    [DoesNotReturn] 
    public static void FailFast(string message); 
} 
 
public class ExceptionDispatchInfo 
{ 
    [DoesNotReturn] 
    public void Throw(); 
} 
 
internal static class ThrowHelper 
{ 
    [DoesNotReturn] 
    public static void ThrowArgumentNullException(ExceptionArgument arg); 
} 
 
public static class Debug 
{ 
    public static void Assert([DoesNotReturnIf(false)] bool condition); 
} 

Discussion

The guiding priorities for this proposal are how the attributes:

  1. affect the caller,
  2. affect OHI,
  3. affect the implementer.

There is a desire for the attributes to also affect the implementation side (for example, [MaybeNull] on a method return would mean fewer warnings on return statements), but we did not focus on that as much.

Simple pre- and postconditions

There is some redundancy for non-generic cases. For example, you could use [AllowNull] string or string?. We'll need to provide some guidance.

Open question: should those attributes be applied to properties or to accessors?

nonnull constraint

The nonnull constraint cannot be combined with a struct constraint, because struct constraint already implies non-null.

T? is only allowed if T is constrained to class, a class type, or struct. T? is disallowed if T is constrained to class? or nonnull.

Conditional postconditions

Note that [MaybeNullWhen(false)] and [NotNullWhen(true)] are different in generic case (see TryDequeue<string> in examples).

Nullness dependencies between inputs and outputs

We cannot use nameof in [NotNullIfNotNull(nameof(parameter))], because the parameter is only in scope inside the method body. In overrides, implementers will have to use the parameter name of the implementation ([NotNullIfNotNull("overriddenParameter")]), not of the overridden method.

Open question: what if the name passed to this attribute doesn't match a parameter name? Is that an error or warning?

Equality postconditions

If an equality returns true and one of the parameters has non-null state, then we can set the state of the other parameter to non-null. We'll need to revisit the rules for recognizing an equality method. We may be able to use a more general pattern, instead of a list of well-known members. We should also consider whether nullability analysis of an equality should account for a null literal

Flow attributes

Open question: would we also need a [DoesNotReturnWhenNull(nameof(parameter))]?