Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Introduce "var?" for nullable type inference #3591

Closed
LWChris opened this issue Jun 21, 2020 · 73 comments
Closed

[Proposal] Introduce "var?" for nullable type inference #3591

LWChris opened this issue Jun 21, 2020 · 73 comments

Comments

@LWChris
Copy link

LWChris commented Jun 21, 2020

Problem

Consider this code:

#nullable enable
public void Foo()
{
    var val1 = p.Count; // int
    var val2 = p.Max; // int?
    var ref1 = p.GetOrCreate(1); // TheType
    var ref2 = p.TryGet(2); // TheType?

    // val1 = null;
    val2 = null;
    ref1 = null;
    ref2 = null;
}

With #nullable enable, the line ref1 = null causes a warning, because the type of ref1 was inferred from GetOrCreate whose return type isn't nullable. The line val1 = null would obviously not even compile.

In order to have all variables nullable, I'd either have to change the return type of GetOrCreate and Count to nullable types (which is bad because both will never return null, so why declare otherwise), or I have to use explicit type, which can range from
int? val1 = p.Count;
to
TheTypeThatRequiresAnImport<WithGenerics, WhichRequire, EvenMoreImports>? ref1 = p.GetOrCreate(1);

Proposal

If var ref1 can be resolved to TheType ref1, then var? ref1 could be resolved to TheType? ref1.

Considerations

This syntax can theoretically be allowed for reference types and value types. For both cases, var? will ensure that the declared variable is nullable, regardless of the right-hand side's nullablity (for value types) respectively nullable annotation (for reference types).

For value types, the behavior will be:

public void Foo()
{
    var? val1 = p.Count; // int? val1 = new Nullable<int>(p.Count);
    var? val2 = p.Max; // int? val2 = p.Max;
}

which is equivalent to

public void Foo()
{
    var val1 = MakeNullable(p.Count);
    var val2 = MakeNullable(p.Max);
}

private static T? MakeNullable<T>(T value) where T : struct => new T?(value);
private static T? MakeNullable<T>(T? value) where T : struct => value;

For reference types, the question mark behaves the same like the one in TheType? obj:

var? ref1 = p.GetOrCreate(1); // TheType? ref1 = p.GetOrCreate(1);
var? ref2 = p.TryGet(2); // TheType? ref2 = p.TryGet(2);
#nullable disable
var? ref3 = p.GetOrCreate(3); // Warning: The annotation for nullable reference types should only be used in code within a '#nullable' annotations context.
@HaloFour
Copy link
Contributor

HaloFour commented Jun 21, 2020

This will no longer produce a warning in C# 9.0 as the language team has decided to always treat var as nullable and to rely on flow analysis to determine when to warn. Assigning null will no longer warn, but trying to deference after that assignment will.

Example

@LWChris
Copy link
Author

LWChris commented Jun 21, 2020

  1. Your example shows a warning for me at ref1 = null. Do I have to set the language to something like "C# 9.0 Preview" somewhere on the page?
  2. Apart from that I thought solving the question of whether something will ever be null is equivalent to solving the halting problem, I think "always treat var as nullable" sounds dangerously close to "always treat references as nullable". To me, this looks like a step in the opposite direction of where the whole concept of "non-nullable-references" was aiming at. In my opinion, you should treat var as what it is - that is "the type of the right-hand side", including the nullability.

@HaloFour
Copy link
Contributor

@LWChris

Your example shows a warning for me at ref1 = null. Do I have to set the language to something like "C# 9.0 Preview" somewhere on the page?

If the selected branch is "master" then it should be using the C# 9.0 bits. I do not see the warnings there, but I do see the warning if I change the branch back to "Default".

I think "always treat var as nullable" sounds dangerously close to "always treat references as nullable". To me, this looks like a step in the opposite direction of where the whole concept of "non-nullable-references" was aiming at.

The flow analysis of NRTs is pretty good at determining when it's safe to dereference the variable and it's already a core aspect of the experience. The behavior of the variable will remain the same. If the method returns a non-nullable, the compiler will consider the variable to not be null and will not warn on dereferencing. If you assign null to the variable the compiler will not warn at the point of assignment, but it will warn if you attempt to dereference the variable again without first checking for null:

string SomeMethod() => "";

var s = SomeMethod();
var length = s.Length; // does not warn
s = null; // does not warn
length = s.Length; // warns
if (s != null) {
    length = s.Length; // does not warn
}

The nullability annotations aren't that important on locals, the compiler will infer from usage. It's at the boundary where the annotations are much more important because the compiler can't know the expected state of the value crossing that boundary.

In my opinion, you should treat var as what it is - that is "the type of the right-hand side", including the nullability.

That is what the language team thought, too. But the experience and feedback from that decision is that the behavior is suboptimal, especially around existing code, so the team decided to reverse that decision. The warning on assignment isn't particularly helpful if that variable isn't dereferenced again without checking for nullability.

Anyway, here's the meeting notes from when the language team already considered the addition of var? and decided on this behavior for C# 9.0 instead:

https://github.com/dotnet/csharplang/blob/master/meetings/2019/LDM-2019-12-18.md#var

@sharwell
Copy link
Member

@HaloFour the behavior you link to is enabled for C# 8 as well, provided the compiler version is new enough.

dgrunwald added a commit to icsharpcode/NullabilityInference that referenced this issue Jun 21, 2020
Unlike documented by the nullable-reference-types proposal, these variables do not infer their nullability based on their initializer.
Instead, they are always declared as nullable, and the initializer's nullability only matters for the flow-state.

See also: dotnet/csharplang#3591 (comment)
@LWChris
Copy link
Author

LWChris commented Jun 24, 2020

s = null; // does not warn I was really hoping this would change once and for all when I saw non-nullable references found their way into C#. IMO, it should. It's a pity, but oh well. C# is a language, not my language, so it's not my choice to make.

I'd rather opt into being pedantic about forcing you to express what you wanted by using var or var?, than relying on inherently weaker local context analysis. It's one thing to not mess with old code just for convenience (protected virtual event NotifyCollectionChangedEventHandler? CollectionChanged;), but another to mess with it to prevent future errors, and when the "fix" is merely adding the ? to the var keyword a few lines earlier. Even more so, if adding that one character doesn't get rid of all warnings immediately, you apparently just identified a risk you never thought about. Sounds promising, not annoying to me.

Thanks a lot for the link though.

@LWChris LWChris closed this as completed Jun 24, 2020
@CyrusNajmabadi
Copy link
Member

I was really hoping this would change once and for all when I saw non-nullable references found their way into C#. IMO, it should. It's a pity, but oh well. C# is a language, not my language, so it's not my choice to make.

You can always write an analyzer yourself to find and report thsi if you want that extra restriction :)

I'd rather opt into being pedantic about forcing you to express what you wanted

We looked into this, and it was effectively awful. So much code that was null safe and was written in idiomatic C# was then flagged as needing additional work.

This violates my own goal for the feature which is that good, existing, coding practices should not feel like something that is now punishing you.

@LWChris
Copy link
Author

LWChris commented Jul 3, 2020

You can always write an analyzer yourself to find and report thsi if you want that extra restriction :)

Not every programmer has the skill or time to write custom analyzers to mitigate shortcomings of the native compiler. ;)

So much code that was null safe and was written in idiomatic C# was then flagged as needing additional work.

I understand your point and see where you're coming from. But quite honestly - with this behavior for #nullable enable, the C# language team might as well consider to remove the whole feature again. After all the point of non-nullable-references is to introduce references that cannot be null. Now they can be null, and we still have to rely on flow analysis. But flow analysis can be done with or without #nullable enable, right? So what will be the point of having #nullable enable scopes in the first place?

If x = null is an error for int x, then it should be an error for SomeReferenceType x when #nullable enable is defined.

If it bothers me that I must write var? x = Foo(); x = null; instead of var x = Foo(); x = null;, because I know I have always checked for null when I use x later on, then why did I even enable the feature in the first place? I might as well disable it and rely on flow analysis anyway, since apparently flow analysis is "good enough" to catch all issues.

And maybe, after killing #nullable, C# may introduce #nonnullable, this time not with SomeReferenceType? vs. SomeReferenceType, but with SomeReferenceType vs. SomeReferenceType! or so.

@HaloFour
Copy link
Contributor

HaloFour commented Jul 3, 2020

@LWChris

The goal of NRTs is to avoid NREs, not nulls. A null assigned to a local that is never dereferenced is innocuous. You can argue that this makes annotating locals with their nullability is pointless if flow analysis can make this determination, and I agree with you. The annotation makes sense at the boundary where flow analysis can't track what happens. But I disagree that this decision somehow weakens NRTs or makes them irrelevant. NRTs were never intended to be a change to the type system, they're meant to be guardrails, and they need to be a very pragmatic solution so that can be practically applied to tons and tons of existing code.

@333fred
Copy link
Member

333fred commented Jul 3, 2020

But flow analysis can be done with or without #nullable enable, right?

No, that is not how the design works. You must, at minimum, turn on the flow analysis with #nullable enable warnings. Nullable does not run in disabled locations.

@LWChris
Copy link
Author

LWChris commented Jul 3, 2020

#nullable enable

public class C {
    readonly string S = "s";
    
    public void A() {
        string s = S;
        bool flag = s == null; // 1
        int length = s.Length; // 2
    }
}

Question 1: What is the point of having a wrong warning "Dereference of a possibly null reference." in line 2 instead of a correct warning "Expression is always false." in line 1 with #nullable enable?
Question 2: If we establish that showing the warnings as described above was somehow useful, then what is the point of not showing them without #nullable enable?

In other words: if NRT are meant to be "guardrails", then why are those rails installed in a way that it is easy to make them ineffective (by adding a useless null-check), and if we deemed them sensible nevertheless, why would I have to explicitly ask for the guardrails to be installed?

@HaloFour
Copy link
Contributor

HaloFour commented Jul 3, 2020

@LWChris

The guardrails exist to steer you away from NREs. That may warn spuriously, but it won't NRE. More interesting would be non-pathological cases where the compiler doesn't warn and can result in an NRE.

@333fred
Copy link
Member

333fred commented Jul 3, 2020

Consider instead a public API: It's annotated as not accepting null, but of course someone could be calling you from a nullable disabled context or they could put a ! on it because they're pretty sure they're not passing null. However, you still want to check the input for null, and if you do so, we should make sure you actually handle it. Or perhaps the return type of a function is not annotated, or the person who annotated it was wrong. Nullable is not perfect: it's designed to take your input into account. We assume that you know what you're doing, and that if you check for null then there's a real reason to do so.

@CyrusNajmabadi
Copy link
Member

Not every programmer has the skill or time to write custom analyzers to mitigate shortcomings of the native compiler

I disagree. The cost to doing something like this is extremely small. So either this is a significant issue for you, in which case writing hte analyzer is a cheap and easy way to solve the problem. Or it isn't an significant issue, in which case it doesn't really matter if it solved :)

After all the point of non-nullable-references is to introduce references that cannot be null.

This was never the point. The point was to diminish the quantity of null reference exceptions, and to introduce a system that had a low amount of false positives and false negatives.

in line 2 instead of a correct warning "Expression is always false." in line 1 with #nullable enable?

It would be fine to introduce an analyzer to give such a warning. No one is stopping that.

@LWChris
Copy link
Author

LWChris commented Jul 3, 2020

This was never the point.

If that is so, this is the root cause of my problems with that behavior, and apparently some others. Because everything that has been named, said, documented or marketed about this feature promises exactly what you say "was never the point".

Just read the introduction of the official documentation:

C# 8.0 introduces nullable reference types and non-nullable reference types that enable you to make important statements about the properties for reference type variables:

  • A reference isn't supposed to be null. When variables aren't supposed to be null, the compiler enforces rules that ensure it's safe to dereference these variables without first checking that it isn't null:
    • The variable must be initialized to a non-null value.
    • The variable can never be assigned the value null.
  • A reference may be null. When variables may be null, the compiler enforces different rules to ensure that you've correctly checked for a null reference:
    • The variable may only be dereferenced when the compiler can guarantee that the value isn't null.
    • These variables may be initialized with the default null value and may be assigned the value null in other code.

[...]
With the addition of nullable reference types, you can declare your intent more clearly.

  • The term "non-nullable reference types" is misleading, because they are still considered nullable enough to allow assignment without warning, even if you explicitly declared otherwise at every possible location.
  • "that enable you to make important statements about the properties for reference type variables", except that these statements are going to be ignored at the next best opportunity.
  • "you can declare your intent more clearly", but the compiler is very eager to ignore your declaration and replace it with its own assumptions.
  • "the compiler enforces rules that ensure it's safe to dereference these variables without first checking that it isn't null", which turns out to be "you can shatter that enforcement by writing x == null somewhere" and then we will urge you to introduce more checks instead of allowing you to get rid of unneccesary checks.
  • "The variable can never be assigned the value null.", except you wrote x == null somewhere, or you used the var keyword like advised. And it's not just "not an error", it's not even a warning anymore.

What's even worse, the documentation continues to compare it to the "earlier" behavior:

This new feature provides significant benefits over the handling of reference variables in earlier versions of C# where the design intent can't be determined from the variable declaration. The compiler didn't provide safety against null reference exceptions for reference types:

  • A reference can be null. The compiler doesn't issue warnings when a reference type is initialized to null, or later assigned to null. The compiler issues warnings when these variables are dereferenced without null checks.
  • A reference is assumed to be not null. The compiler doesn't issue any warnings when reference types are dereferenced. The compiler issues warnings if a variable is set to an expression that may be null.

Comparing this to the current behavior:

  • "earlier versions of C# where the design intent can't be determined from the variable declaration" - now I can seemingly declare design intent at declaration, but actually somewhere the code I may accidentially change the believed intent, so we're back at "you have to read all the code to know if a variable is considered "not null" or "nullable".

Statements about when a reference "can be null":

  • "The compiler doesn't issue warnings when a reference type is [...] later assigned to null." - yup, got that behavior back.
  • "The compiler issues warnings when these variables are dereferenced without null checks." - yup, got that behavior back as well.

Statement about when a refereces "is assumed to be not null" (in fact, the feature was marketed to get rid of "assume" and introduce "declare", which is stronger):

  • "The compiler doesn't issue any warnings when reference types are dereferenced." - you may now see warnings even if you declare "not null", so this got even worse.
  • "The compiler issues warnings if a variable is set to an expression that may be null." - not if you used the var keyword, or checked for null anywhere above. Got worse as well.

I hope you can see what a gigantic let down this feature appears to have become, compared to what was promised. I want you to take a step back and not look at why a decision was made, but look at the bigger picture of what the feature now brings to the table in comparison of what is said it brings to the table.

Optimizing the behavior under the premise of "show as few warnings as possible" and "signal 'everything is fine, you don't have to change a thing' as long as possible" is creating a meaningless tool. NRT/NNRT promised and is marketed to allow me to get rid of vague assumptions and replace those with rigid declarations. But everything that has been done ever since were softening those "declarations" back to "assumptions". These new assumptions may be smarter than before, which is a fine thing in itself, but why even introduce "better algorithms" as a new feature, not as improvement of the already existing feature.

I'm genuinely disappointed. It was announced as "Declare if a reference can be null." and has devolved into "We try to hopefully figure out correctly most of the times whether a reference may be [allowed to be set to] null and will show you some warnings at some places where we assume they are appropiate, and if you think long enough about them you may even realize that the problem is not actually where we mark your code as problematic, but at an utterly different place in your code that has nuked our fragile heuristics that silently overrode your declared intents."

@LWChris
Copy link
Author

LWChris commented Jul 3, 2020

In short:

  • The goal was to create a feature that allows developers to escape the "might be null" situation.
  • This would have enabled developers to code in a context where they don't have to worry about NREs, even without tedious null checks at every place.
  • Using the premise "null checks are implicit declarations of nullability" instead of "a declaration of non-null is a declaration of non-null" creates a confusing behavior and inhibits the goal of creating a null-safe context, because now, checking for null on a non-null variable is not considered unnecessary, but instead destroys the whole null-safe context, re-introducing the necessity to check for nulls everywhere else.

@LWChris
Copy link
Author

LWChris commented Jul 3, 2020

I can really not stress enough how much of a logical break in compiler behavior this is. Consider this code:

var x = 4;
var y = 2 << x;
x = x * 1.0;

In the first line, the compiler infers the type of x to be int, so the second line compiles fine. The third line, albeit uncritical under mathematical aspects, will yield an error, because this line of code is considered wrong when x is an int. The compiler doesn't bother to implicitly assume "everything will be fine", even if it could technically check that.

Now watch what happens when I declare the type of x to be double, like so:

double x = 4; // The type is declared "double"
var y = 2 << x; // Writing "int << double" is wrong, let's make this an error
x = x * 1.0; // Assigning a "double" value to a "double" variable is fine

As you can see, the compiler will treat my explicit type declaration as given, and therefore say "line 2 is wrong".

Now look at what the compiler does with my declared non-nullability:

#nullable enable
string s = "s"; // The type is declared "non-nullable"
bool flag = s == null; // This only makes sense if "s" is nullable; let's assume "nullable" instead
int length = s.Length; // You should not dereference a nullable variable

If in my example from above the compiler behaved in the same way like NRTs are implemented, then this would be the result:

double x = 4; // The type is declared "double"
var y = 2 << x; // This only makes sense if "x" is an int; let's assume "int" instead
x = x * 1.0; // You cannot assign a double to x without casting

The idea of some analysis checking "What you do with that variable only makes sense if the variable is treated differently from what was declared", which then does not lead to the conclusion "you should not do this with that variable" but instead to "I assume a different type than what is stated at the declaration"... that's insane!

@HaloFour
Copy link
Contributor

HaloFour commented Jul 3, 2020

@LWChris

You're comparing inference of type to inference of nullability, which aren't the same. NRTs aren't a part of the type system, they are only a linter.

@LWChris
Copy link
Author

LWChris commented Jul 3, 2020

@HaloFour

I am aware that NNRTs vs their respective NRTs are not two different types, like VTs vs belonging NVTs are. I know they are realized with annotations in order to maintain backwards compatibility, and thus make the behavior a question of individual annotation analysis opposed to existing type logic. But that doesn't mean that to-be-introduced analysis may not strive to produce a behavior for NNRTs that is consistent with the compiler behavior for VTs.

The decision to treat a null-check as more declarative of a variable's nullability than the actual declaration is just that - a decision. It's not a necessity introduced by the way nullability had to be implemented under the hood.

It may be clear to you why nullability for reference types is not implemented the same way like for value types, but there is no reason why it may not be treated the same way. The written code provides all necessary information to accurately infer nullability and then treat the variable identically to how it is done for value types. The decision to treat var differently for RTs than for VTs, and to treat Type vs Type? differently from int vs int? was a (sensible) decision made after establishing the assumption that "a null-check implies nullability". But maybe establishing that assumption was wrong in the first place, and should be exchanged for "a variable declared non-null is non-null". This achieves the same result for NPEs, but more accurate recommendations for the user and consistent behavior, which is what people likely expected and what was described/promised/marketed.

@HaloFour
Copy link
Contributor

HaloFour commented Jul 3, 2020

@LWChris

The decisions made around NRTs were made giving priority to making them easy to adopt with minimal code changes. Generating a ton of warnings around common existing code patterns that are not at risk for producing NREs is not a good experience. The patterns for dealing with nullability in reference types have been established over 18 years, whereas nullability in value types wasn't at all possible until NVTs were added to the runtime and language. NRTs are a giant pile of compromises optimized around pragmatism and a low barrier to entry to adoption and ecosystem penetration.

I view NRTs like I do generics in Java, which were designed with similar considerations to the existing language and ecosystem. Both can be pretty trivially intentionally defeated by the developer, but for non-pathological code they effectively guard against incorrect usage of the values the vast majority of the time.

@LWChris
Copy link
Author

LWChris commented Jul 3, 2020

Thanos travelled the whole universe to collect all Inifinity stones. He created the most powerful tool one can imagine. And then he used it to cure the symptoms, not the cause.

They created a tool that had the power to abolish the cause (reference type variables might be null), and they chose to cure the symptom (NPEs).

The result is giant pile, but that pile is a mess, with behavior that is unexpected, unnecessary, and inconsistent with everything you know about nullability and C#.

int num = 1;
bool flag = num == null;
string text = num.ToString();
        
string text2 = "s";
bool flag2 = text2 == null;
int length = text2.Length;

There is no reason why num == null has a warning, but text2 == null doesn't, why text2.Length has a warning, but num.ToString() doesn't.

string text = "s";
int length = text.Length;
bool flag = text == null;
int length2 = text.Length;

There is no reason why text.Length is okay in the second line but not in the fourth.

#nullable enable
string? text = "s";
bool flag = text == null;
int length = text.Length;
#nullable disable
int length2 = text.Length;

There is no reason why the fourth line should issue a helpful warning, but not the sixth.

Feel free to implement this bunch of mildly-useful helpers, but name them "better reference type nullability annotations" which "hint" something and don't waste/abuse the Type? syntax when you actually mean to implement [MaybeNotNull] Type.

What is implemented and planned is certainly not introducing anything close to the concept of "non-nullable-reference-types" which "enforce" something.

If it must be so convenient and frictionless to adopt the feature, then why even give users a choice? If the ruleset and logic has to be softened so much that basically 99% of the code stays warning-free when enabled, then why make it an opt-in feature at all?

@HaloFour
Copy link
Contributor

HaloFour commented Jul 3, 2020

@LWChris

And then he used it to cure the symptoms, not the cause.

They created a tool that had the power to abolish the cause (reference type variables might be null), and they chose to cure the symptom (NPEs).

The cause of an NRE is dereferencing null, not assigning null. nulls are perfectly valid values, and almost completely unavoidable in the runtime. Would NRTs be designed differently if we were designing .NET and C# 1.0 right now? Almost certainly yes. But we're not designing C# 1.0. We're designing C# 8.0+, and we don't throw out 18 years of existing patterns and practices for the sake of idealism.

The result is giant pile, but that pile is a mess, with behavior that is unexpected, unnecessary, and inconsistent with everything you know about nullability and C#.

Everything you knew about nullability was that every reference variable could be null. Now you get some hints to help you identify where they might be null where you thought it couldn't be. The reality is that this feature has found and identified potential NREs in many existing codebases, including Roslyn itself. If it generated a ton of false positives and I had to question the compiler on every warning I'd be less inclined to trust its judgment. If I had to rewrite swaths of code to enable it my adoption rate would plummet and I'd likely introduce a whole slew of new bugs.

There is no reason why text.Length is okay in the second line but not in the fourth.

Your pathological code is giving the compiler reason to doubt its flow analysis. It assumes you know better. I'd suggest not writing pathological code.

You know how easy it is to get a String into a List<Integer> in Java? Trivial. Know how often that's actually a problem? Negligibly. The perfect is the enemy of the good.

There is no reason why the fourth line should issue a helpful warning, but not the sixth.

There absolutely is. Issuing a new warning is a breaking change.

I'm sorry that you don't like NRTs. I don't like everything about them either. But they are the culmination of more than six years of public discourse on the subject and comparison of a very wide range of options.

@CyrusNajmabadi
Copy link
Member

Optimizing the behavior under the premise of "show as few warnings as possible" and "signal 'everything is fine, you don't have to change a thing' as long as possible" is creating a meaningless tool.

We've found literally dozens of cases in our own product where unintentional null refs could occur. We've been able to find and fix them because of nrt. Furthermore, the design has allowed us to gradually adopt net (generally on a per file basis, but sometimes when more granularly than that). That adoption had been reasonably painless due to the amount of false negatives and positives being quite low.

You say this is a meaningless tool. However, based on the results alone in our codebase I would say that is absolutely not the case.

@LWChris
Copy link
Author

LWChris commented Jul 3, 2020

Let me be clear. What has been done was good and fruitful and is certainly a good thing. I do not at all question that. But I don't see why this is treated as the whole solution for NRT. As I said, if the design is chosen as is, under the premise of avoiding to show any warning that is not most likely valid and definetely show a warning where there is a mistake, then this analysis should be its own separate opt-in feature or not being opt-in at all (depends on the C# definition of "breaking" change; to me a compile- or runtime error is a breaking change, but not a design time warning).

But the other opt-in feature that actually introduces what is called "non-nullable reference types" and the syntax of Type? for "nullable reference types", that should be what it says. And even if enabling that requires me to change the code at appropiate places by either adding ? or deleting an unnecessary null-check - if this comes in exchange for getting warrantied runtime safety without cluttering my code with lots of unnecessary null checks, this is still a pretty strong incentive to me.

@CyrusNajmabadi
Copy link
Member

But I don't see why this is treated as the whole solution for NRT.

It has not been treated as the whole solution for NRT. We've already done more work in the NRT space, and i expect that we will continue to do so for many more versions of the language.

@CyrusNajmabadi
Copy link
Member

As I said, if the design is chosen as is, under the premise of avoiding to show any warning that is not most likely valid and definetely show a warning where there is a mistake,

Your premise is incorrect. There has never been a design that would "definitely show a warning where tehre is a mistake". The designs have all been around a balance of getting lots of true positives/negatives and minimizing false positives/negatives. There has never been a design of where a warning is always shown when there is a mistake.

@CyrusNajmabadi
Copy link
Member

that should be what it says

This is fine to have as a personal opinion on your part. However, you need to recognize that it is not shared by other people (including myself). The goal is to take pragmatic and not dogmatic approaches here to maximize the utility of this for the vast majority of the user base. Strict adherence to rules that would just provide more pain to users, while providing no additional benefit in terms of correctness is seen as a net negative.

Look, i totally get your views and why you want it to be this way :) Indeed, your views on this are how we initially implemented this feature, thinking that would be the right way to go. However, immediately we could see across the board how bad an experience this was causing for users. Simply ignoring that and sticking with the position of "that should be what it says" would have done nothing good for this feature. It would have made it much harder for people to adopt, and it wouldn't have produced any less null-refs for hte people that did adopt it. It was, objectively, a strictly worse outcome for the entire ecosystem.

@CyrusNajmabadi
Copy link
Member

this is still a pretty strong incentive to me.

And htere's the rub. That's your assessment of the pros/cons. But they apply to you. We have to holistically look at the entire ecosystem out there. Thousands of companies, millions of devs, and tons of sources of feedback indicating a wide variety of views on this topic. Your pros/cons tell you that you would prefer the more strict view of things. And trust me, you're not alone. However, you're not a majority or plurality.

In language design, this is the norm. Every decision we make always has some group left out or feeling like their own personal weighting has gone negative with our decision. That's simply part of hte process, and can feel sucky when you're in that group. Trust me, i'm on the LDM and i've been in that group likely at least once per language revision. However, i recognize that just because the decision made then might not be the best decision for my personal needs, that it's still likely the best one for the ecosystem as a whole.

@LWChris
Copy link
Author

LWChris commented Feb 11, 2021

Using explicit types also has another benefit - it makes nullable types stand out - for me these days all reference types should be non-nullable unless there is a good reason and seeing these exceptional cases clearly is helpful.

But it has the disadvantage of requiring more usings, plus if the type is more complex, e. g. a return type from a complex LINQ statement, it gets really wordy. I really hope at some point in the future "var" will be "dumb" again, i. e. simply take the right hand side, without any assumptions or pseudo-clever guesswork.

Return type of my method is int?, var should take int?. Return type is IEnumerable<Task<int?[]>>, var should take IEnumerable<Task<int?[]>>, not IEnumerable<Task<int?[]>>?.

@LWChris
Copy link
Author

LWChris commented Feb 11, 2021

I just hope the "stricter" ruleset will become available

This is a prime case for an analyzer. Indeed, any time the topic is around "the language should be more restrictive and produce errors in this code that was previously legal", all that really means is "write an analyzer to find and mark the patterns you don't want to allow".

Now you don't need to wait on anything. You could write this today and have the stricter semantics you desire.

Can you link me to such an analyzer, i. e. one where declaring string s strictly inhibits s = null with #nullable enable? I tried looking into this, but having never done anything remotely akin to writing an analyzer, this task is way above the amount of time I can spend on learning how to do this right now.

@markm77
Copy link

markm77 commented Feb 11, 2021

I really hope at some point in the future "var" will be "dumb" again, i. e. simply take the right hand side, without any assumptions or pseudo-clever guesswork.

Chris, obviously I was also disappointed about var being always nullable given my strong preference for "non-nullable by default and where possible".

However in terms of the future I wonder if a more realistic and in some ways more useful improvement might be a new non-nullable var with a new keyword (say varn) which allows the programmer to express that a new variable should be non-nullable with inferred type. varn would allow accidental nullable assignments (including at point of declaration) to be flagged without losing the convenience of inferred types (var/varn). This would also allow the programmer to see nullability at a glance (no hover required 😀) due to variables being var or varn. Just a thought....

@markm77
Copy link

markm77 commented Feb 11, 2021

Here's another way use of var in nullable contexts could be improved today without language change.

What if the IDE used different syntax highlighting for the var keyword to reveal whether a variable is non-nullable convertible or not?

That way any unwanted nullable assignments would change the colour of the variable's var keyword and the "truly nullable variables" in a file would be clearly flagged to the programmer by the colour of the var keyword. (I suspect this is very possible because in Rider when using explicit types the "non-nullable convertible" variable types have a greyed-out ?.)

Something changing colour is not as good as a warning but this would not involve language change and seems feasible.

Edit: Or, dare I suggest it, instead of colour change maybe the IDE could suffix var with a ? annotation when the variable is not "non-nullable convertible".

@markm77
Copy link

markm77 commented Feb 11, 2021

Here's another option for improving var with NRT in future which would not break existing code.

Why not make the var inferred type nullable or not based on both the RHS of the declaration and subsequent assignments? I.e. set it based on "flow engine nullability"?

Then the IDE could indicate when the var inferred type is nullable with a suffixed ? annotation?

This would remove the completely un-necessary confusion of "flow engine nullability" differing from "static nullability" in the case of var.

(Optionally, we could also consider e.g. nonnullable var and nullable var to force the nullability to fix detection issues or express a requirement. Explicit types also do the job but are not great for re-factoring and sometimes quietly convert to casts.)

@HaloFour
Copy link
Contributor

It's important to note that with NRTs that it's not the type of the variable that is nullable or not, it's the state of the variable at that point in the flow. A string? will be non-nullable at the point of assignment to a nullable value, or after a null-check. A string could be forced to be nullable, as you've demonstrated above. The goal is avoiding NREs, not enforcing any kind of type correctness.

If you want to make the language stricter and warn/error in cases that would not otherwise throw an NRE, that sounds like a good place to use an analyzer.

As for visualization of the nullability flow, I'd suggest that's possible, at least in Visual Studio where I believe you can still write extensions that affect the view. I imagine one would have to be written for every IDE. I can certainly see value in better visualizations here, although that's not really a C#/Roslyn concern.

@LWChris
Copy link
Author

LWChris commented Feb 16, 2021

If you want to make the language stricter and warn/error in cases that would not otherwise throw an NRE, that sounds like a good place to use an analyzer.

@HaloFour Could you kindly point me to a resource where I can get such an analyzer? As I said, I have never done anything with compilers/syntax trees/nodes etc., and after reading about it for an hour, I figured getting into this stuff would take way more time than I can spend right now to learn how to code analyzers let alone the one that is needed to introduce "true NRTs". Yet I'd really like to have those strict checks rather sooner than later. Has anybody ever done this?

When I try to Google for it, all I can find is people either proposing Roslyn should do it, or people complaining that Roslyn's doing what it does now and not what was promised.

@CyrusNajmabadi
Copy link
Member

or people complaining that Roslyn's doing what it does now and not what was promised.

I'm sorry, we did not promise this. If there are docs saying such a promise exists, we will need to change them as they are not accurate.

@CyrusNajmabadi
Copy link
Member

@HaloFour Could you kindly point me to a resource where I can get such an analyzer?

The idea is: if there is enough interest by someone, they could write this themselves. It's a non-goal of the NRT initiative to address every request and need of all developers. Where there are gaps, analyzers can be written by those interested in filling them.

Has anybody ever done this?

Not that i know of. It seems like something that isn't really that compelling at the end of the day for most people.

@markm77
Copy link

markm77 commented Feb 16, 2021

For interest of those on this thread, JetBrains state here they are in the near future planning to support a bulk quick fix which removes unnecessary ? from types as identified from the flow analysis. For me this will mean that code clean-up converts most types to explicit and then removes unwanted ? tightening nullability greatly. I'm sure other IDEs will/may already provide similar bulk clean-up.

I do think a useful next step for NRT could be the introduction of nonnull var (inferred type with non-null nullability) so that tooling could provide a similar clean-up when using var (i.e. tooling would convert var to nonnull var where possible so nullables stand out).

@LWChris
Copy link
Author

LWChris commented Feb 16, 2021

I'm sorry, we did not promise this.

Yes, yes you did. Read my comment from earlier. Unfortunately, it's impossible to highlight text with font or background-color, so I had to edit a screenshot.

image

  • Yellow: note how it says "the compiler", not "an analyzer that you have to code on your own".
  • Red: it say "enforces", "must", "can never", and "is a non-nullable reference type". Not "warns", "should", "shouldn't", or "may be a non-nullable reference type".
  • Green: This paragraph says "didn't" as if this was now a thing of the past. It then continues to exactly describe what the current NRTs are.

So I think "it wasn't supposed to be that" may be true because the developers implemented something utterly different than what the announcement said. The announcement said "we're going to build a seven story plaza" and the developers said "We aimed for a small bungalow, it was never supposed to be a plaza".

  • A reference isn't supposed to be null. [...] the compiler enforces rules [...]:
    • The variable can never be assigned the value null.

[...] Any variable where the ? isn't appended to the type name is a non-nullable reference type.

These statements alone in conjunction clearly rule out that string name = null is valid code in #nullable enable in native C# 8.0, because the compiler (not my own analyzer) enforces (not hints about) the rules that this variable is a non-nullable reference type (not assumed to be nullable because I assigned null anyway) that can never (not shouldn't) be assigned null.

To be clear, what was done is good, but the only difference is that the warning that were already there are now hopefully more accurate. This is miles off from phrases like "enforces" or "can never".

@HaloFour
Copy link
Contributor

@LWChris

As noted, the behavior with var was changed after this feature was released, specifically because the user experience was awful. Even so, I don't see where the referenced documentation contradicts the behavior.

I think you're reading the behavior that you want out of that document. The compiler enforcing the rules does not mean that this code is no longer legal. It's long been a goal of this feature to be as non-disruptive to the billions of existing lines of perfectly fine C# code as possible as trying to make it stricter would guarantee that it would not be adopted.

@CyrusNajmabadi
Copy link
Member

@LWChris Can you clarify what in that document makes you think we promised this?

@333fred
Copy link
Member

333fred commented Feb 16, 2021

Any variable where the ? isn't appended to the type name is a non-nullable reference type.

var is not a type name. var is a signal to the compiler of "please figure out the type for me". Initially, the nullability of this was also inferred from the right-hand side. However, after significant user experience pain, this was changed.

Green: This paragraph says "didn't" as if this was now a thing of the past. It then continues to exactly describe what the current NRTs are.

No, it describes exactly how reference types used to be treated, not how they are currently treated. When nullable is enabled, and you've declared that a variable should not be null, you get a warning if you attempt to assign null to it. When you attempt to dereference something that we know could be null, you get a warning that it could be null. Neither of these behaviors appear when nullable is turned off, and that's what that green paragraph describes.

These statements alone in conjunction clearly rule out that string name = null is valid code in #nullable enable in native C# 8.0, because the compiler (not my own analyzer) enforces (not hints about) the rules that this variable is a non-nullable reference type (not assumed to be nullable because I assigned null anyway) that can never (not shouldn't) be assigned null.

Yes, and that code is indeed not valid. I don't really know what the point of this sentence was.

I think a key point here is that you are inferring a different meaning for var than what LDM thinks of var. You are inferring "This should be directly replaced by the type of the thing on the right". We infer "This is an indication by the user that they don't care what the type of the thing on the right is." The key difference here is that the latter definition allows for the nullability of that parameter to be relaxed to improve the user experience, while the former definition does not. If you care about ensuring that null is never assigned to a location, you actually care about the type of the local, because nullability is part of the type.

@markm77
Copy link

markm77 commented Feb 16, 2021

I think a key point here is that you are inferring a different meaning for var than what LDM thinks of var.

I also think this is a key point. I've accepted var means "automatically determine the type including nullability from context" and the current definition while technically different has no difference in effect from "auto-determine var nullability from flow analysis" which takes account of more contextual information than "auto-determine var nullability from the RHS" which ignores information. (*)

So I no longer think var nullability behaviour should change although I would much prefer the static type nullability to match the flow engine nullability to reduce user confusion and provide clearer information.

* Another way to say the same thing is that var should infer the tightest nullability that doesn't cause nullability warnings. This is surely the best contextual, automatic choice. The current behaviour of var has the same effect as this.

@LWChris
Copy link
Author

LWChris commented Feb 16, 2021

@333fred I am not talking about the var issue anymore, but about #nullable enable as a whole concept.

This whole thread in a nutshell:

It was my initial understanding that var would mean "right-hand side type" (like for value types), and in that case var? would have been a useful addition. However, I learned that (for reference types)

  1. T x doesn't mean the variable is non-nullable.
  2. string x = null; is compilable without errors, just a warning.
  3. var x = y; always means x will be nullable, no matter if y is nullable or not.
  4. There is no way to make any variable non-nullable by declaration and have the compiler reject assignments of null so that it won't compile, unless I write my own analyzer.
  5. #nullable enable does not do what its documentation says.

I argued that 1, 2, and 3 are inconsistent with the behavior for value types, and that it'd be much less confusing if the compiler tried to match the value type behavior for reference types at compile time.

People keep explaining
a) why it isn't that way (technical reasons, as in "actual types for VTs" vs. "annotations for RTs").
b) why it shouldn't be that way (practical reasons, as in "people won't enable it when that means they must add thousands of ?s before the project compiles again").
c) that the documentation isn't suggesting any of my points 1-4 and therefore 5 isn't true.

I knew about a).

I understand b). But I'm one of the not-actually-so-few persons who don't mind if our projects don't compile if the benefit is having compile-time safety against NREs. Therefore, I made an argument that the current implementation of #nullable enable should be the default and not some opt-in feature. Sure, this means your project could be littered with thousands of warnings, but everyone agrees that they are carefully designed to be meaningful in almost all cases (so there's no reason to not show them to anyone), and you are still free to ignore them (so making this the unalterable default behavior doesn't actually force any immediate actions upon anybody). This in turn would leave the actual opt-in feature to be the thing where opting in means you really want bullet-proof compile-time safety, even if that means you'll need to spend time checking and adjusting possibly thousands of lines of code, file by file, project by project.

I don't see c). How could anyone believe that the announcement and documentation matches the current behavior? The statements I quoted earlier do very clearly state that

  • A reference isn't supposed to be null. When variables aren't supposed to be null, the compiler enforces rules that ensure it's safe to dereference these variables without first checking that it isn't null:
    • The variable must be initialized to a non-null value.
    • The variable can never be assigned the value null.

[...]

Any variable where the ? isn't appended to the type name is a non-nullable reference type."

  • In string name = null;, the ? isn't appended to the type name "string".
  • Therefore, the variable name is a non-nullable reference type.
  • Therefore name "isn't supposed to be null".
  • Therefore the "variable must be initialized to a non-null value".
  • Therefore a compiler that "enforces" this rule would not allow me to successfully compile this code.

The current compiler however compiles string name = null; just fine, it only shows a warning. Therefore the documention does not match the feature.

The compiler enforcing the rules does not mean that this code is no longer legal.

"enforcing" - You keep using that word. I don't think it means what you think it means. 😅
If a rule is violated, "enforcing the rules" and "successfully compiling code that ignores the rules" are mutually exclusive. That's the definition of "enforcing".

image

[YES] The compiler enforces that a class implementing an interface is either abstract or implements all interface members with suitable accessible members.
[NO] The compiler enforces that the variable s must be initialized to a non-null value.
[YES] The compiler enforces that the variable i must be initialized to a non-null value.

@HaloFour
Copy link
Contributor

@LWChris

"enforcing" - You keep using that word. I don't think it means what you think it means. 😅

I dunno, when the police are "enforcing" speeding rules more often than not I end up with a warning. 😁

@LWChris
Copy link
Author

LWChris commented Feb 17, 2021

@HaloFour To me, that means they are apparently not enforcing them 😁

@CyrusNajmabadi
Copy link
Member

and have the compiler reject assignments of null so that it won't compile, unless I write my own analyzer.

Or you can make those warnings into errors.

@CyrusNajmabadi
Copy link
Member

The current compiler however compiles string name = null; just fine, it only shows a warning. Therefore the documention does not match the feature.

The compiler enforces this by telling you about the issue. You can choose to have that be blocking or not. But we leave that choice to you, we do not want to enforce that on an astronomically large ecosystem with thousands and thousands of groups and projects all with varying needs in terms of how they can adopt this concept.

@CyrusNajmabadi
Copy link
Member

CyrusNajmabadi commented Feb 17, 2021

Build started...
1>------ Build started: Project: ConsoleApp1, Configuration: Debug Any CPU ------
1>You are using a preview version of .NET. See: https://aka.ms/dotnet-core-preview
1>C:\Users\cyrusn\source\repos\ConsoleApp1\ConsoleApp1\Program.cs(23,20,23,24): error CS8600: Converting null literal or possible null value to non-nullable type.

image

@CyrusNajmabadi
Copy link
Member

I don't think it means what you think it means.

I think you're confusing "blocking" with "enforcing". Our goal of NRT was to be non-blocking for users. Any reasonable sized project will take a fair amount of effort to move to NRT. In roslyn, we've been continuing to NRTify our whole codebase, and it's been taking months and months and months. It's imperative that people be able to make this sort of progress in an enforced, but non-blocking, fashion. That's what we provided and it was very intentional and necessary to be viable.

OUr defaults seem to not be what you want. However, you can change the defaults here to better suit you if you want.

@LWChris
Copy link
Author

LWChris commented Feb 17, 2021

I am satisfied with this result, there are some quirks and it still sucks that var doesn't infer the nullability from the RHS, but you can mostly work around that.

However I really honestly do recommend to change the documentation to not use the phrase "enforce a rule". I'm convinced the majority of all readers will completey misunderstand that.

Ask 100 C# programmers who aren't involved in compiler development:

The compiler enforces that you cannot assign null to the variable name.
What will happen if you write name = null; anyway?

A) The code doesn't compile
B) The code compiles with a warning
C) The code compiles without a warning

I bet at least 90 will answer "A". 🙂
(Edit: I asked my colleagues, and they either said "A" or "I'd say A, but the fact that you ask me an obvious question makes me suspicious." 😄)

@jez9999
Copy link

jez9999 commented Aug 22, 2022

I have to agree with @LWChris completely on this one. I didn't have a particularly strong opinion on the introduction of non-nullable types, but if you're going to do it, why make something as fundamental as the var keyword opt-out by default? The nomenclature around it really makes you as a programmer think that "turning on non-nullable types for the project" will make reference types non-nullable by default. The fact that var x = new Foo(); creates a nullable Foo in this context is unexpected, whether or not flow analysis still warns about null assignment. If flow analysis is so good, why bother with non-nullable types at all? If existing code gets warnings, why turn this feature on for existing code? It's opt-in.

@HaloFour
Copy link
Contributor

@jez9999

why make something as fundamental as the var keyword opt-out by default?

NRT is opt-in specifically because it involves a migration for any existing codebase. The goal is to eventually get to 100% adoption where NRT is the default behavior, but that will take time. Given the vast impact NRTs has on existing codebases compromises had to be made so that it would be easier to adopt. When someone is evaluating opting-into NRTs for their codebase if they see thousands of warnings they will be less likely to consider the effort necessary to carry out the migration, so it's desirable to ensure that those warnings only appear in cases where NREs are more likely to occur.

The feedback from the first release is that inferring the nullability based on the RHS resulted in way too many extraneous warnings. While warning on assignment may help you identify where a null accidentally flowed into the local, the assignment itself is benign until that local is used. At that point NRTs would provide a warning, so the language is still providing that guard against NREs.

The perfect is the enemy of the good.

If flow analysis is so good, why bother with non-nullable types at all?

Flow analysis only really works well within the body of a method. It becomes infinitely more complicated across the boundaries between methods. Sure, the compiler could have inferred the metadata on the boundary based on use, but at that point the nullability becomes a part of the contract of that method, and the team is pretty consistent on the belief that said contract should be explicitly defined.

As for locals, I agree, having to explicitly notate which are nullable has a lot less value and could have been driven entirely by flow analysis.

@markm77
Copy link

markm77 commented Aug 22, 2022

I hope one day we get a v2 of non-nullable reference types with with real non-nullable types at runtime level. That would solve a host of confusions and complexities that arise because flow-analysis nullability is different from underlying nullability.

It would also remove the need for run-time null checks of non-nullables and mean var could infer non-nullable or nullable type based on the right hand side and stay consistent with the underlying type.

@jez9999
Copy link

jez9999 commented Aug 22, 2022

@HaloFour Of course turning on nullable reference types for existing codebases is likely to create a bunch of warnings. But it's opt-in. I guess I misunderstood the purpose of this, because I assumed it would be for new code. Why not turn non-nullables on for new code, and use flow analysis on existing code?

@HaloFour
Copy link
Contributor

@jez9999

I guess I misunderstood the purpose of this, because I assumed it would be for new code.

The goal is to make it work across all codebases and to make it relatively easy to adopt for maintainers of existing projects. Avoiding spurious warnings is intentional, especially if they arise in common cases.

@jnm2 jnm2 converted this issue into a discussion Aug 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants