Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Converting from @NullnessUnspecified to @NullnessUnspecified is unsound #69

Closed
cpovirk opened this issue Sep 9, 2019 · 30 comments
Closed
Labels
design An issue that is resolved by making a decision, about whether and how something should work. nullness For issues specific to nullness analysis.
Milestone

Comments

@cpovirk
Copy link
Collaborator

cpovirk commented Sep 9, 2019

Probably many of you have realized this from the beginning, but I'm thinking about it for the first time now. And I see that our draft design lists only 2 "permitted [but] possibly unsafe" conversions...

  • conversion from @Nullable to @NullnessUnknown
  • conversion from @NullnessUnknown to @NotNull

...which suggests that "@NullnessUnknown to @NullnessUnknown" might not be at the top of others' minds, too.

As a concrete example, if ConcurrentHashMap were a standalone type with unspecified nullness for all types parameters and type usages, then a caller could write:

ConcurrentHashMap<@NullnessUnspecified String, @NullnessUnspecified String> map =
    new ConcurrentHashMap<>();
map.put("foo", map.get("foo"));

If @NullnessUnspecified converts to @NullnessUnspecified without a warning, then that code would pass our type checking but blow up at runtime when it tries to insert null into the map.

Maybe that's the best we can do -- I've certainly said a couple times that unannotated code could be rough to deal with in the absence of stub annotation files -- but it would be nice if we could do better.

Maybe we end up with rules that issue a warning whenever an expression uses @NullnessUnspecified. (This "just give up" attitude fits in spirit with my #33 (comment) :)) It would be nice if we could avoid that in the cases that are known to be safe, like converting @NullnessUnspecified to @Nullable, but perhaps that's hard to specify or implement?

Or maybe we treat @NullnessUnspecified like an extreme version of wildcard capture: Wildcard capture creates a unique type for every occurrence in an expression, but @NullnessUnspecified would create a unique type for every occurrence everywhere (or close to it). Then those types would behave like the parametric nullness type T extends @Nullable Something, differing only in that unsafe operations would generate warnings instead of errors. But that might be too severe:

Foo<@NullnessUnspecified Bar> a = ...;
Foo<@NullnessUnspecified Bar> b = a; // warning? but "why?"

[edit: Note that all of this, for better or for worse, arises organically out of my current mental model for @NullnessUnspecified.]

Probably there's a smarter way to do it, but I wonder if we can realistically expect tools to all find it unless we provide guidelines. If we're lucky, maybe this is a rare case in which we really can say that our rules apply to implementation code only?

@kevinb9n
Copy link
Collaborator

kevinb9n commented Oct 4, 2019

I think we will not want to do anything that declares assigning unspec to unspec to be problematic, because a legacy codebase that is just starting to get with the program is going to be full of that everywhere.

I think that for unannotated code, continuing to blow up at runtime the way it always has probably IS "doing better", because the user hasn't yet opted into having anything different happen. If we say "but turning on the checker was the opt-in", it can make turning on the checker a very difficult task.

@cpovirk
Copy link
Collaborator Author

cpovirk commented Oct 14, 2019

Warning for unspec to unspec feels in line with the rest of our proposal: If I remove the nullness information from a class, that might create some warnings where they didn't exist before, and it might demote some errors into warnings.

Granted, unspec to unspec isn't identical to null to unspec. But I think it's a difference of degree and not of kind. In handwavy math:

  • Unspec to unspec might "really" be any of {not null to not null, nullable to nullable, not null to nullable, nullable to not null}. Only 1/4 is unsafe.
  • Contrast with null to unspec, which is "really" one of {nullable to nullable, nullable to not null}. 1/2 is unsafe.

I also fear that making unspec to unspec a warning could be difficult to spec and implement, as noted above.

But I take your response to be more "People won't want warnings when they adopt the tool." As wmdietlGC recently put it:

Wasn't our approach to handling sound tools vs bugfinders that we specify certain things as hard errors and other things as various kinds of warnings. Certain warnings, like these warnings for conversions to/from @NullUnspec could be ignored by default by certain tools. Other tools could elect to make them hard errors.

This may be another case in which we have different ideas of how pleasant it will be to interact with unannotated code. In this issue, you seem more in favor of making it easy (and less sound); in the other issue, you seemed more in favor of preserving soundness. Maybe the common thread is that you have a goal of keeping the spec simple? But I feel that I am putting words in your mouth.

@cpovirk
Copy link
Collaborator Author

cpovirk commented Oct 14, 2019

I was also going to say:

If unspec to unspec isn't even a warning, then newly annotating a library could produce compile errors in its callers, not just warnings:

@DefaultNullnessUnspecified
interface Lib {
  void foo(String s);
  String bar();
}

class User {
  void use(Lib lib) {
    lib.foo(lib.bar());
  }
}

Now the library can break the caller:

@DefaultNotNull
interface Lib {
  void foo([@NotNull] String s);
  @Nullable String bar();
}

However, this may well end up being the case, anyway, thanks to, say, a return type of List<@NullUnspecified String>. Once that's refined to either specified nullness, callers are likely to have an error -- unless we can make those conversions possible with a warning, as discussed in #33 -- and maybe this "newly annotating a library" case raises the stakes for #33?

@cpovirk
Copy link
Collaborator Author

cpovirk commented Oct 17, 2019

Converting between, say, one List<@NullnessUnspecified String> and another List<@NullnessUnspecified String> is dicier: Using my handwavy math from earlier: 2/4 possibilities are unsafe -- well, at least for the unusual case of mutable lists. (Wildcards help but not in all cases.)

@stephan-herrmann
Copy link

See #80, in particular the part about type argument containment.

@cpovirk
Copy link
Collaborator Author

cpovirk commented Nov 8, 2019

@kevin1e100 noted that the question of whether @NullnessUnspecified -> @NullnessUnspecified is a warning can be conceptualized as follows, assuming that we treat @NullnessUnspecified as an existential type: Is each occurrence of @NullnessUnspecified a new type, or are they all the same type?

@cpovirk cpovirk mentioned this issue Nov 8, 2019
@stephan-herrmann
Copy link

assuming that we treat @NullnessUnspecified as an existential type: Is each occurrence of @NullnessUnspecified a new type, or are they all the same type?

Regarding any type arguments <@NullnessUnspecified Bar>, new type or same type shouldn't be the question. You should ask: is one type argument contained in the other (see #80). To this question the analogy to existential types would answer: yes (but only one level deep :) ).

@cpovirk
Copy link
Collaborator Author

cpovirk commented Nov 26, 2019

(Thanks. I am still trying to get my head around all that :))

@kevinb9n
Copy link
Collaborator

In generics, each occurrence of ? is treated as a unique captured type. It makes sense because there's quite a variety of different actual types in could take on. Here there's only two.

I am having a hard time understanding why we would be concerned about this. It seems to me that nothing can really go wrong until and unless someone finally tries to impute a known nullness onto the thing. Then that can be a problem completely regardless of how many unspec-to-unspec assignments might have happened. Am I missing something?

@kevinb9n
Copy link
Collaborator

Also, if we establish that there's no concern about converting from unspec to unspec, checkers would still be free to decide to complain for their own reasons, right? I'd optimistically like to think we can close this as not worth worrying about.

@cpovirk
Copy link
Collaborator Author

cpovirk commented Dec 11, 2019

I'm not sure I understand the distinction that you're making.

In my original example, we have an unannotated Map type, and we call:

map.put("foo", map.get("foo"));

map.get returns @NullnessUnspecified; map.put accepts @NullnessUnspecified[*]. But this operation is unsafe if map.get returns null and map.put rejects null.

My take on this is that the "conversion" from @NullnessUnspecified to @NullnessUnspecified is unsafe.

[*] Maybe it's a distraction to pull in the parameterized Map type here. The same logic applies even if we write, say, Ascii.toLowerCase(Strings.emptyToNull(string)): In both cases, we're passing the result of a method that might return null to a method that might reject it. The types involved are all @NullnessUnspecified.

@cpovirk
Copy link
Collaborator Author

cpovirk commented Dec 11, 2019

Possibly you're implicitly assuming that code that uses @NullnessUnspecified is necessarily unsafe? But I think we've generally worked to avoid this, by permitting converting @NullnessUnspecified to @Nullable but warning when converting from @NullnessUnspecified to @NonNull. Again, I am putting words in your mouth :)

@kevinb9n
Copy link
Collaborator

kevinb9n commented Dec 11, 2019

I think that where there is specified nullness, or a boundary between specified and unspecified, then we have some cause for action. When we're not even at a boundary with specified nullness, there is no basis for complaint. A user who wants to be protected from the kind of bug you're talking about should start annotating.

@cpovirk
Copy link
Collaborator Author

cpovirk commented Dec 11, 2019

That's certainly a place where we could draw the line. I'm curious what you feel are the advantages to drawing it there -- beyond simplicity, of course, which is a big one!

It's true that, in some sense, we don't need to "do" anything here: We can note that passing @NullnessUnspecified where @NullnessUnspecified is required is unsound -- possibly we agree that that's the case and disagree only on how much to worry about it? -- and checkers can do what they want with that information (#73).

Probably part of what I'm trying to do here is uphold the idea of what a "warning" means to me: It means, "When the code I'm using is annotated for nullness, the checker might find a type error here." That's definitely the case with @NullnessUnspecified => @NullnessUnspecified. It sounds like you're envisioning something like that but with the additional caveat that at least one type's nullness must be known. The result is that we miss some unsound calls, but we also avoid my "warning? but 'why?'" example above.

Is that fair?

@kevin1e100
Copy link
Collaborator

Worth keeping in mind that the map's implementation code also theoretically could be checked, and errors raised there.  My point being that no NPEs can happen at the call site for map.put(k, map.get(k)) (assuming map is non-null), but map.put's implementation can throw.

Sure one approach a conservative tool could take is to warn at the call site. The rationale for that is something like, "since unspecified nullness code is probably not checked, instead let's alert users of such code." But another approach is to be extremely conservative when checking nullness unspecified implementation code, and e.g. warn if put's implementation just dereferences given values. The advantage of that is that warnings appear where NPEs can actually happen, not at call sites.

Note that by the same argument even conservative tools may reasonably not warn on map.put(k, null).

Conversely, map.get(k).foo() will presumably draw warnings in conservative tools, since the possible NPE is right there when dereferencing a nullness unspecified value. But that's a conversion from @NullnessUnspecified to @NonNull.

@kevinb9n
Copy link
Collaborator

Okay, here's an attempt at explaining where my philosophy is coming from:

Consider a file in a legacy context that has no nullness annotations at all, and doesn't interact with any annotated APIs either. I think we'd agree that analyzing it should not cause any errors (well, at least if there aren't obvious howlers like a = null; a.toString()). But should it trigger any of our "unchecked" warnings?

I claim it should not; it should be entirely clean. Later, as the file or its dependencies are gradually annotated, that is what I think should introduce issues of either degree.

I just can't seem to imagine any way in which such warnings would be useful to anyone, when they haven't even provided any nullness information yet.

@cpovirk
Copy link
Collaborator Author

cpovirk commented Jan 3, 2020

The perspective I'm looking at this from is more like this: I have an app that has historically had a lot of NullPointerException bugs. I want to stop having those. The process by which I stop having those is that I turn on nullness checking (including warnings) for a file or set of files, and then I iteratively add annotations to make the errors and warnings go away.

From this perspective, if I pass a result from one unannotated library to another unannotated library, I want a warning, since that's one of the places that might produce a NullPointerException. Then I can go and annotate those libraries.

Now, in an ideal world, I might just annotate all the libraries I use. But in practice, this can be a lot of work. So I want to rely on warnings to tell me what I need to annotate.


Coming back to your example: I have less of a feel for exactly what I'd want there. Certainly I don't want the warning I envisioned in my original post:

@DefaultNullnessUnspecified
...
List<Bar> a = ...;
List<Bar> b = a; // warning? but "why?"

There could be a lot of reasons not to want that particular warning, but the primary one in my mind is that -- as far as I can tell -- that particular assignment doesn't create any new problems: You had a Foo<@NullnessUnspecified Bar> before, and you have a Foo<@NullnessUnspecified Bar> afterward.

This is in contrast to code like the following:

@DefaultNullnessUnspecified
...
List<Bar> a = ...;
a.get(0).foo();

This is sort of like your "obvious howlers," but it's not an outright error. I'm not sure what your position is on it: warning or not? It's roughly a conversion of @NullnessUnspecified to @NonNull, which suggests that it's a warning. (That would be my position.) But it's happening in unannotated code, which suggests that it's not. [edit: Sorry, I forgot that your previous post was talking specifically about unannotated code that uses only other unannotated code. So I think you're in favor of a warning here, too?]

Then there's an intermediate case:

@DefaultNullnessUnspecified
...
List<Bar> a = ...;
SomeOtherUnannotatedApi.doSomethingWithList(a);

This is a conversion of @NullnessUnspecified to @NullnessUnspecified. It does potentially introduce new problems. But it's also still happening in unannotated code. I would advocate for a warning here; I gather that you would advocate against.


From my perspective, the hard part in my mind is specifying what counts as "creating a new problem." It might have something to do with the idea of handing an object from code that was analyzed with the nullness checker (not "code that was annotated for nullness"; that's neither necessary nor sufficient) to code that was not. I think this is part of what @kevin1e100 is talking about, but I need to take more time to think about that. And/or it might have something to do with handing an object from "this code" (meaning "this file" or "this set of files that I always recompile as a unit, so I know I'll notice if an edit to one file breaks another" or something) to "other code."

(out of time; maybe more on Monday)

@cpovirk
Copy link
Collaborator Author

cpovirk commented Jan 6, 2020

A couple other points:

  • It would be a little sad if users avoided annotating new code because they found that to be easier (because that way produced fewer warnings). Of course the goal is that users want to annotate code because it saves them trouble in the long run, not because we bludgeon them into doing so. But it's still a little sad for us to just slightly discourage annotating code by deciding that annotating code is the step that produces nullness warnings.

  • Another downside of deciding to warn only if we have nullness information: That decision might make it harder to measure progress in gradually annotating a codebase: It increases the chance that, each time you fix a warning, you are creating new ones.

  • Maybe this is an argument for checkers to provide greater configurability than just "strict mode" and "lenient mode." Maybe they'd have one mode in which all unsound conversions are errors, another in which @NullnessUnspecified <-> @NullnessUnspecified is only a warning, and a third in which all unsound conversions involving @NullnessUnspecified are only warnings. [edit: ...and @kevin1e100 has pointed out yet another way for checkers to approach this :)]

Of course this all continues to fall under the umbrella of "checker behavior," which we don't need to specify. Plus, at least some users are just going to disable all warnings. So maybe I am continuing to spend too much time worrying about it and encouraging you to do the same :)

@kevinb9n
Copy link
Collaborator

kevinb9n commented Jan 9, 2020

The perspective I'm looking at this from is more like this: I have an app that has historically had a lot of NullPointerException bugs. I want to stop having those. The process by which I stop having those is that I turn on nullness checking (including warnings) for a file or set of files, and then I iteratively add annotations to make the errors and warnings go away.

That makes sense. But I'm considering that it might work better to say that as I annotate more code, I expose more problems. That seems to control warning-overload/warning-blindness better to me. It means that the first step can be turning on analysis for the whole codebase and having only a small number of "obvious howlers" to deal with before I get back to a clean state. This appeals to me, and it's still just as easy for users to track their burndown.

I do acknowledge that some users may miss that and get a false sense of security from a clean build, though.

@DefaultNullnessUnspecified
...
List<Bar> a = ...;
a.get(0).foo();

This is sort of like your "obvious howlers," but it's not an outright error. I'm not sure what your position is on it: warning or not? It's roughly a conversion of @NullnessUnspecified to @NonNull, which suggests that it's a warning. (That would be my position.) But it's happening in unannotated code, which suggests that it's not. [edit: Sorry, I forgot that your previous post was talking specifically about unannotated code that uses only other unannotated code. So I think you're in favor of a warning here, too?]

My hope for everything is to keep it simple so I've been hoping we can just always treat a dereference as identical to a type conversion to @NonNull. So yeah, a warning of the "unchecked" sort but not a hard error.

@DefaultNullnessUnspecified
...
List<Bar> a = ...;
SomeOtherUnannotatedApi.doSomethingWithList(a);

This is a conversion of @NullnessUnspecified to @NullnessUnspecified. It does potentially introduce new problems. But it's also still happening in unannotated code. I would advocate for a warning here; I gather that you would advocate against.

Right, I think that's what I'm saying.

  • It would be a little sad if users avoided annotating new code because they found that to be easier (because that way produced fewer warnings). Of course the goal is that users want to annotate code because it saves them trouble in the long run, not because we bludgeon them into doing so. But it's still a little sad for us to just slightly discourage annotating code by deciding that annotating code is the step that produces nullness warnings.

But wait: isn't the alternative that we hugely discourage turning on the checker in the first place, because you get the whole entire spew up front?

  • Another downside of deciding to warn only if we have nullness information: That decision might make it harder to measure progress in gradually annotating a codebase: It increases the chance that, each time you fix a warning, you are creating new ones.

This could be an issue.

  • Maybe this is an argument for checkers to provide greater configurability than just "strict mode" and "lenient mode." Maybe they'd have one mode in which all unsound conversions are errors, another in which @NullnessUnspecified <-> @NullnessUnspecified is only a warning, and a third in which all unsound conversions involving @NullnessUnspecified are only warnings.

I just imagine these checkers having any number of checkboxes for any number of different scenarios a user might want to be warned about. And I'm fine with Checker X deciding to add "converting from unspec to unspec" as another checkbox. It doesn't bother me because it's clearly an "extra" which doesn't have to pollute how users think about the core type system.

@cpovirk
Copy link
Collaborator Author

cpovirk commented Jan 9, 2020

But wait: isn't the alternative that we hugely discourage turning on the checker in the first place, because you get the whole entire spew up front?

On the one hand, I would expect this to happen even if the checker doesn't warn on @NullnessUnspecified <-> @NullnessUnspecified conversions. Notably, every time I dereference a value returned by a method (well, an unannotated method), I get a warning. On the other hand, @NullnessUnspecified <-> @NullnessUnspecified warnings will be an order of magnitude(?) more frequent.

I just imagine these checkers having any number of checkboxes for any number of different scenarios a user might want to be warned about. And I'm fine with Checker X deciding to add "converting from unspec to unspec" as another checkbox. It doesn't bother me because it's clearly an "extra" which doesn't have to pollute how users think about the core type system.

A checkbox sounds good to me.

That said... :)

I still hope that that this checkbox will look less like an "extra" once we have a unified model of what @NullnessUnspecified means. Under such a model, passing @NullnessUnspecified to an API that accepts @NullnessUnspecified will naturally appear to be a possible NPE, not least because it is a possible NPE. (But today, the giant problem with that thinking is that I don't have a principle to distinguish that case, which can introduce the possibility of an exception, from the b = a case, which can't.)

(Still: Even if the checkbox isn't "logically" an extra, I can imagine having it off by default.)

@kevinb9n
Copy link
Collaborator

kevinb9n commented Jan 10, 2020

As a "for now" working decision I think we should be able to agree not to worry about this. This keeps things simpler. We want to be able to rely on users' understanding of types as their guide to the vast majority of our behavior, and in an identity type conversion there should be nothing to check. Finally, there is the fact that checkers are always free to complain more.

If we really want to keep pushing on this, I feel like it would make more sense to come back it to late this year than early.

Acceptable?

@cpovirk
Copy link
Collaborator Author

cpovirk commented Jan 10, 2020

Ah, thanks, I think that gets at the heart of it: You want users to be able to view @NullnessUnspecified Foo as a specific type that is subject to some special unchecked conversions. (Of course you've said this before, but I hadn't connected it deeply to this issue.) In contrast, I want to view it as "either @Nullable Foo or Foo, but we don't know which," with the unchecked conversions arising when one of those two specific types would produce an error.

I'm not sure what that means for this issue. If we end up deciding that your model is the way we want users to think, then warnings for an "identity conversion" definitely sound like an extra. If we end up deciding that the other model is the way we want users to think, then they aren't conceptually an extra.

Maybe ultimately there's nothing for us to "do" here beyond noting in the design proposal that this conversion is unsound (which @kevin1e100 did a while back). We'll also need to pick the model that we want users to use, but that's not really specific to this issue. Then again, it's not really specific to any issue, so maybe this issue is exactly the kind of "specific answerable question" that we are seeking :)

@cpovirk
Copy link
Collaborator Author

cpovirk commented Jan 15, 2020

Sorry, I'm finally replying to this from over a month ago:

Worth keeping in mind that the map's implementation code also theoretically could be checked, and errors raised there.  My point being that no NPEs can happen at the call site for map.put(k, map.get(k)) (assuming map is non-null), but map.put's implementation can throw.
...
Note that by the same argument even conservative tools may reasonably not warn on map.put(k, null).

Would the following be an accurate summary?

  • Your proposal guarantees no NPEs[*] if all code (including nullness-unspecified code) passes the checker without errors or warnings.
  • Your proposal is to warn when converting from @NullnessUnspecified to @NonNull but not to warn when converting from @Nullable to @NullnessUnspecified.

[*] OK, barring throw new NullPointerException() and other unusual cases :)

@cpovirk
Copy link
Collaborator Author

cpovirk commented Jan 15, 2020

We may be developing 4 levels of badness / conversative-ness -- possibly even mapping to 4 levels of suppression strings?

(In the following, I'm going to use the term "type errors." That might not be perfect, so feel free to substitute "spec violations" or your preferred term, and hopefully the levels still make sense?)

  1. type errors not involving @NullnessUnspecified
  2. type errors from converting @NullnessUnspecified to @NonNull
  3. type errors from converting @Nullable to @NullnessUnspecified
  4. type errors from converting @NullnessUnspecified to @NullnessUnspecified

Based on that, we can talk about the guarantees that each level provides:

  • A tool that complains only about (1) is "lenient mode." It guarantees no NPEs as long as you stick to annotated, checked code.
  • A tool that complains only about (1)+(2) is @kevin1e100 's, which guarantees no NPEs as long as all code (including nullness-unspecified code) is run through the checker.
  • A tool that complains only about (1)+(2)+(3) is @kevinb9n 's. I'm not sure if it offers any particular new guarantees(?), but it seems like a good, pragmatic tradeoff between signal and noise.
  • A tool that complains about (1)+(2)+(3)+(4) is one that guarantees no NPEs† as long as at least all annotated code is run through the checker.

† What it really guarantees is something more like "no NPEs that are the fault of code that was run through the checker." After all, the unannotated, unchecked code it uses could have all sorts of nullness bugs :) The best way to view this guarantee might be in terms of "logical contradictions" or other specification / type-theory terms.

@cpovirk
Copy link
Collaborator Author

cpovirk commented Feb 6, 2020

@stephan-herrmann shared a post he wrote that shows it's not as simple as in my previous post.

The problem with my post is that it suggests that there are only 4 kinds of conversions[*]. But in fact there are more, thanks to parameterized types. For example, consider passing a List<@NonNull String> to a method that accepts a List<@NullnessUnspecified String>. I think this might end up in my Level (2), since it permits the method to write into the list, effectively putting a @NullnessUnspecified String into a @NonNull String-shaped hole.

Maybe all conversions of parameterized types can still be fit into one of these 4 levels. It wouldn't surprise me. But we may end up with rules that are more complex than we'd hoped.

[*] OK, that was an oversimplification to begin with, given the existence of "parametric nullness" (and perhaps "assumed parametric nullness") in addition to our "ternary nullness."

@kevin1e100
Copy link
Collaborator

kevin1e100 commented Feb 7, 2020

(thanks for sharing the post, interesting read!)

Thanks for writing up the levels. I haven't thought about this enough to have an opinion on whether it's an "exhaustive" list but it is a pretty interesting conceptualization I think. Thanks!

I'm not sure I'm making a specific proposal here (I was slotted in at level 2 above): I suspect different tools will want to do different things, and I'm not sure how many tools will even attempt to make any kind of guarantee. But, as alluded to in the post you mentioned, one useful "guarantee", I think, could indeed be "no NPEs from object dereferences in checked code". (Something like that is coming to Eclipse if I read the shared post correctly?)

Note that while that's a nice guarantee, I think, it's still pretty unsatisfying in a way: checked code can still "cause" NPEs in unchecked code by passing unexpected null values to the unchecked code. ("Causing" may be a contentious description here, but at least to me it seems interesting that NPEs in unchecked code can not only come from "coding errors" within that code but also from assumed or even informally documented invariants/expectations that checked code might inadvertently violate.) Which is where I think the next level of guarantee that was mentioned comes in.

Conversely, Kotlin's type system I think is basically on the level below, where NPEs can also happen in "checked" code, namely when it interacts with unchecked code. I think that that's also a very useful way of going about it, not least b/c it seems more "symmetric". Note though that Kotlin having its own compiler can insert runtime checks to eagerly detect unexpected null values, which I suspect is quite helpful to users. I'm mentioning this b/c many static analysis tools for Java would have less or no ability to insert such runtime checks, and by going to the level above, the tools would essentially ask the programmer to hand-write comparable checks (for better or worse).

Just to point this out, in addition to choosing one of these levels, tools could poke other "holes" into their "guarantees": for instance, they may not make any guarantees with respect to concurrency, or apply heuristics in places. A lot of that is probably orthogonal to checked vs. unchecked though. And it's maybe worth keeping in mind that tools like Infer don't really have the checked/unchecked distinction, as they can "step through" unannotated methods in many cases. Those tools' challenges I believe are different, e.g., "stepping through" method calls is complicated by dynamic dispatch/subclassing.

@msridhar
Copy link
Collaborator

msridhar commented Feb 7, 2020

FWIW, NullAway aims for the condition of "no false negatives in practice, excluding interactions with unchecked code." We say "no false negatives in practice" since NullAway makes several deliberate unsound assumptions even for checked code (e.g., assuming that methods are side-effect-free) that empirically did not lead to real-world crashes, at least on Uber's mobile code base. The only common cause of NPEs that remains with NullAway is interactions with unchecked code, as we did not discover a sound solution with a reasonable annotation burden. Further details are in our paper.

Just sharing as another data point of what current tools do. /cc @lazaroclapp

@cpovirk
Copy link
Collaborator Author

cpovirk commented Feb 20, 2020

(Thanks, I have finally started going through the NullAway paper. I've left small comments on a couple issues so far, and I compared the annotation/suppression burden you saw to what I saw with Guava and our proposed annotations.)

On the main topic here: See also a followup post by @wmdietlGC. Some thoughts on it:

  • I would identify the "problem" at b.elem = null;. This is a conversion from @Nullable to @NonNull, so it's "Level 1" in my earlier post. (We could tweak the example to replace null with a @NullUnspecified value that happens to be null, which would shift this into Level 2.)

As for the tools I outlined: Their guarantees aren't affected:

  • A tool that complains only about (1) still doesn't catch this problem, just as it doesn't catch any problems outside annotated, checked code.
  • A tool that complains only about (1)+(2) still catches the error under the usual condition that all code (including unannotated) is run through the checker. (It would also catch the problem if we replaced null with a @NullnessUnspecified value that happened to be null.)
  • A tool that complains only about (1)+(2)+(3) still guarantees the same as (1)+(2).
  • A tool that complains about (1)+(2)+(3)+(4) still guarantees no NPEs that are "the fault of the annotated code" as long as all annotated code is run through the checker.

But I wouldn't be surprised if there were still ways to break those guarantees :)

@Maaartinus
Copy link

@cpovirk

Another downside of deciding to warn only if we have nullness information: That decision might make it harder to measure progress in gradually annotating a codebase: It increases the chance that, each time you fix a warning, you are creating new ones.

I can imagine that there's no way for the user to stop warnings in a reasonable time. Once you start to annotate, the number of detectable problem may start to grow for a long and unpredictable time. So for -Werror people, it's pretty unusable. Moreover, even if they finish in their livetime (which is unsure) they'll go from zero warnings to zero warnings, seeing no progress at all. ;)

I guess, there should be a different progress measure. Something analogous to test coverage... Something increasing with the invested effort - every added annotation improves the score unless it exposes a major problem..... just an idea....

@cpovirk
Copy link
Collaborator Author

cpovirk commented Sep 17, 2021

I think there's probably no action for "the JSpecify group" to take on this:

  • Unspec->unspec is definitely unsound. After all, it's essentially the state of Java code today, and Java code has been known to throw NPE :)
  • Many tools won't end up reporting errors for any check that involves unspec at all, much less one that involves 2 unspecified types. They can do as they want.
  • The spec has reasons for defining relations that tools could use to be stricter if they wanted. We may play with such a tool internally to identify, e.g., unannotated JDK APIs used from Guava, but who knows we'll do anything beyond that?
  • Our sample inputs include unspec->unspec conversions, labeled with jspecify_nullness_not_enough_information. If some tool author does want to issue warnings for unspec->spec conversions but not unspec->unspec, then that would be something for us to look into changing. We can do that if the need arises.

@cpovirk cpovirk closed this as completed Sep 17, 2021
@kevinb9n kevinb9n added the design An issue that is resolved by making a decision, about whether and how something should work. label Mar 13, 2024
@kevinb9n kevinb9n added this to the 0.3 milestone Mar 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design An issue that is resolved by making a decision, about whether and how something should work. nullness For issues specific to nullness analysis.
Projects
None yet
Development

No branches or pull requests

6 participants