-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Converting from @NullnessUnspecified to @NullnessUnspecified is unsound #69
Comments
I think we will not want to do anything that declares assigning unspec to unspec to be problematic, because a legacy codebase that is just starting to get with the program is going to be full of that everywhere. I think that for unannotated code, continuing to blow up at runtime the way it always has probably IS "doing better", because the user hasn't yet opted into having anything different happen. If we say "but turning on the checker was the opt-in", it can make turning on the checker a very difficult task. |
Warning for unspec to unspec feels in line with the rest of our proposal: If I remove the nullness information from a class, that might create some warnings where they didn't exist before, and it might demote some errors into warnings. Granted, unspec to unspec isn't identical to null to unspec. But I think it's a difference of degree and not of kind. In handwavy math:
I also fear that making unspec to unspec a warning could be difficult to spec and implement, as noted above. But I take your response to be more "People won't want warnings when they adopt the tool." As wmdietlGC recently put it:
This may be another case in which we have different ideas of how pleasant it will be to interact with unannotated code. In this issue, you seem more in favor of making it easy (and less sound); in the other issue, you seemed more in favor of preserving soundness. Maybe the common thread is that you have a goal of keeping the spec simple? But I feel that I am putting words in your mouth. |
I was also going to say: If unspec to unspec isn't even a warning, then newly annotating a library could produce compile errors in its callers, not just warnings: @DefaultNullnessUnspecified
interface Lib {
void foo(String s);
String bar();
}
class User {
void use(Lib lib) {
lib.foo(lib.bar());
}
} Now the library can break the caller: @DefaultNotNull
interface Lib {
void foo([@NotNull] String s);
@Nullable String bar();
} However, this may well end up being the case, anyway, thanks to, say, a return type of |
Converting between, say, one |
See #80, in particular the part about type argument containment. |
@kevin1e100 noted that the question of whether |
Regarding any type arguments |
(Thanks. I am still trying to get my head around all that :)) |
In generics, each occurrence of ? is treated as a unique captured type. It makes sense because there's quite a variety of different actual types in could take on. Here there's only two. I am having a hard time understanding why we would be concerned about this. It seems to me that nothing can really go wrong until and unless someone finally tries to impute a known nullness onto the thing. Then that can be a problem completely regardless of how many unspec-to-unspec assignments might have happened. Am I missing something? |
Also, if we establish that there's no concern about converting from unspec to unspec, checkers would still be free to decide to complain for their own reasons, right? I'd optimistically like to think we can close this as not worth worrying about. |
I'm not sure I understand the distinction that you're making. In my original example, we have an unannotated map.put("foo", map.get("foo"));
My take on this is that the "conversion" from [*] Maybe it's a distraction to pull in the parameterized |
Possibly you're implicitly assuming that code that uses |
I think that where there is specified nullness, or a boundary between specified and unspecified, then we have some cause for action. When we're not even at a boundary with specified nullness, there is no basis for complaint. A user who wants to be protected from the kind of bug you're talking about should start annotating. |
That's certainly a place where we could draw the line. I'm curious what you feel are the advantages to drawing it there -- beyond simplicity, of course, which is a big one! It's true that, in some sense, we don't need to "do" anything here: We can note that passing Probably part of what I'm trying to do here is uphold the idea of what a "warning" means to me: It means, "When the code I'm using is annotated for nullness, the checker might find a type error here." That's definitely the case with Is that fair? |
Worth keeping in mind that the map's implementation code also theoretically could be checked, and errors raised there. My point being that no NPEs can happen at the call site for Sure one approach a conservative tool could take is to warn at the call site. The rationale for that is something like, "since unspecified nullness code is probably not checked, instead let's alert users of such code." But another approach is to be extremely conservative when checking nullness unspecified implementation code, and e.g. warn if Note that by the same argument even conservative tools may reasonably not warn on Conversely, |
Okay, here's an attempt at explaining where my philosophy is coming from: Consider a file in a legacy context that has no nullness annotations at all, and doesn't interact with any annotated APIs either. I think we'd agree that analyzing it should not cause any errors (well, at least if there aren't obvious howlers like I claim it should not; it should be entirely clean. Later, as the file or its dependencies are gradually annotated, that is what I think should introduce issues of either degree. I just can't seem to imagine any way in which such warnings would be useful to anyone, when they haven't even provided any nullness information yet. |
The perspective I'm looking at this from is more like this: I have an app that has historically had a lot of From this perspective, if I pass a result from one unannotated library to another unannotated library, I want a warning, since that's one of the places that might produce a Now, in an ideal world, I might just annotate all the libraries I use. But in practice, this can be a lot of work. So I want to rely on warnings to tell me what I need to annotate. Coming back to your example: I have less of a feel for exactly what I'd want there. Certainly I don't want the warning I envisioned in my original post: @DefaultNullnessUnspecified
...
List<Bar> a = ...;
List<Bar> b = a; // warning? but "why?" There could be a lot of reasons not to want that particular warning, but the primary one in my mind is that -- as far as I can tell -- that particular assignment doesn't create any new problems: You had a This is in contrast to code like the following: @DefaultNullnessUnspecified
...
List<Bar> a = ...;
a.get(0).foo(); This is sort of like your "obvious howlers," but it's not an outright error. I'm not sure what your position is on it: warning or not? It's roughly a conversion of Then there's an intermediate case: @DefaultNullnessUnspecified
...
List<Bar> a = ...;
SomeOtherUnannotatedApi.doSomethingWithList(a); This is a conversion of From my perspective, the hard part in my mind is specifying what counts as "creating a new problem." It might have something to do with the idea of handing an object from code that was analyzed with the nullness checker (not "code that was annotated for nullness"; that's neither necessary nor sufficient) to code that was not. I think this is part of what @kevin1e100 is talking about, but I need to take more time to think about that. And/or it might have something to do with handing an object from "this code" (meaning "this file" or "this set of files that I always recompile as a unit, so I know I'll notice if an edit to one file breaks another" or something) to "other code." (out of time; maybe more on Monday) |
A couple other points:
Of course this all continues to fall under the umbrella of "checker behavior," which we don't need to specify. Plus, at least some users are just going to disable all warnings. So maybe I am continuing to spend too much time worrying about it and encouraging you to do the same :) |
That makes sense. But I'm considering that it might work better to say that as I annotate more code, I expose more problems. That seems to control warning-overload/warning-blindness better to me. It means that the first step can be turning on analysis for the whole codebase and having only a small number of "obvious howlers" to deal with before I get back to a clean state. This appeals to me, and it's still just as easy for users to track their burndown. I do acknowledge that some users may miss that and get a false sense of security from a clean build, though.
My hope for everything is to keep it simple so I've been hoping we can just always treat a dereference as identical to a type conversion to
Right, I think that's what I'm saying.
But wait: isn't the alternative that we hugely discourage turning on the checker in the first place, because you get the whole entire spew up front?
This could be an issue.
I just imagine these checkers having any number of checkboxes for any number of different scenarios a user might want to be warned about. And I'm fine with Checker X deciding to add "converting from unspec to unspec" as another checkbox. It doesn't bother me because it's clearly an "extra" which doesn't have to pollute how users think about the core type system. |
On the one hand, I would expect this to happen even if the checker doesn't warn on
A checkbox sounds good to me. That said... :) I still hope that that this checkbox will look less like an "extra" once we have a unified model of what (Still: Even if the checkbox isn't "logically" an extra, I can imagine having it off by default.) |
As a "for now" working decision I think we should be able to agree not to worry about this. This keeps things simpler. We want to be able to rely on users' understanding of types as their guide to the vast majority of our behavior, and in an identity type conversion there should be nothing to check. Finally, there is the fact that checkers are always free to complain more. If we really want to keep pushing on this, I feel like it would make more sense to come back it to late this year than early. Acceptable? |
Ah, thanks, I think that gets at the heart of it: You want users to be able to view I'm not sure what that means for this issue. If we end up deciding that your model is the way we want users to think, then warnings for an "identity conversion" definitely sound like an extra. If we end up deciding that the other model is the way we want users to think, then they aren't conceptually an extra. Maybe ultimately there's nothing for us to "do" here beyond noting in the design proposal that this conversion is unsound (which @kevin1e100 did a while back). We'll also need to pick the model that we want users to use, but that's not really specific to this issue. Then again, it's not really specific to any issue, so maybe this issue is exactly the kind of "specific answerable question" that we are seeking :) |
Sorry, I'm finally replying to this from over a month ago:
Would the following be an accurate summary?
[*] OK, barring |
We may be developing 4 levels of badness / conversative-ness -- possibly even mapping to 4 levels of suppression strings? (In the following, I'm going to use the term "type errors." That might not be perfect, so feel free to substitute "spec violations" or your preferred term, and hopefully the levels still make sense?)
Based on that, we can talk about the guarantees that each level provides:
† What it really guarantees is something more like "no NPEs that are the fault of code that was run through the checker." After all, the unannotated, unchecked code it uses could have all sorts of nullness bugs :) The best way to view this guarantee might be in terms of "logical contradictions" or other specification / type-theory terms. |
@stephan-herrmann shared a post he wrote that shows it's not as simple as in my previous post. The problem with my post is that it suggests that there are only 4 kinds of conversions[*]. But in fact there are more, thanks to parameterized types. For example, consider passing a Maybe all conversions of parameterized types can still be fit into one of these 4 levels. It wouldn't surprise me. But we may end up with rules that are more complex than we'd hoped. [*] OK, that was an oversimplification to begin with, given the existence of "parametric nullness" (and perhaps "assumed parametric nullness") in addition to our "ternary nullness." |
(thanks for sharing the post, interesting read!) Thanks for writing up the levels. I haven't thought about this enough to have an opinion on whether it's an "exhaustive" list but it is a pretty interesting conceptualization I think. Thanks! I'm not sure I'm making a specific proposal here (I was slotted in at level 2 above): I suspect different tools will want to do different things, and I'm not sure how many tools will even attempt to make any kind of guarantee. But, as alluded to in the post you mentioned, one useful "guarantee", I think, could indeed be "no NPEs from object dereferences in checked code". (Something like that is coming to Eclipse if I read the shared post correctly?) Note that while that's a nice guarantee, I think, it's still pretty unsatisfying in a way: checked code can still "cause" NPEs in unchecked code by passing unexpected Conversely, Kotlin's type system I think is basically on the level below, where NPEs can also happen in "checked" code, namely when it interacts with unchecked code. I think that that's also a very useful way of going about it, not least b/c it seems more "symmetric". Note though that Kotlin having its own compiler can insert runtime checks to eagerly detect unexpected null values, which I suspect is quite helpful to users. I'm mentioning this b/c many static analysis tools for Java would have less or no ability to insert such runtime checks, and by going to the level above, the tools would essentially ask the programmer to hand-write comparable checks (for better or worse). Just to point this out, in addition to choosing one of these levels, tools could poke other "holes" into their "guarantees": for instance, they may not make any guarantees with respect to concurrency, or apply heuristics in places. A lot of that is probably orthogonal to checked vs. unchecked though. And it's maybe worth keeping in mind that tools like Infer don't really have the checked/unchecked distinction, as they can "step through" unannotated methods in many cases. Those tools' challenges I believe are different, e.g., "stepping through" method calls is complicated by dynamic dispatch/subclassing. |
FWIW, NullAway aims for the condition of "no false negatives in practice, excluding interactions with unchecked code." We say "no false negatives in practice" since NullAway makes several deliberate unsound assumptions even for checked code (e.g., assuming that methods are side-effect-free) that empirically did not lead to real-world crashes, at least on Uber's mobile code base. The only common cause of NPEs that remains with NullAway is interactions with unchecked code, as we did not discover a sound solution with a reasonable annotation burden. Further details are in our paper. Just sharing as another data point of what current tools do. /cc @lazaroclapp |
(Thanks, I have finally started going through the NullAway paper. I've left small comments on a couple issues so far, and I compared the annotation/suppression burden you saw to what I saw with Guava and our proposed annotations.) On the main topic here: See also a followup post by @wmdietlGC. Some thoughts on it:
As for the tools I outlined: Their guarantees aren't affected:
But I wouldn't be surprised if there were still ways to break those guarantees :) |
I can imagine that there's no way for the user to stop warnings in a reasonable time. Once you start to annotate, the number of detectable problem may start to grow for a long and unpredictable time. So for I guess, there should be a different progress measure. Something analogous to test coverage... Something increasing with the invested effort - every added annotation improves the score unless it exposes a major problem..... just an idea.... |
I think there's probably no action for "the JSpecify group" to take on this:
|
Probably many of you have realized this from the beginning, but I'm thinking about it for the first time now. And I see that our draft design lists only 2 "permitted [but] possibly unsafe" conversions...
...which suggests that "
@NullnessUnknown
to@NullnessUnknown
" might not be at the top of others' minds, too.As a concrete example, if
ConcurrentHashMap
were a standalone type with unspecified nullness for all types parameters and type usages, then a caller could write:If
@NullnessUnspecified
converts to@NullnessUnspecified
without a warning, then that code would pass our type checking but blow up at runtime when it tries to insertnull
into the map.Maybe that's the best we can do -- I've certainly said a couple times that unannotated code could be rough to deal with in the absence of stub annotation files -- but it would be nice if we could do better.
Maybe we end up with rules that issue a warning whenever an expression uses
@NullnessUnspecified
. (This "just give up" attitude fits in spirit with my #33 (comment) :)) It would be nice if we could avoid that in the cases that are known to be safe, like converting@NullnessUnspecified
to@Nullable
, but perhaps that's hard to specify or implement?Or maybe we treat
@NullnessUnspecified
like an extreme version of wildcard capture: Wildcard capture creates a unique type for every occurrence in an expression, but@NullnessUnspecified
would create a unique type for every occurrence everywhere (or close to it). Then those types would behave like the parametric nullness typeT extends @Nullable Something
, differing only in that unsafe operations would generate warnings instead of errors. But that might be too severe:[edit: Note that all of this, for better or for worse, arises organically out of my current mental model for
@NullnessUnspecified
.]Probably there's a smarter way to do it, but I wonder if we can realistically expect tools to all find it unless we provide guidelines. If we're lucky, maybe this is a rare case in which we really can say that our rules apply to implementation code only?
The text was updated successfully, but these errors were encountered: