Skip to content

nullness unspecified

Kevin Bourrillion edited this page Apr 26, 2023 · 5 revisions

Nullness annotations enable nullness analysis by letting you communicate, for each type usage, whether it is nullable or non-null (with some handwaving for type variables).

But a core element of the JSpecify design is that nullness is not binary, but rather ternary. We have:

  1. nullable types
  2. non-null types
  3. and types of unspecified nullness

"Unspecified nullness" is what all the types in your Java code have now, if you are not yet using any nullness annotations at all.

(Footnote: there is a fourth kind; type variables can also have "parametric nullness", because they are "standing in for" a type with one of the three basic nullnesses above. But we can set that aside for this discussion.)

But in Java all types are nullable... right?

Commonly heard: "Java already has nullable types, and now just needs non-nullable types to go with them."

And it's true that in vanilla Java, null is a member of every reference type. So why do we need this third category? Isn't the default just... nullable?

This sounds fair up to a point, but turns out to be misleading and unhelpful on deeper examination. What really matters is the developer’s intention. Did they mean for null to be included, or not? Since Java never offered them a choice, we know nothing about the true intended nullness of any unannotated type.

There are important differences between such a type (“unspecified nullness”) and one we know intentionally includes null (“nullable”).

To dig into that, we’ll need to think separately about “from-types” and “to-types”. The basic idea is that every type conversion (including what we may think of as just “type checks”) is converting from some from-type to some to-type.

Examples of from-types include:

  • a method return type... from the caller’s perspective
  • a final parameter type... from the implementation’s perspective

Examples of to-types include:

  • a method return type... from the implementation’s perspective
  • a parameter type... from the caller’s perspective

Assuming your goal is a sound or conservative analysis, which catches any possible problem at the risk of false positives, then indeed you must treat unannotated from-types as if they are nullable. We don’t know that a null won’t come out, so we must account for that possibility. So far so good. But for to-types it’s the reverse! We don’t know that a null will be accepted, so to be conservative we must treat the type as non-null.

But there’s more. We believe most users will prefer lenient analysis most of the time. Lenient analysis only reports findings that are definitely wrong, avoiding false positives. And in this case the roles of from-types and to-types are swapped! For a from-type, you want to be able to assign it to anything, so you need to treat it as non-null. For a to-type, anything might be assigned to it, so it behaves as though nullable.

What is "unspecified nullness" in JSpecify

So, such types must constitute a third category, which we call "unspecified nullness".

Importantly, we see this category as existing only to support legacy, unmigrated code. That is, every type usage can be neatly classified into one of the other nullness categories. We simply don't know yet which category this particular type usage belongs to.

Most code that's full of unspecified-nullness types is probably not being run through nullness analysis anyway. But often enough it gets used as a dependency of other code that is. Your code might be fully annotated, but when it calls a library that isn't, then the analyzer doesn't know which of the parameter types or return type is meant to include null.

Behavior may vary

Wherever unspecified nullness is involved, tools' ability to recognize problems accurately is limited, but not nonexistent, and they have a lot of freedom to do as they see fit.

  • In some situations, the analyzer would have reported the same finding no matter how the unspecified usages had been annotated. In such event it would likely report it, as it never really depended on the missing information at all.
  • A tool might also try to infer the nullness for certain type usages where it was unspecified, and report more likely-issues that way.
  • In the extreme, it might even report any finding it would recognize under any possible way of annotating the unspecified usages! (This approach is extremely vulnerable to false positives.)

How to think about unspecified nullness

(TODO: restates much of previous section)

Sometimes the situations that arise from the boundaries between specified and unspecified code can get hairy and confusing. But it all boils down to a very simple principle.

Remember that unspecified-nullness type usages (which I'll call "unspec type usages" for now) are viewed as possessing a "true" nullness, that we simply don't have knowledge of. So, a nullness analyzer should account for any possible way that code might get annotated in the future.

In short, it should certainly report a finding whenever it can conclude that it would report that same finding under any possible way of annotating these unspec type usages.

And it would certainly not report a finding if there is no possible way of annotating the unspec type usages that produces the finding.

In the middle, there is some room for checker-specific interpretation. What if some ways the code might get annotated will lead to a finding and some won't? JSpecify doesn't mandate what a checker should do in this case.