Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor LibraryModelsHandler.onOverrideMayBeNullExpr #754

Merged
merged 2 commits into from
Apr 5, 2023

Conversation

XN137
Copy link
Contributor

@XN137 XN137 commented Mar 31, 2023

this does a similar refactoring as in #747

i held back on putting this up for review because i was assuming the removal of analysis.nullnessFromDataflow(state, expr) could be controversial

we can use this review to figure our what kind of tests are missing that would make clear why nullnessFromDataflow was being called inside a handler

boolean isMethodUnannotated =
getCodeAnnotationInfo(state.context).isSymbolUnannotated(methodSymbol, this.config);
if (exprMayBeNull) {
if (optLibraryModels.hasNonNullReturn(methodSymbol, state.getTypes(), isMethodUnannotated)) {
return false;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the only handler check supporting a true -> false transition

|| !optLibraryModels.nullImpliesNullParameters(methodSymbol).isEmpty()) {
// These mean the method might be null, depending on dataflow and arguments. We force
// dataflow to run.
return analysis.nullnessFromDataflow(state, expr) || exprMayBeNull;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see the odd nullnessFromDataflow being run, even without consideration the current value of exprMayBeNull.

the comment above suggests this is intentional and important, however no test is failing without this dataflow call.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, from the point of view of code calling this Handler extension point in NullAway.java, with this as the only relevant handler, the change is:

Before:

  1. We ran dataflow here, and return true only if dataflow said true or if exprMayBeNull == true
  2. If we get false out of the full handlers pipeline, we are done, the expression is non-null
  3. Otherwise we call dataflow again.

After:

  1. We return true from the handler unconditionally in this case (meaning if we have either of the above library models)
  2. We call dataflow after getting true from the full handlers pipeline.

Either way, the expression is only consider @Nullable if both the if condition logic in this handler matches and dataflow returns @Nullable and non-null otherwise. So they seem equivalent from the point of view of NullAway.java.

There is a small subtlety here, though, which is that other handlers in the chain will observe the result of LibraryModelsHandler before any call to dataflow on NullAway.java. They will observe different values before/after this change.

I don't think we are relying on dataflow being run here for the correctness of other handlers implementing onOverrideMayBeNullExpr, so this is probably fine? But it's worth pointing that small change in semantics.

One thing I do see possibly happening is that after this change, this handler will set exprMayBeNull == true a lot more often for handlers further down the chain, which could cause some handlers that are checking for reasons to transition nullable -> non-null to do extra work. At the same time, the double call to dataflow is not particularly expensive, because we cache the results of the dataflow analysis.

That said, taking a look at the handlers we currently have, the above concern is theoretical for now. All other handlers implementing this method (RestrictiveAnnotationHandler, InferredJARModelsHandler, and OptionalEmptinessHandler) do strictly less work when exprMayBeNull == true.

Long digression, but basically convinced that this change is a good idea. Worth internal/performance testing, but I'd expect it to be mostly the same, just clearer code! (for which: thank you!)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this analysis is correct. Since the new code structure is significantly clearer and easier to understand, I think we would probably want benchmarks showing some measurable performance difference before (re-)introducing an early call to dataflow in a handler. As it stands, I don't expect this change to have a measurable impact

@XN137 XN137 marked this pull request as ready for review March 31, 2023 17:19
@XN137 XN137 force-pushed the refactor-LibraryModelsHandler branch from 08c88ee to 51d2a7b Compare April 1, 2023 07:26
Copy link
Collaborator

@msridhar msridhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change also LGTM, but is it dependent on #753, or independent? Maybe it's safer to land after #753? Not that any tests are failing now, but I'd rather be confident in the overall logical structure and then get this in

@XN137
Copy link
Contributor Author

XN137 commented Apr 4, 2023

@msridhar i think it is independent (assuming there is no good reason why it was running dataflow early before...) but of course we should rebase and re-run CI after one of them got merged, to confirm all tests are still passing with the combination of the two

@XN137 XN137 force-pushed the refactor-LibraryModelsHandler branch from 51d2a7b to e59cef7 Compare April 4, 2023 17:35
Copy link
Collaborator

@lazaroclapp lazaroclapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestion and a long digression, but overall this refactor sounds great to me!

boolean isMethodUnannotated =
getCodeAnnotationInfo(state.context).isSymbolUnannotated(methodSymbol, this.config);
if (exprMayBeNull) {
if (optLibraryModels.hasNonNullReturn(methodSymbol, state.getTypes(), isMethodUnannotated)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not:

if (exprMayBeNull) {
    // This is the only case in which we may switch the result from @Nullable to @NonNull: 
    return !optLibraryModels.hasNonNullReturn(methodSymbol, state.getTypes(), isMethodUnannotated);
}

Guess there is some consistency in all returns being boolean literals, but not sure the extra if nesting is worth it.

Copy link
Contributor Author

@XN137 XN137 Apr 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am aware of this simplification, i have mentioned it here before:

#747 (comment)

when there is only a single reason, we could of course use the condition in a single return statement if preferred?

guess back then there was no preference for the single return statement, but happy to adjust it now here

the suggested comment is true now, but will it stay true in the future as well?
or is this comment really just "local" to the LibraryModelsHandler logic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me know your preference and whether we should also adjust the other onOverrideMayBeNullExpr implementations that also only have a single if (for consistency?)

Copy link
Collaborator

@lazaroclapp lazaroclapp Apr 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just think in this case it's clearer to have the return with the call just to avoid extra if nesting and thus the issue of matching branch conditions to returns, but that's a personal preference. Don't feel super strongly either way. In the linked example, I agree that having each condition in its own if check is more readable than some complicated boolean expression with ||/&&, but this case here is a single call and everything else is nested just one level deep.

Either way: happy to keep the code as is, just a suggestion (cc: @msridhar , any thoughts/preference here?)

Edit: Actually, I see it was changed to the suggested case here. So the current code seems good to me.

|| !optLibraryModels.nullImpliesNullParameters(methodSymbol).isEmpty()) {
// These mean the method might be null, depending on dataflow and arguments. We force
// dataflow to run.
return analysis.nullnessFromDataflow(state, expr) || exprMayBeNull;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, from the point of view of code calling this Handler extension point in NullAway.java, with this as the only relevant handler, the change is:

Before:

  1. We ran dataflow here, and return true only if dataflow said true or if exprMayBeNull == true
  2. If we get false out of the full handlers pipeline, we are done, the expression is non-null
  3. Otherwise we call dataflow again.

After:

  1. We return true from the handler unconditionally in this case (meaning if we have either of the above library models)
  2. We call dataflow after getting true from the full handlers pipeline.

Either way, the expression is only consider @Nullable if both the if condition logic in this handler matches and dataflow returns @Nullable and non-null otherwise. So they seem equivalent from the point of view of NullAway.java.

There is a small subtlety here, though, which is that other handlers in the chain will observe the result of LibraryModelsHandler before any call to dataflow on NullAway.java. They will observe different values before/after this change.

I don't think we are relying on dataflow being run here for the correctness of other handlers implementing onOverrideMayBeNullExpr, so this is probably fine? But it's worth pointing that small change in semantics.

One thing I do see possibly happening is that after this change, this handler will set exprMayBeNull == true a lot more often for handlers further down the chain, which could cause some handlers that are checking for reasons to transition nullable -> non-null to do extra work. At the same time, the double call to dataflow is not particularly expensive, because we cache the results of the dataflow analysis.

That said, taking a look at the handlers we currently have, the above concern is theoretical for now. All other handlers implementing this method (RestrictiveAnnotationHandler, InferredJARModelsHandler, and OptionalEmptinessHandler) do strictly less work when exprMayBeNull == true.

Long digression, but basically convinced that this change is a good idea. Worth internal/performance testing, but I'd expect it to be mostly the same, just clearer code! (for which: thank you!)

Copy link
Collaborator

@lazaroclapp lazaroclapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@lazaroclapp lazaroclapp merged commit 8821567 into uber:master Apr 5, 2023
@XN137 XN137 deleted the refactor-LibraryModelsHandler branch April 5, 2023 16:25
@XN137
Copy link
Contributor Author

XN137 commented Apr 5, 2023

thanks for the review, please let me know the results of internal testing and performance impact of all these changes if possible

@msridhar
Copy link
Collaborator

msridhar commented Apr 5, 2023

thanks for the review, please let me know the results of internal testing and performance impact of all these changes if possible

Thanks again for the contribution! For performance impact, the best tests we have for that are the jmh tests in open source. Usually it's hard to see a statistically significant performance win there, as there is a good amount of noise in the runs. If you're going to try it, you should try to run on a system with as little load / interference as possible

@XN137
Copy link
Contributor Author

XN137 commented Apr 6, 2023

thanks for the info

Long digression, but basically convinced that this change is a good idea. Worth internal/performance testing, but I'd expect it to be mostly the same, just clearer code! (for which: thank you!)

due to this comment i thought there might be some internal testing going on outside of the JMH benchmarks...

@msridhar
Copy link
Collaborator

msridhar commented Apr 6, 2023

thanks for the info

Long digression, but basically convinced that this change is a good idea. Worth internal/performance testing, but I'd expect it to be mostly the same, just clearer code! (for which: thank you!)

due to this comment i thought there might be some internal testing going on outside of the JMH benchmarks...

You're right that there is some internal performance testing that @lazaroclapp can do. But those numbers are even more noisy than the JMH benchmarks :-) So, outside of some unusual coding pattern or combination of flags that doesn't show up in the benchmarks, it probably won't surface any further perf insights (in my opinion).

@lazaroclapp
Copy link
Collaborator

thanks for the info

Long digression, but basically convinced that this change is a good idea. Worth internal/performance testing, but I'd expect it to be mostly the same, just clearer code! (for which: thank you!)

due to this comment i thought there might be some internal testing going on outside of the JMH benchmarks...

You're right that there is some internal performance testing that @lazaroclapp can do. But those numbers are even more noisy than the JMH benchmarks :-) So, outside of some unusual coding pattern or combination of flags that doesn't show up in the benchmarks, it probably won't surface any further perf insights (in my opinion).

To be fair, I was mostly thinking of internal conformance testing (i.e. any changes to the behavior of NullAway on our full codebase) + JMH for performance benchmarking. I am not sure anything that doesn't show up in JMH benchmarks will show up in internal builds, since those are a lot more noisy, but definitely will report if something does :)

@msridhar
Copy link
Collaborator

@XN137 FYI I sent an email to you a while back at the email address associated with your GH commits. Maybe you saw it, I'm just not sure. If you didn't see it, you can email me at the address on my web site from whatever address works best for you

msridhar added a commit to msridhar/NullAway that referenced this pull request Jul 18, 2023
msridhar added a commit to msridhar/NullAway that referenced this pull request Jul 18, 2023
msridhar added a commit to msridhar/NullAway that referenced this pull request Jul 19, 2023
msridhar added a commit to msridhar/NullAway that referenced this pull request Jul 19, 2023
msridhar added a commit to msridhar/NullAway that referenced this pull request Jul 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants