Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #4082: Take "break" into consideration in FinalLocalVariable #4115

Merged
merged 1 commit into from
Apr 18, 2017

Conversation

Luolc
Copy link
Contributor

@Luolc Luolc commented Mar 28, 2017

@codecov-io
Copy link

codecov-io commented Mar 28, 2017

Codecov Report

Merging #4115 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@          Coverage Diff           @@
##           master   #4115   +/-   ##
======================================
  Coverage     100%    100%           
======================================
  Files         283     283           
  Lines       14885   14896   +11     
  Branches     3402    3398    -4     
======================================
+ Hits        14885   14896   +11
Impacted Files Coverage Δ
...ckstyle/checks/coding/FinalLocalVariableCheck.java 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 888f258...107650e. Read the comment docs.

@@ -667,6 +697,9 @@ private static boolean isLoopAst(int ast) {
/** Contains definitions of uninitialized variables. */
private final Deque<DetailAST> uninitializedVariables = new ArrayDeque<>();

/** Whether there is a {@code break} or {@code return} in the scope. */
private boolean containsBreakOrReturn;
Copy link
Member

@rnveach rnveach Mar 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems weird we treat break and return with the same variable as there is a difference between them and how they are handled.
break will exit switch/loop/etc, but continue on logic outside that scope.
return will exit the method completely (besides any finally blocks) and ignore any logic outside scope.
I am also going to assume that we don't have any handling of continue either since we didn't have break. If so, that can be another issue.

@Luolc Can you create a some more examples with just break and return by itself, and not intermingled like issue's example, that can probably prove my concern. Don't limit break to just switch statement. Please show some while/for/etc.
I will review regression shortly.

Copy link
Contributor Author

@Luolc Luolc Mar 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also going to assume that we don't have any handling of continue

You are correct. continue should be taken into consideration as well.

@rnveach Regrading the break in loops, please correct me if I am wrong...

  1. In most cases, break or return does not have a direct loop parent, they are always wrapped by a if scope (or else scope). In fact, a break or return with a direct loop parent (smth like below) is really weird.
while (...) {
  // statements...
  break;
}

In that case we will be sure to break at the first time. If so, why should we use a loop at all? There is no regression problem due to the "break in loop" in diff report, I guess it is because no one wrote a break or return like this in the projects.
2. Actually we don't care about break or return exit what scope.

int a;
if (...) {
  a = 1; // (0)
  break; // or return, we don't care that
}
a = 2; // (1)

The break or return issue is that we ignore them before and regard them as common statements. And the check will consider that a is possible to be assigned twice at (0) and (1). And what we should do is to let the check know if there is a break or return in the if scope, then it is not possible to reach both (0) and (1).
In the example I don't write the enclosing scope at all, it could be a switch, and also for, while etc. It doesn't matter.
3. One thing we need to confirm: should we concern about unreachable statement?

return will exit the method completely and ignore any logic outside scope.

It is true but return can only be at the end of a if, else scope (or end of a case group, or the end of a method). If there are anything else after the return, like smth below:

void foo() {
  return;
  final int a = 1;
}

We would get a compile error.

---
To sum up, IMO, we don't need to handle break and return in a different way. Please provide a counterexample if I am wrong.
The only case I could find it different now is break with a label:

int a;
// ...
label: {
  a = 1;
}
// ...
break label;

I haven't try to handle with that yet. It seems really hard to handle with a label. It is smth like a evil goto.

Can you create a some more examples with just break and return by itself

Sure. I will make new comment when I complete it sooner.

Copy link
Contributor Author

@Luolc Luolc Mar 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more information.
We cannot directly assign a final variable in a loop at all.

final int a;
while (...) {
  a = 1; // compile error
}

We would get a variable 'a' might be assigned in loop error.
We can only assign a final variable (declared outside the loop) in the loop like this:

final int a;
while (...) {
  if (...) {
    a = 1;
    break; // necessary, or there would still be a compile error
  }
}

or this, weird but without compile error:

final int a;
while (...) {
  a = 1;
  break; // necessary
}

But if we try to access a outside the loop, we would get a variable 'a' might not have been initialized error.
So a must be assigned outside the loop if we want to access it. Then a should not be declared as final.
I assume the only case we should raise a violation is smth like this:

int a; // violation
while (...) {
  if (...) {
    a = 1;
    System.out.print(a);
    break;
  }
}
// never assign or access 'a' again

And the check won't raise that violation by now. In fact we have a judgement here https://github.com/checkstyle/checkstyle/blob/master/src/main/java/com/puppycrawl/tools/checkstyle/checks/coding/FinalLocalVariableCheck.java#L511-L513

if the variable is declared outside the loop and initialized inside
the loop, then it cannot be declared final, as it can be initialized
more than once in this case

Therefore, whether I handle break in loop or not, no violation would be raised due to that judgement.

Is it really necessary to handle such weird cases like that? I believe no one would write code in that way.

Copy link
Member

@rnveach rnveach Mar 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct. continue should be taken into consideration as well.

I say let's split this into new issue. This check is already complex enough and let's just get this PR done.

One thing we need to confirm: should we concern about unreachable statement?

Unreachable statements won't change the fact that a variable can be final or not, right?
Our main concern is can the variable be declared final if it is compilable.
If user has other problems or makes changes to the code after that, it is not our concern.

The only case I could find it different now is break with a label:

I forgot about this type of break. There will also be continue with a label too.
Unless @romani specifies otherwise, lets ignore this for now. Labels are bad practice anyways.

I assume the only case we should raise a violation is smth like this ... And the check won't raise that violation

It is definately weird code.
I'll this up to @romani if we should fix or make a new issue or ignore.

Is it really necessary to handle such weird cases like that?

From our side, it is probably too much work to try and handle every single case imaginable.
From users' side, it would make this check all the more amazing if it could find everything. :)

To sum up, IMO, we don't need to handle break and return in a different way. Please provide a counterexample if I am wrong.

Thanks for all to think about. I will do my own testing and get back to you. I have just one idea I want to test out.
I didn't see anything stand out in regression report but false negatives wouldn't show there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rnveach

I say let's split this into new issue. This check is already complex enough and let's just get this PR done.

Sure. When the new issue is created, I am willing to continue to help with that. :)

Unreachable statements won't change the fact that a variable can be final or not, right?

Yep.

Our main concern is can the variable be declared final if it is compilable.

But the point is the unreachable statement would lead to a compile error. smth like this:

void foo() {
  int a;
  a = 1;
  return;
  a = 2; // compile error
}

We would get an Unreachable statement error. If I understood correctly, we don't need to concern about the cases like that since it is not compilable.
In fact, I haven't handle the unreachable cases in this commit by now. I assume it could be not easy if we need to concern about the unreachable statement with return and break.

... but false negatives wouldn't show there.

Maybe I don't understand clearly... Is there anything unexpected in the diff report? If the answer is yes, I will try to fix them, or split it into new issue?

Copy link
Member

@rnveach rnveach Apr 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understood correctly, we don't need to concern about the cases like that since it is not compilable.

Yes.

This seems weird we treat break and return with the same variable as there is a difference between them and how they are handled.

@Luolc Here is two examples I have created that isn't requested to be final with your code. I got no compile errors making them final.

    void foo1() throws Exception {
        Exception e; // violation
        final int a = (int) Math.random();
        final int b = (int) Math.random();

        while (b == 0) {
            if (a == 0) {
                e = new Exception();
                return;
            }
        }

        e = new Exception();
        throw e;
    }

    void foo2() throws Exception {
        Exception e; // violation
        final int a = (int) Math.random();
        final int b = (int) Math.random();

        switch (a) {
        case 0:
            e = new Exception();
            return;
        default:
            break;
        }

        e = new Exception();

        throw e;
    }

No violation is correct if return was treated like break, but return and break mean different things.
Regression doesn't show, this as I mentioned, because it only reports violations (added or removed) and no violations are being reported.

Copy link
Contributor Author

@Luolc Luolc Apr 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rnveach
The first example is handled by the check just like the one I wrote at #4115 (comment):

int a; // violation
while (...) {
  if (...) {
    a = 1;
    System.out.print(a);
    break;
  }
}
// never assign or access 'a' again

And the check won't raise that violation by now. In fact we have a judgement here https://github.com/checkstyle/checkstyle/blob/master/src/main/java/com/puppycrawl/tools/checkstyle/checks/coding/FinalLocalVariableCheck.java#L511-L513

if the variable is declared outside the loop and initialized inside
the loop, then it cannot be declared final, as it can be initialized
more than once in this case

Therefore, whether I handle break in loop or not, no violation would be raised due to that judgement.

This is a complicated problem, due to the code about handling loop I mentioned. I will get back to this later and I am willing to focus on the second example now.

For the second one, it is indeed a false negative. You are correct, return and break should be handled differently for case like that.
I am trying to fix this and find another token we ignored before: throw.
Example:

void foo3() {
  int a; // violation
  int b; // no violation
  final int random = (int) Math.random();
  switch (random) {
    case 0:
      b = 0;
    case 1:
      b = 1;
      a = 1;
      throw new RuntimeException();
    default:
      a = 2;
      break;
  } 
}

throw is smth like return if it is not surrounded with try.
Also, we didn't handle the fall-through of case before. There could be a false positive of b.

I am afraid we would find many many bugs of this check during this PR. 😅 It is no longer just a "return" and "break" problem now.

Copy link
Member

@rnveach rnveach Apr 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first example is handled by the check just like the one I wrote at #4115 (comment):

Your example had a break in a while loop. My example has a return instead. It will not produce a compile issue like you were stating for the breaks.

It is no longer just a "return" and "break" problem now.

Let's focus on these specific tokens for now. We can create new issue for others.
If you find supporting break and return too hard, and if @romani agrees, we can split this into new issues.
I have only seen false negatives so far with current code, not any false positives.

Copy link
Contributor Author

@Luolc Luolc Apr 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rnveach

I have only seen false negatives so far with current code, not any false positives.

It is a false positive when there is fall-through of case. The bug maybe existed for a long time.

void foo3() {
        int a; // expect no violation
        final int random = (int) Math.random();
        switch (random) {
            case 0:
                a = 1;
            default:
                a = 2;
                break;
        }
    }

Your example had a break in a while loop. My example has a return instead. It will not produce a compile issue like you were stating for the breaks.

Ya. I know that. I mean it is a more complicated problem than the return in catch, since we have special treatment about loop in the current code. I have no problem to handle that, but it might take a bit more time.

Let's focus on these specific tokens for now.

we can split this into new issues.

The problem is when I am fixing bugs about break and return in catch, the regression appears. And the regression is not because of the latest fix, but the bugs (i.e. fall-through case, or throw) in the original code. If I want no regression, then I need to fix the others bugs in the meantime. If we are going to do so, then we would mix too many problems into one PR. We are in a dilemma now. 😞

Copy link
Member

@romani romani Apr 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only case I could find it different now is break with a label:

brake with label should be moved to separate issue to avoid overcomplication of this PR.
Provided example is it compilable ? I found http://stackoverflow.com/a/14960484/1015848 , we should consider only compilable code.
Lets move discussion of this weird cases in separate issue.

I have only seen false negatives so far with current code, not any false positives.

It very good, as false negatives does not make CI fails. So user are ok to tolerate them. False-positives usually result in suppression or disablement of check.

It is definately weird code.
I'll this up to @romani if we should fix or make a new issue or ignore.

Lets skip it for now, lets wait for reporting of it by users.

break;
}

if (b == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these 4 lines aren't in issue's example. Is there a reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Emm... I added smth when I was debugging and forgot to change it back. Sorry for my careless

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove them so we can keep it as issue's example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@rnveach rnveach requested a review from romani March 29, 2017 14:16
@Luolc Luolc force-pushed the issue4082 branch 2 times, most recently from 3e27c12 to d51ace8 Compare March 30, 2017 04:59
@romani
Copy link
Member

romani commented Apr 9, 2017

@rnveach , PR become a huge .... I would like to avoid detailed review by me of all your comments.
I provided answers when you referenced me.

Please finish your review and I will make second after all you.

@rnveach
Copy link
Member

rnveach commented Apr 9, 2017

@romani I am pretty much done, except for the name of the variable because of how we handle return.

@Luolc Are you able to fix return for this PR?
If not, correct me if I am wrong, but your current implementation looks like it handles break correctly, right? (ignore break with label) All my issues I pointed out were where return wasn't causing violations.
Why don't we remove return token support, rename this variable and any others to only mention break, and move return problems to new issue.
breaks with labels and fall through cases should be separate issues too.
It was probably too much to include both tokens in one issue. We should probably stick to single tokens in issues for this check, if possible.

@Luolc @romani Does this sound like a good idea?

@Luolc
Copy link
Contributor Author

Luolc commented Apr 9, 2017

@rnveach I cannot fix return if the fall-through and throw bugs still exists. In other words, in the original code, we treated that every case has a break, return or throw and wrote the check with this wrong assuming. Only if I fix fall-through and throw, then I could focus on return. I didn't get your approval before so I didn't start the working of them yet.

but your current implementation looks like it handles break correctly

Why don't we remove return token support, rename this variable and any others to only mention break

I am not 100% sure that whether it would still look good after removing return. But I assume you are correct. I could follow your suggestion and generate a new diff report to see the result. If everything looks good for only introducing break, I am OK with it and will keep focusing on the related issues to help. Handle the other bugs in separate issues could be more reasonable, otherwise this PR would be too massive.

@romani
Copy link
Member

romani commented Apr 9, 2017

Does this sound like a good idea?

I am fie to split issues if no regression is happening.

@rnveach
Copy link
Member

rnveach commented Apr 9, 2017

I didn't get your approval before so I didn't start the working of them yet.

@Luolc I am sorry, I didn't realize you wanted something from me.
I am fine with whatever you need to get the check working correctly, just we shouldn't try to do so much in one PR if we can.

@Luolc
Copy link
Contributor Author

Luolc commented Apr 9, 2017

@rnveach np :) I also didn't write my comments clearly before.

I will update once new diff report is ready.

@Luolc
Copy link
Contributor Author

Luolc commented Apr 9, 2017

@rnveach @romani
Now the fix only contains break.

New diff report:
http://www.luolc.com/checkstyle-diff-report/issue4082/
Previous one:
http://www.luolc.com/checkstyle-diff-report/issue4082-backup/

The report looks good. Current one is a sub-set of previous one.

Please have a review of current code. :)

BTW, should I change the issue name to Take "break" into consideration in FinalLocalVariable? Or any other suggestions?

@rnveach
Copy link
Member

rnveach commented Apr 9, 2017

BTW, should I change the issue name

If @romani is happy with current regression and he agrees we are only doing break then return should be removed from Issue and PR title, and from commit message.

Current one is a sub-set of previous one.

Only openjdk8 has the differences between the 2. We have 3 less new violations than previously.
I don't see anything wrong with reports either.

@Luolc
Copy link
Contributor Author

Luolc commented Apr 12, 2017

@romani What's your opinion? :)
Limited by the rules for GSoC students, I cannot create new PRs now (Users may not create more than 3 Pull Requests (PRs) if changes are still being requested of them.) I have tried to read and learn the code base in detail these days. I would be very happy if I could help to fix more problems. :)

@romani
Copy link
Member

romani commented Apr 13, 2017

I cannot create new PRs now

Only for you ....... you can make more than 3 PRs , as your issues are hard and demand time to be validated.

@romani
Copy link
Member

romani commented Apr 13, 2017

@Luolc ,

should I change the issue name to Take "break" into consideration in FinalLocalVariable?

yes, please do. We should be exact. We use issue title in release notes generation.
Please update commit message.

New diff report:

Please prove that http://www.luolc.com/checkstyle-diff-report/issue4082/openjdk8/index.html#A2 is expected and could be final.
I think there is a way to skip assignment , see res.referrals = extractURLs(res.errorMessage); branch.

Please provide explanation for all 4 new violations. We can not accept false-positives. False-negatives are less problematic for users.

@Luolc Luolc changed the title Issue #4082: Take "return" and "break" into consideration in FinalLoc… Issue #4082: Take "break" into consideration in FinalLocalVariable Apr 13, 2017
@Luolc Luolc force-pushed the issue4082 branch 2 times, most recently from e42313f to 71f3716 Compare April 13, 2017 08:11
@Luolc
Copy link
Contributor Author

Luolc commented Apr 13, 2017

@romani

I think there is a way to skip assignment

Yes. It is possible that e would be never assigned.
In fact, in this check, we only concern about whether a variable is possible to be assigned more than once. We don't mind a variable that never be assigned. Actually, smth like this would be raised as violation:

public class Foo {
  void foo() {
    int a; // violation
  }
}

And if we add a final to the declaration, we could compile without error. I assume that this check is strict that violations would be raised so long as a variable could have been declared as final without compile-error but not. "Declare but not assign" (or possible to produce NPE) is also a bad practice, and it should be caught be other checks.

Please prove that http://www.luolc.com/checkstyle-diff-report/issue4082/openjdk8/index.html#A2 is expected and could be final.

Please provide explanation for all 4 new violations.

I write 4 samples which has similar structure with the 4 new violations' snippets. All of them could compile without error when adding a final modifier.

https://gist.github.com/Luolc/3dda1f63184ec51f791ffdd1f3902eae
https://gist.github.com/Luolc/f26046e1afb54adc68c6c2b439d9f51a
https://gist.github.com/Luolc/f2d0ac75474f3a9be06452d92bfe34fc
https://gist.github.com/Luolc/6bee95ebac9dcdd79401cd44ee827f31

Copy link
Member

@romani romani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Luolc , thanks a lot for explanation and code snippets.
I reviewed all cases in diff report - all of them are valid.
Code is changes is ok.

@rnveach , please do final review.

@rnveach
Copy link
Member

rnveach commented Apr 15, 2017

@Luolc We mentioned splitting different tokens to new issues. Please create new issues for each one separately.
I believe we talked about return, case fall-through, throw, catch, and continue. Include any others I might have missed.

Copy link
Member

@rnveach rnveach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor things, then I will merge

@@ -667,6 +693,9 @@ private static boolean isLoopAst(int ast) {
/** Contains definitions of uninitialized variables. */
private final Deque<DetailAST> uninitializedVariables = new ArrayDeque<>();

/** Whether there is a {@code break} or {@code return} in the scope. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove return from javadoc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

for (ScopeData scopeData : scopeStack) {
final FinalVariableCandidate candidate =
scopeData.scope.get(variable.getText());
scopeData.scope.get(variable.getText());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

undo superfluous change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. BTW, why the verification pass when changing the indentation? My Intellij reformat the indentation automatically. There are also many 8 spaces indentation in other lines with assignment like this.

@Luolc Luolc force-pushed the issue4082 branch 2 times, most recently from 30214de to 2f9bfb5 Compare April 16, 2017 13:11
@Luolc
Copy link
Contributor Author

Luolc commented Apr 18, 2017

@rnveach Sorry for delay. I was busy attending QCon Software Development Conference these days.

New issues taken from our discussions:
#4224
#4225
#4226
#4228

I haven't think out a bug with continue yet. Please kindly add it if you find one. :)

@rnveach
Copy link
Member

rnveach commented Apr 18, 2017

I haven't think out a bug with continue yet.

We will probably run into them with other fixes.

@rnveach rnveach merged commit 5e0ab75 into checkstyle:master Apr 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants