Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[java] Local Variables At Top #2850

Open
numeralnathan opened this issue Oct 22, 2020 · 14 comments
Open

[java] Local Variables At Top #2850

numeralnathan opened this issue Oct 22, 2020 · 14 comments
Labels
a:new-rule Proposal to add a new built-in rule

Comments

@numeralnathan
Copy link

Proposed Rule Name: LocalVariablesAtTop

Proposed Category: Code Style

Description:

In C++, one could declare the local variables at the top of the method. However, the compiler has to call the constructor at the point of declaration and then throw away this work when the local variable is initialized. This drove C++ programmers to declare the local variable at the point of initialization.

In Java, this does not apply. The bytecode does nothing with the local variable until its first initialization. There is no need to declare local variables at the point of initialization. Yet, some Java programmers do this as a carry over from C++.

Please create a rule that encourages programmers to declare local variables at the top of the method. This will improve code readability. Instead of having to visually hunt for the variable's type, one can simply glance at the top of the method.

Some local variables cannot be declared at the top of the method. Let's say the local variable is assigned in a loop body and then used in a lambda. The local variable must be effectively final and hence must be declared in the loop body. In such a case, please make the rule require programmers to declare the local variable at the top of the loop body.

Code Sample:

Here is some code that is harder to read because the variable declarations happen at the point of initialization. If I am looking at the line second.eat(), I have to look up through many lines of code to find out the type of second.

   ... many lines of code ...
   Banana first = new Banana(5.5, 6.2);
   Apple second = new Apple(3.2, 9.5, 1.2);
   ... many lines of code ...
   second.eat();

Here is the "correct" code. If I am looking at the line second.eat(), I simply glance up at the first few lines of code to find out the type of second. My code reading speed is greatly improved.

   Banana first;
   Apple second;

   ... many lines of code ...
   first = new Banana(5.5, 6.2);
   second = new Apple(3.2, 9.5, 1.2);
   ... many lines of code ...
   second.eat();

Here is an example of a local variable that must be declared in a loop body. index must be declared in the loop body and not at the top of the method because it must be effectively final in order to be used in the lambda.

   for (i = 0; i < 10; i++)
   {
      int index;

      index = i;

      operate(() -> operation(index));
   }

Possible Properties:

I cannot think of any. Can you?

@numeralnathan numeralnathan added the a:new-rule Proposal to add a new built-in rule label Oct 22, 2020
@linusjf
Copy link

linusjf commented Oct 25, 2020

This rule will also end up enforcing Only one return and possibly violating Premature declaration. What do you think?

@numeralnathan
Copy link
Author

numeralnathan commented Oct 26, 2020

I do not like Only One Return since it encourages deep nesting of if-else blocks. I would rather have multiple returns and a very flat logic than one return and 3 layers of if-else. I think the idea of 1 return is deprecated and flat logic replaces it. But, maybe I am wrong on this. Is there a rule about keeping the logic flat?

Putting that aside, how would Local Variables At Top enforce Only One Return?

As for Premature Declaration, perhaps this should be Premature Initialization. Declaring a variable at the top but not initializing it does not do anything different in the bytecode. The bytecode only shows first use to last use. So, JIT does not know any difference. On the other hand, prematurely initializing a variable may have a negative performance impact. Unless JIT can move the line of code later, then JIT will have to assign a register for the variable or even worse push one more thing on the stack.

@linusjf
Copy link

linusjf commented Oct 26, 2020

And then there's this Checkstyle rule:
VariableDeclarationUsageDistance
which will be violated as well if the distance from declaration to initialization exceeds its specified value.

Personally, I think this new rule is far too fastidious. Let's see what the others have to say, too.

@numeralnathan
Copy link
Author

I disagree with Premature Initialization and Variable Declaration Usage Distance. At a minimum, I would like the Local Variables At Top rule and have it default to disabled. I can then enable the rule and disable the rules I disagree with in the PMD configuration file... like I do today.

@linusjf
Copy link

linusjf commented Oct 26, 2020

Putting that aside, how would Local Variables At Top enforce Only One Return?

This new rule + Premature Declaration enforces OnlyOneReturn. At least, I believe that is what will happen. Because then you can't have the function exit normally or otherwise until the final line of the method. If this is restricted only to Premature initialisation, then it works. I haven't checked that but this is one recurring coding error.

@linusjf
Copy link

linusjf commented Oct 26, 2020

I do not like Only One Return since it encourages deep nesting of if-else blocks. I would rather have multiple returns and a very flat logic than one return and 3 layers of if-else. I think the idea of 1 return is deprecated and flat logic replaces..

I don't like the rule either. I don't like using else when I'm returning in the if.

@linusjf
Copy link

linusjf commented Oct 26, 2020

I disagree with Premature Initialization and Variable Declaration Usage Distance. At a minimum, I would like the Local Variables At Top rule and have it default to disabled. I can then enable the rule and disable the rules I disagree with in the PMD configuration file... like I do today.

I'm not uncomfortable with the VariableDeclarationUsageDistance rule. However, violation is sometimes unavoidable when used in conjunction with the rule OneDeclarationPerLine.
I think you can allow multiple variables of same type per line , with either all initialized or none initialized. If not, then they can and should be declared on separate lines.

@numeralnathan
Copy link
Author

I still do not see how Local Variables At Top and Premature Declaration enforces Only One Return. I do not see how both rules can be enabled at the same time. Can you give some example code?

Consider this piece of code. result is declared at the top of the method yet there are 3 returns in the method. Local Variables At Top would require int result; to be at the top of the method. Premature Declaration would flag this as a problem and insist removing the line and putting int in front of result = value1....

   public static int compareString(@Nullable String value1, @Nullable String value2)
   {
      int result;

      if ((value1 == null) || (value2 == null))
         return compareNulls(value1, value2);

      result = value1.compareToIgnoreCase(value2);

      if (result != 0)
         return result;

      return value1.compareTo(value2);
   }

The same code would have to look like this if Only One Return were applied. I personally do not like this. It makes it much harder to follow each path. With returns like above, I can easily follow each path because the branches terminate quickly (i.e. return).

   public static int compareString(@Nullable String value1, @Nullable String value2)
   {
      int result;

      if ((value1 == null) || (value2 == null))
         result = compareNulls(value1, value2);
      else
      {
         result = value1.compareToIgnoreCase(value2);

         if (result == 0)
            result = value1.compareTo(value2);
      }

      return result;
   }

@numeralnathan
Copy link
Author

numeralnathan commented Oct 26, 2020

I remember why Only One Return was invented. This was for assembly code and languages where you had to clean up the memory and pop the stack at the end of the method. By having Only One Return, the programmer can ensure the clean up and stack pop happen consistently. In Java, we do not have to worry about cleaning up the memory nor popping the stack. This is all done for us by the JVM. Hence, I do not see a need for Only One Return.

@numeralnathan
Copy link
Author

Hmm... I disable the One Declaration Per Line rule and I do not initialize the local variables until the latest moment possible. This leaves the local variables at the top of the method very easy to read and easy to find the type.

@linusjf
Copy link

linusjf commented Oct 26, 2020

I still do not see how Local Variables At Top and Premature Declaration enforces Only One Return. I do not see how both rules can be enabled at the same time.

Then you have to specify which rules are exclusive of LocalVariablesAtTop and must be disabled.
Once you add a rule to the PMD source code, it becomes public property and its proper use and pitfalls must be documented.

@linusjf
Copy link

linusjf commented Oct 26, 2020

   public static int compareString(@Nullable String value1, @Nullable String value2)
   {
      int result;

      if ((value1 == null) || (value2 == null))
         result = compareNulls(value1, value2);
      else
      {
         result = value1.compareToIgnoreCase(value2);

         if (result == 0)
            result = value1.compareTo(value2);
      }

      return result;
   }

The second code sample proves my point. To not violate your new rule and PrematureDeclaration, you have to code to OnlyOneReturn.

@linusjf
Copy link

linusjf commented Oct 26, 2020

As for Premature Declaration, perhaps this should be Premature Initialization. Declaring a variable at the top but not initializing it does not do anything different in the bytecode. The bytecode only shows first use to last use. So, JIT does not know any difference. On the other hand, prematurely initializing a variable may have a negative performance impact. Unless JIT can move the line of code later, then JIT will have to assign a register for the variable or even worse push one more thing on the stack.

I'm not debating the byte code. I agree prematurely initializing a variable can have a -ve performance impact.

@linusjf
Copy link

linusjf commented Oct 26, 2020

I remember why Only One Return was invented. This was for assembly code and languages where you had to clean up the memory and pop the stack at the end of the method. By having Only One Return, the programmer can ensure the clean up and stack pop happen consistently. In Java, we do not have to worry about cleaning up the memory nor popping the stack. This is all done for us by the JVM. Hence, I do not see a need for Only One Return.

https://stackoverflow.com/questions/36707/should-a-function-have-only-one-return-statement

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a:new-rule Proposal to add a new built-in rule
Projects
None yet
Development

No branches or pull requests

2 participants