New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-3583. Loosen some rule restrictions of checkstyle #921
Conversation
I am in favour of this change, but for a change like this, I feel we need a few more +1's and also wait to see if there are any objections. We will try to discuss it on the next community sync call, which is on Monday. |
@sodonnel Yeah it is necessary |
@sodonnel How about the result of the +1's int the community sync call? |
My fault, but it was not discussed during the call. We were out of time. Let me ping the mailing list. |
BTW, I am +1, but let's wait a little for more feedback... |
+1 |
+1. Thanks @maobaolong for taking action. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maobaolong Thanks for working on this! The changes look good to me. +1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@maobaolong Thanks for working on this! +1 LGTM |
Change Line length limit from 80 -> 120 is overall positive change for me, only once in a while I hit it with modern IDEs. A few things I can think of are multiple-window side-by-side will need to be adjusted
|
Should we wait for more approvals or is it good to be merged? |
There is a related discussion on the mailing lists. @fapifta had some concerns, but he said As any code change can be vetoed by any committers, we need clarification from @anuengineer. @anuengineer can you accept this patch ( |
I think it should be cancelled. Thanks for checking. |
There are still lots of contributor around me complain with the 80 character per line limitation. So a month past, a vehemently discussion has taken place in the mail-list, i think everyone have many arguments, but everyone try to persuade other, but nobody can persuade other. But we still need a decision, and a reason why, both do nothing and do this change. can be a choice. I think we don't make decision means choose keep 80, it isn't a reasonable result for people who want to change. Now, i found there are 3 choices from the discussion.
It seems we have no better idea than vote, and everyone is equal, i don't think it's a good idea of giving more weight of people who did special contribution for Ozone. For example, founder, member, committer, contributor. In fact, it is too complex, the simple way is everyone is equal, at least voter should be a contributor, majority can be accepted. Finally, my choice is change to 120, or at least 100, so, i want to vote to 2, and 3. I don't know why if we have to keep consistent with the first great people who write the first line of one repository, in fact, i saw a repository was changed from one language into another language, i also saw a repository was changed from gradle to maven, that is what i want to say, keep change, change for great. I write this reply here because i'm not sure all contributors in the ozone-dev mail list, but i believe all the contributor can see this page, we can send this link to mail-list, slack channel or any others to let people vote for this decision. If someone have a better idea than vote, i'm very glad to see any other idea about how to finish this proposal other than vote. |
This is the problem with a change like this. Its a debate where there is no "correct" answer. Some like it one way and some like it another. However that is not a reason to not change - as you said, software is always changing and in my opinion the style can change too. We can see that as the Java language has evolved over the versions, software gets refactored etc.
I believe a vote is the best way to get an idea of what the community wants. Let the people who are actively involved in the project give their opinion. Its probably best to do this like a normal release or merge vote. Post it to the mailing list, and Slack and give it 7 days for responses. I'd be inclined to with just two options 80 or 120, but add in the 100 option too if you like. I'd be happy with 100 or 120. Perhaps going for the middle ground (100) would be more sympathetic to the existing code style as the lines would not be too much longer. |
@maobaolong @sodonnel I agree with you both. I like the approach of starting a vote thread similar to a normal release or merge vote and give it 7 days for responses. It will be easy to keep track of votes if we start a vote thread in the mailing list and share the link here and in slack channel. |
Thanks all for the discussion, this change can be arguable from both sides. If I understand correctly, @anuengineer indicated a valid binding veto here. According to the project bylaw, a vote will not be helpful. The best shot is to convince him to withdraw, otherwise let us cancel the PR and move on with other important issues. Bylaw of the project https://hadoop.apache.org/bylaws.html If you disagree with a valid veto, you must lobby the person casting the veto to withdraw their veto. If a veto is not withdrawn, any action that has been vetoed must be reversed in a timely manner." |
My reason is simple, https://accessibility.digital.gov/visual-design/typography/ argues very clearly too keep them under 75 chars per line. I want to make sure that code base usable for people with visual disability. If you have research or authority which proves that this is not the case, I am willing to lift the veto. In the absence of information -- that an argument like this is probably for text and not for code -- is absence of information, I don't have any data to modify current stance. So currently this veto stands. I am more than willing to listen to your arguments and respectfully think about it. In fact, my veto is based on reading and researching on this topic. Not based on the fact that I am the first person to write code for Ozone. |
According to this study 95 character is better (based on the text sis more convincing as the mentioned article, as it's a reproducible study). http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.503.2885&rep=rep1&type=pdf |
Actually I think the question is slightly different (based on the conversation from the email list) Let's say that we have an ideal line length (75 or 95 or 120, I don't know what). Do we need to enforce it in the source representation? Or we can enable for anybody to choose it based on own preference. Using higher (or unlimited) number makes it possible to use both 120 or 80 or any other number based on the preference as the soft-wrap function of IDEs and an editors are very powerful today (tested IntelliJ) Originally I had no preference but I convinced that using a high number is more inclusive. |
|
Exactly. These recommendations are not about reading source code, which is read in a very different way to text. The above link provides no evidence about the ease of reading source code. With what @elek has tried related to "soft-wrap function of IDEs" I feel code with longer line lengths can still be inclusive. I am even open to removing the line length restriction entirely and handling it at review time, but I worry about the reviews getting bogged down in comments about reducing line lengths on excessively long lines. The reason I suggested having a vote on the preference, is so we can fairly access what the community desires. If the majority of people want to extend the line length, then for the sake of the community, I feel we would be better to accept the consensuses rather than veto the decision, and I would hope Anu would take onboard the direction the community desires over his own preference. |
We are currently the part of the Hadoop project, which has the 80 characters as the limit, we I think should change the rule - if we change it - after we become TLP as that is in the making. Also this seems to be a good time to write our guidelines (yes, I loved to read the Flink guidelines Marton posted on the mailing list), that talks about coding style, wrapping, line length, possible exceptions and all the things that we think is important regarding code. First it should not be large, but evolve over time as we have discussions and learning points on the go, but it would be useful to contain all the prior learning points we already have on the project. Even though I don't want to block the community from the change, I still think we should keep 80-ish as a limit. I wrote about consistency, I wrote about typography, and I wrote about a couple other things in the mail-thread. Also I questioned why not go unlimited then based on the first arguments about inclusiveness and naming problems cited by 120 limit supporters. Now I just would like to add some colour from a few guys seem way smarter then me... Dan North at 20:42 in this video (https://youtu.be/4Y0tOi7QWqM?t=1242) talks about "code that fits into one's head", and mentions why contextual consistency matters, and even though he talks about the way larger picture, it is true for source code as well, also he talks about teams have to be opinionated about how they do things, and how they organize things, and that makes me think that, we need general acceptable guidelines on many visual/architectural/algorithmical things that we might not agree on now, but we would like to have consistency on. |
I have stated a logical reason. None of you seem to discuss that. So let me ask this again. Do you know how a visually challenged person operates? Have you used the tools? I have not, and I have rely on someone like Accessibility.gov. Your argument is that it is about text, which part of the source code does not seem to be text? why is that recommendation not applicable? None of you are willing to produce information about why your preference is better, other than it is "my preference". I have repeatedly said in this conversation that if you are willing to produce data about accessible software use, I am willing to listen. If anything, I think this is a group of people trying to force the new preference without anything other than "ad hominem" attacks and innuendo. Let us get the fact straight. The https://accessibility.digital.gov/ talks about recommendations to accommodate certain people who have handicaps; you have a wide monitor, you find that it is better to put the burden on them; since you would love to enjoy your wide monitor. I get that; but to accuse me to forcing my preference, when I have pointed out a valid logical objection and point out that we must be inclusive, where do you get the idea that I am forcing the community to be inline with my own preference. So yes, my VETO absolutely stands.
I agree, but let me ask you this; how did you come to the conclusion that this is NOT applicable to the source code at all? What is the information used by you to come to that conclusion? That text does not have the word source code, and therefore source is some sort of magical thing? or are you under the impression that visually challenged people does not read source code and we don't need to care about them? |
I am far more than willing to listen. I am basically saying that it is NOT a preference of mine at all. I studied this issue; and I am asking all people part of this issue to research this. I am NOT saying this is the way it should be. I am asking for a technical study which justifies any of the other positions. Yes, "You" as Individual must have debated that 15 years ago, I am saying that there are significant studies on this topic, so that we can leverage those leanings. Honestly, It feels like I am not able to debate this on the merits of the case -- NO ONE seems to be interested in researching this -- but the refrain is this is what the majority wants. My veto is NOT against the Majority -- I love democracy, was born in the largest one in the world, and now I live in the oldest democracy. This is NOT a democratic question -- This is simply me asking -- what is the rationale for this decision ? Have we considered all this information that is available. I would love to have a deep discussion on each of these points, and I have a set of questions, which I had promised I will share with @fapifta (it is taking time, but I promise I will get to that). So once more, let us have a discussion about this issue -- and as I pointed out, it is NOT based on my preferences or yours -- but based on extensive amount of studies that are already in place about this. Just saying something is Nonsense without producing data to support your claim -- is not an effective way to have a discussion. **Let me repeat: ** Let us have a sincere discussion about this issue. My veto was cast based on my research on this topic. I am more than willing to lift my veto, if I am able to see information that justifies other positions. This is NOT a dogmatic issue for me. I am just doing the right thing for all of us. |
I am an active committer. I am not sure where you go the idea that I am not. |
@arp7 This is a classical example of an attack against the person. We just agreed to discuss the issue. What is @szetszwo point here. Let us shoot the messenger, instead of debating the issue. I think this is a bad slippery slope that will kill this community. I really think we should all be vigilant against NOT having healthy debates, but use arguments like You are not a part of US, because you have a different opinion. If I count the contributions to ozone, I know I have made far more contributions to ozone than @szetszwo. I am humbly requesting us to maintain a discourse that is respectful to each other. |
Please also note the history of discourse so far in this round:
What is missing ?NO DISCUSSION about the merits of the technical argument, and the long document that I wrote based on my research. I am flagging this pattern with the hope that it will allow us to debate this issue with research and references instead of just trying to destroy free speech and expression. |
Hadoop stayed on 80 for a long time not so much for punched-card development (that's where it came from!) but because it was easier to do side-by-side reviews with the .diff viewer plugin for chrome. we moved to 100 chars a few years back and I have no regrets. It helps us stay with readable code when using builder APIs: public interface FutureDataInputStreamBuilder
extends FSBuilder<CompletableFuture<FSDataInputStream>, FutureDataInputStreamBuilder> { Here the types of arguments/subclasses are over 80 chars wide, splitting this stuff makes it harder to parse. And, just like the limits on method length and #of arguments to a method or constructor, it is only a hint. private <BuilderT extends S3BaseClientBuilder<BuilderT, ClientT>, ClientT> void configureEndpointAndRegion(
BuilderT builder, S3ClientCreationParameters parameters, Configuration conf) { Although wider lines may appear to go back on the "side by side" reviewing problem, it is considered moot because
Adopting a wider column count should not be an excuse to overboard with functional programming code to show off how you learned Haskell, Standard ML or OCaml getOptionalKeys().stream().filter(!key -> key.startsWith(headerPrefix) && key.length() > prefixLen).forEach(key -> headers.put(key.substring(prefixLen), options.get(key))); why so? makes debugging and tracking down where an NPE came from wrong. A single clause per line is much better as you can set breakpoints, isolate stack traces etc. getOptionalKeys().stream()
.filter(key -> key.startsWith(headerPrefix) && key.length() > prefixLen)
.forEach(key -> headers.put(key.substring(prefixLen), options.get(key))); Accordingly, I'd worry less about line length and more about modern java code issues
So here's my proposal
if you can't get consensus, move this out of github issues into mailing lists. And focus on what is best for the readability and maintainability of modern code, not stuff we wrote for java 5. (BTW: all those code snippets are from hadoop trunk, not all my work) |
oh, and as hadoop went to 100 chars in 2021, if you copy and paste in anything from beyond that time, you either have to reformat or expect complaints. |
Thank you @steveloughran. You have always brought a fresh breath to arguments, and you are taking this argument in the right direction. This is what I have been asking..
I have heard the similar story for a long time, and I have always believed this to be true; until I started reading the
This is actually a great input. The question I would ask is that is there more density of information now per line, and what is the way that people read and understand.
This is a major concern. When I am on a laptop, I find hard to review. The language of choice for me these days is Rust, and they do use 100 chars. I am familiar with this. What I do on a regular basis is that is I use 2 monitors when I do code reviews. To avoid these kind of wrapping.
This. 100 times this. Java is rapidly getting these features, and I love that. Don't get me wrong. The information Why is that -- in my mind -- really valuable and a far better solution NOT being discussed here ? Another Logical Corollary to this point.This argument always seems to start from a point, ** I have larger monitor**, followed by the ** I don't believe human beings reading studies are relevant in understanding code**. So let us make this 100 chars as the limit. What surprises me is that they are NOT able to see the inconsistency of the ask, let me illustrate that here. Let us assume that every one in the world has a 1080p screen and 21 inches is the minimum size of a monitor (I am fully aware that many of you might be coding on Macs with 4K), but humor me here. The first question we want to ask is: How many chars can we effectively put on a given line ? Pixels Per Inch (or PPI) is defined as number of pixels per inch on your monitor. Characters per Line= Screen Width in Inches × (PPI / Character Width in Pixels) So the first question in my mind is always, why are you folks NOT asking for 462 characters per line. The argument that I have a large screen implies I need more characters seems to fail, due to this inconsistency in the ask. So what they are really asking me is this --
But my humble point is that you are just responding to your anger, and not realizing the greater good this rule was designed for. That is to prevent information overload. We have studied extensively the **"Saccades" : One of the important metric used in understanding how people read code and understand code -- is measured by a notion called "Saccades" or "Saccades Per Line" or how we read to understand. The paper indicates that experts make larger jumps (saccades) when reading code, focusing on important elements, while Novices tend to read code more linearly. This implies that fewer, but more targeted, saccades per line might be characteristic of expert reading behavior, and Novices read very differently, and as the information density increases, only people deeply familiar with the domain and code will be able to read and understand it. As a language gains more capability to express ideas in far fewer characters, we have to be cognizant of that. I know that @arp7 used to love Perl. He used to be able to write it well. Often, I struggled to read Perl due to the information density shenanigans involved by the masters of Perl (for all the new kids, this is a comment that applies to few older people in this discourse, it is a waste of your time to look up the Magic of Perl.) This leads me to conclude, that real argument that my friends are trying to make is NOT about the monitor size -- but we are irritated that checkstyle fails when we are about to commit. Perhaps, that is also the reason that Nicholas wants to argue that I have not been checking in and I don't suffer the same pain (which is NOT true, I have been more irritated by these tools that anything. Ask @awk -- I have had many fights with him. ) So let us bring this conversation back into a normal -- let us ask ourselves which pain path is better ? Moving the checkstyle to 100 chars -- which I believe is a path to reduce the pain for a little while, but suffer the consequences of the fear that you are voicing "go overboard with functional programming code to show off how you learned Haskell, Standard ML or OCaml". I guarantee you that any Java programmer who starts coding from Java 17 has far more information density that many of us who started far earlier. In that case, is the solution truly to take the pain for a little while and get Ozone working with Java 21 LTS or Java 17 LTS ?
I strongly feel that this complicated. Especially due to the reactive crowd of Java, where they taught everyone it is so cool to return this in each function call, and then chain the calls. Most people I know, would have broken this line into multiple lines, and perhaps (I think, a personal opinion) that it is better to read this as different lines.
Completely agree. For debugging a single long line is better. But we both know that the real issue
100% -- This is my opinion too. However the discussion for the last 3 years have always been on "My monitor is big", I don't believe that human being read code. If you address these comments about -- The var, optional etc. you have effectively solved this issue. But here is the issue. Are all over contributors willing to move up this chain; perhaps tweak the way they write code ?
+1. I am fine with it. I believe that committers should do the right thing. The tools are to help them, NOT to dictate.
I would, but as you see, I am very engaged in defending MY ONLY veto in Apache. {Just kidding, I am not listening to Hadoop-common, I have always been an Ozone man.}
Yes. This is what I am saying too.
@steveloughran It has been a pleasure to talk to you after a long time. I am thankful for your thoughtful comments, and more over for taking time to write a thoughtful response. |
@steveloughran Unless you are really passionate about this topic, You might not get anything from reading these papers. These two are general studies on WHY in-spite of having large monitors, we as human beings have not
Once more, thank you for improving the noise to signal ratio of this discussion; appreciate your kindness and time. |
Code style issues like this are subject to majority opinion of the project. There is no veto. Literally any opinion can be stretched into a research discussion ad nauseam. It is still just an opinion. A technical reason for which a veto is justified would be if the configuration failed to match a group decision, or if the change causes a failure in deployment that would impact existing users. Saying the group has to extensively research the subject matter of your opinion just to override a veto that is based on opinion is a nonsensical abuse of the Apache process, even if that opinion is correct. A project must be free to make its own conclusions even if the entire world disagrees. You have an opinion and you are welcome to share that and have one vote on the process. No veto is allowed. |
@royfielding , thanks a lot for commenting on this! @maobaolong , could you update this change? It currently has some conflicts. |
@royfielding Thank you for comment. Appreciate it. I will comply with the Apache process. |
I intentionally reopened this last time. Are you open for discussion? Discussion on a closed pull request does not sound right. |
@szetszwo sure! Will be pleasure and glad to continue to finish this PR! |
0d3b99b
to
9b16de3
Compare
@szetszwo Thanks for reopen this PR for me, I've merge the master branch and resolved conflicts, please take a look. Thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @maobaolong for updating the patch.
The pull request is now merged. Thanks, everyone! |
Thank you everyone for the discussion and review! |
@maobaolong, do we need also change the value of |
What changes were proposed in this pull request?
Change config file of checkstyle only, loosen some rule restrictions of checkstyle.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-3583
How was this patch tested?
No need