-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added tests for validating commit message. Issue #937 #1518
Conversation
Sorry for delay, i will review your update as soons as i return from vacation |
please name object by noun
please get last 10 commits from history, as some PRs contain several commits.
please use BeforeClass annotation to cache revisions list between all Tests. new format: ^Issue #\d_. ._ $ , ^Pull #\d_. ._ $ On test failure we should print to user all rules for message formats to let him fix message easily. We need an option to ignore certain user from validation, user names of them should be hardcoded in UTs - that will be a list of users who can commit with any message. It is required as some time admins has to resolve problem quickly without wasting time on issue creation and version bump (during release) also done without following to format described above. |
a842cd3
to
95ed391
Compare
In https://github.com/checkstyle/checkstyle/wiki/Release-notes-automation before end of string ($) In my dev environments it always changes ending whitespaces in one single newline character. What is more ,even when i do 'git -m "msg"' it replaces "msg" with "msg\n". I tried searching on the web about this behaviour but i couldn't find anything, so i assume that this is standard, but im not 100% sure.
But... I think the best solution to this problem is to check only commits that following cmd would produce: $ git log someRemote/master..currentBranch where someRemote is the name of remote that url is main repository of checkstyle and currentBranch is name of the branch user is currently in. I prepared POC with such a solution here: The only requirement for this solution, is that a user must have a remote that points at main checkstyle git url, and have fetched master branch of this remote. For developers this requirements must satisfy( i really have no idea how anyone would want to contribute and not have remote with checkstyle url, or not fetched master) but for CI not. For CI to work, CI scripts would have to add remote to checkstyle and fetch master before invoking any test. Certainly, it would slow down those tests a bit. Please let me know what do you think about this solution. |
please update format for commit message to have ":" (instead of ".") as delimiter between issue number and message . Example - https://github.com/checkstyle/checkstyle/commits/master
it is ok, I will enforce that format manually for 10 commits and than your UTs will pass :).
I do not like that, we should not demand remote existence from user, your UT should work in all enviroments (Travis (by commit and by PR), local, ......). Lets be simple for first release. |
95ed391
to
9b116e8
Compare
|
I'm just curious about why should we hardcode excluded users names in such a way?
Why don't we use a separate "ignore list" file for this? |
Because config file is useful when you do not want to change code, but by having config file you will change code to change list of users. What is diff to change config file or test file ? |
Guys, why do you want to check 10 latest commits? What if PR has 11 of them? I think the more versatile solution is to check author of the last commit and in a while loop scan previous commits until someone's else commit reached. We very rarely have PRs with commits from multiple authors. What do you think? |
@mkordas 👍 |
It depends thare were times that my commits had very sequential, that is why i tried to validate by limit. PR with more then 10 commits will raise a lot of attention from me for sure during code review. Ok, if that is fun to implement that , lets make that as mode in test configured by field value, by default work as mkordas is suggested. Time will show who was right. |
One more thing - we should probably give some hints whether letter after colon should be rather big or small and if dot at the end is suggested:
|
Lets start with simple implementation and enhance later on, there are lot of room for imprivements. |
6b00dfb
to
045015c
Compare
we should not any failure , as in this PR there is only one commit and it is correct. As you have in correct message format in your PR, your PR should pas on Travis and Appveyor. |
|
||
private static String getRulesForCommitMessageFormatting() { | ||
return "Proper commit message should adhere to the following rules:\n" | ||
+ "\t1) Must match one of the following patterns:\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should use anywhere any tabs. Let's use spaces instead :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
eb297ba
to
86f9627
Compare
Current coverage is
|
"that is due to technical merge commit of git, see what is real commit is. We need to support that behavior."
I added new commit to PR mostly to discuss the idea of skipping merge commits, i will remove it on reject or i will squash it to previous on acceptance. |
|
074cac7
to
7612088
Compare
My next idea is to distinguish between two situation:
which will return all commits in secondParent which wasnt in firstParent. If first parent is master ,we get all new merged commits. This solution however has this limitation that it will work properly only if HEAD commit is merge commit of master and other ( so merge was done being in master branch). It will not work in situation where for example contributor will send PR with latest commit being merge commit between his local branches. I added next commit to PR as a POC, let me know what do you think |
good, but what if we will just validate two branches from merge-commit ? master-branch has to be correct always, merged-branch will be correct or not. So will just make extra validation over commits that are already in master , but there HAVE to be no violations. In this case logic will be simple (performance is not an issue there). |
@romani , so we should generate two iterators of commits - one for (git log firstParent) and the second (git log secondParent), then filter commits by one of two previously discussed strategies( by counter or by last commit author). Is this what You meant ? |
No filtering, just validate two collection of commits. If any have problem - report that |
2dd7292
to
1c1ecf6
Compare
@romani now commits are checked from both branches, there is one build failure in orekit which seems that it is independent from this PR: [ERROR] Failed to execute goal org.apache.maven.plugins:maven-checkstyle-plugin:2.15:check (default-cli) on project orekit: Execution default-cli of goal org.apache.maven.plugins:maven-checkstyle-plugin:2.15:check failed: An API incompatibility was encountered while executing org.apache.maven.plugins:maven-checkstyle-plugin:2.15:check: java.lang.NoSuchMethodError: com.puppycrawl.tools.checkstyle.PropertiesExpander.(Ljava/util/Properties;)V |
please rebase on latest code, I fixed that problem. Ref #2065 . |
1c1ecf6
to
9a365ea
Compare
@romani after rebase it succesfully passed tests, ready for next review |
1)>RepoCommitsIteratorFactory Please remove class and that by simple method. 2)>previousCommitsResolutionType Please name it as xxxxxMode 3)>masterParent,mergeParent As you do not know where is master and where is branch please name variables by general names
|
9a365ea
to
1968da8
Compare
@romani, what should I exactly do? I think we should just agree on exact naming convention. |
Lets vote , I am in favor of both of these: True to say, I do not really think we need to enforce any rules there. |
|
please add to javadoc description of logic implementation that existing only in PR discussion (nuance with two branches to validate and other nuances)
Please move all logic of filtering commits to Test methods, getCommitsMessagesToCheck should return list of RevCommit objects to Test method. We will grow amount of test in future, so extra logic should appear in Tests method only. Algorithm for getting list of commits should not change.
please move declaration below its usage. Methods declaration should appear in class right after its usage and above first usage. Distance between first usage and declaration should be reasonably minimal.
I do not like when single object is named as collection "......s". Please rename or prove your intention.
please move setUp and testXXXXXX methods above all methods in that class as this is "main" methods in a class and investigation of code should start from them
please put a comment there to describe reason of this.
I prefer to report hash of commit together with message to be exact as commit messages could be the same. |
d72fd3a
to
0029a9a
Compare
/**
* Validate commit message has proper structure.
*
* Commits to check are resolved from different places according
* to type of commit in current HEAD. If current HEAD commit is
* non-merge commit , previous commits are resolved due to current
* HEAD commit. Otherwise if it is a merge commit, it will invoke
* resolving previous commits due to commits which was merged.
*
* After calculating commits to start with ts resolves previous
* commits according to COMMITS_RESOLUTION_MODE variable.
* At default(BY_LAST_COMMIT_AUTHOR) it checks first commit author
* and return all consecutive commits with same author. Second
* mode(BY_COUNTER) makes returning first PREVIOUS_COMMITS_TO_CHECK_COUNT
* commits after starter commit.
*
* Resolved commits are filtered according to author. If commit author
* belong to list USERS_EXCLUDED_FROM_VALIDATION then this commit will
* not be validated.
*
* Filtered commit list is checked if their messages has proper structure.
*
* @author <a href="mailto:piotr.listkiewicz@gmail.com">liscju</a>
*/
/**
* Git by default put newline character at the end of commit message. To check
* if commit message has a single line we have to make sure that the only
* newline character found is in the end of commit message.
*/
Build failed because of method to deal with merge commits(it checks commits from both branches) - master first commit has bad structure:
|
Two questions:
|
@liscju , please remove "." from message format, it is not really useful, we do not do that now, we should not fight with ourself here. I will answer Michal questions later. I would merge you right now and later on enhance your test with extra validations as next PR. |
@liscju - please fix all IDEA violations for this PR: https://teamcity.jetbrains.com/viewLog.html?buildId=563114&tab=Inspection&buildTypeId=Checkstyle_IdeaInspectionsPullRequest |
0029a9a
to
aad86c9
Compare
@romani maybe i don't understand what You mean, by currently dot in message patterns is all about matching any character , not a dot character: private static final String ISSUE_COMMIT_MESSAGE_REGEX_PATTERN = "^Issue #\\d*: .*$";
private static final String PR_COMMIT_MESSAGE_REGEX_PATTERN = "^Pull #\\d*: .*$";
private static final String OTHER_COMMIT_MESSAGE_REGEXPATTERN =
"^(minor|config|infra|doc|spelling): .*$";
private static final String ACCEPTED_COMMIT_MESSAGE_REGEX_PATTERN =
"(" + ISSUE_COMMIT_MESSAGE_REGEX_PATTERN + ")|"
+ "(" + PR_COMMIT_MESSAGE_REGEX_PATTERN + ")|"
+ "(" + OTHER_COMMIT_MESSAGE_REGEXPATTERN + ")"; @mkordas all issues fixed, build failed because of violation in PropertiesExpander not touched by this PR |
To specification of patterns for commit message to match (from:
https://github.com/checkstyle/checkstyle/wiki/Release-notes-automation#validation-of-last-commit-message-to-reference-issue-number )
I added newline character before end of pattern because in my environment git always add it to the end of message. I cant find information in web about this behaviour , so i dont know if it is common for git or my env. Matter to discuss is extracting class like CommitMessageValidation where matching could be done, instead of doing it in test (lastCommitMessage.matches(ACCEPTED_COMMIT_MESSAGE_PATTERN) )
, if we extract it would be easier to test, but matching functionality is so small that it probably could stay in test.