Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #3: Git component #17

Merged
merged 1 commit into from Jun 11, 2017
Merged

Issue #3: Git component #17

merged 1 commit into from Jun 11, 2017

Conversation

Luolc
Copy link
Collaborator

@Luolc Luolc commented Jun 8, 2017

#3

Copy link
Member

@rnveach rnveach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Luolc
Please look over the issue for this PR. You didn't make any comments on what I said, so I am not sure if you are planning this out like I suggested or you think we should go another way.

I said:

We most ignore deleted files.
We must only grab files from src/main

Are you planning to do this in another component or in the git component in another method?
Is there any reason we should do this outside of git?

Finally, if any testing in git gives you trouble or we want to skip them for now, I am fine with that, but we need to make issue(s) for them to come back to later.
Most of my talk of tests are more just ideas and are not required right now.

*/
public class DiffParser {
/** The prefix of branch head commit ref. */
private static final String COMMIT_REF_PREFIX = "refs/heads/";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just use Constants.R_HEADS from jgit?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

* @throws IOException JGit library exception
* @throws GitAPIException JGit library exception
*/
public List<GitChange> parse(String branchName) throws IOException, GitAPIException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we every going to instantiate this more than once? Since it takes the repository in the constructor, that means we would be working with more than one repository.
I think this and the other methods would be better as static and take repositoryPath as a parameter.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just designed it by my intuition previously. But actually my point is on the opposite, I think if we need work with more than one repository, then static method is better.
I was thinking that we may work with only one repository path and more than one branch, for there is only one checkstyle project but many feature branches. Then we could just set the repo path once in the constructor and run parse method with only the branch parameter.
But I guess in most case we would work with only one repo path and one branch. So if you think static method is better, I am ok with that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, I think the 2 branches we will have to work with is PR branch and master branch. Since we can't install the Checkstyle versions, we will need to execute them to get the information we need (what are modules, what are properties, and what has changed).

But in terms of GitDifferencing, we only need the files that have changed (for now).
Let's make this static and remove the field and make the constructor private. I don't forsee any components having a constructor, right now.

private GitUtils() {
}

public static Repository newRepository() throws IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start methods with verbs when possible.
Lets go with createNewRepository.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

import org.eclipse.jgit.treewalk.CanonicalTreeParser;

/**
* Git diff parser.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repeating the component and class name is not ok for a description.
Just write a brief sentence that states the purpose of the utility.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

}
}

public static void addAllAndCommit(Repository repository, String message)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why all? that usually implies more than 1 of something.
I only see one file being added and it is named ..

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it is the same with command git add ., and that means add all files.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was unaware, as I don't use the git command line.
This can be ignored for now.

throws GitAPIException {
try (Git git = new Git(repository)) {
if (git.branchList().call().stream()
.anyMatch(ref -> ref.getName().equals("refs/heads/" + branchName))) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the constants here too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

@Test
public void testParse() throws Exception {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should include all types of differences in git.
New file, removed file, renamed file, moved file, chmod change (linux only).

}

@Test
public void testParse() throws Exception {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also add a test where PR branch is behind master. This happens quite a bit when people don't stay up to date with changes in master.
We should still correctly identify PR branch commits in this scenario and only get it's differences.

}

@Test
public void testParse() throws Exception {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need a test where PR branch has multiple commits on master.

@Luolc
Copy link
Collaborator Author

Luolc commented Jun 9, 2017

@rnveach
Sorry for the missing of my comments. It was pretty late when I sent this PR yesterday (about 2AM) and I was sort of sleepy then. I originally intended to edit this PR to add my opinions today.

I had thought about the issue comments before this PR and decided to hang on "deleted files" and "src/main" for the following reasons:

Regrading "src/main", I have discussed the possibility of grabing information from changes of UTs in my proposal with Roman. I summarized and wrote the result of the discussion in the proposal as:

Moreover, from changes of UTs we could take values of properties that are required for regression. As certain UT is changed, it means that behavior for that property values are changed and to do regression on values that UT specify is required. Therefore, we could find the changed lines of corresponding UTs, and then find the config setting code, which could be used to help us generate the configurations. It might be not easy, but I think it is possible and we could have a try.

https://docs.google.com/document/d/1xu6SE4qeKTRQ45R9FSLOQB-t5ExzBGyvU9FLGifvxY0/edit?usp=sharing, Project -> Outline -> 2

This idea is much far from now, and even may not be achieved during the period of GSoC. But there are chances to realize the idea in the future. So I think we should not drop "it/test" and other files here in git component, but in specific strategy class.

In my original design of Generator, there is a whetherToSkipChange method for the changed files filter. And I prefer smth like that to do the filtering outside git component. Or we could create an filter interface in git component.
https://github.com/Luolc/regression-tool/blob/74c6c83094599488bb9960d5d7eb162858a3753d/src/main/java/com/github/checkstyle/regression/generator/AbstractConfigGenerator.java#L183-L190

In addition, from the point of Single responsibility principle, I think there should be only git related operation in git component. DiffParser, then only parse the diff, no other restriction. If we add "it/test" filter directly in git component, then it surely dependents on the file structure of checkstyle project. At least it would increase the difficulty of testing.

The situation of "deleted files" is similar.

@Luolc Luolc force-pushed the issue3 branch 2 times, most recently from 58573f9 to 18d37db Compare June 9, 2017 07:32
@Luolc
Copy link
Collaborator Author

Luolc commented Jun 9, 2017

@rnveach
Regrading the test, I suggest to split the remaining testing problems to separate issues, for it is a little hard for me to complete all of them in this one PR. :) And let me fix them one by one in the future.
BTW, I don't know why the "static method" review is folded. I replied my points in the review block.

@rnveach
Copy link
Member

rnveach commented Jun 9, 2017

then it surely dependents on the file structure of checkstyle project

Whole regression idea depends on checkstyle project.
If you wish to separate this out more, than filtered files would be a parameter to this class.

"src/main"

Ok, we will hold off on this for now.
Changes in tests can take a priority, but since we know the module that is changing, we can deduce the class' test and pull all properties from the file. If we rely only on test changes and for whatever reason there is no test changes found, we still need to come up with some type of testing plan.

This will be done in the module component then?

suggest to split the remaining testing problems to separate issues

Make the issue. They can all be in one issue and worked on slowly since they are all very closely related. We have no requirement to fix whole issue in this repo. Just keep issue updated with what is left.

Copy link
Member

@rnveach rnveach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make the static change and I will merge this.
Please join discussion in #2 and #3 so I get a clearer picture of what you intend.


package com.github.checkstyle.regression.git;

import org.eclipse.jgit.diff.DiffEntry;
Copy link
Member

@rnveach rnveach Jun 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be another issue we can do once I see how this class will be used.


If we are passing this class to other components, than org.eclipse.jgit can't be in this class and this class must go to com.github.checkstyle.regression.data as I defined in #2 .

One component shouldn't be invoking data of another component. data package is so all components have a common area to share data with. There shouldn't be any other cross component communication.
I don't want jgit showing up in import controls of other components as it makes it look like they are doing things outside of their purview.
I think we should expand the fields to what we need, the file path and the lines that have changed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remove the DiffEntry from the constructor params. The path and changed lines (in the future) would be passed directly now.

* @throws IOException JGit library exception
* @throws GitAPIException JGit library exception
*/
public List<GitChange> parse(String branchName) throws IOException, GitAPIException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on discussion at #17 (comment) ,

Let's make this static and remove the field and make the constructor private. I don't forsee any components having a constructor, right now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@Luolc Luolc mentioned this pull request Jun 11, 2017
@rnveach rnveach merged commit 4edeee4 into checkstyle:master Jun 11, 2017
@Luolc Luolc deleted the issue3 branch June 11, 2017 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants