Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tool for importing webrev patches #32

Closed
wants to merge 8 commits into from

Conversation

@JornVernee
Copy link
Member

JornVernee commented Jul 5, 2019

As discussed on the mailing list: https://mail.openjdk.java.net/pipermail/skara-dev/2019-July/000176.html

This PR adds a tool for importing webrev .patch files, either from a webrev url or local file path into a repository.

I've named the tool wimport as short for webrev-import (I thought the full name sounded a bit too much like a sub-command of webrev). This will first checkout a new branch, then apply the patch, and then create a commit for the patch with the message Imported patch 'NAME', similar to what is done by mq. Afterwards a user can just delete the new branch when they're done with e.g. reviewing the webrev. Although, there is also an option to directly apply the webrev without creating a branch or commit.

Currently the following options are supported:

usage: git wimport [options] <webrev url|file path>
    -n, --name NAME  Use NAME as a name for the patch (default is patch file name)
    -f, --file       Input is a file path to a local patch file
    -k, --keep       Keep downloaded patch file in current directory
    -d, --direct     Directly apply patch, without creating a branch or commit

Progress

  • Change must not contain extraneous whitespace
  • Change must be properly reviewed

Approvers

@openjdk openjdk bot added cli rfr labels Jul 5, 2019
@JornVernee JornVernee changed the title Wimport prototype Add tool for importing webrev patches Jul 5, 2019
@bridgekeeper
Copy link

bridgekeeper bot commented Jul 5, 2019

👋 Welcome back jvernee! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request (refresh this page to view it).

@mlbridge
Copy link

mlbridge bot commented Jul 5, 2019

@JornVernee JornVernee force-pushed the JornVernee:wimport branch from 7b20ce7 to 3e4f668 Jul 5, 2019
@edvbld
Copy link
Member

edvbld commented Jul 5, 2019

Thanks Jorn for you contribution!

I just had time to take quick look at the code, but wanted to share some high-level thoughts before diving in deeper. I would suggest splitting this tool into two:

  1. git-webrev-apply
  2. git-webrev-fetch

The motivation would be to two piggyback on already known git commands apply and fetch and try
to match the semantics as good as possible. The first tool, git-webrev-apply is very similar to what you have done in this PR except that it would not do a commit. The second tool, git-webrev-fetch, would be more similar to what you have done here, but here I have few more considerations to make the tool as safe as possible:

  • we want git-webrev-fetch to apply directly to index (with git apply --cached) so we avoid potentially changing files in the user's working tree.
  • if the user already has a dirty index (git diff --cached is not empty) then I would consider two options:
    • abort and notify the user that the index is dirty (this differs from how git fetch behaves)
    • temporarily store the user's dirty index with git write-tree, apply the patch with git apply --cached, commit (more on how to commit later), restore the user's index with git read-tree
  • I think we can find out a bit more about the webrev. For author and committer we can definitely use the username, https://cr.openjdk.java.net/~(.*)/. You could potentially use the census module if you want to get the real name for the user (to be able to use "Real Name username@openjdk.org" form).
  • For the commit message I would either go with just the URL of the webrev or try to apply some heuristic to find out if the webrev has corresponding JBS issue.
  • For committing the dirty index I'm not fully sure yet of the best way. Ideally we want origin/master to be the parent since master might have advanced due to a user's local commits. One way to achieve this could be to first commit and then rebase the resulting commit onto origin/master. Another, perhaps more correct, but also slower, would be to use git worktree to create a new temporary worktree from origin/master in /tmp, apply the patch, and then commit.
  • I think we can automatically deduce the name of the ref, for a webrev from a URL https://cr.openjdk.java.net/~ehelin/JDK-01234567/00 I think the ref webrevs/ehelin/JDK-01234567/00 is fairly natural. This is somewhat similar to how pulls/ is used for refs related to pull requests.

Regarding the verbose naming of the tools, I actually intended for them to look like webrev subcommands 😃 The reason for this is if these two tools turn out well then I think we can repurpose git-webrev into having subcommands (similar to git-pr):

  • git webrev generate
  • git webrev apply
  • git webrev fetch

generate would be the default command so that git webrev stays backwards compatible (just typing git webrev would mean git webrev generate). But before embarking on this refactoring I wanted us to start with implementing git webrev-apply and git webrev-fetch.

I would suggest slightly re-purposing this PR into becoming the PR for git-webrev-apply and then tackle git-webrev-fetch in another PR.

What do you think?

@JornVernee
Copy link
Member Author

JornVernee commented Jul 5, 2019

Response inline...

I just had time to take quick look at the code, but wanted to share some high-level thoughts before diving in deeper. I would suggest splitting this tool into two:

1. git-webrev-apply

2. git-webrev-fetch

The motivation would be to two piggyback on already known git commands apply and fetch and try
to match the semantics as good as possible. The first tool, git-webrev-apply is very similar to what you have done in this PR except that it would not do a commit.

Okay, so it would just be like git apply, but with a webrev url as an argument? I suppose it's reasonable to drop support for local .patch files in that case, since that behaviour can be achieved by just using git apply.

The second tool, git-webrev-fetch, would be more similar to what you have done here, but here I have few more considerations to make the tool as safe as possible:

* we want git-webrev-fetch to apply directly to index (with `git apply --cached`) so we avoid potentially changing files in the user's working tree.

Oh yeah, good point!

* if the user already has a dirty index (`git diff --cached` is not empty) then I would consider two options:
  
  * abort and notify the user that the index is dirty (this differs from how `git fetch` behaves)
  * temporarily store the user's dirty index with `git write-tree`, apply the patch with `git apply --cached`, commit (more on how to commit later), restore the user's index with `git read-tree`

Can't the git read-tree step create conflicts? I'm a little more in favour of the former option of aborting, then the user can decided either to reset the tree or do a write-tree (or something else).

* I think we can find out a bit more about the webrev. For author and committer we can definitely use the username, `https://cr.openjdk.java.net/~(.*)/`. You could potentially use the `census` module if you want to get the real name for the user (to be able to use "Real Name [username@openjdk.org](mailto:username@openjdk.org)" form).

If were going with deriving more information from the webrev, I'd rather try to find it in the webrev body, we can possibly get:

  • the author
  • the revision the webrev was generated on
  • the branch the webrev is targeting
  • the associated JBS bug link
  • the patch file link (already doing this for this PR)
* For the commit message I would either go with just the URL of the webrev or try to apply some heuristic to find out if the webrev has corresponding [JBS](https://bugs.openjdk.java.net/) issue.

I think some heuristic is good, but we should have a good fallback as well, since we're not guaranteed to find the wanted information. Using the webrev URL as a commit message seems like a good one!

* For committing the dirty index I'm not fully sure yet of the best way. Ideally we want `origin/master` to be the parent since `master` might have advanced due to a user's local commits. One way to achieve this could be to first commit and then rebase the resulting commit onto `origin/master`. Another, perhaps more correct, but also slower, would be to use `git worktree` to create a new temporary worktree from `origin/master` in /tmp, apply the patch, and then commit.

Well, not all webrevs target the master branch, but I get your point. An easy way to sidestep this problem is to use the current HEAD as the parent (and whichever branch this might be).

* I think we can automatically deduce the name of the ref, for a webrev from a URL https://cr.openjdk.java.net/~ehelin/JDK-01234567/00 I think the ref `webrevs/ehelin/JDK-01234567/00` is fairly natural. This is somewhat similar to how `pulls/` is used for refs related to pull requests.

Sounds good! Overall I think we can use several heuristics to create a WebrevMetaData object that we can then generate this kind of information from that.

Regarding the verbose naming of the tools, I actually intended for them to look like webrev subcommands 😃 The reason for this is if these two tools turn out well then I think we can repurpose git-webrev into having subcommands (similar to git-pr):

* git webrev generate

* git webrev apply

* git webrev fetch

generate would be the default command so that git webrev stays backwards compatible (just typing git webrev would mean git webrev generate). But before embarking on this refactoring I wanted us to start with implementing git webrev-apply and git webrev-fetch.

Okay, sounds great to me! (just having the usual case of new-contributor-syndrome, not trying to step on anyone's toes 😄)

I would suggest slightly re-purposing this PR into becoming the PR for git-webrev-apply and then tackle git-webrev-fetch in another PR.

What do you think?

Sounds good.

Also, I wanted to ask what you want to do for tests? I noticed there are no other test in the cli module yet, so I've tested this PR manually, but maybe now is a good time to start adding them?

FWIW, not all the tests pass on Windows currently, particularly some tests that rely on unix tools/path-separators, and quite a bit of the vcs tests file (I haven't looked into it yet). In the meantime it might be nice to set up some CI on GitHub, e.g. Travis?

@edvbld
Copy link
Member

edvbld commented Jul 8, 2019

The first tool, git-webrev-apply is very similar to what you have done in this PR except that it would
not do a commit.

Okay, so it would just be like git apply, but with a webrev url as an argument? I suppose it's reasonable to drop support for local .patch files in that case, since that behaviour can be achieved by just using git apply.

Yes, git-webrev-apply would behave exactly like git apply but with a webrev URL as argument. Yes, I would drop support for local .patch files for the exact reason you mention.

* if the user already has a dirty index (`git diff --cached` is not empty) then I would consider two options:
  
  * abort and notify the user that the index is dirty (this differs from how `git fetch` behaves)
  * temporarily store the user's dirty index with `git write-tree`, apply the patch with `git apply --cached`, commit (more on how to commit later), restore the user's index with `git read-tree`

Can't the git read-tree step create conflicts? I'm a little more in favour of the former option of aborting, then the user can decided either to reset the tree or do a write-tree (or something else).

I don't think so? But I haven't tried this idea 😄 The first git read-tree will empty the index, we are essentially temporarily storing the index as a tree. The git apply followed by git commit should also result in an empty index. One could experiment with using git write-tree and git commit-tree instead of git commit, but the result should be very similar. So when we do the final git read-tree the index should be clean, HEAD should be intact, the worktree should be intact, so I don't think we should get any conflicts?

However, the above will be a little fiddly, so it is probably best to just start with a warning if the index is dirty.

* I think we can find out a bit more about the webrev. For author and committer we can definitely use the username, `https://cr.openjdk.java.net/~(.*)/`. You could potentially use the `census` module if you want to get the real name for the user (to be able to use "Real Name [username@openjdk.org](mailto:username@openjdk.org)" form).

If were going with deriving more information from the webrev, I'd rather try to find it in the webrev body, we can possibly get:

  • the author
  • the revision the webrev was generated on
  • the branch the webrev is targeting
  • the associated JBS bug link
  • the patch file link (already doing this for this PR)

Good point! I would probably recommend a mix, since not everyone creates webrevs with -c, but many contributors use the pattern ~username/ISSUE/ in the URLs.

* For the commit message I would either go with just the URL of the webrev or try to apply some heuristic to find out if the webrev has corresponding [JBS](https://bugs.openjdk.java.net/) issue.

I think some heuristic is good, but we should have a good fallback as well, since we're not guaranteed to find the wanted information. Using the webrev URL as a commit message seems like a good one!

👍

* For committing the dirty index I'm not fully sure yet of the best way. Ideally we want `origin/master` to be the parent since `master` might have advanced due to a user's local commits. One way to achieve this could be to first commit and then rebase the resulting commit onto `origin/master`. Another, perhaps more correct, but also slower, would be to use `git worktree` to create a new temporary worktree from `origin/master` in /tmp, apply the patch, and then commit.

Well, not all webrevs target the master branch, but I get your point. An easy way to sidestep this problem is to use the current HEAD as the parent (and whichever branch this might be).

* I think we can automatically deduce the name of the ref, for a webrev from a URL https://cr.openjdk.java.net/~ehelin/JDK-01234567/00 I think the ref `webrevs/ehelin/JDK-01234567/00` is fairly natural. This is somewhat similar to how `pulls/` is used for refs related to pull requests.

Sounds good! Overall I think we can use several heuristics to create a WebrevMetaData object that we can then generate this kind of information from that.

That sounds like a good start, please consider adding it to the webrev module.

Regarding the verbose naming of the tools, I actually intended for them to look like webrev subcommands smiley The reason for this is if these two tools turn out well then I think we can repurpose git-webrev into having subcommands (similar to git-pr):

* git webrev generate

* git webrev apply

* git webrev fetch

generate would be the default command so that git webrev stays backwards compatible (just typing git webrev would mean git webrev generate). But before embarking on this refactoring I wanted us to start with implementing git webrev-apply and git webrev-fetch.

Okay, sounds great to me! (just having the usual case of new-contributor-syndrome, not trying to step on anyone's toes )

Haha, the only toes you can step on are mine and Robin's, and we step on each other's toes all the time, so they are fairly hardened by now 😆 Feel free to suggest as dramatic changes as you like!

I would suggest slightly re-purposing this PR into becoming the PR for git-webrev-apply and then tackle git-webrev-fetch in another PR.
What do you think?

Sounds good.

Also, I wanted to ask what you want to do for tests? I noticed there are no other test in the cli module yet, so I've tested this PR manually, but maybe now is a good time to start adding them?

I general we try to put as much logic as possible into the various libraries and then test the libraries (since they are easier to unit test than the CLI tools). The CLI tools are primarily meant to be small programs "gluing" together a few libraries and therefore I have resorted to manual testing. Some of the CLI tools (the ones not requiring a network connection) should be fairly easy to test though, so feel free to add tests to the cli module.

FWIW, not all the tests pass on Windows currently, particularly some tests that rely on unix tools/path-separators, and quite a bit of the vcs tests file (I haven't looked into it yet). In the meantime it might be nice to set up some CI on GitHub, e.g. Travis?

Sorry to hear that the tests aren't passing on Windows, they should pass (we are running them on Windows from time to time). We have something in the works regarding CI, but we need a little more time to work on it, so please stay tuned for that.

@JornVernee
Copy link
Member Author

JornVernee commented Aug 2, 2019

Updated this to be a sub command of webrev.

I made a utility called SubCommandSwitch for switching over the different sub commands, as well as making a basic WebrevMetaData with a factory that derives the information from a webrev URL (currently only the patch URI).

using git webrev help you get a message like:

usage: git webrev <help|apply|generate> <input>
  help            print a help message
  apply           apply a webrev from a webrev url
  generate        generate a webrev

Pretty much all the other functionality is the same, but I've removed the options from the apply command. It now only takes a url as input.

@openjdk openjdk bot added the outdated label Aug 27, 2019
@JornVernee JornVernee force-pushed the JornVernee:wimport branch from 10fb3d6 to db1e37a Aug 30, 2019
@openjdk openjdk bot removed the outdated label Aug 30, 2019
JornVernee added 2 commits Sep 2, 2019
@edvbld
edvbld approved these changes Sep 5, 2019
Copy link
Member

edvbld left a comment

Great work @JornVernee! Just a few tiny comments that you can fix before integrating 👍

@openjdk openjdk bot removed the rfr label Sep 5, 2019
@openjdk
Copy link

openjdk bot commented Sep 5, 2019

@JornVernee This change can now be integrated. The commit message will be:

Add tool for importing webrev patches

Reviewed-by: ehelin
  • If you would like to add a summary, use the /summary command.
  • To list additional contributors, use the /contributor command.

Since the source branch of this PR was last updated there have been 9 commits pushed to the master branch:

  • 4cf872c: Fix configuration parse error
  • 7cf1b08: 76: Remove the "ready" label from a PR once it is integrated
  • 418efc9: Allow more compact bot configuration
  • 4c6b8ae: Coalesce hunks that have overlapping context
  • ac12500: Allow reverse mapping of census namespace users
  • 2aa4482: Extend the repository configuration syntax
  • a0e7eda: Topological bot tests needs to read org.junit.jupiter.params
  • a2d8107: 8: Use a file based approach for GitToHgConverter.java
  • bb34b1d: Add merge and topological bots to cli image

Since there are no conflicts, your changes will automatically be rebased on top of the above commits when integrating. If you prefer to do this manually, please merge master into your branch first.

  • To integrate this PR with the above commit message, type /integrate in a new comment.
@openjdk openjdk bot added the ready label Sep 5, 2019
@JornVernee
Copy link
Member Author

JornVernee commented Sep 6, 2019

/integrate

@openjdk openjdk bot closed this Sep 6, 2019
@openjdk openjdk bot added the integrated label Sep 6, 2019
@openjdk
Copy link

openjdk bot commented Sep 6, 2019

@JornVernee The following commits have been pushed to master since your change was applied:

Your commit was automatically rebased without conflicts.
Pushed as commit 4a88959.

@openjdk openjdk bot removed the ready label Sep 6, 2019
@JornVernee JornVernee deleted the JornVernee:wimport branch Sep 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants
You can’t perform that action at this time.