-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
STORM-1885. python script for squashing and merging prs. #1468
Conversation
I'm a little on the fence in terms of squashing the commits of others vs. asking the contributor to do so. There are a lot of situations where spreading out a big patch over multiple commits makes sense and makes the history more consumable. A couple of questions:
|
commits can be preserved in contributors branch as they seem fit but it doesn't help having them in the main repo. As a contributor they know what those commits means but everyone else will doesn't have any knowledge of individual commits and why they made them. How does this preserve authorship in a pull request that has commits from multiple authors? How would this work with our current branch model? Specifically, applying a pull request to multiple branches |
That means it removes authorship information. If we tag a squashed commit as coming from multiple authors, we still wouldn't be able to differentiate what code was contributed by the individual authors. So if I merged a pull request with multiple authors, the result would be a single commit from me with a message listing the contributing authors, is that correct? |
"That means it removes authorship information. If we tag a squashed commit as coming from multiple authors, we still wouldn't be able to differentiate what code was contributed by the individual authors." "So if I merged a pull request with multiple authors, the result would be a single commit from me with a message listing the contributing authors, is that correct?" |
# TODO Introduce a convention as this is too brittle | ||
RELEASE_BRANCH_PREFIX = "0." | ||
|
||
DEV_BRANCH_NAME = "trunk" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trunk
is not used for Storm project.
|
@HeartSaVioR Not aware of spark script. I am ok with using either one or make this one work as we needed. |
Actually I was the one which claims separated credits from other project. (OpenTSDB/asynchbase#122) But there was a strong reason to do so, and I think it's not the normal case we can see it often. As I addressed from mailing list, many big Apache projects already used this approach. If there're cases which squashing really hurts then we can have exceptional case. |
@harshach Yeah, I don't know what things Kafka improve from Spark script so I wanted to see the benefit if you know about it. As I commented earlier, just adopting script doesn't work since we use different branch model (master, minor, bugfix) so it should be fully tested (including JIRA integration) before adopting. |
@HeartSaVioR already ran a simple tests. I like it because it allows us to tag the reviewers and additional committers in the tag message and can be picked onto other branches as well. It does work with JIRA as well. |
@harshach FYI: Spark script is having same issue. |
From a legal perspective it's very important that we be able to track the provenance of all code that lands in an ASF repository and could potentially be released. For example: Bob is a committer. Alice and Charles are not. Alice and Charles collaborate on a patch, both making commits. In the process Charles commits some code that he doesn't have the legal rights to (its proprietary, etc.). Later Bob uses this script to merge the pull request, and squash all the commits. Alice and Charles are listed as authors of the patch, but there is no history regarding how the code that the ASF doesn't have rights to get there. Was it Charles or Alice? That may seem like an edge case, but one that we should absolutely consider. |
@harshach The source of the file is referenced here: I'd like to get clearance that what this script does or enables is okay before proceeding. |
@ptgoetz makes sense. We can make explicit case for not merging PRs using this tool if the origin PR has commits from multiple authors and also can be integrated into the tool to not to proceed if thats the case. |
I'd really like to go forward with automated tools for developers / committers. What I've stated from dev@ mailing list, many projects already use specific tools for merging, and the merge script originated from Spark is well used for Spark, Kafka, Zeppelin (now TLP). |
I'll take a deep look and describe what this script actually does. |
Here's my understanding regarding this script.
|
We may have to modify lots of part of script since...
So without arranging our branch policy and merging step, it will be hard to get merge script fit for us. |
@HeartSaVioR Thanks for documenting the script. "Commit message will contain body of pull request which is free format for now and tends to be meaningless for commit message." "So without arranging our branch policy and merging step, it will be hard to get merge script fit for us." |
-1 I'm not sure I see the benefit in adding another script, which we will have to maintain, in order to do something we should rarely be doing. Also, I worry that having this script will lead to a sharp rise in totally-squashed PR merges, even when there's not really any benefit (and in fact, loss of authorship info) since some people are likely just going to use the script whenever they're doing a merge. |
I'll second @knusbaum's -1. Based on points I made earlier. This has the potntial to automatically destroy code provenance, especially if more than one contributor is involved in a pull request. From a legal perspective, the ASF need to be able to determine the origin of all code changes. I would recommend any project that uses a variant of this script to double-check with ASF legal that it is okay. I could be totally wrong, but I'd rather play it safe. I'd also argue that a well thought out series of commits cane make reviews easier. For example, separating maven build changes from code changes. I'd rather we encourage devs to think out the partitioning of commits themselves and squash appropriately if necessary. |
I'm totally +1 to this approach, even though I think script should be modified to Storm's project style. Like I said to dev@ mailing list, I have been doing reviewing and merging pending pull requests for weeks and months, and it was painful enough to merge and port back to each branches, even though I ignored cleaning up commits. (Pain is amplified when tiny commit should be merged to all branches.) If I want to clean up commits before merging it should be more painful. CHANGELOG is subject to not in sync among branches, but we need to write it manually because it's hard to filter merge commits to see the change list. (We could just rely on JIRA issues for alternative.) Regarding commits, I don't want to keep commits like 'kicking travis', 'address review comments', etc. which is not helpful at any chances. For my last 2 years of development of Storm, I didn't utilize individual commit. If something is wrong with recent merge, we rollback the merge, not individual commit. Squashing commits is widely used strategy and already shows success story to many big projects. Even Github provides the squash merge mode (recently rebase mode too) in GUI. If we want to merge in squashed commit, it should be done in merging process, not reviewing process. For me, ideal review process should be contributor-friendly. While we can't put efforts to only maintain Storm project (by reviewing pull requests, etc.), contributors also can't. Once we create a script which also squashes the commits, we don't need to make contributors bothering with rebasing and squashing commits. If not, all individuals including us should do it just after merger said 'please rebase and squash in order to merge.', which is also bothering for mergers, too. Moreover, there's a chance for contributors to be busy at the moment, and pull request goes stale. Pull requests which need upmerging are the case. I understand and agree the authorship issue, but we can treat it as exceptional case. Many pull requests are authored by one. Let's make merging phase as painless, or at least less painful thing. |
I'll also point out that the "if other Apache projects do it, it is oaky" stance is particularly dangerous. PMC members must understand ASF policy and not rely on what other projects do. If what another project does is wrong, then doing the same thing in our project just introduces liability. |
I'm okay with automating the merge process, just not the way it is implemented here. Perhaps we shouldmove the discussion to the dev list. |
@knusbaum maintaining the script shouldn't push us from adopting a better approach. It's another piece of code that we need to maintain just like entire repo that we are maintaining right now. |
We are closing stale Pull Requests to make the list more manageable. Please re-open any Pull Request that has been closed in error. Closes apache#608 Closes apache#639 Closes apache#640 Closes apache#648 Closes apache#662 Closes apache#668 Closes apache#692 Closes apache#705 Closes apache#724 Closes apache#728 Closes apache#730 Closes apache#753 Closes apache#803 Closes apache#854 Closes apache#922 Closes apache#986 Closes apache#992 Closes apache#1019 Closes apache#1040 Closes apache#1041 Closes apache#1043 Closes apache#1046 Closes apache#1051 Closes apache#1078 Closes apache#1146 Closes apache#1164 Closes apache#1165 Closes apache#1178 Closes apache#1213 Closes apache#1225 Closes apache#1258 Closes apache#1259 Closes apache#1268 Closes apache#1272 Closes apache#1277 Closes apache#1278 Closes apache#1288 Closes apache#1296 Closes apache#1328 Closes apache#1342 Closes apache#1353 Closes apache#1370 Closes apache#1376 Closes apache#1391 Closes apache#1395 Closes apache#1399 Closes apache#1406 Closes apache#1410 Closes apache#1422 Closes apache#1427 Closes apache#1443 Closes apache#1462 Closes apache#1468 Closes apache#1483 Closes apache#1506 Closes apache#1509 Closes apache#1515 Closes apache#1520 Closes apache#1521 Closes apache#1525 Closes apache#1527 Closes apache#1544 Closes apache#1550 Closes apache#1566 Closes apache#1569 Closes apache#1570 Closes apache#1575 Closes apache#1580 Closes apache#1584 Closes apache#1591 Closes apache#1600 Closes apache#1611 Closes apache#1613 Closes apache#1639 Closes apache#1703 Closes apache#1711 Closes apache#1719 Closes apache#1737 Closes apache#1760 Closes apache#1767 Closes apache#1768 Closes apache#1785 Closes apache#1799 Closes apache#1822 Closes apache#1824 Closes apache#1844 Closes apache#1874 Closes apache#1918 Closes apache#1928 Closes apache#1937 Closes apache#1942 Closes apache#1951 Closes apache#1957 Closes apache#1963 Closes apache#1964 Closes apache#1965 Closes apache#1967 Closes apache#1968 Closes apache#1971 Closes apache#1985 Closes apache#1986 Closes apache#1998 Closes apache#2031 Closes apache#2032 Closes apache#2071 Closes apache#2076 Closes apache#2108 Closes apache#2119 Closes apache#2128 Closes apache#2142 Closes apache#2174 Closes apache#2206 Closes apache#2297 Closes apache#2322 Closes apache#2332 Closes apache#2341 Closes apache#2377 Closes apache#2414 Closes apache#2469
We are closing stale Pull Requests to make the list more manageable. Please re-open any Pull Request that has been closed in error. Closes apache#608 Closes apache#639 Closes apache#640 Closes apache#648 Closes apache#662 Closes apache#668 Closes apache#692 Closes apache#705 Closes apache#724 Closes apache#728 Closes apache#730 Closes apache#753 Closes apache#803 Closes apache#854 Closes apache#922 Closes apache#986 Closes apache#992 Closes apache#1019 Closes apache#1040 Closes apache#1041 Closes apache#1043 Closes apache#1046 Closes apache#1051 Closes apache#1078 Closes apache#1146 Closes apache#1164 Closes apache#1165 Closes apache#1178 Closes apache#1213 Closes apache#1225 Closes apache#1258 Closes apache#1259 Closes apache#1268 Closes apache#1272 Closes apache#1277 Closes apache#1278 Closes apache#1288 Closes apache#1296 Closes apache#1328 Closes apache#1342 Closes apache#1353 Closes apache#1370 Closes apache#1376 Closes apache#1391 Closes apache#1395 Closes apache#1399 Closes apache#1406 Closes apache#1410 Closes apache#1422 Closes apache#1427 Closes apache#1443 Closes apache#1462 Closes apache#1468 Closes apache#1483 Closes apache#1506 Closes apache#1509 Closes apache#1515 Closes apache#1520 Closes apache#1521 Closes apache#1525 Closes apache#1527 Closes apache#1544 Closes apache#1550 Closes apache#1566 Closes apache#1569 Closes apache#1570 Closes apache#1575 Closes apache#1580 Closes apache#1584 Closes apache#1591 Closes apache#1600 Closes apache#1611 Closes apache#1613 Closes apache#1639 Closes apache#1703 Closes apache#1711 Closes apache#1719 Closes apache#1737 Closes apache#1760 Closes apache#1767 Closes apache#1768 Closes apache#1785 Closes apache#1799 Closes apache#1822 Closes apache#1824 Closes apache#1844 Closes apache#1874 Closes apache#1918 Closes apache#1928 Closes apache#1937 Closes apache#1942 Closes apache#1951 Closes apache#1957 Closes apache#1963 Closes apache#1964 Closes apache#1965 Closes apache#1967 Closes apache#1968 Closes apache#1971 Closes apache#1985 Closes apache#1986 Closes apache#1998 Closes apache#2031 Closes apache#2032 Closes apache#2071 Closes apache#2076 Closes apache#2108 Closes apache#2119 Closes apache#2128 Closes apache#2142 Closes apache#2174 Closes apache#2206 Closes apache#2297 Closes apache#2322 Closes apache#2332 Closes apache#2341 Closes apache#2377 Closes apache#2414 Closes apache#2469
No description provided.