Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: better selection of base ref branch #532

Merged
merged 9 commits into from
Sep 8, 2017

Conversation

asfaltboy
Copy link
Member

@asfaltboy asfaltboy commented Oct 28, 2016

Before, we used a base ref that was specified in settings, otherwise defaulting to hardcoded "master".

Now, we find the nearest common ancestor branch and use that as our default base, still allowing override in settings.

Fixes #499


The maintainers of this repo require that all pull request submitters agree and adhere to the following:

  • I have read and agree to the contribution guidelines.
    (required)
  • All related documentation has been updated to reflect the changes made. (required)
  • My commit messages are cleaned up and ready to merge. (required)

The maintainers of this repository require you to select the semantic version type that
the changes in this pull request represent. Please select one of the following:

  • major
  • minor
  • patch
  • documentation only

@randy3k
Copy link
Collaborator

randy3k commented Oct 28, 2016

Shall we somehow combine github integrated branch name with this?

It seems a bit inconsistant that we are using git config to set github remote branch while using project settings to set the rebase base.

Edit

To clarify, by combining, I meant to use the same mechanism to store the branch information.

@asfaltboy
Copy link
Member Author

We're not setting the base in project settings, the base is calculated on every render of the view.

The project settings are for getting a default override, which prompted me to suggest #531 , since I believe this should be in (global) default/user settings too.

As for github integrated branch name, I'm not sure there's a better way to set settings persistently in sublime.

@randy3k
Copy link
Collaborator

randy3k commented Oct 28, 2016

The github integrate remote is also computed every time. git config is used to store the default.

My question is actually how we should store the default settings per project, using git config or sublime-settings or sublime-project?

@randy3k
Copy link
Collaborator

randy3k commented Oct 28, 2016

Anyway, this PR looks good to me.

@randy3k
Copy link
Collaborator

randy3k commented Oct 29, 2016

Ah....

The git show-branch trick does not work if that a branch has merged into the current branch.
Your code will pick the branch subfeature, but subfeature has been merge into the current branch (feature).

* [feature] Merge branch 'subfeature' into feature
 ! [master] master new commit
  ! [subfeature] subfeature commit
---
 +  [master] master new commit
-   [feature] Merge branch 'subfeature' into feature
* + [subfeature] subfeature commit
*++ [master^] before creating feature branch

@asfaltboy
Copy link
Member Author

I'm not sure there's a way to intelligently decide for the user which of the common branches he should rebase on. As far as I know Git doesn't store the original "branching-out" node. In other words, the user could have easily merged "master" into "feature" and it would detect that. If they do merge a sub-feature branch, deleting the merged branch will allow them to rebase on the next detected base.

On the up side, it's still better than what we had (hardcoded "master"), and it can still be overridden in settings if they prefer to keep "base-ref" static.

@asfaltboy
Copy link
Member Author

asfaltboy commented Oct 29, 2016

Another option, would be to find the oldest common ancestor, rather than the nearest, it would look something like:

git branch --contains `diff -u <(git rev-list --first-parent subfeature) \
    <(git rev-list --first-parent master) | sed -ne 's/^ //p' | head -1` --no-merged

Update: Originally taken from here. But, it feels to me, that there's still no "right" answer for all scenarios. It's possible to ignore any merges and this may bring the user closer to true oldest ancestor node.

@asfaltboy
Copy link
Member Author

@randy3k can you try the command in the above comment on your test case. If it would indicate it's better for you, maybe we can allow both options with a toggle?

@randy3k
Copy link
Collaborator

randy3k commented Oct 29, 2016

The above returns master for me, so it still needs to hard code master? I have no idea how the above works....

@asfaltboy
Copy link
Member Author

Sorry I copy-pasted hastily, I probably needed to update the branches like so:

git branch --contains `diff -u <(git rev-list --first-parent feature) \
<(git rev-list --first-parent subfeature) | sed -ne 's/^ //p' | head -1` --no-merged

# or even
git branch --contains `git merge-base feature subfeature` --no-merged

Here is how I understand this works, hope it helps:

So, the current base detection (in this PR) works by getting all relative branches, then simply picks the first of those branches, and if it's different than self, returns it.

However, with the above suggested solution, we would not stop at "nearest", rather we would then continue and compare current branch to the potential base branch, looking for the node these trees met (oldest common commit in rev-list). Then check if there is a branch that has this commit and is unmerged, and use that branch as base ref.

@randy3k
Copy link
Collaborator

randy3k commented Oct 29, 2016

The former one returns master as expected and the latter one returns nothing...

How about this?

for s in $(git rev-list --first-parent feature); do 
    branches=`git branch --contains $s --no-merged`
    [ -n "$branches" ] && echo $branches && break
done

Note: The above may return more than one branch.

@asfaltboy
Copy link
Member Author

The loops works nicely on my feature branches, but not at all on my master branch, for those it returns a single feature branch (unmerged!), which would be wrong as a base ref.

Correct me if I'm wrong but, the only issue we currently have with the show-branch solution above is if we have many local feature branches that share ancestry with current branch, then we may choose those instead of their parents (since children appear before their parents in show-branch).

I think I'll add the filter (earliest common no-merge branches) suggested earlier, and see how that works for us.

I still don't think there is a "one size fits all" solution, so we may also want to research a bit more of how others do it.

@randy3k
Copy link
Collaborator

randy3k commented Oct 30, 2016

Sounds a good idea. It would be also handy to preselect the corresponding branch in GsRebaseOnTopOfCommand.

@asfaltboy asfaltboy changed the title Feature: better selection of base ref branch [WIP] Feature: better selection of base ref branch Oct 30, 2016
@asfaltboy asfaltboy self-assigned this Oct 30, 2016
@asfaltboy asfaltboy force-pushed the feature/better-base-ref-detection branch 2 times, most recently from ff90635 to c5e13d4 Compare October 30, 2016 10:06
@asfaltboy
Copy link
Member Author

Ok ready for review guys! This was a bit troublesome and I want to avoid adding more functionality in this PR. And to be clear, this is by no means a perfect solution, there may still be cases where the detected base is clearly not what the users wants.

Here's some stuff we need to address:

  1. Refreshing: We currently refresh the base only when building the view from scratch again. We could call base_ref(reset_ref=True) in pre_render, only when a ref is already set, to make sure we have the right ref for the selected branch. This however, will overwrite any custom base ref, meaning GsRebaseDefineBaseRefCommand will not work. To address this, we need to change the view setting this command uses, e.g to user_defined_base_ref and have it take precedence over the auto-detected / defined in settings. This will make refreshing base in pre_render "safe"
  2. The base should be used as the pre-selected branch in GsRebaseOnTopOfCommand, and other commands of similar nature.
  3. A setting can be added to toggle auto-detection of base, though this can already be done by the rebase_default_base_ref project setting.

max_iterations = 100
for relative in relatives:
util.debug.add_to_log('nearest_branch: Getting common commits with %s' % relative)
relative_commits = self.git("rev-list", "--first-parent", relative).splitlines()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may set the max iterations by self.git("rev-list", "-{}".format(max_iterations), "--first-parent", relative)


if not base_ref:
# use remote tracking branch as a sane default
remote_branch = self._get_branch_status_components()[3]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.get_active_remote_branch().name_with_remote may be better

@@ -284,6 +296,95 @@ def base_ref(self):

return base_ref

def nearest_branch(self, branch, default="master"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably worth breaking this out into a helper method, since there's some complexity here and it isn't strictly view-related.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean into it's own helper class? Initially I had this in BranchesMixin since it's basically a branch detection mechanism, then I moved it here since felt it's very much related to rebase interface.

But, perhaps it's better if we move this into it's own mixin module: git_mixins.rebase to potentially be used by other (rebase) commands. How do you feel about this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, moving to git_mixins.rebase sounds exactly right.

except GitSavvyError:
return default

util.debug.add_to_log('nearest_branch: found %s show-branch results' %
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Prefer "... {}".format(val) over "... %s" % valfor consistency.

(len(relatives), relatives))

if not relatives:
util.debug.add_to_log('No relatives found. Possibly on a root branch!')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this pattern of verbose debug logging!

util.debug.add_to_log('nearest_branch: Getting common commits with %s' % relative)
relative_commits = self.git("rev-list", "--first-parent", relative).splitlines()
common = None
for i, l in enumerate(diff.compare(branch_commits, relative_commits)):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer descriptive variable names here.

if branch == nearest:
return default
return nearest

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a huge function. Would be great if it could be decomposed somehow, but its okay if that would add extra complexity or damage readability.

@divmain
Copy link
Collaborator

divmain commented Oct 31, 2016

Oh, and last comment. Would like to make sure this doesn't adversely effect performance too much. GitSavvy can already be slow, particularly on Windows machines without SSD. I'd like to avoid making it worse if possible. I have an older machine that I pull out for things like this, if you would like me to check this.

@asfaltboy
Copy link
Member Author

@divmain It does add a tiny bit of overhead when opening rebase dashboard, but it's still surprisingly fast on my machine. I would appreciate if we could run this on an old machine! We'd also need a code base with branches that have diverged significantly, to test further.

Other than that, I'd have to use this a bit more to test out more edge cases.

@randy3k
Copy link
Collaborator

randy3k commented Nov 3, 2016

Also related to #369

#369 suggests using remote tracking branch as the base. Perhaps we can use it as a fallback instead of master.

@asfaltboy
Copy link
Member Author

It's already the default passed by the interface to the helper method. I.e remote_branch or "master"

@divmain
Copy link
Collaborator

divmain commented Dec 16, 2016

@asfaltboy, let me know when all review comments have been addressed and I'll pull this down onto an old laptop :)

@stoivo
Copy link
Member

stoivo commented Aug 31, 2017

@randy3k still looking at this?

@randy3k
Copy link
Collaborator

randy3k commented Aug 31, 2017

Hmmm, I could work on this if @asfaltboy agrees.

@stoivo
Copy link
Member

stoivo commented Aug 31, 2017

Ops, i though I saw you opening this.
Would be cool if we could close some of the long lasting PR

@asfaltboy
Copy link
Member Author

I keep postponing my duties, too much work during the week and weekends are busy. I'll try to address the concerns when able

@randy3k
Copy link
Collaborator

randy3k commented Sep 1, 2017

No problem. Everyone has a real life.

@asfaltboy asfaltboy force-pushed the feature/better-base-ref-detection branch from c5e13d4 to 2fcec66 Compare September 2, 2017 11:06
@asfaltboy asfaltboy changed the title [WIP] Feature: better selection of base ref branch Feature: better selection of base ref branch Sep 2, 2017
@asfaltboy asfaltboy changed the base branch from master to dev September 2, 2017 11:16

if not base_ref:
# use remote tracking branch as a sane default
remote_branch = self.get_active_remote_branch().name_with_remote
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.get_active_remote_branch() could be None if there is no remote tracking, you probably only need get_upstream_for_active_branch().

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have fixed it.

@randy3k
Copy link
Collaborator

randy3k commented Sep 3, 2017

It would be more convenient if GsRebaseDefineBaseRefCommand and GsRebaseOnTopOfCommand will preselect the chosen base_ref.

project_data = sublime.active_window().project_data() or {}
project_settings = project_data.get('settings', {})
base_ref = project_settings.get("rebase_default_base_ref", "master")
base_ref = project_settings.get("rebase_default_base_ref")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is irrelevant to this PR, but I would also like to discuss about this setting rebase_default_base_ref. It seems it is the only place we use a project setting while we are using git config to set GitHub integration repo and branch.

I just want to make sure there will be a consistent way for storing user settings. So which one shall we use if there is such a need? A: git config or B: sublime text project settings.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally don't think "sublime-specific settings" belong in git config ...

My project-settings files are synced to dropbox, and I'd like to avoid defining the override ref again for every new project when I switch machines. Then again, this feature kind of makes this a non-issue.

However, now that you mentioned the Github settings, maybe it could be good to store them in project settings too, unless it makes sense to use these outside of GitSavvy?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sublime-project file might be shared within a team (though it is no likely), the git config is not. It might be one of the reasons why historically we were using git config rather than project settings.

Though, I agree that those information should not be stored at git config neither.

@asfaltboy
Copy link
Member Author

@randy3k thanks a lot for the review, and the fixes that follow!

Unless @divmain, or anyone with a slow/old/windows machine, wants to profile the rebase interface with this PR, we can go ahead and merge it into develop.

asfaltboy and others added 9 commits September 6, 2017 22:02
Before, we used a base ref that was specified in settings, otherwise defaulting to hardcoded "master".

Now, we find the nearest common ancestor branch (after some filtering) and use that as our default base, still allowing users to override it in settings.
- use .get_active_remote_branch().name_with_remote for clarity
- be consistant with string formatting in log calls
- limit to max revision count in `git rev-list` call
- rename max_iterations to max_revisions for clarity
This avoids wrongly using commit message parts that include strings
in square brackets as branches.

For example before this fix, the following commit would return WIP
as branch name: `[develo] [WIP] Fix: some bug`
- As suggested by @divmain the method was moved into a separate
  mixin for reuse, as well as, splitted into smaller parts

- Use `git branch --merged` instead of `--no-merged. This fixes a
  bug where the most obvious branches are not returned since no-merged
  means that only branches that are "unreachable" from given node
  are returned, and we actually want the opposite.
@asfaltboy asfaltboy force-pushed the feature/better-base-ref-detection branch from ba3553c to ff8bcb1 Compare September 6, 2017 20:04
@asfaltboy asfaltboy merged commit 23e9cb7 into dev Sep 8, 2017
@asfaltboy asfaltboy deleted the feature/better-base-ref-detection branch November 13, 2017 07:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants