Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to checkout the whole history since the branch creation? #266

Closed
leoheck opened this issue May 30, 2020 · 5 comments
Closed

Comments

@leoheck
Copy link

leoheck commented May 30, 2020

Hello, is it possible to checkout the whole history since the branch creation?
If, yes. How can I do it?

@leoheck leoheck changed the title Question - Is it possible to checkout the whole history since the branch creation? Is it possible to checkout the whole history since the branch creation? May 30, 2020
@ericsciple
Copy link
Contributor

If you set the input fetch-depth: 0, all history for all branches and tags will be fetched.

You can then use git log origin/master..my-branch to find all commits in my-branch that aren't in origin/master.

Does that help? If not can you provide more information about the scenario?

@leoheck
Copy link
Author

leoheck commented Jun 1, 2020

I was able to do it yesterday sometime after my question.
Yes, it was useful. Thank you.

Just describing my scenario a little bit better.

I wanted to clone the whole branch history because I want to use this, to generate visual diffs from the beginning of the branch creation and the latest commit of the branch.

So reviewers can have this info easily when they have to review printed circuit boards.

@leoheck leoheck closed this as completed Jun 1, 2020
@ryan-williams
Copy link

It would be nice to fetch just the commits being proposed in a given pull request. Fetching the entire repo history can be quite slow/expensive.

Git lets you clone just the commits between a branch branch and upstream master like:

git clone --shallow-exclude master --single-branch --branch branch git@github.com:<org>/<repo>.git

Might make sense to reopen this issue to track exposing that functionality in this action.

A related, combination of steps I've used to have the PR base and head available (but none of the interceding history) is:

- name: Check out repository
  uses: actions/checkout@v2  
  with:
    ref: ${{ github.head_ref }}
- name: Add PR base ref
  run: |
  git fetch --depth=1 origin +refs/heads/${{github.base_ref}}:refs/remotes/origin/${{github.base_ref}}

That's enough to do e.g. git diff --name-only <base>..<head> to see which files have changed as part of a PR, but it doesn't know how many or which commits lie between the two endpoints.

It would be nicei if actions/checkout streamlined this multi-ref use-case (perhaps in a refs list?), fetching refs in addition to the one that is ultimately checked out.

Some of these common patterns could even merit their own top-level flags (just the PR history, just the PR endpoints, just the PR head without the merge, etc.).

@leoheck
Copy link
Author

leoheck commented Jun 3, 2020

@ryan-williams Can I use something like your command to initialize submodules behind a private repo? I was not able to achieve this so far.

AbeJellinek added a commit to zotero/translators that referenced this issue Jul 8, 2021
@polarathene
Copy link

polarathene commented Jun 25, 2022

@leoheck

This will fetch the PR branch commits and one more commit (the commit on the base branch that the PR branched from (eg: master / main)):

- name: 'PR commits + 1'
  run: echo "PR_FETCH_DEPTH=$(( ${{ github.event.pull_request.commits }} + 1 ))" >> "${GITHUB_ENV}"

- name: 'Checkout PR branch and all PR commits'
  uses: actions/checkout@v3
    with:
      ref: ${{ github.event.pull_request.head.ref }}
      fetch-depth: ${{ env.PR_FETCH_DEPTH }}

NOTE: If the PR had merge commits from the base branch into the PR branch, those are included and the fetch depth will also continue to fetch remaining N depth commits from both parents of the merge commit (the base branch at that point and the PR branch).

The ref was set to the PR head ref, which will point to the last commit listed for that PR. The default is a test merge-commit that Github generates of the PR branch merging into the base branch, which is probably not what you'd want/expect if trying to fetch only the commit history for the PR branch.

Likewise, ignore the prior step to set an ENV for fetch-depth and use the github context directly (fetch-depth: ${{ github.event.pull_request.commits }}) if you do not need that extra commit (useful if you want to also fetch the base branch to that point for fetching enough commit history to have a merge-base commit, otherwise fetch may retrieve the entire history of the base branch).


@ryan-williams

Git lets you clone just the commits between a branch branch and upstream master like:

git clone --shallow-exclude master --single-branch --branch branch git@github.com:<org>/<repo>.git

This is quite neat, but I noticed when a merge commit from base branch to PR branch exists in the history it'll stop from that point. So if that was the last commit in the PR history (eg, a maintainer did this via Github PR web UI button), then you have a single commit. Might not be as useful.

It is neat for finding how many commits the branch has if it was branched off the excluded branch (or that branch was merged into it): git rev-list --count HEAD.


but it doesn't know how many or which commits lie between the two endpoints.

You can use the github context to get how many commits belong to the PR, fetch that depth + 1, like shown above. Then a git fetch to the other branch (base) will identify a common commit between the two to fetch roughly what is needed only (for a merge-base commit), instead of full history.

When a merge commit from base branch into PR branch occurs, the git fetch of the base branch may only pull history to that last merge point, which would also serve as the merge-base point/commit AFAIK. That should be fine, otherwise you could tell git fetch to retrieve commits from the base branch since the date of the commit you branched off from using --shallow-since, I explain that in a similar comment I wrote.


A related, combination of steps I've used to have the PR base and head available (but none of the interceding history) is

Just to clarify, the base ref is not the first commit of the PR (nor the commit on the base branch that the PR branched from), it is the ref to the base branch. So the commit you fetch may be newer than the one you branched off from.

There is also the ref ${{ github.event.pull_request.base.sha }}, which is the latest base branch commit at the time the PR was opened (not when it was branched), unless a merge commit from the base branch to the PR branch happened at some later point, then this base.sha points to the latest base branch commit at the time of that merge commit (and if that merge commit itself is the latest commit in PR branch history, it represents the head.sha).

You could instead take the default ref of the generated test merge commit of PR branch into base branch, since the outcome would be the same in your example with the base ref. With fetch-depth: 2, you'll get the merge commit and the two parent commits (1 from each branch involved in the merge).

No need to specify the refs then, the merge commit itself is HEAD (since the branch is checked out, this is shorter than the default ref, which github.ref isn't exactly the correct value of AFAIK, but github.sha should be the same commit hash as HEAD) and the associated commit from the base branch (HEAD^1), no need to fetch it separately (only difference really is associating a branch ref to it?). Keep in mind that HEAD^1 is the latest commit from the base branch, so it will not necessarily match base.sha (which the PR web UI compares to base.ref associated commit to know if the base branch is outdated for the PR).

Example for comparing branches (eg: PR to master, files that have added/modified changes):

- name: 'Checkout PR branch (with test merge-commit)'
  uses: actions/checkout@v3
    with:
      fetch-depth: 2
- name: 'Get a list of changed files to process'
  run: git diff-tree --name-only --diff-filter 'AM' -r HEAD^1 HEAD

As we're comparing two commits with direct before/after diff, this works well.

NOTE: If you attempted to do so with the other commit that was fetched (the HEAD sha/ref commit from the PR branch), you'd not be able to derive a merge-base, and without that you'd potentially get different diff output than the example above.

One scenario of that problem occurs if the base branch has since deleted a file that the PR branch has in it's history (but the PR branch left alone), a diff between those two commits would suggest that the file has been added by the PR commit. This won't happen with the merge commit diff (which was done with a merge base by Github).


That's enough to do e.g. git diff --name-only <base>..<head> to see which files have changed as part of a PR

This would run into the issue I mentioned comparing the two commits, instead of the base commit with the default test merge-commit. I don't think the range notation .. matters here?

You'd also potentially miss files due to lack of -r I think? Which seems to be necessary for getting files in sub-directories.

The --diff-filter 'AM' if anyone is curious additionally filters the changed file paths to only those with "Added" (A) or "Modified" (M) statuses (you can see the status via --name-status instead of --name-only), useful if you want to parse files from the PR branch with actual changes such as for checking URL links are valid/reachable, you may also be interested in other change statuses such as files being renamed/moved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants