- Author: Ben Du
- Date: 2020-11-29 21:13:47
- Title: Hands on GitPython
- Slug: hands-on-GitPython
- Category: Computer Science
- Tags: Computer Science, programming, Python, Git, GitPython, version control
- Modified: 2021-05-29 21:13:47


## Tips and Traps

1. GitPython is a wrapper around the `git` command.
    It requires the `git` command to be on the search path in order to work. 
    Also, 
    sometimes it is easier to call the `git` command 
    via `subprocess.run` directly instead of using GitPython.

2. The `git` command (and thus GitPython) accepts URLs both with and without the trailing `.git`. 

In [1]:
!pip3 install GitPython



In [4]:
import git
from git import Repo

In [5]:
url = "https://github.com/dclong/docker-ubuntu_b.git"
dir_local = "/tmp/test_gitpython"

In [6]:
url = "https://github.com/dclong/docker-ubuntu_b"
dir_local = "/tmp/test_gitpython"

## Clone a Repository

In [7]:
!rm -rf {dir_local}

In [9]:
repo = git.Repo.clone_from(url, dir_local, branch="main")
repo

<git.repo.base.Repo '/tmp/test_gitpython/.git'>

In [7]:
ls /tmp/test_gitpython/

[0m[01;32mbuild.sh[0m*  Dockerfile  LICENSE  readme.md  [01;34mscripts[0m/


Verify that the GitHub repository is cloned to local.

In [5]:
!ls {dir_local}

build.sh  readme.md


Clone the local repository to another location 
(which is not very useful as you can directly copy the directory to the new location).

In [13]:
repo2 = Repo(dir_local).clone(f"/tmp/{dir_local}")
repo2

<git.repo.base.Repo '/tmp/test_gitpython/.git'>

In [14]:
!ls /tmp/{dir_local}

build.sh  readme.md


## Infomation of the Local Repository

In [15]:
heads = repo.heads
heads

[<git.Head "refs/heads/main">]

In [16]:
main = heads.main
main

<git.Head "refs/heads/main">

Get the commit pointed to by head called master.

In [17]:
main.commit

<git.Commit "95ed236bd715a06320ee85d519fb79a0adffe072">

In [18]:
main.rename("main2")

<git.Head "refs/heads/main2">

Verify that the `main` branch has been renamed to `main2`.

In [19]:
!cd {dir_local} && git branch

* [32mmain2[m


### Get the Active Branch

In [5]:
repo.active_branch.name

'main'

## Get All Branches

In [10]:
repo.branches

[<git.Head "refs/heads/main">]

### Get the Remote Name

In [6]:
repo.remote().name

'origin'

### Get all Remotes

In [7]:
repo.remotes

[<git.Remote "origin">]

## Commits

Get the latest commit in a branch.

In [23]:
repo.commit("main")

<git.Commit "53d99955a9762427f2f68dc04765471089055dc1">

In [28]:
repo.commit("main").diff(repo.commit("origin/dev"))

[]

In [27]:
repo.commit("origin/dev")

<git.Commit "8f9f426f13d70b21f573f7c50bbe01e8ce38f158">

In [26]:
repo.refs

[<git.Head "refs/heads/main">,
 <git.RemoteReference "refs/remotes/origin/HEAD">,
 <git.RemoteReference "refs/remotes/origin/debian">,
 <git.RemoteReference "refs/remotes/origin/dev">,
 <git.RemoteReference "refs/remotes/origin/main">]

## Changed Files

Update a file.

In [23]:
!echo "# add a line of comment" >> {dir_local}/build.sh

In [24]:
repo = Repo(dir_local)
files_changed = [item.a_path for item in repo.index.diff(None)]
files_changed

['build.sh']

## Staged Files

In [25]:
repo = Repo(dir_local)
index = repo.index

In [26]:
index.add("build.sh")

[(100644, f1cb16a21febd1f69a7a638402dddeb7f1dc9771, 0, build.sh)]

The file `build.sh` is now staged.

In [27]:
files_stage = [item.a_path for item in repo.index.diff('HEAD')]
files_stage

['build.sh']

In [28]:
files_changed = [item.a_path for item in repo.index.diff(None)]
files_changed

[]

Commit the change.

In [29]:
index.commit("update build.sh")

<git.Commit "bfea304786b7b77f7fe247c74040c0e23576fc41">

In [30]:
files_stage = [item.a_path for item in repo.index.diff('HEAD')]
files_stage

[]

In [8]:
remote = repo.remote()
remote

<git.Remote "origin">

## Push the Commits

Push the local `main2` branch to the remote `main2` branch.

In [32]:
remote.push("main2")

[<git.remote.PushInfo at 0x11fd596d0>]

The above is equivalent to the following more detailed specification.

In [62]:
remote.push("refs/heads/main2:refs/heads/main2")

[<git.remote.PushInfo at 0x119903540>]

Push the local `main2` branch to the remote `main` branch.

In [63]:
remote.push("refs/heads/main2:refs/heads/main")

[<git.remote.PushInfo at 0x11992d9a0>]

## Pull a Branch

In [6]:
repo.active_branch

<git.Head "refs/heads/main">

In [11]:
remote.pull(repo.active_branch)

[]

In [12]:
!ls {dir_local}

abc       build.sh  readme.md


## git checkout

In [42]:
help(repo.refs[4].checkout)

Help on method checkout in module git.refs.head:

checkout(force=False, **kwargs) method of git.refs.remote.RemoteReference instance
    Checkout this head by setting the HEAD to this reference, by updating the index
    to reflect the tree we point to and by updating the working tree to reflect
    the latest index.
    
    The command will fail if changed working tree files would be overwritten.
    
    :param force:
        If True, changes to the index and the working tree will be discarded.
        If False, GitCommandError will be raised in that situation.
    
    :param kwargs:
        Additional keyword arguments to be passed to git checkout, i.e.
        b='new_branch' to create a new branch at the given spot.
    
    :return:
        The active branch after the checkout operation, usually self unless
        a new branch has been created.
        If there is no active branch, as the HEAD is now detached, the HEAD
        reference will be returned instead.
    
    :note:

In [5]:
repo.git.checkout?

[0;31mSignature:[0m [0mrepo[0m[0;34m.[0m[0mgit[0m[0;34m.[0m[0mcheckout[0m[0;34m([0m[0;34m*[0m[0margs[0m[0;34m,[0m [0;34m**[0m[0mkwargs[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m <no docstring>
[0;31mFile:[0m      /usr/local/lib/python3.8/site-packages/git/cmd.py
[0;31mType:[0m      function


In [6]:
repo.active_branch

<git.Head "refs/heads/dev">

The `force=True` option discard any local changes no matter switching branch might be blocked by the local changes or not.

In [12]:
repo.git.checkout("dev", force=True)

'Your branch is ahead of \'origin/dev\' by 1 commit.\n  (use "git push" to publish your local commits)'

In [5]:
repo.git.checkout("main", force=True)

"Your branch is up to date with 'origin/main'."

In [11]:
repo.active_branch

<git.Head "refs/heads/dev">

## git tag

List all tags.

In [13]:
repo.tags

[]

Add a tag.

In [15]:
repo.create_tag("v1.0.0")

<git.TagReference "refs/tags/v1.0.0">

In [17]:
repo.tags

[<git.TagReference "refs/tags/v1.0.0">]

In [20]:
repo.tag("refs/tags/v1.0.0")

<git.TagReference "refs/tags/v1.0.0">

In [23]:
tag2 = repo.tag("refs/tags/v2.0.0")
tag2

<git.TagReference "refs/tags/v2.0.0">

In [24]:
repo.tags

[<git.TagReference "refs/tags/v1.0.0">]

The GitCommandError is thrown when the tag already exists.

In [25]:
repo.create_tag("v1.0.0")

GitCommandError: Cmd('git') failed due to: exit code(128)
  cmdline: git tag v1.0.0 HEAD
  stderr: 'fatal: tag 'v1.0.0' already exists'

In [26]:
repo.remote().push("v1.0.0")

[<git.remote.PushInfo at 0x12034acc0>]

## git diff

In [16]:
help(repo.refs[4].commit.diff)

Help on method diff in module git.diff:

diff(other: Union[Type[git.diff.Diffable.Index], Type[ForwardRef('Tree')], object, NoneType, str] = <class 'git.diff.Diffable.Index'>, paths: Union[str, List[str], Tuple[str, ...], NoneType] = None, create_patch: bool = False, **kwargs: Any) -> 'DiffIndex' method of git.objects.commit.Commit instance
    Creates diffs between two items being trees, trees and index or an
    index and the working tree. It will detect renames automatically.
    
    :param other:
        Is the item to compare us with.
        If None, we will be compared to the working tree.
        If Treeish, it will be compared against the respective tree
        If Index ( type ), it will be compared against the index.
        If git.NULL_TREE, it will compare against the empty tree.
        It defaults to Index to assure the method will not by-default fail
        on bare repositories.
    
    :param paths:
        is a list of paths or a single path to limit the diff to.
 

In [3]:
url = "https://github.com/dclong/docker-ubuntu_b.git"
dir_local = "/tmp/" + url[(url.rindex("/") + 1):]
!rm -rf {dir_local}

In [4]:
repo = git.Repo.clone_from(url, dir_local, branch="main")
repo

<git.repo.base.Repo '/tmp/docker-ubuntu_b.git/.git'>

In [25]:
repo.refs

[<git.Head "refs/heads/debian">,
 <git.Head "refs/heads/dev">,
 <git.Head "refs/heads/main">,
 <git.RemoteReference "refs/remotes/origin/HEAD">,
 <git.RemoteReference "refs/remotes/origin/debian">,
 <git.RemoteReference "refs/remotes/origin/dev">,
 <git.RemoteReference "refs/remotes/origin/main">]

In [6]:
diffs = repo.refs[4].commit.diff(repo.refs[3].commit)
diffs

[]

In [21]:
diffs = repo.refs[4].commit.diff(repo.refs[2].commit)
diffs

[<git.diff.Diff at 0x7f0eb05d1a60>, <git.diff.Diff at 0x7f0eb05d1af0>]

In [13]:
str(diffs[0])



In [12]:
repo.refs[5].name

'origin/main'

In [6]:
print(repo.git.status())

On branch main
Your branch is up to date with 'origin/main'.

nothing to commit, working tree clean


In [6]:
repo.git.checkout("debian", force=True)

"Your branch is up to date with 'origin/debian'."

In [8]:
repo.git.checkout(b="a_new_branch", force=True)

''

In [None]:
nima = repo.refs[4].checkout(force=True, b="nima")
nima

In [50]:
diffs = nima.commit.diff(repo.refs[-1].commit)
diffs[0].diff

''

Diff the `dev` and the `main` branch,
which is equivalent to the Git command 
`git diff dev..main`.

In [30]:
repo.refs[2].commit.diff(repo.refs[1].commit)

[]

In [32]:
diffs = repo.refs[2].commit.diff(repo.refs[0].commit)
diffs

[<git.diff.Diff at 0x128ca3a60>]

In [33]:
diffs[0]

<git.diff.Diff at 0x128ca3a60>

In [24]:
diffs = repo.refs[6].commit.diff(repo.refs[7].commit)
diffs

[]

In [25]:
diffs = repo.refs[4].commit.diff(repo.refs[7].commit)
diffs

[<git.diff.Diff at 0x128ca3790>]

In [26]:
diffs[0].diff

''

In [27]:
diffs = repo.refs[7].commit.diff(repo.refs[4].commit)
diffs

[<git.diff.Diff at 0x128ca3820>]

In [28]:
diffs[0].diff

''

In [19]:
any(ele for ele in [''])

False

In [23]:
repo.branches[0].name

'dev'

In [None]:
for branch in repo.branches:
    branch.

In [12]:
commit = repo.head.commit
commit

<git.Commit "6716bb0d016bd63ba543f3d9c67a65dadecd152e">

In [15]:
type(repo.branches[0])

git.refs.head.Head

In [17]:
repo.refs[4].commit.diff(repo.refs[2].commit)

[]

In [9]:
repo.refs[4].commit.diff(repo.refs[3].commit)

[<git.diff.Diff at 0x127370310>]

In [20]:
help(repo.git.branch)

Help on function <lambda> in module git.cmd:

<lambda> lambda *args, **kwargs



In [28]:
repo.heads

[<git.Head "refs/heads/dev">, <git.Head "refs/heads/main">]

Diff the `debian` and the `main` branches but limit diff to specified paths 
(via the `paths` parameter).

In [24]:
diffs = repo.refs[4].commit.diff(repo.refs[2].commit, paths=["build.sh", "scripts"])
diffs

[]

## References

https://github.com/gitpython-developers/GitPython

https://stackoverflow.com/questions/33733453/get-changed-files-using-gitpython

https://stackoverflow.com/questions/31959425/how-to-get-staged-files-using-gitpython

https://gitpython.readthedocs.io/en/stable/tutorial.html#tutorial-label

