Skip to content

Mini GitHub Tutorial

Mihai Surdeanu edited this page Feb 5, 2024 · 3 revisions

How to Use Github for Processors

  1. Read a GitHub tutorial. There are many good ones on the web. The trickiest thing with Git is understanding branches. This tutorial might be helpful: http://git-scm.com/book/en/Git-Branching-Branching-Workflows. This is a more detailed tutorial, if you really want to understand how things work (although the instructions below might be sufficient for most things): http://ftp.newartisans.com/pub/git.from.bottom.up.pdf.

    If you want to push any changes you are going to make back to the origin (GitHub), you will need a GitHub account. The account will come with a username and password which used to be sufficient credentials for a push. As of 13 Aug. 2021, simple password authentication is no longer supported by GitHub. A personal access token is used instead so that you don't have to type your real password into all sorts of git programs. Generate a token to use from your GitHub account under Settings / Developer settings / Personal access tokens. Use the token as the password when git requests one.

  2. To get a copy of processors and start working with git, please follow these steps:

git clone https://github.com/clulab/processors.git

Similar instructions exist for all other projects hosted in GitHub.

  1. By default you will be in the main (or master in older projects) branch. To see what local branches you have, and which one you are in, type (the branch marked with "*" is the branch you are in):
git branch
  1. Please work from your own branch for every major change (see next section for best practices). To create a new local branch, type:
git checkout -b my-new-branch
  1. After you make some changes, you can the status of files with git status. To add a new file and then commit it, use the instructions below. Note that, if you your comments are very verbose (and they should be!), it is best to use git commit: this will open a separate editor where you can type your commit message.
git add my-new-file
git commit -m "Description of my first commit"

You might find these tips on good commit messages helpful.

  1. You have to publish the changes in your branch to a remote branch, so that other people can see your contributions. You do this by:
git push origin my-new-branch
  1. When your changes are ready to be merged into the main branch, check with your advisor about which protocol to use. For some projects that are more sensitive (e.g., processors or reach) merges into the main branch are handled using pull requests. Follow step 8 if you are in this situation. For other projects, merging into main may be directly allowed. In this situation, skip step 8 and follow step 9.

  2. If your supervisor indicated that your contributions must be merged into the main branch using a pull request, please follow the instructions in this page to initiate one: https://help.github.com/articles/using-pull-requests/

  3. If your supervisor allows direct merging into the main branch, do:

    git checkout main (which switches to your local main branch)

    git pull origin main (which updates the local main with changes published in the remote main)

    git merge my-new-branch (merge your local branch into local main. Resolve any conflicts reported by git here, if necessary.)

    git commit -a (commit the merged main locally. Summarize the changes made in your branch in the commit comments. Note: this step is not necessary if the merge above completed successfully without any manual editing.)

    git push origin main (push the updated local main to remote main)

General Best-practice Rules for Github

  • It is important to keep the remote main (origin main) clean. That means, do not merge your branch back into main if it's not in a stable state. Unit tests must always pass in the main branch!

  • It is important to create a separate branch for every major change (i.e., research idea) you make. Merge branches back into main as soon possible, but only when: (a) the idea was proven to work, and (b) see previous rule.

  • Do not create too many levels of branching, that is, branches spin-off from another branch. Keep your branch structure as flat as possible, ideally a one-level tree where every branch is a child of "main".

  • Use your remote branch (step 6 in the previous section) as a place to backup your changes before merging back into main.

Useful Tips

  • To save some typing, edit your ~/.gitconfig files and add aliases, e.g.:
[user]
  email = your-email@yourdomain.com
  name  = Your Name
[alias]
  st = status
  ci = commit
  br = branch
  co = checkout
  ls = ls-files
  discard = checkout -- .
[push]
  default = tracking
  • It is very useful to display your current working branch in your Bash prompt (for *nix and Mac folk). You can add the following block to your ~/.bashrc file:
# first, copy [`git-completion.bash`](https://github.com/git/git/blob/master/contrib/completion/git-completion.bash) to `~/.git-completion.bash`
source ~/.git-completion.bash

To ensure the completions are available each time you log in, add the following to your ~/.bashrc:

# bash-completion
if [ -f ~/.git-completion.bash ]; then
  sh ~/.git-completion.bash
fi

Similar completions are available for [other shells (ex. zsh) as well]((https://github.com/git/git/blob/master/contrib/completion/).

  • If you messed up your current branch, did not commit the changes, and just want to revert to the last checkin, use:

    git checkout -- . or (if you added the alias above)

    git discard