# Coding Outreach Group Summer Workshop
# Git + GitHub
06/10/2021

__**Content creator:**__ [Elizabeth (Liz) Beard](https://github.com/elizabethbeard)

__**Content reviewers:**__ John Erardi, Haroon Popal

## Set Up & Prerequisites
Be sure to check out the README for the workshop to make sure you've completed the following steps:
1. Familiarized yourself with the unix shell and bash (see prerequisites)
2. Installed git on your local computer
3. Make a [GitHub](https://github.com/) account

## Description
This workshop introduces attendees to the basics of git and GitHub and why it is a useful tool for researchers. This course is largely based on materials from [Elizabeth DuPre](https://emdupre.github.io/git-course/) and [Software Carpentry](https://swcarpentry.github.io/git-novice/).

## Outline
| Topic | Time | Description |
| --- | --- | --- |
| Intro | Why do I need version control? What is git? What is GitHub? | 5 min |
| Tutorial 1 | set up git, create a local repo, commit changes, add files to GitHub repo | 20 min |
| Tutorial 2 | forking and cloning a repo, updating a local repo with changes from the remote repo, **BONUS**: pull requests | 25 min |
| Examples | collaborating, open science, resource sharing, exposure | 5 min 

# Intro

In [2]:
%%HTML
<iframe width="560" height="315" src="https://www.youtube.com/embed/QPxcARTtoZQ" frameborder="0" allowfullscreen></iframe>

# 1. Getting Started with git
Version control in git is centered around a *repository* which holds your directories and files. For this workshop, we'll be using the command line interface or terminal to work with git. While there are other GUIs for using git, there are a number of benefits to using the command line:
- you develop better understanding of how git commands work
- you develop the ability to use git on any computer with bash or linux
- learning various GUIs will be easier since you know the terminology 

## Setting up git on our local machine
### Tell git who you are
Git records information not only about the changes to files you make, but also *who* made those changes -- this makes it even more helpful for collaborating! For git to know who is making changes on this machine, we need to tell git about who we are.
- *note*: be sure to use the same email address you used to create your GitHub account.

``` bash
git config --global user.name "Your Name"        # put quotation marks around your name
git config --global user.email yourname@yourplace.edu
```

### Set a default editor
When working with Git we will often need to provide some short but useful information. In order to enter this information, we need an editor. We’ll now tell Git which editor we want to be the default (i.e. the one that Git will bring by default whenever it wants us to provide some information).

You can choose any editor available on your system. For this tutorial we'll be using nano (the default in mac and linux systems). There are lots of other text editors available, if you want to [check some out here](https://swcarpentry.github.io/git-novice/02-setup/index.html).

``` bash
git config --global core.editor "nano -w"
```

Great! Now let's check whether our settings were saved.
- *note*: filenames that start with a '.' are generally invisible when exploring your folder structures -- even when you use `ls` in the command line! To look for these files in the command line, use `ls -a`. If you're using the finder on a mac, press `COMANND + SHIFT + .`.

In [4]:
cat ~/.gitconfig

[filter "lfs"]
	required = true
	clean = git-lfs clean -- %f
	smudge = git-lfs smudge -- %f
	process = git-lfs filter-process
[user]
	name = Liz Beard
	email = ecooperbeard@gmail.com
[color]
	ui = auto
[core]
	editor = nano


## Creating a local repository with git
Let's create a local responsitory so you can track changes on your awesome new project that you want to see develop over time and maybe even turn into a library for others to use. But we're going to start simple, and create a README to help outline the project.

First, let's create a directory for the project in our home directory:

In [5]:
cd
pwd
mkdir new-project
cd new-project

/Users/lizbeard


Now we need to tell git that this is a repository that we'd like to track. This is called "initiating the repository.

In [6]:
git init

Initialized empty Git repository in /Users/lizbeard/new-project/.git/


If we look in this directory, we'll find a new invisible git directory. 

In [8]:
ls -a

.	..	.git


The `.git` directory contains git's configuration files. DO NOT delete this directory -- you'll delete all of the tracked information in your repository!

## Tracking files with in your repo
Now that we've initialized our repository, let's create our README and add our authors names and the project title.
- *note*: You don't *have* to create/edit your text files using the command line, but we're going to for this workshop just to familiarize ourselves with it.

``` bash
nano README.md
# add project title and authors
```

In [9]:
git status

On branch master

No commits yet

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	[31mREADME.md[m

nothing added to commit but untracked files present (use "git add" to track)


Information about what git knows about the directory is displayed after running `git status`. We are on the `master` branch, which is the default branch for a repository. We're not going to cover branches in this workshop, but the resources listed on the workshop README discuss branches in depth.

For now, it's what's important to realize is that our file is listed as *untracked*, which means it is in our working directory but git is not tracking it yet -- so no changes made to the file will be recorded by git!

### Adding files to a git repo
To tell git about the file, we'll use the `git add` command. This is used for two purposes: (1) to tell git that a file should be tracked and (2) to put the file into the git staging area. The staging area serves as a cache to store changes *before* you commit them to the repository. So you can add more files/changes before submitting them all under a single `commit`. For more about the 'staging area', see [this presentation by Stephanie DeCross](https://zenodo.org/record/3369466#.YL4cVDZKh24). 

In [10]:
git add README.md
git status

On branch master

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

	[32mnew file:   README.md[m



### Commit changes
In order to tell git to record our changes (adding our new files) into the repository, we need to `commit` it.

``` bash
git commit
# type a commit message: "Add project title and authors"
# save the commit message and close your text editor
```

After we save our commit message and exit the editor, git will commit our file to the local repo. It will report the number of files changed and the number of lines inserted or deleted across all those files.

Now if we look at the status of the repo:

In [11]:
git status

On branch master
nothing to commit, working tree clean


Our file is now in the local repository! There are other ways to commit changes as well. Here are a few:
- `git commit -m "Commit notes here"` commits your changes and submits a message automatically. This tends to be faster than opening the text editor every time, and you generally want to leave sparse change notes.
- `git commit -am "Commit notes here"` adds and commits all files that have been tracked and modified.

For other commit options, check out the [git documentation](https://git-scm.com/docs/git-commit/en#_options).

Additionally, you can also add entire directories to be tracked by adding the folder instead of just a file (e.g.`git add tools`, to track changes in a tools folder). 

## Remote repositories with GitHub
We've started tracking changes on our own computer! That's great! But if you love iced coffee as much as I do you know that it's a huge risk to just keep things stored on a local machine. 

Let's set up a remote repository so that we can access our project from multiple locations. This way, we can share the repository with other collaborators easily!

At this point, you should already have your GitHub account created.

### Create a new repository
Let's create an empty repository on github.
- Log in to [GitHub](https://github.com/)
- Click on the create icon (`+`) on the top right
- Enter your repository name: "new-project"
- For this exercise, let's keep the repository public
- Since we'll be importing a local repository, make sure that the **Initialize this repository with a README** is ***UNselected***
- Click `Create Repository`

You'll be directed to a page with new information about your repository. We already have our local repository and will be *pushing* it to GitHub, so we can do the following.

``` bash
git remote add origin https://github.com/<USERNAME>/new-project.git

```

Now we can execute the following:

In [12]:
git push -u origin master

Counting objects: 3, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 283 bytes | 283.00 KiB/s, done.
Total 3 (delta 0), reused 0 (delta 0)
To https://github.com/elizabethbeard/new-project
 * [new branch]      master -> master
Branch 'master' set up to track remote branch 'master' from 'origin'.


This pushes our `master` branch to the remote repository (named via the alias `origin`) and creates a new `master` brance in the remote repository. If we look on the GitHub repo, we should see our code. If we click the `Commits` tab, should see a complete history of commits.

Syncing to a remote repository adds a third step to our git procedure for tracking changes. Generally, adding changes and pushing them to the remote repo will follow these steps:
1. `git add` to add tracked files and changes to the staging area
2. `git commit` to push those changes from the staging area to a local repo
3. `git push` to push those changes from the local repo to the remote repository

Systems like git allow us to move work between any two repositories. In practice, though, it’s easiest to use one copy as a central hub, and to keep it on the web (like on GitHub) rather than on someone’s laptop.

And there you have it! These are the basics of working with git for your own independent purposes. In the next section, let's learn a bit more about how to access and utilize *other* repositories on git and collaborate with them.

# 2. Utilizing Remote Repositories
Version control really comes into its own when we begin to collaborate with other people. We already have most of the machinery we need to do this; the only thing missing is to copy changes from one repository to another.

Oftentimes, you'll want to access someone else's remote repository to use for your own analyses or benefit. You won't necessarily have write access to the repository, but by forking the repo you'll be able to access the code on your own GitHub and local machine.

In the bonus section, we'll review how to submit changes to a repository where you don't have access. GitHub provides a functionality called Pull Requests. Essentially, it’s “requesting the owner of the repository to pull in your contributions”. As the owner of a repository, you may or may not accept a pull request. But as a contributor, pull requests provide a path to engage with the community and contribute new tools and functions to different resources.

Utilizing remote repos usually follow a similar process (see below for a visualization of the process):
1. Find a reponsitory on GitHub that belongs to someon else
2. Fork it on GitHub's servers into your own GitHub account
3. `git clone` the repo to your local machine
4. Make changes, and push them to your repository on GitHub
5. **BONUS** Request that the owner of the repository you forked pulls in your changes


<img src="images/github-fork-diagram.png" width=500 height=500 />

## Is it worth it? Lemme Fork It
In today's tutorial, we're going to fork the summer workshop repo so that each week, you'll be able to pull the new material from the COG GitHub to your GitHub and onto your local machine. Ideally, this means no more clicking and downloading files -- we'll do this all through the command line and git.

The first thing we need to do is fork the COG Summer Workshop Series repository to our own GitHub account.
1. Go to the [COG Summer Workshop Series Repo](https://github.com/TU-Coding-Outreach-Group/cog_summer_workshops_2021).
2. Click the **fork** button in the upper right hand corner.

You should now be redirected to the cog_summer_workshops_2021 repo on your own GitHub page.

## Attack of the clone
To add the remote cog_summer_workshops_2021 repo from our GitHub onto our local machine, we'll use `git clone`.

Start by making sure you're in the parent directory where you want the repo to be stored. I want to store this repo in my documents folder.

In [13]:
pwd

/Users/lizbeard/new-project


In [14]:
cd ~/Documents
pwd

/Users/lizbeard/Documents


Now, we'll use `git clone` to copy the repository to our local machine. Be sure to copy the GitHub repo link the same way we did in Tutorial 1.

Cloning creates an exact copy of the online repository. By default it creates a directory with the same name as the GitHub repository.

In [15]:
git clone https://github.com/elizabethbeard/cog_summer_workshops_2021.git

Cloning into 'cog_summer_workshops_2021'...
remote: Enumerating objects: 251, done.[K
remote: Counting objects: 100% (251/251), done.[K
remote: Compressing objects: 100% (193/193), done.[K
remote: Total 251 (delta 133), reused 145 (delta 51), pack-reused 0[K
Receiving objects: 100% (251/251), 11.23 MiB | 12.63 MiB/s, done.
Resolving deltas: 100% (133/133), done.


In [19]:
cd cog_summer_workshops_2021
pwd
ls

/Users/lizbeard/Documents/cog_summer_workshops_2021
LICENSE			desktop.ini		linux-owlsnest
README.md		excel-basics		neuroimaging-in-python
bids-heudiconv-fmriprep	git-github		psychopy
data-visualization-in-r	jupyter-notebook	rsa


## Push It
Now we can use our cloned repository just as if it were the original, local repository! Let's make some changes to our files and push these.  Let's say you wanted to add some notes from your git/github worksop to your remote repository.

```bash
cd git-github
nano git-notes.txt
# add some notes or some projects you think you could post on github
git add git-note.txt
git commit -m "added workshop notes"
```

Great! Now our changes are stored to our *local* repository. Let's send our changes back to the *remote* repository by `push`ing our changes.
- *note*: you'll need to be in the repo directory to push these changes, **not** inside a subfolder!

In [26]:
pwd
git push origin main

/Users/lizbeard/Documents/cog_summer_workshops_2021
Counting objects: 4, done.
Delta compression using up to 8 threads.
Compressing objects: 100% (4/4), done.
Writing objects: 100% (4/4), 404 bytes | 404.00 KiB/s, done.
Total 4 (delta 1), reused 0 (delta 0)
remote: Resolving deltas: 100% (1/1), completed with 1 local object.[K
To https://github.com/elizabethbeard/cog_summer_workshops_2021.git
   725023f..eb6b3da  main -> main


You may notice that here, we're pushing to `main` instead of master. This is a [recent change](https://www.techrepublic.com/article/github-to-replace-master-with-main-starting-in-october-what-developers-need-to-know/) in git and GitHub in an effort to reduce unnecessary holdover references to slaver and replace them with more inclusive terms. Be sure to note the change whenever you're referencing tutorials written before October 2020.

## Pulling changes from a remote repository
We've already updated our `origin` repo (the repo on our personal GitHub). But let's say a week goes by, and we want to download the materials for Susan's PsychoPy workshop before next Thursday. To do this, we'll pull from the COG or `upstream` repo.

Before we can pull from the COG repo into our local repo, we need to make sure that the two are connected.

In [27]:
git remote -v

origin	https://github.com/elizabethbeard/cog_summer_workshops_2021.git (fetch)
origin	https://github.com/elizabethbeard/cog_summer_workshops_2021.git (push)


Here we can see that the the remote repos linked to our local machine are both from my personal GitHub. But we want to tell git that this repo needs to pull information from the COG GitHub. Let's tell git to make the [COG repo]() our `upstream`.

In [29]:
git remote add upstream https://github.com/TU-Coding-Outreach-Group/cog_summer_workshops_2021
git remote -v

fatal: remote upstream already exists.
origin	https://github.com/elizabethbeard/cog_summer_workshops_2021.git (fetch)
origin	https://github.com/elizabethbeard/cog_summer_workshops_2021.git (push)
upstream	https://github.com/TU-Coding-Outreach-Group/cog_summer_workshops_2021 (fetch)
upstream	https://github.com/TU-Coding-Outreach-Group/cog_summer_workshops_2021 (push)


Now we see that git has *two* remote locations: our `origin` repository (the forked repo) and the `upstream` repository (the COG repo).

To pull changes from the `upstream` repository to our `main` local repository, we can use the following:

```bash
git pull upstream main
```

BUT, this doesn't update your `origin` repo on your GitHub repository. To do that, you'll need to push those changes you've just pulled back to your `main` repo.

```bash
git push origin main
```

It's good to get in the habit of doing this *before* working on any code or projects, to avoid any potential change conflicts. For more info on conflicts in git, check out this [software carpentry page](https://swcarpentry.github.io/git-novice/09-conflict/index.html).

## **BONUS**: Pull Requests ##
Pull Requests are a great solution for contributing to repositories to which you don’t have write access. Remember, Pull Requests are essentially “requesting the owner of the repository to pull in your contributions”.

For example, say you've found some typos in this repo and want to let Liz and Haroon know you've fixed them and they should too! To submit a Pull Request, follow these steps:
1. Make any changes you want to contribute to the `upstream` repo, then `commit` and `push` them to your GitHub repository (`origin`). 
2. Go to your GitHub account and in the forked repository find the green button for creating Pull Requests. Click it and follow the instructions.
<img src="images/github-pull_Request-screenshot.png" width=700 height=400 />
3. The owner(s) of the original repository will get a notification that someone created a pull request - the request can be reviewed, commented, and merged in (or not) via GitHub.

Here is some advice from [Elizabeth DuPre for submitting Pull Requests](https://emdupre.github.io/git-course/06-pull-requests/):
>- Keep your Pull Request small and focused (makes it easier to process!)
>     - Submit one PR per issue
>     - Create a separate branch for each issue you work on (you can submit a PR from any branch(
>- Take advantage of available resources!
>     - If the repository has contributing guidelines, read them!
>     - Some repositories pre-populate the body of the PR or issue message with a template.
>          - Follow the instructions and provide the information requested.
>- Consider creating a new issue first to discuss your ideas before submitting a PR. Some repositories ask for this in their contributing guidelines, but this can be a good approach even if it isn't required, so that you know whether the owner agrees with your suggestion. They might also bring up ideas and/or challenges you haven't considered.
>
>Your PR may get merged just as it is. It’s very normal, too, for there to be some discussion (on GitHub) and a request for further edits to be made. Given that your pull request haven’t been merged get, you can make changes by adding further commits to your branch and pushing them. In either case, your PR will update automatically once you have pushed your commits.

# Implementations

In [1]:
%%HTML
<iframe width="560" height="315" src="https://www.youtube.com/embed/w_iOjU53gbk" frameborder="0" allowfullscreen></iframe>