## Using Git
**Nicholas Kern**
<br>
**Astro 9: Python Programming in Astronomy**
<br>
**UC Berkeley**

---
1. [Introduction](#Introduction)
2. [Configuration](#Configuration)
3. [Tracking Your First Repo](#Tracking-Your-First-Repo)
4. [Push, Pull, Fork, Clone](#Push-&-Pull,-Fork-&-Clone-with-an-Online-Repo)
5. [Branching and Merging](#Branching-&-Merging)
6. [Breakout](#Breakout)

### Introduction
Git is a powerful version control software, and there are lots of good resources online for an intro to Git. Here we will look over the basic capabilities of Git and explore how we will be using it for this course.

**Branching and Merging**
<img src="imgs/branch_illustration.png" width=500px>
<center> IC: Github. An illustration of branching and merging.
In the above figure we start with a set of documents on a "master" branch denoted in red. We can create a copy of these files into a sub-branch, which gets tracked separately from the master branch. In both cases, ***different kinds of edits*** are made to the documents (+ for additions, - for deletions) on each branch. Let's say we like the edits on both branches and want to "merge" the two branches together to keep all of the changes. To do so, we merge the sub-branch into the master branch, which inherits all of the changes from the sub-branch and keeps the changes to itself. Next we add files to the sub-branch, and then merge them into the master branch.
</center>

***External Resources***
1. [Installing Git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git)
2. [Software Carpentry](http://swcarpentry.github.io/git-novice/)
3. [Atlassian Tutorial](https://www.atlassian.com/git/tutorials)

### Configuration
---
Once you've installed Git, we need to set some global preferences, including  your name and email:
```bash
git config --global user.name "First Last"
git config --glober user.email "email@site.com"
```
by default, Git will probably use Vi as a text editor, so if you'd like to switch that to nano, you can do so
```bash
git config --global core.editor "nano -w"
```
You can check these settings with
```bash
git config --list
```

### Tracking Your First Repo
---

**Initializing Git**

Let's make our first directory (aka repository, aka repo), which we will track with Git.
```bash
mkdir my_dir
cd my_dir
```

Now let's make some files!
```bash
for i in $(seq 1 10)
do
echo hello, this is file$i > file"$i".txt
done
```

Let's make some sub-directories!
```bash
mkdir subdir1
echo a new file in a sub-dir > subdir1/file1.txt
```
To track these files and directories, we can initialize git within this directory
```bash
git init
```
In doing so, we have created a hidden git directory
```bash
ls -aF
./ ../ .git/
```

**Staging and Committing**

So far we have only created an empty git repository. Now we need to ***stage*** the files we want to track, and ***commit*** them to the repository. This is typically a two step process, illustrated by the figure below.
<img src="imgs/staging.png" width=400px>
<center> IC: Github. An illustration of staging a file edit, and commiting it to the repository. </center>

So let's add our files to the staging area
```bash
git add file*.txt
```
What did this do? This added all files ***in the current directory*** starting with "file" and ending in ".txt" to the staging area, which we can then commit to the repository if we like.

Let's see the status of things:
```bash
git commit
```
You can see that the newly added files are staged for commit, while sub-dir remains unstaged. Let's say we didn't like the changes we made to the files and wanted to **unstage** them, we can do this on a file-per-file basis
```bash
git reset file1.txt
```
or we can unstage everything
```bash
git reset
```
Careful with git reset because it can be permanent depending on how you use it! If you want to erase your private edits you just un-staged so that your working directory reflects the most recent commit, you can use
```bash
git checkout -- <filename>
```
to do it on a file-by-file basis, or
```bash
git checkout -- .
```
to do all files at once. Careful, if you don't have another copy of your un-committed edits, this will permanently erase them. Another way to do both the un-staging and erasing of local edits all at once is to use
```bash
git reset --hard
```

If we forgot what changes we made and want to see what we would be undoing by using the reset, we can do a 
```bash
git diff
```
to see the difference between the working directory and the repository.

Assuming we actually wanted those commits, we can restage them and commit them with a commit message with the ```-m``` flag
```bash
git add file*.txt
git commit
```
Note that pound symbols ```#``` are interpreted as "comments" meaning that the shell does not execute those lines of code and effectively ignores them. To actually make the commit, we need to uncomment the files we want to commit and then save the script and exit. We can also leave a commit message in the first line of the file.

Use
```bash
git status
```
to check the status of your files. Try editing one of the files and checking its status!

Now let's check the log:
```bash
git log
```
You should be able to see your first commit with its commit message. 

Here are some other ways to stage and commit:
```bash
git commit <filename> # stages and commits a single file in one go
git commit -a # stages and commits all tracked files
git commit -m "text" # automatically commit all currently staged files w/ commit message of 'text'
```

**Checking Things Out**

In order to navigate your commit history and see what's happened, it is very useful to be able to check things out, i.e., make your files in your working directory reflect the state of a previous commit (or different branch, as we will see later). This is accomplished by the command
```bash
git checkout <commit-tag>
```
where your entire working directory should now reflect that state of the working directory **at the time of that commit**. If you only wanted to check the state of a certain file at the time of that commit, you can use
```bash
git checkout <commit-tag> <filename>
```
Once you have finished checking out that commit, you can get back to your home state, (also called the branch tip), by checking out master:
```bash
git checkout master
```

Note that if you have uncommitted changes in your working directory, `git checkout` may fail. It is better have your working directory clean of edits before you checkout a previous commit. If you don't want to get rid of your edits but would still liket to checkout a previous commit, you can use the `git stash` command, which we won't go into now, but would be good to read-up on yourself. 

**Undoing Changes**

Let's say you made some changes and commits to your files since your first commit and you realize that these new additions were faulty: perhaps they added unwanted bugs to your code. In this case, we would like to either 1) undo some of the commits that we think introduced the bug, or 2) undo all commits all the way back to some previous commit. This is another one of the useful functionalities of Git. 

Let's say you have identified a single commit as the faulty commit, and would like to undo the changes introduced by that commit (not necessarily the most recent commit), but keep all other commits between that commit and your current state. We can accomplish this with
```bash
git revert <commit-tag>
```
This moves the commit history forward by generating a new commit which actually undoes the commit of `<commit-tag>`, to the best of git's capability.

Now let's say we actually want to undo all commits all the way back to some previous commit. One way of doing this is to perform a `git revert` on every single commit all the way down the line. If you check your `git log`, you can see that while you are reverting the files to reflect the old commits, you are keeping a linear order to your commit history. This is useful in the case that you wan't to reinstate these commits at some future point.

Now let's say you are really sure the edits you have made are bad, and you want to revert back to an old state and not add the revert to your commit history. To do this, you can use `git reset`, but be careful, because this is a **permanent action**. If you want to change the commit history and the files in your working directory use
```bash
git reset --hard <commit-tag>
```
If you look at your `git log` you will see that you have erased your commit history that occured after `<commit-tag>`. 


**Ignoring Files**

You can create a `.gitignore` file to tell git to ignore certain files or types of files, even if you try to stage and commit them. 

### Git Summary
---

```bash
git <command> --help # get help on a certain git command

git init             # initialize new git repository
git add              # stages files to commit
git commit           # commits files to repository
git log              # see commit log
git status           # see status of each file
git diff             # see difference between commits or between HEAD and working directory
git checkout         # move working directory to mimic that of a previous commit
git reset            # move staging area (and possibly working dir) to previous commit
```

### Push & Pull, Fork & Clone with an Online Repo
---

**Push Pull**

The other extremeley useful capability of git is [GitHub](https://github.com/): an online tool that allows us to make copies of our files (and their git logs) to an online server. This means we can easily share our files with others, and also access our files from anywhere and any machine! To use it you'll need to make an account. Once you have an account, press the "+" buttom near the top-right to start a new repository. Name the repo the same name as your directory, keep it a public repo, and **do not** initialize it with a README file. Copy & paste the code corresponding to pushing an already existing repository, and enter that into your command shell located in your local repository to set up the remote (aka which online repo this local repo connects to), and to make your first pull. In the future, you can push to this repo by being anywhere within this repo and using the syntax
```bash
git push <GitHub_Repo> <branch_name>
```
The `<GitHub_Repo>` is also called a ***remote***, which you can check via ```git remote -v```. The default remote is called "origin" and it will be connected to the repo you just synced with. By default, the branch name is "master", which we will learn more about later. Therefore, for a simple one-remote, one-branch connection (which is what we have by default), the syntax will be
```bash
git push origin master
```
where the "origin master" is optional b/c we now only have 1 remote and 1 branch, so git knows where to push to by default. However, in the case where you have multiple remotes / branches you will want to specify where you want to push to.

Now you can refresh your web browser and you should see your tracked files! This is what's called "pushing" to an online repository. Everytime you make a commit on your local machine, you should try and push to the online repo. Other people can see these files (b/c it is a public repo), so don't push any files you don't want freely on the internet!

On your online repo, press "create new file" and label it README.md, where the .md stands for "markdown". This is a file that allows you to give a description of what this repository holds, both as a reminder to yourself and also for others who may land on this page. Once you've created the readme file, put some text into it (any text). Doing this online is an automatic stage+commit. Now your local git repo is "behind" and you'd like to update it. You can "pull" down the changes to your local repo via
```bash
git pull
```
Alternatively, you can do a ```git fetch``` and a ```git merge``` which is essentially what ```git pull``` does anyways. The fetch brings down new commits and stores them in a remote branch you can access via ```git branch -a``` and then ```git checkout <remote-branch-name>``` to see what the new changes are. If you like them you can go back to your master branch ```git checkout master``` and merge them in via ```git merge```. 

**Clone Fork**

Now try going to a different directory in your machine, say your Desktop if you were in your home directory, or vice versa. Now pretend that we somehow erased our copy of our original repository that we just pushed to GitHub. If we wanted it back, we could clone it:
```bash
git clone https://github.com/username/repo_name
```
This will make a copy of the online repository ***as it stands online***, which means if we had extra files / edits on our local directory that we hadn't committed, they would be lost. Note that it comes with the .git repository, so you get access to the entire commit log history. If this repository came from your GitHub account, you should be able to edit files and push normally as you would have before.

Let's say you wanted to copy your friend's code to your computer, and additionally wanted to start working on it, make your own edits and track those edits with git. In this case, you would want to ***fork*** the repository, which is the same thing as cloning but forking makes a copy of the online repository into your own GitHub account, and links your local copy to that repo, instead of the original one from your friend. Try forking the spoon-knife directory from octocat by navigating to https://github.com/octocat/Spoon-Knife and pressing the "Fork" button on the upper-right. You've now got a copy of spoon-knife of your own which you can now clone:
```bash
git clone https://github.com/username/Spoon-Knife
```
Try making a change, or adding a file and pushing to your Spoon-Knife!

**Clone the Astro_9 Directory**

You will want to have class materials on your local machine. To do that, clone the [Astro_9 repo](https://github.com/nkern/Astro_9). Before we start class, you can do a `git pull` to get the most up-to-date material.

### Branching & Merging
---
So far we have worked on initializing a git repo, tracking files and directories and syncing them with an online GitHub repo. Another powerful feature of git is the ability to create and track multiple versions of your repositories. This is called branching. A good example of this is collaborative work on code: when two or more people work on different parts of the same code, each can have their own **branch**. When the edits are completed, they can **merge** their commits back onto the master branch.

Let's start by examining which branches currently exist:
```bash
git branch
```
which will list the **local** branches of your repo. You should see only the master branch, which is the main reference all other branches will point back to. To see the local and **remote tracking** branches, you can append the ```-a``` flag. The remote tracking branch is hidden, and tracks your synced GitHub repo and is what gets updated when you do a ```git fetch```. Let's create a new branch, called new_idea:
```bash
git branch new_idea
```
If we now list the branches, we will find master and new_idea, but the HEAD (or the current working directory we can see) still points to master b/c it is starred. To see new_idea, let's check it out!
```bash
git checkout new_idea
```

Now let's edit some files: add text to one file, remove text from another and add a new file! Then commit the changes and push to ```origin new_idea```. If you check your online GitHub repo, you will find that a new branch has appeared, and that it has the changes you just made!

Now let's say we would like to bring in these changes to the master branch. We can do this with the ```git merge``` command. The way we do this is to checkout the branch we want to merge into, and feed `git merge` the branch we want to take commits from. To see what changes we will be bringing in, we can use diff syntax like
```bash
git checkout new_idea
git diff -U0 master
```
which should list the differences between new_idea and master. If we like the changes and want to continue with the merge, we can use the following syntax
```bash
git checkout master
git merge new_idea
```
which should return with no errors, and should give you a summary of which files were changed to resolve the merge.

### Breakout
---

Try to accomplish the following tasks:

1. Create a new directory (call it what you want) and initialize a git repo

2. Create some files and sub-directories w/ their own files, and put some text in them 

3. Create a .gitignore and ignore one of your files

4. Stage your files and commit them with a commit message

5. Create an online GitHub repo of the same name and sync them

6. Now add a README.md file in your repo, add some text to it and re-sync the online repo (see the [MarkDown CheatSheet](https://guides.github.com/pdfs/markdown-cheatsheet-online.pdf) for MarkDown syntax)

7. Now make some edits to your files, stage them for commit, and commit them

8. Now undo this commit with `git revert`

9. Now make some edits, stage them, then un-stage them and undo the local edits

10. Make a new branch (call it what you want)

11. Edit some files in whatever fashion you want (edit existing files, erase files, add new files, etc.)

12. Sync your online repo to reflect the new branch

13. Merge the changes of the new branch into the master branch

14. Undo the merge by reverting back to a commit before the merge