## Version Control with git

If you remember one thing and one thing only about version control it should be this: **always be committing**.

Each commit is taking a snapshot of your work so far which enables you to go
back in time to older versions of your program. You will most certainly find
yourself in a situation where you had some working code, modified it to add a new
feature or work out some kink, find that you've hopelessly ruined everything and
would give your left index finger just to get back to what you had before. Enter
git.

Here is an interactive tutorial to help you get things setup on your computer: [http://jlord.us/git-it/](http://jlord.us/git-it/)

### Key concepts

* Repository (a folder managed by git)
* Workspace (current state)
* Index (staged for commit)
* Commit (take a snapshot)
* Branch (a series of commits)
* Remote (a remote repository that you can push to or pull from)

Any folder can be turned into a git repository with `git init`. Your
**workspace** is the current state of all your files. Some of them will be
different from what was last committed. You can see what's different by running
`git status`. From your workspace, you can use the `git add` command to add
files to the index, which is a sort of staging area for commits. When you run
`git commit`, the files in your index are included in the commit snapshot. You
can use `git reset` to roll back to prior commits and you can use `git log` to
see the history of commits.

Here's a [visual cheatsheet][git-cheat] that covers all this and more.

[git-cheat]: http://ndpsoftware.com/git-cheatsheet.html#loc=workspace 

<img src="http://rabellamy.github.io/git-ready/images/fork-diagram.png" width="75%"/>

### Key commands

* `git status`: see the status of the workspace, index, and what branch you're
  on
* `git add`: add files to the index (commit staging area)
* `git commit`: take a snapshot of the project, committing the files in the
  index
* `git checkout`: switch to a different branch (use the `-b` option to switch to
  a new branch)
* `git branch`: list the branches
* `git reset`: rollback to a previous commit
* `git push`: push up the changes in a local repository to a remote repository
* `git pull`: pull down the changes from a remote repository to the local
  repository
* `git clone`: copy a remote repository to the local machine

### git Workflow

1. Choose a feature/segment/thing to work on next
1. Write some code
1. Play with the code
1. Rewrite, play some more, etc.
1. `git add .`: add all your changes to the index
1. `git commit -m "Describe the work you just did"`
1. Repeat

__DO NOT commit large files to a Github repo (anything larger than ~20mb).  In case you have accidentally committed a large file (or dataset) use this [tutorial](http://blog.jessitron.com/2013/08/finding-and-removing-large-files-in-git.html) or this [commandline tool](http://rtyley.github.io/bfg-repo-cleaner/) to clean up your repo__



## Troubleshooting

These are common git issues that students run into. Don't forgot to change `username` to your username and `repository` to the repo you're working with!

### Stuck in vi?

You might get stuck in a window that looks like this:

```
Merge https://github.com/username/repository

# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.
~
~
```

This is an editor called `vi`. It is automatically creating a commit for you. All you need to do is exit out of `vi` by pressing: `:q`.

### Repository not found? (Pushing to your repo if you cloned from Galvanize)

If you accidentally clone from the Galvanize version rather than your fork, you will get the error `Repository not found`. This is a poor error message. What they really mean is that you don't have access to edit the repo. Here's what you can do:

```
git push https://github.com/username1/repository master
```

### Pushing to your partner's github

Here, Partner 1 will refer to the partner who's fork you cloned. Partner 2 is the other partner.

1. Make sure you push to partner 1's github fork.

2. Make sure partner 2 forked the repository.

3. You can use this command to push to their github:

    ```
    git push https://github.com/username2/repository master
    ```

4. It might not let you because you each have a different version of your individual work. You would like to get the work from partner 2's fork. This is a *merge conflict*. We can resolve this by just always choosing partner 2's work when there's a conflict. `ours` would be the strategy of choosing the version that's on the computer (partner 1's) and `theirs` is choosing the version on github. Since we're pulling from partner 2's fork, this is what we want.

    ```
    git pull -X theirs https://github.com/username2/repository master
    git push https://github.com/username2/repository master
    ```

5. If you pulled the normal way, without giving a merge strategy and are told you have a merge conflict, do this first to undo the pull:

    ```
    git reset --hard
    ```

6. If your individual work includes files with different names, you will end up with both of them. This is not a huge deal, but if you'd like to not have your partner's file for the individual exercise, do this:

    ```
    git rm partner1exercise.txt
    git push https://github.com/username2/repository master
    ```

### Stop writing the long `https://github.com/username/repository`

You can make a shortcut to avoid writing this long thing if you're going to be doing it repeatedly. Here's how you can do it:

```
git remote add username https://github.com/username/repository
```

Now you can do:

```
git push username master
```

and

```
git pull username master
```

You can give it any name you want.

You can see all these shortcuts with this command:

```
git remote -v
```

### Updating your Fork from `upstream`

In this class we will be pushing solutions and new assignments to our `zipfian/DSCI6003-student` fork of the repository.  In order to pull in these updates you need to setup a second remote which you will then merge changes from.

<img style="width: 800px" src="http://jlord.us/git-it/assets/imgs/clone.png" >

First things first, find out what remotes are already setup: `git remote -v`

    origin	https://github.com/Jay-Oh-eN/DSCI6003-student (fetch)
    origin	https://github.com/Jay-Oh-eN/DSCI6003-student (push)
    
What does this mean?  The `(fetch)` means that anything you `git fetch` or `git pull` comes from that url.  In theory you could pull and push to two different sources but that is rare.

The `(push)` means that when you `git push` it goes to that url.

#### Adding the upstream remote

```bash
$ git remote add upstream https://github.com/zipfian/DSCI6003-student
$ git remote -v
```
<span style="color: red">origin</span>	  `https://github.com/`<span style="color: red">Jay-Oh-eN</span>`/DSCI6003-student (fetch)`  
<span style="color: red">origin</span>	  `https://github.com/`<span style="color: red">Jay-Oh-eN</span>`/DSCI6003-student (push)`  
<span style="color: teal">upstream</span>	`https://github.com/`<span style="color: teal">zipfian<span>`/DSCI6003-student (fetch)`  
<span style="color: teal">upstream</span>	`https://github.com/`<span style="color: teal">zipfian</span>`/DSCI6003-student (push)`  

#### Pulling in Changes from `upstream`

To incorporate changes we have made to the `upstream` repository you simply need to `git pull` from the upstream remote: 

```bash
git pull upstream master

remote: Counting objects: 3, done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 3 (delta 0), reused 3 (delta 0), pack-reused 0
Unpacking objects: 100% (3/3), done.
From github.com:zipfian/DSCI6003-student
 * branch            master     -> FETCH_HEAD
 * [new branch]      master     -> upstream/master
Updating dbaa565..af4394a
Fast-forward
 new.md | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 new.md
```

### Ready to Submit an Assignment?

We want to see the amazing work you've done, so push your code to Github. If you're unclear how to do this, follow these steps:

1. If you haven't done this, you'll need to create a fork and clone of this repository.
2. Add (`git add .`) and commit your changes (`git commit -m "Include solution for week1 exercise`).
3. Push your changes to your personal fork: `git push origin master`
4. Submit a [pull request](https://help.github.com/articles/creating-a-pull-request/). Github will send us a notification of your request once it's submitted.

Here is a very thorough walk through of submitting a pull request: [https://github.com/asmeurer/git-workflow](https://github.com/asmeurer/git-workflow) 

## Recap

1. Keep a tight feedback loop when writing code.
  * Write code in Sublime Text
  * Import file into iPython
  * Write, run, repeat
1. Use git. Always be committing (ABC).

![abc](http://cdn.memegenerator.net/instances/400x/24938700.jpg)