# Git Basics

# Table of Contents

1. [Introduction](#Introduction)<br>
    a. [Learning objectives](#Learning-objectives)

- [Git quickstart](#Git-quickstart)
- [Git CheatSheet](#Git-CheatSheet)<br>
    a. [Git Concepts](#Git-Concepts)<br>
    b. [Set up a Git Configuration](#Set-up-a-Git-Configuration)<br>
    c. [Create a Git Repo](#Create-a-Git-Repo)<br>
    d. [Stashing - Moving changes to the side](#Stashing---Moving-changes-to-the-side)<br>
    e. [Other Useful Commands](#Other-Useful-Commands)<br>
    f. [GitHub flow](#GitHub-Flow)

- [Git Overview](#Git-Overview)

# Introduction

- back to [Table of Contents](#Table-of-Contents)

This notebook introduces you to using git for version control. The quickstart outlines the basic workflow for commmitting and syncing changes with a git remote repository.  The CheatSheet provides additional details on basic use of git. The Overview goes into more detail of what git is and how it works.

## Learning objectives
- Know the minimum set of git commands needed to use git for version control.
- Provide some additional commands and context so you better understand how to use git.
- Provide an overview of what git is and how it works.

# Git quickstart

- back to [Table of Contents](#Table-of-Contents)

The git command always is used by first calling git, then telling it what action you want to perform, then passing it additional arguments to tell it how you want it to carry out the requested action:

    git <actiom> <parameters>
    
Example with action "add", adding "`my_file.txt`":

    git add my_file.txt

To check in using git at the command line:

- First, go into the directory of the git repository in which you are working.
- run git status to see what changes have been made:

        git status
        
- Add any files or directories that are new or have been changed:

        git add <file_name>
        
        git add <directory_name>

        git add README.md
        
        git add *.py   # you can use wild cards
        
- Once you've added all the files, commit.

        git commit
        
- As part of commit, it will ask you to enter a commit message.  On Unix and Mac, this will open up your default shell text editor.

- After commit, you sync with the github remote repository.

        first pull, to receive changes that are on the server, not on your computer.
        git pull
        
        then, push your changes to github
        git push
        
When you are collaborating with a team of developers, pulls sometimes force you to manually reconcile changes made to the same bits of code. If it is just you working alone in a repository, however, chances are your pull won't result in any changes or merges. It will just tell you there aren't any changes.

When you push, depending on how you cloned your repository, you will likely have to log in to github.

# Git CheatSheet

- back to [Table of Contents](#Table-of-Contents)

## Git Concepts

**remote** - a remote is an external repository that the local repository syncs with.  A given repository can have more than one remote.  Standard remotes:

- origin -- default remote repository (i.e, the GitHub repo if you clone a repository from github)

**branch** - a branch is a set of code changes that are kept separate from the main code base (or trunk) in a git repository. A branch can be worked on in isolation until one wants to merge the changes back into the trunk. Git makes it easy to create branches both in your local repository and in a remote. Standard branches:

- master -- default development branch
- HEAD   -- current branch
- HEAD^  -- parent of head
- HEAD~4 -- the great-grandfather of head


## Set up a Git Configuration
```
# Adding some customization
git config --global user.name "Clark Kent"
git config --global user.email "clark.kent@dailyplanet.com"
git config --global color.ui "auto"
git config --global core.editor 'nano' #or vim, emacs sublime
git config --global push.default current
```

## Create a Git Repo

### From an existing repo
```
git clone git://host.org/myproject.git # an external GitHub Repo though HTTPS
git clone ssh://you@somehost.org/project.git # through SSH
git clone ~/some/repo.git ~/new/repo.git # #through the filesystem
```

### From a new project
```
cd ~/myproject
git init # intialize the repo
git add . # add the folder
```

## Stashing - Moving changes to the side

The `git stash` command lets you put aside a set of changes so that you can pull updated code from a remote.  You can then either re-apply your stash of changes to the updated code files or discard them.

```
git stash -- save modified and staged changes and them remove them from current branch.
git stash list -- list stack-order of stashed file changes
git stash pop -- worte working from top of stash stack
git stash drop -- discard the changes from top of stash stack
```

## Other Useful Commands

```
git <command> --help #pull up documentation for a <command>

git status -- check which files have been changed in the working directory

git log -- get a history of changes

git checkout <somefile> HEAD --revert to a the state of a file at the last commit

git reset --hard Revert back to the last state WARNING THIS CANNOT BE UNDONE

git rm <somefile> -- remove the file both from your git repository and from the file system, adding the removal to the next commit.

git mv <old_file_path> <new_file_path> -- move a file that is already versioned in git from one location to another, adding the change to the next commit.
```

## GitHub Flow

So far we have been doing the "solo" workflow, which looks something
like the following:
```
 > mkdir my_working_directory
 > cd my_working_directory
 > git init
 > touch some_file.py
  # hack, do some work, hack
  # hack
 > git add some_file.py
 > git commit -m "Working with some awesome idea"
 > git push origin master
  # hack
  # more hacking
```

As you might have guessed, this workflow is just fine when you are
working by yourself. When you're working in a team, it's useful to
have a more structured workflow. Here we'll talk about the Github flow.

In the GitHub flow, *we never code anything unless there is a need to.*
When something needs to be done, we create an **issue** on the GitHub repository
for it. *Good* issues:
- Are clear
- Have a defined output
- Are actionable (written in the imperative voice)
- Can be completed in a few days (at most)

Here are some examples:
- *Good*: /Fix the bug in .../
- *Good*: /Add a method that does .../
- *Bad*:  /Solve the project/
- *Bad*:  /Some error happen/

[Here is how to create a GitHub issue.]https://guides.github.com/features/issues/)

Once an issue exists, we'll pull from the repo and create a *branch*.
A *branch* is a copy of the code base separate from the main master branch
where we can work on our issue (e.g, fixing a bug, adding a feature) without
affecting the master branch during our work and then ultimately merge our
change into the master branch.

The flow goes something like this:
```
##Pull from the repo
> git pull
##Decide what you want to do and create an issue
> git checkout -b a-meaningful-name
```
The command `git checkout -b` creates a new branch (in this case
called "a-meaningful-name") and switches to that branch. We can see what
branch we are on by using the command `git branch`, which displays all
the branches in the local repository with a `*` next to the branch we are
currently on.
```
##
##hack, hack, hack, make some changes, add/rm files, commit
##
##Push to the repo and create a remote branch
> git push
##Create a pull request and describe your work (Suggest/add a reviewer)
##Someone then reviews your code
##The pull-request is closed and the remote branch is destroyed
##Switch to master locally
> git checkout master
##Pull the most recent changes (including yours)
> git pull
##Delete your local branch
> git branch -d a-meaningful-name
```
[Here is how to create a GitHub pull request.](https://help.github.com/articles/about-pull-requests/)


# Git Overview

- back to [Table of Contents](#Table-of-Contents)

The tutorial will give a quick overview of what git is, how to use it, and how to work
as a team using the GitHub workflow.

For this tutorial you will need to have access to [Git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git), a [Terminal](http://swcarpentry.github.io/shell-novice/), and a [text editor](http://swcarpentry.github.io/git-novice/02-setup/) - you can use a terminal-based text editor like [vim](http://www.vim.org/) or [emacs](https://www.gnu.org/software/emacs/), or you can use one with a more complex interface like [Sublime](https://www.sublimetext.com/).

> For this tutorial, all commands that you should type in your terminal will be
> prefaced with `$`.

## Git in a nutshell

![git](git.png)

[Image source](https://xkcd.com/1597/)

Git is a *version control system* which helps you keep track of changes you make to files during the development
of your project. You can think of it as an undo button, a lab notebook, and a tool to safely and efficiently collaborate with others on a shared project, all rolled into one.
All serious software projects use version control.

While git is mostly used in software development, it can be used for anything you like
([writing books](https://www.gitbook.com/), for example), as long as your files are plain text
(e.g., source code, $\LaTeX$  files).

Simply speaking, git saves snapshots of your work called `commits`; after a `commit` is created, you can go back
and forth through different commits in your project -- maybe you were experimenting with some new function and
realized the old function was better, no problem, you can bring back anything! The collections of commits together with their associated
metadata (like who made the changes, and when) form the `repository` of your project.

![git 2](git_commit.png)

[Image source](https://xkcd.com/1296/)

The entire development of your project, the `repository`, is stored on your computer.  but we know that's
dangerous, so you can also host a remote copy on a hosting service like GitHub, Bitbucket, or GitLab.
Hosting a project's `repository` on GitHub also allows for the distribution of your work and collaboration
with others. This prevents endless emailing of source code and the following situation:



![final](phd101212s.gif)


## git sounds awesome! How do I get it?

Chances are, git is already installed on your computer. To check, open up a terminal and type `git`.
If not, you can get it [here](https://git-scm.com/).

OS X users can use `homebrew` to install git.

## Can I get buttons and stuff?

`git` is a command line tool, which means it doesn't have a native graphical user interface. Using the
`git` CLI is the most flexible way of working with `git`, and if you are working on a remote server you will
unlikely be able to use a GUI.

However, if you *really* want a point-and-click interface on your computer, here are some options:

*   [Options for Mac](https://git-scm.com/download/gui/mac)
*   [GitKraken](https://www.gitkraken.com/) (Windows and Mac)

*Keep in mind that if you are logging into a remote machine, such as AWS, a GUI may not be an option.*

## Configure your Git Profile

First things first: you need to configure your `git` client, so that your commits are correctly attributed to you,
and so you get pretty output. Do the following:

Open up a terminal: Go to your desktop on ADRF, left-click and
select Open Terminal.

```
# How my git configuration currently looks
$ git config --list
```

 My workspace #switch the names to your information

```
# Adding some customization
$ git config --global user.name "Clark Kent"
$ git config --global user.email "clark.kent@dailyplanet.com"
$ git config --global color.ui "auto"
$ git config --global core.editor 'nano' #or vim, emacs sublime
```
For a list of text editors, see Software Carpentry's [list](http://swcarpentry.github.io/git-novice/02-setup/)

Also do the following (important):
```
$ git config --global push.default current
```
You now have your `git` client configured. Next we will create
our first repository.


# Create a Repository

Let's work on creating our first `git` repository.

In this tutorial, all shell commands that you should type in the terminal will be prefixed with `>`
not the $ we used in Introduction to CLI.

> For this tutorial, all commands that you should type in your terminal will be
> prefaced with `>`.

## Create a git repository
Let's say our project is to analyze 311 data from New York City.

> If you already have a nyc_311 directory then ignore the instruction above.

First we'll *make a directory* for our project called `nyc_311` and then
*change directories* to our new `nyc_311` directory:
```
> mkdir -v nyc_311
> cd nyc_311
```
The `-v` flag after `mkdir` produces *verbose* output, which means that you'll see the results of the command displayed.
You should see `mkdir: created directory 'nyc_311'`, so you know that the command did what you intended.



Now let's *initialize* the git repository using the command `git init`. (You'll notice that
all `git` commands have the format `git <verb>`.)
```
> git init
```
One of the things `git init` does is create a `.git` directory. To see this, we can
*list* all the files in this directory using the `ls` command. We include the `-a`
flag to tell the command to display *hidden* files, or those that begin with a `.`,
in the list.
```
> ls -a
.  ..  .git/
```

We see that there is now a `.git` directory in the `nyc_311` directory.
*Unless you really know what you are doing* **DO NOT EVER** *modify anything in this
directory. If you delete this directory, the entire history of your project will be gone.*

## Make our first commit!!

All good projects should have a "README" describing the purpose and organization of the
project, so let's start with that. Fire up your favorite text editor and create
a file named `README.md`. The `.md` means this file will be read as `markdown`, which
allows us to format the text programmatically. Here's a basic example of a `README.md`:
```
# Exploring 311 Calls in NYC

## Description

This repo is for an analysis of 311 calls in NYC using Python 2.7.
```
When you're first starting a project, there's a good chance you won't have much more to add,
and that's OK. Notice that we *do* have a title, a short description of the project, and a
list of the software the project required, also known as the *dependencies*. It's a good idea
to keep updating both the description and dependencies as your project grows and evolves.

The `#` here don't signify hashtags or comments - they are part of
[`markdown` syntax](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet), and
they designate headings.

Now let's look at the *status* of our repo using `git status`:
```
> git status
On branch master

Initial commit

Untracked files:
  (use "git add <file>..." to include in what will be committed)

          README.md

          nothing added to commit but untracked files present (use "git add" to track)
```
`git status` tells us which *branch* we are on, which files are being tracked, and which
files are present that *aren't* being tracked. We see our new file `README.md` listed
under `Untracked files` - `git` sees that we've added something, but doesn't know whether
it should be logged as part of our project. When we create a new file, we need to *tell*
`git` to start tracking it. Like the command suggests, let's use `git add` to track
`README.md`.
```
> git add README.md
```
Now let's invoke the command `git status` again.

```
> git status
On branch master

Initial commit

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

          new file:   README.md
```
Now that we have added the file, it has been "staged" to be committed. We can now make
our first commit!

```
> git commit
```
When you invoke the `git commit` command, an editor should pop up. This is for you to
write your **commit message**, a message that provides the context of
*what you did* and *why you did it*. Anyone can look at a commit and examine
what was changed; you might look at a commit message to find where you changed a
certain file, or your collaborator might read it before incorporating your changes
into the shared version of the project.

Give some thought to what you write here - good commit messages lead to a usable `git` log,
and separate novice `git` users from competent practitioners. Generally, you should follow
these guidelines in a commit message:

1. **First line** is a one-line summary (fewer than 80 characters) of the commit. It should
be in [title case](https://en.wikipedia.org/wiki/Letter_case#Title_case) and written in
the [imperative voice](https://en.wikipedia.org/wiki/Imperative_mood). A good rule of thumb:
*If applied this commit will, <insert title of git message here>*.

2. **Second line** should be blank.

3. **Third (and subsequent) lines** should include more details of the commit.

My commit message is the following:
```
Check in README file

* Added short description of the project
* Added python2.7 as a dependency
```
Now that we have made our first commit we can examine our log using `git log`!
```
> git log

* commit aaf89fd77e9b43d99fe32823843a7519b2108c90
  Author: Clark Kent <clark.kent@dailyplanet.com>
  Date:   Sat Nov 05 13:45:11 2016 -0600

          Check in README file

          * Added short description of the project
          * Added python3 as a dependency
```
The long string of characters in the first line is a unique identifier for your commit.
You can use this identifier to switch back to this specific version of your project in
the future. The second line is who made the commit. The third line gives the date
and time. Everthing that follows is the commit message.

To see only the *titles* of commit messages, you can use `git log` with the
`--oneline` option:
```
> git log --oneline
```
There are many other options to the `git log` command; explore the documentation to find
which ones suit your needs. Here's one combination we recommend (check it out!):
```
> git log --oneline --graph --all --decorate
```

## Moving and removing files
Now you know how to *add* a file to your project using `git`. What if you want to get
rid of something? You can **remove** files from your repo using `git rm`:
```
> git rm FILENAME
> git rm -r DIRECTORY
```
Note that when you remove a file using `git`, you can still go back to a *previous* commit
before you removed the file and **recover** that file. This is different than simply using
the command `rm`, which permanently erases the file.

You'll also have to tell `git` if you want to *rename* a file using the `git mv` command,
because *renaming* a file is basically *moving* it to a new location. For
example, to change the name of a file from `OLD` to `NEW`:
```
> git mv OLD NEW
```
`git rm` and `git mv` will stage the changes, so after invoking one of these commands, you can
just `commit` your changes without using `git add` again.

We have our first commit and repo! Now we're ready to write some code!
