<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">

# Session 3 - Managing Data

# Agenda

#### Housekeeping
#### Part 1: Version Control with Git
#### BREAK
#### Part 2: Databases

# Housekeeping

- Office hours have been posted.
- You should know who your TA is.

# Part 1: Version Control with Git

## Learning Objectives
- learn about Git, GitHub, and make your own repository
- understand and practise the basic Git commands you need to version your work over the remainder of the course

# Why "Version Control"?

![](assets/usb.png)

### What are desirable properties of a solution?

- keep our work off our computers
    - either there is a "master" copy (**centralised**)    
    - or everyone can have a full copy (**distributed**)

- ability to collaborate with others...
    - ...and gracefully handle "conflicts"

- ability to see changes over time
    - and go back to old versions if needed

**Nice to haves:**

- open source

- fast

- ability to work on same code for two different purposes
    - e.g. develop two new features of a piece of software that are separate

# One Solution: Git

![](assets/git-xkcd.png)

from [https://xkcd.com/1597](https://xkcd.com/1597)

## What's the difference between Git and GitHub?

### Git

- the underlying source control system
- allows repositories, commits, branches
- open source

### GitHub

- a company that lets you host Git
- free (if your material is public)
- alternatives include **BitBucket** (which allows free private repositories)
- GitHub Enterprise is a paid version to have a separate, private GitHub
- additional features e.g. wikis and issue tracker

# Step 1: Create a repository

A **repository** (or "repo") is a self-contained "folder" of files. Think of it as a **project**.

Let's create a repository for you to store your own work.

*Note: if you've forked `dat24` you can use that fork for your work, but remember it is **public***

![](assets/git/new_repo.png)

![](assets/git/create_repo_menu.png)

![](assets/git/after_create_repo.png)

### What's that `.gitignore` file?

It is a hidden file that tells Git what files to **not version**.

Why?

- generated files that you don't need
    - e.g. Jupyter creates "checkpoints" that you may not care about
- code artefacts (Python creates some)
- it's nice to have the option!

### Give the teaching team access

Go to Settings -> Collaborators

![](assets/git/repo_access.png)

# Step 2: Get a local copy

Your repository only exists in GitHub Enterprise. It is a **remote** repository.

To work on it, you need a copy on your machine.

![](assets/git/get_clone_url.png)

To get a local copy, you **clone**.

In a terminal/command prompt navigate to a folder where you want a copy of your repository and type:

`git clone <your_url_here>`

# Step 3: Make a change

Open your automatically created `README.md` (which will get displayed by default in GitHub) and add some text.

Maybe write who you are and what the purpose of this repository is!

# Step 4: "Saving" your changes

In Git, there are **three** stages to saving your work.

1 - "Stage" your work = prepare it for a commit. This step will look like it's done nothing.

2 -  "Commit" your work. Your changes are now part of your **local** repository.

3 - "Push" your work. Your commit will be sent to the **remote** repository.

2.5 - If you are collaborating with someone, or have copies of your work on multiple machines, it's good practice to "pull" before you "push".

This brings down changes from the **remote** repository to the **local** one that have happened since you **cloned**.

### The Three Steps

#### 1 - "Stage"

To see any pending changes, type:

`git status`

To see what those changes actually are:

`git diff`

Now type

`git add README.md`

to "stage" the changes you've made to that file.

If you start typing the name of `README.md` you can press `TAB` to autocomplete it (great for long Jupyter notebook filenames).

*Note: if you **remove** a file, you still need to `add` its removal as a change!*

Check

`git status`

again. Notice your file is green and it says "Changes to be committed"

#### 2 - Commit

You're ready to commit!

`git commit -m "Add a commit message"`

![](assets/git-commit-xkcd.png)

from [https://xkcd.com/1597](https://xkcd.com/1296)

#### 3 - "Push"

I said it's good practice to pull, so let's do that.

`git pull`

***Note: this is also how you should get the latest course materials!***

Now:

`git push`

Technically you should specify:

- which remote repository to push to (you can have multiple at once)
- which **branch** to push to

Then the command becomes:

`git push <remote> <branch>`

e.g. `git push origin master`

### Branching

![](assets/git/branching.svg)

# Summary of commands

How do you get a local copy of a remote repository?

`clone`

How do you see what has changed locally?

`status`

How do you stage a local change?

`add`

How do your changes get added to the **local** repository?

`commit`

How do your changes get added to the **remote** repository?

`push`

# Summary of commands

`clone`: gets a local copy of a **remote** repository<br>
`status`: show pending local changes<br>
`add`: stage a local change<br>
`commit`: commit a set of staged changes<br>
`push`: push your commits up to the remote repository<br>
`pull`: pull updates from the remote repository<br>

# Useful links

## Git GUI

If you don't want to use the command line, you can use a nice desktop application to manage your work:

- [GitHub Desktop](https://desktop.github.com/)
- [SourceTree](https://www.sourcetreeapp.com/)
- [Git Kraken](https://www.gitkraken.com/)

## Git cheat sheet

We all need one of these.

[https://www.git-tower.com/blog/git-cheat-sheet](https://www.git-tower.com/blog/git-cheat-sheet)

## When things go wrong...

[Oh, s***, git!](http://ohshitgit.com/)

## More about workflows, branching etc.

[Git workflows](https://www.atlassian.com/git/tutorials/comparing-workflows)

[Git branching](https://www.atlassian.com/git/tutorials/using-branches)

## Miscellaneous

A way to practise Git in the browser: [https://learngitbranching.js.org](https://learngitbranching.js.org)