# Version Control with Git

**Version control is a way of tracking the change history of a project.**

Git is a tool that automates and enhances a lot of the tasks that arise when dealing with larger, longer-living, and collaborative projects. It's also become the common underpinning to many popular online code repositories, GitHub being the most popular.

For good resources for Git, click on [The Git Book](https://git-scm.com/book/en/v2)

## Why should you use version control?


- ** You can undo anything: ** Git provides a *complete history* of every change that has ever been made to your project, timestamped, commented, and attributed. If something breaks, you always have the choice of going back to a previous state.
- ** You won't *need* to keep undo-ing things: ** One of the advantages of using git properly is that by keeping new changes separate from a stable base, you tend to avoid the massive rollbacks associated with constantly tinkering with a single code.
- ** You can identify exactly when and where changes were made: ** Git allows you to pinpoint when a particular piece of code was changed, so finding what other pieces of code a bug might affect or figuring out why a certain expression was added is easy.
- ** Git forces teams to face conflicts directly: ** On a team-based project, many people are often working with the same code. By having a tool which understands when and where files were changed, it's easy to see when changes might conflict with each other. While it might seem troublesome sometimes to have to deal with conflicts, the alternative&mdash;*not* knowing there's a conflict&mdash;is much more insidious.

## Git Basics
The first thing to understand about git is that the contents of your project are stored in several different states and forms at any given time. If you think about what version control is, this might not be surprising: in order to remember every change that's ever been made, you need to store a record of those changes *somewhere*, and to be able to handle multiple people changing the same code, you need to have different copies of the project and a way to combine them.

You can think about git operating on four different areas:

![Git Commands](git_layout.png)

 - The **working directory** is what you're currently looking at. When you use an editor to modify a file, the changes are made to the working directory.
 
 
 - The **staging area** is a place to collect a set of changes made to your project. If you have changed three files to fix a bug, you will add all three to the staging area so that you can remember the changes as one historical entity. It is also called the **index**. You move files from the working directory to the index using the command `git add`.
 
 
 - The **local repository** is the place where git stores everything you've ever done to your project. Even when you delete a file, a copy is stored in the repo (this is necessary for always being able to undo any change). It's important to note that a local repository doesn't look much at all like your project files or directories. Git has its own way of storing all the information, and if you're curious what it looks like, look in the `.git` directory in the working directory of your project. Files are moved from the index to the local repository via the command `git commit`.
 
 
 - When working in a team, every member will be working on their own local repository. An **upstream repository** allows everyone to agree on a single version of history. If two people have made changes on their local repositories, they will combine those changes in the upstream repository. In our case this upstream repository is hosted by github. This need not be the case; SEAS provides git hosting, as do companies like Atlassian (bitbucket). This upstream repository is also called a **remote** in git parlance. The standard github remote is called the **origin**: it is the repository which is given a web page on github. One usually moves code from local to remote repositories using `git push`, and in the other direction using `git fetch`.

## Common Tasks in the Version Control of Files
### Forking a Repository
Forking brings a repository into your own namespace. Its really a *cloning* process (see below), but its done between two "remotes" on the server. In other words it creates a second upstream repository on the server, called the **origin**.

The forking process on github will ask you *where* you want to fork the repository. Choose your own github id.

![forking](github-forking2.png)

In my case I will choose `@rahuldave`, as in the screenshot above. In this tutorial, wherever you see `rahuldave`, substitute your own github id.

This leaves me with my own repository, `rahuldave/Testing`, as seen in this image

![forking](github-forking3.png)

You will get a similar page.

### Cloning a repository
Now that we have a **fork** of the `cs109/Testing` repository, lets **clone** it down to our local machines.

----
`git clone`

![clone](git_clone.png)

Cloning a repository does two things: it takes a repository from somewhere (usually an **upstream repository**) and makes a local copy (your new **local repository**), and it creates the most recent copy of all of the files in the project (your new **working directory**). This is generally how you will start working on a project for the first time.

### Poking Around 
`git log`
Log, tells you about all the changes that have occured in this project as of now. 

Each one of these "commits" is a SHA hash. It uniquely identifies all actions that have happened to this repository previously. We shall soon see how to add our own actions in. In the meanwhile, lets see the "status"of our working directory.

`git status`

![status](git_status.png)

Status is your window into the current state of your project. It can tell you which files you have changed and which files you currently have in your staging area.

Finally I set us up with a `.gitignore` file, hidden in the repository folder. It tells us what files to ignore when adding files to the index, and then comitting them to the local repository. We use this file to ignore temporary data files and such when working in our repository. Folders are indicated with a `/` at the end, in which case, all files in that folder are ignored.

You are always working on a given branch in a repository. Typically this is master. More on this later..You can know which branch you are on by typing `git branch`. The strred one is the one you are on.
