<img src="./assets/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">

## Your Development Environment


<a id="home"></a>

## Lesson Guide

---

### [Part 3: Git and GitHub](#git_github) 

- [a) Git Basics](#git_basics)
- [b) Git CodeAlong](#git_codealong)
- [c) GitHub Basics](#github_basics)
- [d) GitHub & GitHub Enterprise Accounts](#gh_ghe_accounts)
- [e) Creating and Cloning Repos CodeAlong](#making_cloning) 
- [f) Pulling](#pulling)  
- [g) Secure Shell (SSH)](#ssh) 
- [h) Independent Practice](#independent_practice3)


<a id="home"></a>

## Lesson Guide

---

### [Part 3: Git and GitHub](#git_github) 

- [j) Cloning your first repository](#clone_repo)
- [k) How we will use GitHub in this course](#use_github)
- [m) Practice outside of class: Creating a Pull Request](#create_pull)
- [n) Practice outside of class](#independent_practice4) 
- [o) Git tips](#git_tips)
- [p) Conclusion](#conclusion2)

<a id="git_github"></a>
# <font style = 'color:blue'>Part 3: Version Control (Git and GitHub)</font>




First things first: Git is not GitHub. This is a common mistake that people make.



And Git and GitHub may well seem confusing at first, as they use unique terms to describve how things are done.  Don't worry, you will get used to it as you use it during the course.

<a id="git_basics"></a>
## <font style = 'color:blue'>a) Git Basics</font>

### <font style = 'color:red'>Version Control</font>


#### Why "Version Control"?

![](assets/usb.png)


Version control is something we often have to do with files that we work with (just think of all the files you've made called `<filename>_v1`, `<filename>_v2` etc).





There are 4 steps we go through from creating a file to making changes to it and creating a new version:
1. Create a file
2. Save the file
3. Edit the file
4. Save the file again


We iterate through the steps 3 and 4 until we have a final version:

<img src="assets/git_tracking.png" style="width: 700px;"><br />
Source: https://git-scm.com/video/what-is-version-control


For software developers, version control is crucial as you need to be able to revert back to a previous version (if you don't like the code you've added, or if you can't get it to work).

This is where Git comes in.



### <font style = 'color:red'>What is Git?</font>



![](assets/git-xkcd.png)

from [https://xkcd.com/1597](https://xkcd.com/1597)

[Git](https://git-scm.com/) is:



- Software



- A distributed version control system



- A program you run from the command line


Programmers use Git so that they can keep a history of all changes made to their code. This means that they can roll back changes (or switch to older versions) as far back as when they started using Git in their project.

A code base in Git is referred to as a repository, or repo, for short.



Fun fact: Git was created in 2005 by Linus Torvalds, the creator of the Linux kernel.

### <font style = 'color:red'>How does Git work?</font>

Git enables you to take snapshots of your files at any given moment in time.  It does this by having three main states that your files can reside in:

1. Modified
2. Staged
3. Committed

<img src="assets/git_process.png" style="width: 700px;">

Your local repository consists of three "trees" that are maintained by Git:

- **Working Directory**: holds the actual files.  Once you start changing files, they are modified.  They can be saved locally, but won't be under version control.


- **Index**: acts as a staging area.  You add your changes to the staging area; these are then staged.


- **HEAD**: points to the last commit you've made.  Once you're happy to save the changes in version control, Git takes a shapshot of all the changes you've made, and stores that snapshot.  As you add more and more versions, Git essentially builds up a library of snapshots, and you can revert your code to a previous snapshot at any time.


#### How can you use Git?  What are the commands?



You access Git via the Command Line (though you can now use [GitHub Desktop](https://desktop.github.com/) too).





There are a variety of commands you can use in Git (<font style = 'color:red'>tip: every Git command starts with `git`</font>). You can take a look at a list of the available commands by running:

```bash
$ git help -a
```

Even though there are lots of commands, in the course, we will really only need about 10.

`q` is usually the command used in Git to exit a screen and get back to the terminal.




Plenty of cheat-sheets are available with a summary of all the commands.  For example:

https://training.github.com/downloads/github-git-cheat-sheet/
<br />
https://dzone.com/articles/top-20-git-commands-with-examples


<a id="git_codealong"></a>
## <font style = 'color:green'>b) Let's Git into it! CodeAlong</font>





First, create a directory on your Desktop.

```bash
$ cd ~/Desktop
$ mkdir hello-world
```


You can place this directory under Git revision control using the following command:

```bash
$ cd hello-world      # don't forget to `cd` into the folder
$ git init
```



Git will reply:

```bash
Initialized empty Git repository in <location>
```

You've now initialized the working directory.

#### The .git folder



We can look at the contents of this empty folder using this command:

```bash
ls -A
```



We should see that there is now a hidden folder called `.git`. This is where all of the information about your repository is stored. There is no need for you to make any changes to this folder. You can control all of the Git flow using `git` commands.


#### Add a file



Let's create a new file.




```bash
$ touch a.txt
```


If we run `git status`, we should get:



```bash
On branch master

Initial commit

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	a.txt

nothing added to commit but untracked files present (use "git add" to track)
```



This means that there is a new, **untracked** file. 



Next, tell Git to take a snapshot of the contents of all files under the current directory.

```bash
$ git add a.txt
```

This snapshot is now stored in a temporary staging area, which Git calls the "index." To confirm the files are staged and ready to be committed, again run `git status`.



You can alternatively add _all_ new and modified files at once using the command below (note the `.`). This is not recommended, because you can accidentally add extra files if you are not careful! However, sometimes it is useful if you are adding many files at once and carefully use `git status` for verification.

```bash
$ git add .
```

#### Commit



To permanently store the contents of the index in the repository, (i.e. commit these changes to the "HEAD"), you need to run the following command:



```bash
$ git commit -m "Please remember this file at this time"
```



You should now get:

```bash
[master (root-commit) b4faebd] Please remember this file at this time
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 a.txt
```

#### Checking the log



If we want to view the commit history, we can run:



```bash
git log
```



As a result, you should see something similar to this:



```bash
* b4faebd (HEAD, master) Please remember this file at this time
```


<a id="github_basics"></a>
## <font style = 'color:blue'>c) GitHub Basics</font>

Another reason that that version control is so important for software developers is that they need to be able to work with others on the same code, with everyone able to make changes at different times...


<img src="assets/git_multiple_tracking.png" style="width: 700px;">

<img src="assets/git_collaborative_tracking.png" style="width: 700px;">
<br />

Source: https://git-scm.com/video/what-is-version-control

This is where GitHub comes in.

### <font style = 'color:red'>What is GitHub?</font>



[GitHub](https://github.com/) is:




- A hosting service for Git repositories
- A web interface to explore Git repositories
- A social network of programmers


Some other points to note: 



- It's good to have an individual GitHub account to store code that you work on
- You can follow users and star your favorite projects
- Developers frequently use Github to share and collaborate on open-source code
- GitHub uses Git

#### Can you use Git without GitHub?


Think about this quote: “Git is software. GitHub is a company that happens to use Git software.” So yes, you can certainly use Git without GitHub!

#### How do Git and GitHub interact?



<img src="assets/github_process.png" style="width: 700px;">

### <font style = 'color:red'>What is GitHub Enterprise (GHE)?</font>

[GitHub Enterprise](https://enterprise.github.com/home):

- A professional application of GitHub
- All repository data is stored on private and/or local machines and networks



Where GitHub is the _public_, 'Social Network' for programming and programmers, Github Enterprise is the _private_, professional application of GitHub.  Because, GitHub and GitHub Enterprise have a similar structure and are based off the `git` language, interacting with the two is is almost identical.

<font style = 'color:green'>**Question:** In your own words, what is the difference between Git, GitHub, and GitHub Enterprise?</font>

**More information on [GitHub vs GitHub Enterprise](https://enterprise.github.com/downloads/en/comvsenterprise-082415.pdf)**


During the course we will be interacting with both GitHub and GitHub Enterprise (GHE).  You have set up a GHE account to gather course materials and as a private repo for you to store your own works-in-progress.  

You should also setup a [GitHub](http://github.com) account as a location for you to host projects and work that you want to exhibit or share. 


<a id="gh_ghe_accounts"></a>
## <font style = 'color:blue'>d) Git on GitHub and GitHub Enterprise</font>




For those of you that have not already set up your GitHub and GitHub Enterprise accounts, lets take a minute to do that now.


Account creation for both is simple.  Create a _Username_, and provide an _email_ & _password_.



_Keep in mind while you **can** use the same email, username and password for both accounts, they are **wholly separate**._

- **[GitHub](https://github.com/)**
 _This is yours_
 
- **[GitHub Enterprise for General Assembly](https://git.generalassemb.ly/join?source=header)**
_This is ours_



You can use a GitHub account to create a GitHub Enterprise account, but that will be **your** enterprise; in that, you will not be able to use it to access the General Assembly Enterprise GitHub.

If in the future you join another GitHub Enterprise you will also need to create another account for said enterprise. 


<a id="making_cloning"></a>
## <font style = 'color:green'>e) Creating and Cloning Remote Repositories: CodeAlong</font>





When using Git, GitHub and GHE it is common to have your repositories in several locations.  Typically when we use GitHub and GHE we will have two repository locations, **Remote** and **Local**.


- **Remote:** Repositories that are not stored in our current location/machine. Usually where we store the repo.
- **Local:** Repositories that are stored on our current machine. Usually where we work on the repo.


**Let's do this together:**

1. Go to your GitHub Enterprise (GHE) account.
2. On the left hand side, hit the `New` button in the `Repositories` section.
3. Name your repository `hello-world`.
    - **DO NOT initialize the repository with a `README`, `.gitignore`, or license.**

4. Click the big, green `Create Repository` button.



We now need to connect our local Git repository with our newly created remote repository on GitHub. We have to add a "remote" repository, an address where we can send our local files to be stored.



On the right hand side of your GitHub repo in the `Quick setup — if you’ve done this kind of thing before` section, click on the copy button (looks like a clipboard) to the right of the http address.  This copies the URL of the repo (which will be our remote repo). 




_Make sure you changed directories into `hello-world` prior to running this_
```bash
git remote add origin https://github.com/<GITHUB-NAME>/hello-world.git
```



_At this point you **may** be prompted for a password, especially when using GHE_

#### Pushing to GitHub



In order to send files from our local machine to our remote repository on GitHub, we need to use the command `git push`. 


However, you also need to add the name of the remote repo — in this case, we called it `origin` — and the name of the branch, in this case `master`.


```bash
git push origin master
```




Refresh your GitHub web page, and your files should appear.

#### Creating a README.md file



Let's create a `README.md` file and push it to GitHub! 




Any file ending with `.md` is a Markdown file -- a text file with optional [Markdown formatting](https://daringfireball.net/projects/markdown/syntax). On GitHub, the contents of the displayed directory's `README.md` is automatically displayed.


Create a new `README.md` text file and add some text. (We'll try the command-line text editor `nano` this time!)

```bash
nano README.md
```



<font style='color:green'>Now, try using the same procedure as before to:</font>





* Add the new file to the staging area.  ```git add README.md```


* Verify the file is in the staging area, ready to be committed.  ```git status```


* Commit the file.  ```git commit -m'put a message here'```


* Push the commits to GitHub.  ```git push origin master```



Refresh your GitHub web page, and the new `README.md` file should appear. Take a look underneath the directory tree, and its contents will be automatically displayed.

<a name="pulling"></a>
## <font style = 'color:blue'>f) Pulling from GitHub</font>




Sometimes, other people will make commits and push them to GitHub. Locally, we will need to `fetch` these changes and `merge` them with our local files. To do both of these steps at once, run the `pull` command:

```bash
git pull origin master
```


This pulls from our remote `origin` to our local `master` branch.

Since our remote repo contains the same commits as we do locally, it will tell us everything is up-to-date.

By now, you have have noticed that you are being prompted quite a bit for your password when interacting with GitHub or GitHub Enterprise via Git.  There are two ways to work around this. 



**[Caching your password in Git](https://help.github.com/articles/caching-your-github-password-in-git/)**

or

**Using SSH and SSH Agent** (_recommended_)


You can use [this guide](../01-welcome-to-data-science/02a_SSH-setup.ipynb) to get it sorted, or those available on GitHub:

- [Working with SSH key passphrases](https://help.github.com/articles/working-with-ssh-key-passphrases/)
- [Generating a new SSH key and adding it to the ssh-agent](https://help.github.com/articles/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent/)
- [Adding a new SSH key to your GitHub account](https://help.github.com/articles/adding-a-new-ssh-key-to-your-github-account/)

<a id="ssh"></a>
## <font style = 'color:blue'>g) Secure Shell (SSH)</font>




SSH, or Secure SHell, is a common means of adding an additional layer of security.  Simply put, SSH is used to establish authenticity between a client and a server so that a secure connection can be formed.  This can be useful for secure file sharing or remote application access.  



#### How SSH works



The SSH process at a high level is relatively simple.  
1. Client makes a request to the server.
2. Server responds asking for authentication.
3. Client provides authentication.
4. If authentication is correct, a connection is established.



**Note:** An SSH Agent can be used to avoid being prompted for a password every time you push or pull to Github.


<a id="independent_practice3"></a>
## <font style='color:green'>h) Independent Practice</font>

Repeat the steps above, starting from the creation of the remote repo, to create a _local_ and _remote_ repository within your GitHub Enterprise account.  


If you have extra time and you haven't done it already, feel free to try to set up an SSH.  If you have any issues there are [SSH troubleshooting guides](#ssh_trouble) at the end of the lesson.

<a id="clone_repo"></a>
## <font style='color:green'>j) Cloning your first repository</font>




If you have a `repo` already in GitHub and want to create a local repo on your machine with it in, you can `clone` that repo from your GitHub link to have it automatically configured in a new directory (instead of going through the set up procedure in e) above).

```
git clone https://github.com/<github-username>/hello-world.git
```



Now that everyone has a repository on GitHub, let's clone one!

Cloning allows you to make a local copy of a remote repository.



Navigate back to your Desktop, and **delete your `hello-world` repository**.

```bash
cd ~/Desktop
rm -rf hello-world
```

<font style='color:green'>Now, everyone please post their **GitHub usernames in Slack** under the thread that we'll start.
    


Navigate to someone else's repository on GitHub:

```bash
https://www.github.com/<github-username>/hello-world
```

On the right-hand side, you will see the green button 'Clone or download'.</font>

#### Clone that repo!



To retrieve the contents of the repo, all you need to do is:

```bash
$ git clone git@github.com/<partners-github-username>/hello-world.git
```



Git should reply:

```bash
Cloning into 'hello-world'...
remote: Counting objects: 3, done.
remote: Total 3 (delta 0), reused 3 (delta 0), pack-reused 0
Receiving objects: 100% (3/3), done.
Checking connectivity... done.
```



You've now cloned your first repository!   The process will be the same with GitHub Enterprise.  

Cloning is most useful when there is a repo that exists remotely that we want on our local machines. 

<a id="use_github"></a>
## <font style = 'color:green'>k) How we will use GitHub in this course</font>




For the most part, our use of GitHub in this class will be much more linear.  

We will have a central class repository where all of the lesson materials are housed:


https://git.generalassemb.ly/GADS-BOH/ds50_repo


You will create a `Forked` version of the master repo which will be yours to do what you want with.  
![fork](assets/fork.png)



- Forked repos are essentially personal copies of others' repositories
- You can't change someone else's repo without their authorisation, but if you `fork` their repo you can make any changes you want to that
- Once you've forked a repo, if the original is updated you can pull those changes to your copy via ```git pull upstream master```



Once we have a forked copy, we will clone the fork to our machines to make a local copy.

Most of your git commands will be pulling updates from the main class repo to your local and pushing changes make on your local to your remote version.


### <font style='color:green'>Setting up our class repo on your machine</font>

1) Go to the class GitHub repo: https://git.generalassemb.ly/GADS-BOH/ds50_repo <br />
2) Click on `Fork` <br />
3) Choose a location for the forked repo to go in (in your GitHub Enterprise account) <br />
4) Copy the url (`Code` then `Code with HTTPS` then copy the URL there) <br />
5) Create a folder for the course somewhere on your machine <br />
6) Navigate to that folder in the command line <br />
7) In the command line, type `git clone` and then paste the URL <br />

You now have the class repo on your local machine, and you can open the notebook files through Anaconda!

### But... what happens when updates are made to the original repo that we forked?  (And we want to get those updates)


 
We can add a link to the original repo that we forked, and `pull` any changes and merge them with our personal local and remote repos.

In the command line type `git remote add upstream` and then the URL of the course repo:

https://git.generalassemb.ly/GADS-BOH/ds50_repo

This enables you to connect directly to the original class repo to get updates before each session (see below).




#### <font style='color:red'>What you need to do regularly</font>






At the end of session when you've worked on files in the repo, track your file changes:

- Verify which files have changed: 
    
    `git status`
    <br />

- Add all changes

    `git add <file name>` (or `git add .` if you're _sure_ there aren't any files you don't want to be uploaded)
    <br />

- Commit your changes with a meaningful message (no novels, but make different stages recognizable):

    `git commit -m '<your meaningful message>'`
    <br />
    
- Push your changes to your repo:

    `git push origin master`
    <br />

In order to get each lesson's files before each session, you will need to (from the lessons repository):


- Navigate to your lessons folder in the command line
- Then pull the current version of the course repo:

    `git pull upstream master`
    <br />
    


- Then push your updated local repo (which should have your version of the lessons' files, merged with any updated and new lessons' files) into your own individual repo (where you forked the files to originally):

    `git push origin master`
    <br />
    
The keyword `master` refers to a particular branch (the default one). With git, not only different users can have different file versions, but also on your own you can have different file versions on different branches (more on that at a later stage).    
    


<font style='color:green'>**Question:** Why does one command refers to `upstream master` and the other one to `origin master`? In other words, what's the difference between `upstream` and `origin`?</font>

<img src="assets/upstream_updates.png" style="width: 700px;"><br />

We will manage the projects through a different repo - we'll cover this in a future class.

<a name="create_pull"></a>
## <font style = 'color:green'>m) Practice outside of class: Create a Pull Request on GitHub</font>


#### What's a "Pull Request"?

Often times when we are working on a project with other collaborators we all can't work on the same repo at the same time so we have these things called _branches_.  


With branches we can create a branch off of the main repo or "Master Branch", perform our work whether that be additions or alterations and then merge the changes we made on _our_ branch back into the master.  



You can probably think of a few ways that multiple people trying to merge their work can get messy.  Fortunately, we have Pull Requests as means of queing and validating merges.  


Rather than having your branch merge automatically, your request will go through an administrator who can review your changes and additions and approve or deny your request.  



<img src="assets/branches.jpg" style="width: 500px;"><br />

_Even though we are trying to **push** our information into the master, it is called a pull request because we are making a request to the administrator to **pull** or branch into the master._

#### <font color styler='color:green'>Creating a Pull Request</font>


Before you can open a `Pull Request`, you must create a branch in your local repository, commit to it, and push that branch to a repository or fork on GitHub.



* Go to the GitHub page for a repo of yours


* Click the button labeled 'Master' on the left towards the top and create a branch called 'branch-edits' by typing in the box and hitting enter.  This should put you on your new branches page


* Make sure you're on your new branch and in GitHub create a new '.md' or '.txt' file:
    - Click on the 'Add file' button on the right towards the top
    - Select 'Create new file'
    - Call it what you like and place some text in it


* Click the 'Compare & pull request' button in the repository (if you can't see one, look for a link called '+- Compare' just below the green 'Code' button)
![pr](assets/pr.jpg)



* You'll land right onto the Compare page. Next, above the filename you should see two boxes labeled 'base:master' and 'compare: branch-edits'.  You can change these but they are correct by default: your new 'branch-edits' branch is being compared with the 'master' branch.


* Select the target branch that your branch should be merged to using the "Base Branch" dropdown menu


* Review your proposed change


* Hit 'Click to create a pull request' for this comparison


As you are working on your own repo in your own GitHub you will be able to merge the branch at your leisure by clicking the 'Merge pull request' button.

If you were working on a shared repo, there may be security in place to stop you merging until someone else has reviewed it and approved it.

#### How it works in practice: a GitHub Example Project



A company is building onto their website and adding a few new features.  They are assigning a team of engineers to complete the task.  

- Engineer 1 is responsible for re-doing the home page.
- Engineer 2 is responsible for re-working content pages 1 & 2
- Engineer 3 is responsible for re-working content pages 3 & 4

In this situation the `master` branch would probably be a copy of the files and scripts for the original website.  As this is the master copy it is best not to make any edits to it until those changes have been vetted.

The engineer team starts by making a branch off of the master called '`project_1`'.

From '`project_1`' the team of engineers would probably create their own branches to perform their tasks in.

- Engineer 1 : '`project_1_homepage`'
- Engineer 2 : '`project_1_content1_2`'
- Engineer 3 : '`project_1_content3_4`'


Once each engineer completed their task they would merge the changes from their branch into '`project_1`'.  

Now that all of the individual work has been compiled they can test the website and make sure that there are no bugs or issues then merge '`project_1`' back into the master copy.


<a id="independent_practice4"></a>
## <font style = 'color:green'>n) Practice outside of class</font>
- Explain the following commands in your own words.  Imagine that the audience are unfamiliar with Git and this is their first exposure to it.
- `init`, `add`, `commit`, `push`, `pull`, `clone`, and `fork`.



<a id="git_tips"></a>
## <font style='color: blue'>o) Useful Git commands</font>

<font style='color: SlateBlue'>*Initiate a git repo locally for the folder you're in*
```
git init
```



*Clone a remote repo*
```
git clone <url taken from the remote repo>
```

<br />



*Upload your changes to a remote repo*<br />
**This is the most frequently used git sequence**
```
git add <folder or filename>
git commit -m'your brief message about what's changed'
git push origin master
```

<br />



*Pull updates from a remote repo*<br />
**Watch out as this will overwrite any local changes you've made (if you're pulling `origin`)**
```
git pull origin master
```



*Get the status relative to the remote version*

```
git remote update
git status
```



*Setup the upstream remote*

```
git remote add upstream <origin URL>
```



*Configure your name and email address*

```
git config --global user.name "John Doe"
git config --global user.email johndoe@example.com
```



*Remove link between local and remote repos (you can just delete the folder; this is if you want to keep the folder but disconnect it from the remote repo)*
```
git remote rm origin
```
<br />



**Other commands to look up**:
- How to move between git branches
- How to rename a git branch (locally and remotely)



<a id="conclusion2"></a>
## <font style="color:blue">p) Lesson Review : Git and GitHub</font>

Do you feel comfortable with Git, GitHub and GitHub Enterprise? As we'll be using them in alot a lot of our coursework, let's make sure you are! 

We understand that Git and GitHub can be difficult to mentally grasp initially, so if you have an questions about the following or other Git aspects please ask now.

- Basic Git commands, `init`, `add`, `commit`, `push`, `pull`, and `clone`
- Local and Remote repositories
- Cloning, Branching, Merging and Pull Requests
- Installing Git