# PDA data science - Git
<div class="alert alert-block alert-info"> 
    Notebook 9: by michael.ferrie@edinburghcollege.ac.uk <br> Edinburgh College, April 2022
</div>

## Introduction

Git is version control software for code and we can use it to store and backup up code. As part of this course we need to know how to use Git to store a project. Git was originally authored by Linus Torvalds in 2005 for development of the Linux kernel, with other kernel developers contributing to its initial development. We can use git together with the website GitHub where we can back up code using git. If you already have a GitHub profile then you can skip this step and move on to the next section.

Go to [this page](https://github.com/join) and sign up for a free GitHub account. You can either use your real name or some kind of internet handle. I use my [real name](https://github.com/michaelferrie). 

You may want to add this to your CV in future for a potential employer to look at, so it is a good idea to keep it professional. You can think of GitHub as cloud storage for code. I would advise backing your main code folder here to a private repo.

## Create a repo

A repository or repo for short is a storage area for code and other files. It is common to have code files, data files, documentation, files all in one directory. 

Once you have signed into GitHub go to [github.com/new](https://github.com/new). And create a new repository, call it `demo`. And give it the following settings.

* Repository template - no template
* Description - leave blank
* Public
* Check the box to add a README file
* Licence - leave as it is
* .gitignore - leave as it is

Then click on _Create repository_.

## Install git on you machine

So now we have a location online to back up our files, now we need to install git locally, instructions differ for each system:

* Apple computers, go to [https://git-scm.com/downloads](https://git-scm.com/downloads) and download git for your system, follow the instructions.

* Windows users go to [https://git-scm.com/downloads](https://git-scm.com/downloads). You will need to download the __Standalone 64-bit__ installer. Once the installer has downloaded run the installer and leave all settings as default during the installation setup. There are a lot of options, do not change any of them just click next and leave everything as default. At the end of the installation click 'Finish'

* On Linux install with `apt install git`.

## Github token

In the old days you were able to log into a GitHub repo with a username and password. Now we have to set up a token for security reasons, sign into GitHub and go to [https://github.com/settings/tokens](https://github.com/settings/tokens).

On this page click the button to __Generate new token__. In the Note box write `my_token` and set the expiration to 90 days, this means that if you lose the token or it is compromised you will not need to worry about it, it will expire, you can also revoke it manually and make a new one at any time. This gives more control than a password, password resets can be time consuming.

## Scopes

There are [many options](https://docs.github.com/en/developers/apps/building-oauth-apps/scopes-for-oauth-apps) for the scope, this would be useful if you were working with a large team and you wanted to assign different rolls. For this project, select only the `repo` scope section. This will check all of the options, so all of these should be turned on:

* repo:status Access commit status
* repo_deployment Access deployment status
* public_repo Access public repositories
* repo:invite Access repository invitations
* security_events Read and write security events


Once you have done this scroll to the bottom of the page and click __Generate token__.

This next step is important, the token will __only appear once__ so take a copy of it and past it into a text editor.

## Connect to repo

On your local machine make a directory called `demo` to connect to the repo. I would suggest doing this on the desktop or in your home folder, this is just for testing purposes, you can delete it later.

Place a file in the demo folder, call it `testfile.py`, add the following to line 1 of the file `print('hello world')`

On Mac or Linux open the terminal application, on Windows open the cmd shell or PowerShell. Now we need to change directory into the demo folder use the `cd` command to change directory into the demo folder.

On Linux or Mac check your current working directory with `pwd` on Windows the command is `dir`. Once we are inside the directory run the following commands to connect it to the repo.

![testfile.gif](https://michaelferrie.com/labs/testfile.gif)



## Git Global Configuration

Now we have everything set up, go to `https://github.com/YOUR-USER-NAME/demo`, swapping YOUR-USER-NAME for your actual GitHub username. You should see some instructions there, those are what we need to type into the command line in this folder. Carefully enter the following commands.

First we need to tell git who we are:

Run this command, enter the email address you used to sign up for GitHub

`git config --global user.email "you@example.com"`

Then enter your username for GitHub

`git config --global user.name "username"`

## Git initialisation

To initialize git in the directory.

`git init ` 

The `.` means add everything in this folder, this command tells git to add everything in the folder.

`git add .` 

This adds a comment and sets up a commit.

`git commit -m "first commit"`

This tells git to use the main branch.

`git branch -M main`

Now here is the fun bit, tell git the address of the GitHub repo.

`git remote add origin https://github.com/ YOUR-USER-NAME /demo.git`

GitHub is going to ask for your login details now.

`git push -u origin main`

![git_push.gif](https://michaelferrie.com/labs/git_push.gif)

## Enter token

Depending on the operating system you are using, git will either ask for your personal access token directly on the command line, or a box will pop up, when the box pops up click on Token then, just paste in the token we created earlier. Then git will push the changes to the repo.

Now if you return to the repo online and click on the code tab you should see the test file has been saved in the GitHub repo.

# Questions

Complete the following tasks with git.


1. In the `demo` folder edit `testfile.py` add the following to line 2 of the file then save `print('hello again world')`

2. Add another file to your `demo` folder called `testfile2.py` add this print statement to line 1 `print('hello testfile 2')`

3. From the command line add, commit and push your changes to the repo, remember to run this in the `demo` directory

* `git add .`
* `git commit -m "Added testfile2.py"`
* `git push -u origin main`

4. Browse to [https://github.com/michaelferrie/practice-lab/fork](https://github.com/michaelferrie/practice-lab/fork) and create a fork of, my practice-lab repo. Fork will create your own brand new copy my repo, this way you can make changes to it.

5. Make a folder on your local machine called `practice-lab` `cd` into that folder using your terminal/cmd/PowerShell.

6. Clone my repo into your folder with `git clone https://github.com/ YOUR-USER-NAME /practice-lab.git`, swapping YOUR-USER-NAME for your actual GitHub username in the URL.

7. You are going to edit the files but first, create a new branch (with -b) to work on, with the following command, replace the word __NAME__ with your first name.

`git checkout -b NAME-updates`

7. Once the contents of the repo have downloaded to your local machine, open the practice lab with your file manager (Windows Explorer or Finder) and open the `practice-lab` folder, you should see 4 files inside.

8. Using a text editor edit README.md, you can right click on the file and choose "Open with" open it with Notepad, add your first name on the first empty line.

9. Return to the terminal, and `cd` back into the practice-lab folder check the changes you have made run `git diff`. Git will show you the differences it has found since your changes.

10. Now add your changes, commit them locally and add a comment to the commit.

* `git add .`
* `git commit -m "Added my name to the file"`

11. We are ready to push to your branch, you need to change the name of the branch to your name, as defined in question 7.

* `git push -u origin NAME-updates`

12. Have a look at your forked GitHub repo and see if your changes have been uploaded. A new green button should have appeared that says __Compare & pull request__. Add a message to the text box saying `Added my name to README` and click on __Create pull request__.

13 What is happening here is that you are asking me to _pull_ your changes into the master branch, when I see your pull request, I will merge your changes into the master branch of the repo... if I like them 👍 If you get a bit lost here, [watch this video](https://www.youtube.com/watch?v=rgbCcBNZcdQ), this guy gives this a good explanation.

# Next steps

You can do a lot with Github, not just pushing and pulling repos, you can create and host websites, create backups of files, there are some good tutorials and guides and projects out there, GitHub will render Jupyter notebooks so you can host these as well. You can star pages you like or follow them. If anyone finds any good pages, send me a link and ill add it here, maybe we can create a GitHub page with links to good GitHub pages!

* https://github.com/datasciencescoop/Data-Science-Tutorials

* https://github.com/R-tutorials

* https://pages.github.com/

* https://pinegrow.com/tutorials/how-to-host-your-html-website-on-github-pages-for-free/

* https://github.com/collections/static-site-generators