# Git and GitHub

## What is Git?
Git is a version tracking software.

## Initialising a repository - git init
The definition of a repository is a place where things can be stored. In the sense of using git, a repository is set up within a directory to be used to track files within that directory, which are added to the repository.

__git status__

In [1]:
mkdir mytrackeddir
cd mytrackeddir

git init # initialise a repository here

echo "welcome to the repo" > readme.txt
ls

Initialised empty Git repository in /home/harry/temp/studentsForAI/gith/mytrackeddir/.git/
readme.txt


## Checking which files are being tracked - git status

In [2]:
git status

On branch master

Initial commit

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	[31mreadme.txt[m

nothing added to commit but untracked files present (use "git add" to track)


## Start tracking files (staging) - git add

In [3]:
git add readme.txt # add the readme to the repo (this is called STAGING that file)
                # the file is not yet tracked
git status

On branch master

Initial commit

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

	[32mnew file:   readme.txt[m



The readme has now been staged (added to the repo) and as such is now having its changes tracked.

In [4]:
git rm --cached readme.txt # this is how to remove a file from the staging area
#git status
git add readme.txt # let's add it back to the staging area to work with
#git status

rm 'readme.txt'


Now let's make a change to the readme.txt file and add some more files.

In [5]:
echo "newline" >> readme.txt # edit the readme.txt file
touch file1.txt # make a new file
touch file2.txt # make another new file

git status # check the status

On branch master

Initial commit

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

	[32mnew file:   readme.txt[m

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	[31mmodified:   readme.txt[m

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	[31mfile1.txt[m
	[31mfile2.txt[m



In [6]:
git add -A # add all files in the current directory to the staging area
    # we could also have used a regular expression to specify a subset of them to stage
git status

On branch master

Initial commit

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

	[32mnew file:   file1.txt[m
	[32mnew file:   file2.txt[m
	[32mnew file:   readme.txt[m



## Making checkpoints - git commit

If we want to create a milestone to indicate that we are happpy with the changes to far and we want to _commit_ to them, then we can use the __git commit__ command to save these changes.

Commits require comments to explain the changes that have occured since the last commit, which we are now happy with and want to save. We follow the __-m__ tag with that comment.

Committing updates the version of the file that is 

In [7]:
git commit readme.txt -m "First commit" # commit the readme
git status

[master (root-commit) af1b931] First commit
 1 file changed, 2 insertions(+)
 create mode 100644 readme.txt
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	[32mnew file:   file1.txt[m
	[32mnew file:   file2.txt[m



Now the readme file has been committed. Let's commit the other two files by not specifying a file as an argument.

In [8]:
git commit -m "added the other 2 files"
git status

[master 3437e45] added the other 2 files
 2 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 file1.txt
 create mode 100644 file2.txt
On branch master
nothing to commit, working directory clean


## Resetting the staging area to some specified state - git reset

If we make a mistake and then accidentally commit it, we can undo the last commit using __git reset --soft HEAD~__

In [9]:
git reset --soft HEAD~ # undo the last commit where we committed the two files
git status

On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	[32mnew file:   file1.txt[m
	[32mnew file:   file2.txt[m



## Getting help - git help

In [10]:
git help status # get the man pages for the status command

GIT-STATUS(1)                     Git Manual                     GIT-STATUS(1)

NAME
       git-status - Show the working tree status

SYNOPSIS
       git status [<options>...] [--] [<pathspec>...]

DESCRIPTION
       Displays paths that have differences between the index file and the
       current HEAD commit, paths that have differences between the working
       tree and the index file, and paths in the working tree that are not
       tracked by Git (and are not ignored by gitignore(5)). The first are
       what you would commit by running git commit; the second and third are
       what you could commit by running git add before running git commit.

OPTIONS
       -s, --short
           Give the output in the short-format.

       -b, --branch
           Show the branch and tracking info even in short-format.

       --porcelain
           Give the output in an easy-to-parse format for scripts. This is
           similar to the short output, but will remain stable across Git
   

       In that format, the status field is the same, but some other things
       change. First, the -> is omitted from rename entries and the field
       order is reversed (e.g from -> to becomes to from). Second, a NUL
       (ASCII 0) follows each filename, replacing space as a field separator
       and the terminating newline (but a space still separates the status
       field from the first filename). Third, filenames containing special
       characters are not specially formatted; no quoting or
       backslash-escaping is performed.

CONFIGURATION
       The command honors color.status (or status.color — they mean the same
       thing and the latter is kept for backward compatibility) and
       color.status.<slot> configuration variables to colorize its output.

       If the config variable status.relativePaths is set to false, then all
       paths shown are relative to the repository root, not to the current
       directory.

       If status.submoduleSummary is set to

## List your logs (history of commits) - git log

Each commit shows the time, date, commit message and the SHA-1 hash that identifies the file.

In [11]:
git log

[33mcommit af1b931ef89d38c9d9e2cdcf4adb93a044484170[m
Author: haaaarryb <harryaberg@gmail.com>
Date:   Wed Aug 15 08:17:23 2018 +0100

    First commit


## Checking differences between files - git diff

__git diff__ shows the difference between a files and its version that was last committed. Plusses show lines that were added. Minuses show lines that were deleted

In [12]:
echo "another new line" >> readme.txt
git diff readme.txt

[1mdiff --git a/readme.txt b/readme.txt[m
[1mindex 3354a0e..a1a3722 100644[m
[1m--- a/readme.txt[m
[1m+++ b/readme.txt[m
[36m@@ -1,2 +1,3 @@[m
 welcome to the repo[m
 newline[m
[32m+[m[32manother new line[m


## Restoring files to their last commit

We might really mess up a file and not know how, but we can always 

In [13]:
cat readme.txt # show the file contents
git checkout readme.txt # revert readme.txt back to its last commit
git diff # show the difference between last commit (NONE)
cat readme.txt #'another new line# has been removed

welcome to the repo
newline
another new line
welcome to the repo
newline


## Making sure to not track certain files

What about files that you don't want to track? For example images or outputs of programs.

You can create a __.gitignore__ file that contains names of files and directories that should not be tracked. Even when you do _git add -A_, anything matching expressions in the .gitignore file will not be staged. 

The .gitignore file can contain regular expressions. E.g. echo "*.jpg" >> .gitignore would make all JPEG images not be tracked.

Each line in the /gitignore file represents one file, directory or regulare expression that specify things not to be tracked.

In [14]:
touch outputfile.txt # create a file
git status # see that is is ready to be staged
echo "outputfile.txt" >> .gitignore
git status

On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	[32mnew file:   file1.txt[m
	[32mnew file:   file2.txt[m

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	[31moutputfile.txt[m

On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	[32mnew file:   file1.txt[m
	[32mnew file:   file2.txt[m

Untracked files:
  (use "git add <file>..." to include in what will be committed)

	[31m.gitignore[m



In [15]:
git add .gitignore # start tracking the .gitignore file
git commit -m "added .gitignore file" # commit the gitignore file
git status

[master c840f64] added .gitignore file
 3 files changed, 1 insertion(+)
 create mode 100644 .gitignore
 create mode 100644 file1.txt
 create mode 100644 file2.txt
On branch master
nothing to commit, working directory clean


## Branching

Branching allows you to make a copy of elements of a repository, on a separate _branch_, where changes do not affect other branches.

Branching allows multiple people to work on new code at the same time without the updates interfering with each other.

__git branch__ lists all available branches. The asterisk (\*) indicates the branch that you are currently working on.

After different branches have been worked on, they can be merged into the master branch so that the changes are included there.

In [16]:
git branch

* [32mmaster[m


Make a new branch by following git branch with the name of your new branch. This does not put you working on this branch.

In [17]:
git branch mynewbranch
git branch # list all available branches

* [32mmaster[m
  mynewbranch[m


Move to working on another branch by using __git checkout__ followed by the name of the branch you want to move onto.

In [18]:
git checkout mynewbranch
git branch

Switched to branch 'mynewbranch'
  master[m
* [32mmynewbranch[m


We can delete a branch (that is not master) by using __git branch -d__ followed by the branch you want to delete.

In [19]:
git checkout master # move onto the master branch
git branch -d mynewbranch # delete mynewbranch
git branch # check available branches

Switched to branch 'master'
Deleted branch mynewbranch (was c840f64).
* [32mmaster[m


We can create a branch __and__ switch onto it at the same time by using __git checkout -b__ followed by the branch name.

In [20]:
git checkout -b newestbranch # create this branch AND move onto it
git branch # check which branch we are on

Switched to a new branch 'newestbranch'
  master[m
* [32mnewestbranch[m


In [21]:
echo "the fresh new line" > file1.txt
git status
git add -A
git commit -m "line added"
git status

On branch newestbranch
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	[31mmodified:   file1.txt[m

no changes added to commit (use "git add" and/or "git commit -a")
[newestbranch 0b3555d] line added
 1 file changed, 1 insertion(+)
On branch newestbranch
nothing to commit, working directory clean


Now lets move to the master branch and check the contents of that file we changed.

In [22]:
git checkout master

cat file1.txt

Switched to branch 'master'


There\'s nothing in file1.txt! But it hasn\'t been lost - it\'s just on the other branch.

In [23]:
git branch
git checkout newestbranch
git branch

cat file1.txt # see that the line is still there

* [32mmaster[m
  newestbranch[m
Switched to branch 'newestbranch'
  master[m
* [32mnewestbranch[m
the fresh new line


Now we can combine this branch with the master by using __git merge__ followed by the name of the branch that we want to merge.

In [24]:
git merge newestbranch
git branch
cat file1.txt

Already up-to-date.
  master[m
* [32mnewestbranch[m
the fresh new line


In [25]:
git checkout master # switch to master branch
echo "the funky new line" > file1.txt # now there will be a conflict between the branch and the master
    # branches will not be able to be merged
git status
git add -A
git status
git commit -m "fresh changed to funky"
git merge newestbranch

Switched to branch 'master'
On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	[31mmodified:   file1.txt[m

no changes added to commit (use "git add" and/or "git commit -a")
On branch master
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

	[32mmodified:   file1.txt[m

[master 0824f23] fresh changed to funky
 1 file changed, 1 insertion(+)
Auto-merging file1.txt
CONFLICT (content): Merge conflict in file1.txt
Automatic merge failed; fix conflicts and then commit the result.


: 1

In the penultimate line above, we can see that there is a conflict between the branches in file1.txt.

Let's take a look at that conflict from the command line using the __cat__ command. This shows the differences between versions of that file in different branches, if there are any.

The HEAD represents the the most recent commit on the currently checked out branch.

In [26]:
cat file1.txt

<<<<<<< HEAD
the funky new line
the fresh new line
>>>>>>> newestbranch


In [27]:
# we are still on the master branch
echo "the funky new line" > file1.txt # overwrite the file with the same line as in the 
cat file1.txt # now this only shows the content of the file which is the same in both branches, 
git status
git add file1.txt
git commit -m 'resolved conflict'

the funky new line
On branch master
You have unmerged paths.
  (fix conflicts and run "git commit")

Unmerged paths:
  (use "git add <file>..." to mark resolution)

	[31mboth modified:   file1.txt[m

no changes added to commit (use "git add" and/or "git commit -a")
[master 3d174ab] resolved conflict


Now there are not conflicts and we can merge the branches together.

In [31]:
git checkout newestbranch
git merge master # merge the branches

Already on 'newestbranch'
Updating 0b3555d..3d174ab
Fast-forward
 file1.txt | 2 [32m+[m[31m-[m
 1 file changed, 1 insertion(+), 1 deletion(-)


# GitHub

GitHub is an online platform for sharing code and collaborating with others.

Once logged on you can click the plus icon to add a repository. 

__Create a repository called '_myrepo_' now__

# add some images

GitHub provides a __remote__ repository, as oppose to your __local__ repo which has been created on your machine.

__git remote__ lists the remote repositories  our local repo is connected to. currently it will print nothing because our local repo is not connected to any remote repositories.

In [33]:
git remote # prints nothing
git remote add origin https://github.com/haaaarryb/myrepo.git # add the remote repo you just created
    # make sure to replace my username with your own github username (just copy the prompted URL after creating the repo)

Above we assign the name origin to a remote repo. Below we call __git remote__ to confirm that origin was added as a remote repository.

So __origin__ is the remote repo and __master__ is the main local repository.

In [34]:
git remote # now origin will be shown as a remote repo

origin


## Updating remote repositories - git push

The command __git push__ updates connected remote repos with all of the latest commits.

Use __git push -u__ followed by the repo 

Let's push this __commit__ to the remote _newestbranch_ branch on github using the __git push__ command. Not anyone can push to your repo, so you will have to execute the next command from your terminal, where you will then be prompted for your 

We have to specify which remote we are pushing to GitHub. Only the commits on the current branch are sent to the remote repository. This allows us to create branches from code that can not be seen or accessed from the remote repo.

In [35]:
git push origin newestbranch # go to your terminal and run this command - you will be prompted to enter your login details

Username for 'https://github.com': 


## add image of branch appearing on github

## Pull requests

A pull request essentially allows someone to ask someone else if they're willing to incorporate changes from one branch into another. It is like a git merge facilitated by GitHub.

Let's initiate a pull request on ourselves just as a demo to show how they work.

# Add image of pull request being made on github

The _base_ branch is the one that changes will be merged into. The _compare_ branch is the one that includes the changes that you want to merge into the base.
You can add comments and other stuff too. The comments section is a markdown file.

# show some markdown syntax

Commits that are made to the compare branch of the pull request will be reflected even if they are made after the request is.

Other people can change files within the project that you are working on. To make sure that your local repo is up to date with these changes, you should __pull__ these changes from GitHub onto your local version of that branch (this requires login details, so you'll have to do this from your terminal rather than here in the notebook):

In [None]:
git pull # run this command in your terminal and enter your login details.

Git now finds the our branch in the remote repo and updates our local repo with the new commits which exist there. 

## Copying someone's code to edit it - Forking

Forking is copying someone else's code, by copying their GitHub repo into your own account where you can edit it how you like. You can then even submit a pull request to ask to incorporate your changes into the original repo which you forked from.

Fork a repository by finding that repo on GitHub and then clicking the 'clone or download' button, then copy the URL and go to your terminal and run the __git clone__ command with that URL as an argument as the link to the repo that you want to clone:

In [None]:
git clone <your repo URL>

# Cleanup - run the below cells to delete the directory and return to the orignal state before this tutorial

In [2]:
cd ../mytrackeddir && rm * # move into the mytrackeddir directory and if that is successful, empty it

bash: cd: ../mytrackeddir: No such file or directory
rm: cannot remove 'images': Is a directory
rm: cannot remove 'journal': Is a directory
rm: cannot remove 'manyfiles': Is a directory
rm: cannot remove 'mytrackeddir': Is a directory
rm: cannot remove 't': Is a directory
images	journal  manyfiles  mytrackeddir  t


In [3]:
cd .. && rm -rf mytrackeddir # remove the directory used for this tutorial