<font color="white">.</font> | <font color="white">.</font> | <font color="white">.</font>
-- | -- | --
![NASA](http://www.nasa.gov/sites/all/themes/custom/nasatwo/images/nasa-logo.svg) | <h1><font size="+3">ASTG Python Courses</font></h1> | ![NASA](https://www.nccs.nasa.gov/sites/default/files/NCCS_Logo_0.png)

---

<center>
    <h1><font color="red">Introduction to Version Control with Git</font></h1>
</center>

## Few Pointers
- <a href="https://www.atlassian.com/git/tutorials">Become a git guru</a>
- <a href="https://github.github.com/training-kit/downloads/github-git-cheat-sheet/">Git Cheat Sheet</a>
- <a href="https://www.vogella.com/tutorials/Git/article.html">Git - Tutorial</a> (by Lars Vogel)
- <a href="https://berkeley-stat159-f17.github.io/stat159-f17/lectures/01-git/Git-Tutorial..html">An interactive Git Tutorial: the tool you didn’t know you needed</a>
- <a href="https://www.freecodecamp.org/news/learn-the-basics-of-git-in-under-10-minutes-da548267cc91/">Learn the Basics of Git in Under 10 Minutes</a>

# <font color="red">What is Version Control?</font>
- Version control systems are a category of software tools that help a software team manage changes to source code over time. 
- Version control software keeps track of every modification to the code in a special kind of database. 
- If a mistake is made, developers can turn back the clock and compare earlier versions of the code to help fix the mistake while minimizing disruption to all team members.
- Version control software is an essential part of the every-day of the modern software team's professional practices. 

There are two types of version control system:
- **Centralized** version control systems are based on the idea that there is a single “central” copy of your project somewhere (probably on a server), and programmers will “commit” their changes to this central copy. The best known examples of centralized systems are CVS and Subversion.
- **Distributed** version systems do not necessarily rely on a central server to store all the versions of a project’s files. Instead, every developer “clones” a copy of a repository and has the full history of the project on their own hard drive. This copy (or “clone”) has all of the metadata of the original. Some distributed version control systems are Git and Mercurial.

![fig_vcs](https://s26500.pcdn.co/wp-content/uploads/2019/09/VCS_Diff-768x314.png)
Image Source: [datacore.com](https://www.datacore.com/blog/how-to-move-source-control-from-perforce-to-github/)


**Benefits**
- Developing software without using version control is risky, like not having backups. 
- Version control can also enable developers to move faster and it allows software teams to preserve efficiency and agility as the team scales to include more developers.
- The primary benefits you should expect from version control are:
     + A complete long-term change history of every file. 
     + Branching and merging. 
     + Traceability.

# <font color="red">What is Git?</font>
- The most widely used modern distributed version control system.
- A mature, actively maintained open source project.
- Git has been designed with the following features:
    + **Performance**: Committing new changes, branching, merging and comparing past versions are all optimized for performance. 
    + **Security**: The content of the files as well as the true relationships between files and directories, versions, tags and commits, all of these objects in the Git repository are secured with a cryptographically secure --hashing-- algorithm called SHA1.
    + **Flexibility**: Git is flexible (1) in support for various kinds of nonlinear development workflows, (3) in its efficiency in both small and large projects, and (3) in its compatibility with many existing systems and protocols.

### What is a Git Repository?
- A Git repository is a virtual storage of your project. 
- It allows you to save versions of your code, which you can access when needed.
- A Git repository contains the history of a collection of files starting from a certain directory. 
- The process of copying an existing Git repository via the Git tooling is called **cloning**. 
- After cloning a repository the user has the complete repository with its history on his local machine.

### Git Hashes

- Hashes, file based key-value storage and tree data structure, are the key things behind git. 
- Each tree node, commit and files has own unique 40 character long SHA-1 representation.
- Commits are a particular type of checkpoint called a **revision**. 
- The name will be a random-looking hash of numbers and letters such as `e093542`. 
- This hash can then be used in various other commands to extract a specific revision of the code.

### Git Workflow

The general workflow of a Git cycle is:
- Clone a Git repository as a working copy.
- Modify the working copy by adding/editing files.
- If necessary, update the working copy by taking other developers' changes.
- Review the changes before commit.
- Commit changes in your local repository.
- If everything is fine, push the changes to the remote respoitory.
- If you realize that something is wrong, you can correct any of the previous commits and still push your changes to the remote repository.

The image below shows the interaction between the remote repository, the local repository (in this case the Master) and the Working Directory.

![fig_git](https://dev.vividbreeze.com/wp-content/uploads/2018/03/gitBasicsRemote.jpg)
Image Source: https://dev.vividbreeze.com/git-tutorial-remote-repositories/

A file in the working tree of a Git repository can have different states:

- **untracked**: the file is not tracked by the Git repository. This means that the file is never staged nor committed.
- **tracked**: committed and not staged.
- **staged**: staged to be included in the next commit.
- **dirty** / **modified**: the file has changed but the change is not staged.

After doing changes in the working tree, the user can add these changes to the Git repository or revert these changes.

# <font color="red">Using Git: My First Git Repository</font>

## <font color="blue">Step 1: Install and Setup Git</font>

### Step 1-1: Install Git on your Machine (if it is not there already)

- **For this course, you are not required to have `git` on your local machine.**
- To install Git on your local machine by following the installation instructions: [Getting Started - Installing Git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git). 


Verify the installation:

In [None]:
%%bash

which git

In [None]:
%%bash

git --version

In [None]:
%%bash

git

### Step 1-2: Create Account on `github.com` 

- Go the the website [https://github.com/](https://github.com/) to create an account if you do not have any.
- You might also need to [generate an SSH key and authenticate it](https://docs.github.com/en/github/authenticating-to-github/connecting-to-github-with-ssh) if you are working on your local machine.

### Step 1-3: Configure `git` to Use your Account

- The `git config` is used to set Git configuration values on a global or local project level. 
- These configuration levels are included in the `.gitconfig` text file. Executing `git config` will modify a configuration text file. 
- A `git config --global` setting is only done once. Git will always use that information for anything you do on that system. If you want to override this with a different name or email address for specific projects, you can run the command without the `--global` option when you’re in that project.

If you have never done it, you need to set your user name and email address. Uncomment the lines below and provide your information.

In [None]:
%%bash

#git config --global user.name "<username>"
#git config --global user.email "<email address>"
#git config --list

Here is a sample `.gitconfig` file:

    [user]
            name = Jules Julian
            email = Jules.Julian@mymail.com
    [alias]
            co = checkout
            ci = commit
            cm = commit -m
            cam = commit -am
            st = status
            br = branch
            cam = commit -am
            slog = log --oneline --topo-order --graph
            hist = log --pretty=format:"%h %ad | %s%d [%an]" --graph --date=short

We also want to create few aliases from the command line:

In [None]:
%%bash

git config --global alias.co "checkout"
git config --global alias.ci "commit"
git config --global alias.cm "commit -m"
git config --global alias.cam "commit -am"
git config --global alias.st "status"
git config --global alias.slog "log --oneline --topo-order --graph"
git config --global alias.hist 'log --pretty=format:"%h %ad | %s%d [%an]" --graph --date=short'

## <font color="blue">Step 2: Create a New Repository</font>

![first](https://media.geeksforgeeks.org/wp-content/uploads/20191122182103/staging_process.jpg)
Image Source: [https://www.geeksforgeeks.org/what-is-a-git-repository/](https://www.geeksforgeeks.org/what-is-a-git-repository/)

- You will first to create a local working directory, named `src_code`.
- All the cells below are meant to be run by you in a terminal where you change **once** to the `src_code` directory and continue working there.
- For the purposes of the tutorial, the cells will all be prepended with the first two lines:

```python
   %%bash
   cd src_code
```

In [None]:
%%bash

rm -fr src_code
mkdir src_code

In [None]:
%%bash
cd src_code

pwd

**Create an empty repository**
- To create a new repo, you'll use the `git init` command. 
- `git init` is a one-time command you use during the initial setup of a new repo. 
- Executing this command will create a new **.git** subdirectory in your current working directory. 
- This will also create a new master branch. 

In [None]:
%%bash
cd src_code

git init

You have initialized the working directory and you created a new directory, named **.git**.

In [None]:
%%bash
cd src_code

ls

In [None]:
%%bash
cd src_code

ls -a

In [None]:
%%bash
cd src_code

ls -l .git

In [None]:
%%bash
cd src_code

cat .git/config

Verify the directory tree:

In [None]:
%%bash
cd src_code

find . -print | sed -e 's;[^/]*/;|____;g;s;____|; |;g'

**Create a simple Python file** (`hello_class.py`)

In [None]:
%%bash
cd src_code

echo "print('Hello Class!')" >> hello_class.py

In [None]:
%%bash
cd src_code

ls

In [None]:
%%bash
cd src_code

cat hello_class.py

In [None]:
%%bash
cd src_code

python hello_class.py

**Get a brief summary of the git repository**
- To find out information regarding what files are modified and what files are there in the staging area.

In [None]:
%%bash
cd src_code

git status

**Add the new file to the repository staging area**
- Before the file is added to the local repository, it has to be in the stagging area.
- The staging area is there to keep track of all the files which are to be committed.
- Here, you can incrementally modify files, stage them, modify and stage them again until you are satisfied with your changes.

In [None]:
%%bash
cd src_code

git add hello_class.py

In [None]:
%%bash
cd src_code

git status

**Create a new commit with a message describing what work was done in the commit**
- After adding the selected files to the staging area, you can commit these files to add them permanently to the Git repository. 
- Committing creates a new persistent snapshot (called commit or commit object) of the staging area in the Git repository. 

In [None]:
%%bash
cd src_code

git commit -m "My first Python script"

In [None]:
%%bash
cd src_code

git status

**View the history (print out of what has been committed so far) of your changes using**: `git log`

Here you create a commit object that contains:
- **Commit ID**: SHA-1 Hash
- **Author**: Name of the user who committed files.
- **Date**: Date and time the commit was done.
- **Commit Message**: The message the user wrote while committing.

In [None]:
%%bash
cd src_code

git log

In [None]:
%%bash
cd src_code

git slog

In [None]:
%%bash
cd src_code

git hist

**Get the current `hash`**

The Python function below will allow us obtain the current hash from a git repository.

In [None]:
%%bash
cd src_code

hash_00=`git rev-parse HEAD`
echo "First hash:  ${hash_00}"

## <font color="blue">Step 4: Make Changes and Track Results</font>

Edit the file.

In [None]:
%%bash
cd src_code

echo "print('\t Welcome to the Git Tutorial.')" >> hello_class.py

In [None]:
%%bash
cd src_code

cat hello_class.py

In [None]:
%%bash
cd src_code

python hello_class.py

In [None]:
%%bash
cd src_code

git st

**What has changed so far**: `git diff`

In [None]:
%%bash
cd src_code

git diff

In [None]:
%%bash
cd src_code

git add hello_class.py

In [None]:
%%bash
cd src_code

git ci -m "Added a welcome message"

In [None]:
%%bash
cd src_code

git st

In [None]:
%%bash
cd src_code

git log

In [None]:
%%bash
cd src_code

hash_01=`git rev-parse HEAD`
echo "Second hash: ${hash_01}"

If you also want to see complete diffs at each step, use: `git log -p`

In [None]:
%%bash
cd src_code

git log -p

Often the overview of the change is useful to get a feel of each step:

`git log --stat --summary`

In [None]:
%%bash
cd src_code

git log --stat --summary

# <font color="red">Using Git: Recovering Old Versions of Files</font>

![revert](https://www.satz24.com/post/temp_admin12345/02-reset-concept.png)
Image Source: [satz24.com](https://www.satz24.com/post/temp_admin12345/02-reset-concept.png)

In [None]:
%%bash
cd src_code

echo "n = 5" >> hello_class.py

In [None]:
%%bash
cd src_code

echo "for i in range(n)" >> hello_class.py

Note the code below will generate a syntax error due to lack of indentation.

In [None]:
%%bash
cd src_code

echo "print('{} of {}'.format(i+1, n))" >> hello_class.py

In [None]:
%%bash
cd src_code

cat hello_class.py

In [None]:
%%bash
cd src_code

git add hello_class.py

In [None]:
%%bash
cd src_code

git cm "Added a loop"

In [None]:
%%bash
cd src_code

python hello_class.py

Oops, made a mistake! Need to go back one commit.

In [None]:
%%bash
cd src_code

git log

In [None]:
%%bash
cd src_code

git hist

In [None]:
%%bash
cd src_code

hash_02=`git rev-parse HEAD`
echo "Third  hash: ${hash_02}"

In [None]:
%%bash
cd src_code

git diff $hash_01

In [None]:
%%bash
cd src_code

git diff HEAD~1

In [None]:
%%bash
cd src_code

git checkout $hash_01 hello_class.py

In [None]:
%%bash
cd src_code

cat hello_class.py

In [None]:
%%bash
cd src_code

echo "n = 5" >> hello_class.py
echo "for i in range(n):" >> hello_class.py
echo "    print('{} of {}'.format(i+1, n))" >> hello_class.py

In [None]:
%%bash
cd src_code

cat hello_class.py

In [None]:
%%bash
cd src_code

python hello_class.py

In [None]:
%%bash
cd src_code

git add hello_class.py

In [None]:
%%bash
cd src_code

git cm "Added loop (with proper indentation)"

In [None]:
%%bash
cd src_code

git hist

In [None]:
%%bash
cd src_code

hash_03=`git rev-parse HEAD`
echo "Fourth hash: ${hash_03}"

# <font color="red">Using Git: Moving and Removing Files</font>

- While `git add` is used to add files to the list git tracks, we can also use git if we want to change their names to change or to stop tracking them. 
- We can use the Unix `mv` and `rm` commands in git to accomplish it.

**Moving a file:**

In [None]:
%%bash
cd src_code

git mv hello_class.py hello_world.py
git st

These changes must be committed too, to become permanent.

In [None]:
%%bash
cd src_code

git cam "Rename the file hello_class.py as hello_world.py"
git log

**Removing a file:**

In [None]:
%%bash
cd src_code

echo "import math" >> simple_sine.py
echo "x = math.pi / 4.0" >> simple_sine.py
echo "print('Sine of {} is: {}'.format(x, math.sin(x)))" >> simple_sine.py

In [None]:
%%bash
cd src_code

cat simple_sine.py
echo ""
python simple_sine.py

In [None]:
%%bash
cd src_code

git add simple_sine.py
git cm "Created the file simple_sine.py"
git log

Remove the file that was just created:

In [None]:
%%bash
cd src_code

git rm simple_sine.py
git status

In [None]:
%%bash
cd src_code

git cam "Remove the file simple_sine.py"
git log --stat --summary

Print the directory structure:

In [None]:
%%bash
cd src_code

find . -print | sed -e 's;[^/]*/;|____;g;s;____|; |;g'

# <font color="red">Using Git: Branching</font>

- A branch represents an independent line of development. 
- Branches serve as an abstraction for the edit/stage/commit process. 
- Can be seen as a way to request a brand new working directory, staging area, and project history. 
- New commits are recorded in the history for the current branch, which results in a fork in the history of the project.


![branch](https://www.nobledesktop.com/image/gitresources/git-branches-merge.png)
Image Source: [nobledesktop](https://www.nobledesktop.com/learn/git/git-branches)

In [None]:
%%bash
cd src_code

git st
ls

In [None]:
%%bash
cd src_code

git branch

We now try two different routes of development: 

- On the `master` branch we add one file, and 
- On the `my_own_branch` branch, which we create and add a different file.

We will then merge the `my_own_branch` branch into `master`.

In [None]:
%%bash
cd src_code

git branch my_own_branch
git co my_own_branch

In [None]:
%%bash
cd src_code

git branch

In [None]:
%%bash
cd src_code

echo "import math" >> simple_sine.py
echo "x = math.pi / 4.0" >> simple_sine.py
echo "print('Sine of {} is: {}'.format(x, math.sin(x)))" >> simple_sine.py

git add simple_sine.py
git cam "Added the new file simple_sine.py"
git slog

In [None]:
%%bash
cd src_code

git co master
git branch
git slog

In [None]:
%%bash
cd src_code

echo "import math" >> simple_cosine.py
echo "x = math.pi / 4.0" >> simple_cosine.py
echo "print('Cosine of {} is: {}'.format(-x, math.cos(-x)))" >> simple_cosine.py

git add simple_cosine.py
git cam "Added the new file simple_cosine.py"
git slog

- By default, all variations of the `git log` commands only show the currently active branch. 
- If we want to see all branches, we can ask for them with the `--all` flag.

In [None]:
%%bash
cd src_code

git log --all

In [None]:
%%bash
cd src_code

ls

In [None]:
%%bash
cd src_code

git merge my_own_branch
git slog

# <font color="red">Using Git: Accessing Remote Repository</font>

- We want to introduce the concept of remote repository: a pointer to another copy of the repository that lives on a different location. 
- A remote repository can be a different path on the filesystem or a server on the internet.
- We will use a repository hosted on the [github.com](github.com) service.
   - GitHub is a code hosting platform for version control and collaboration. 
   - It lets you and others work together on projects from anywhere.
   - To use GitHub, you need to have knowledge on version control and Git.

![fig_github](https://drstearns.github.io/tutorials/git/img/github.png)
Image Source: Dave Stearns (drstearns.github.io)

In [None]:
%%bash
cd src_code

ls
echo "Check if the local repository is associated with any remote repositories."
git remote -v

- Since the `git remote -v` command did not produce any output then is no remote repository is configured.
- We now need to go to GitHub and create a new repository called `src_code`.
- **Do not check** the box that says _Initialize this repository with a README_, since we already have an existing repository here. That option is useful when you’re starting first at Github and don’t have a repo made already on a local computer.

In [None]:
%%bash
cd src_code

git remote add origin https://github.com/your_USERID/src_code.git
git branch -M master
git push -u origin master

In [None]:
%%bash
cd src_code

git remote -v

**Cloning the remote repository**

In [None]:
%%bash

git clone https://github.com/your_USERID/src_code.git new_src_code

In [None]:
%%bash
cd new_src_code

pwd
git remote -v

Create a new file, add and commit it to the local repository.

In [None]:
%%bash
cd new_src_code

echo "# My first repository" >> README.md
git add README.md
git commit -m "Created a README.md file"

Push the new file to the remote repository:

In [None]:
%%bash
cd new_src_code

git push

Let us update the original local repository:

In [None]:
%%bash
cd src_code

git pull

![stages](http://www.tsbakker.nl/images/gitstages.jpg)
Image Source: [http://www.tsbakker.nl/git.html](http://www.tsbakker.nl/git.html)

# <font color="red">Using Git: Resolving Conflicts</font>

- If two different branches modify the same file in the same location, git cannot decide which change should prevail. 
- Human intervention is needed to make the decision. 
- Git will help you by marking the location in the file that has a problem, but it’s up to you to resolve the conflict. 

In [None]:
%%bash
cd src_code

git branch my_trouble_branch
git checkout my_trouble_branch
echo "There will be a conflict here!" >> README.md
git commit -a -m "Made changes on the my_trouble_branch branch"

We now now go to the master brach and change the same file:

In [None]:
%%bash
cd src_code

git checkout master
echo "Changes on the master branch" >> README.md
git commit -a -m "Modified the master branch"

Let us try to merge the `my_trouble_branch` branch into the `master`:

In [None]:
%%bash
cd src_code

git merge my_trouble_branch

We can see what git has included in the `README.md` file.

In [None]:
%%bash
cd src_code

cat README.md

We need to:
- Use a text editor
- Decide which changes to keep, and
- Make a new commit that records our decision. 

In [None]:
%%bash
cd src_code

rm -f README.md

echo "# My first repository" >> README.md
echo "There will be a conflict here!" >> README.md
echo "Changes on the master branch" >> README.md

git add README.md
git commit -m "Created a README.md file"
git slog

### Exercise

- In this exercise, two code developers interact with the remote repository and modify the same file. 
- They both push their changes to the remote repository until a conflict arise.

**Developer 1**

- Clone the remote repository.
- Create a new file, commit it to the local repository and push it.

In [None]:
%%bash

git clone https://github.com/your_USERID/src_code.git code_dev_1

cd code_dev_1

echo "a = 100" > "add_numbers.py"
echo "b = 200" > "add_numbers.py"
echo "c = a + b" > "add_numbers.py"

git add add_numbers.py
git commit -m "Added the new file add_numbers.py"
git push

The file `add_numbers.py` is now part of the remote repository.

**Developer 2**

- Clone the remote repository that contains the file created by Developer 1.
- Update the file `add_numbers.py`, commit it to the local repository and push it.

In [None]:
%%bash

git clone https://github.com/your_USERID/src_code.git code_dev_2

cd code_dev_2

echo "d = 300" > "add_numbers.py"

git add add_numbers.py
git commit -m "Set the variable d to 300"
git push

**Developer 1**

- Make changes to the file `add_numbers.py`, commit it to the local repository.
- Attempt to push the changes to the remote repository and realize that there is a conflit.
- Resollve the conflict.

In [None]:
%%bash

cd code_dev_1

echo "d = 400" > "add_numbers.py"

git commit -am "Set the variable d to 400"
git push

You will get the error message that looks like:

     Using index info to reconstruct a base tree...
     M	add_numbers.py
     Falling back to patching base and 3-way merge...
     Auto-merging add_numbers.py
     CONFLICT (content): Merge conflict in add_numbers.py
     error: Failed to merge in the changes.
     Patch failed at 0001 Set the variable d to 400
     hint: Use 'git am --show-current-patch' to see the failed patch
     Resolve all conflicts manually, mark them as resolved with
     "git add/rm <conflicted_files>", then run "git rebase --continue".
     You can instead skip this commit: run "git rebase --skip".
    To abort and get back to the state before "git rebase", run "git rebase --abort".

**Developer 1** needs to first resolve the conflict by editing the file `add_numbers.py` (by choising one value for the variable `d`) and then execute git commands:

In [None]:
%%bash

cd code_dev_1

echo "a = 100" > "add_numbers.py"
echo "b = 200" > "add_numbers.py"
echo "c = a + b" > "add_numbers.py"
echo "d = 300" > "add_numbers.py"

git add add_numbers.py
git rebase --continue
git status
git push

# <font color="red">Summary of Few Git Commands</font>

| Command | Description |
| --- | --- |
| `git init` | Initialize a repository |
| `git clone` | Clone a remote repository |
| `git add` | Add a file that is in the working directory to the staging area. |
| `git commit` | Add all files that are staged to the local repository. |
| `git log` | Show the commit logs. |
| `git push` | Add all committed files in the local repository to the remote repository. So in the remote repository, all files and changes will be visible to anyone with access to the remote repository. |
| `git fetch` | Get files from the remote repository to the local repository but not into the working directory. |
| `git merge` | Get the files from the local repository into the working directory. |
| `git pull` | Get files from the remote repository directly into the working directory. It is equivalent to a `git fetch` and a `git merge`. |

## <font color="blue">First Activity: Simple Repository</font>

Follow the steps described in <a href="https://guides.github.com/activities/hello-world/">https://guides.github.com/activities/hello-world/</a> to create a **Hello World** project.

## <font color="blue">Second Activity: Collaboration on github</font>

Form teams of two (Coder "A" and Coder "B").  Here you will learn how to create a repository on github, "clone" it to your local machine, and share it with a collaborator.

### Part 1 (Coder A):
- Create a new repository on github called "my_git_repos".  
- Clone this to your laptop: `git clone <repository>`  

### Part 2 (Coder A):
- Add Coder B as a collaborator to the repository on github.

### Part 3 (Coder B):
- Clone the repository to your local machine.  
- Create a new file in your repository called "hello.py" (print("Hello, world!")).  
- Add the file to the repository, commit the change, and "push" (`git push`) the change back to github.

### Part 4 (Coder A):
- `Pull` the (updated) repository and identify the new file present.  
- Make a change to the file "hello.py".  Add, commit, and push to the server.  

### Part 5 (Coder B):
- (After Part 4 is completed) Make a change to the file "hello.py".  
- Add, commit, and try to push to server.  What happened?