# Tutorial: Git/GitHub

`Git` is a **distributed version control system (VCS)** created to track changes in source code throughout the software development process. 

**It enables multiple developers to collaborate on a project, manage various file versions, and easily revert to earlier versions when necessary.** 

`Git` is widely adopted and is regarded as the standard for version control in software development.

## Why Use Git?

Here are the key reasons:

**Collaboration:**  
Git facilitates collaboration among multiple developers working on the same project, allowing them to work concurrently and merge their changes.

**Version History:**  
Git provides a complete history of your project, making it easy to revert to previous versions if necessary.

**Backup and Recovery:**  
Since each user has a full copy of the repository, it serves as a backup. The project can be restored from any user's copy if the main server goes down. 

Important note: it only serves as a backup if the repository is cloned on multiple independent devices.

**Flexibility:**  
Git's branching and merging capabilities allow for flexible workflows, making it easy to manage complex projects and experiment with new ideas.

## Features

**Version Control:**  
`Git` tracks changes to files over time, allowing you to see who made changes, when they were made, and what those changes were.

**Distributed Architecture:**  
Unlike centralized systems, each user has a complete copy of the repository, including its entire history. This feature enables offline work and simplifies collaboration.

**Branching and Merging:**  
`Git` allows the creation of separate branches for developing new features or fixing bugs without affecting the main codebase. These branches can be merged back together later.

**Staging Area:**  
The staging area (also known as the index) lets you select specific changes to include in your next commit.

**Commit:**  
A commit is a snapshot of your changes, creating a new version of your project.

**Remote Repositories:**  
`Git` enables you to connect to remote servers (such as GitHub, GitLab, or Bitbucket) for sharing your work and collaborating with others.

## Import commands to know about

* `git help`
* `git status`
* `git init`
* `git commit`
* `git diff`
* `git log`
* `git branch`
* `git clone`
* `git push`
* `git pull`
* `git revert`

In the following, we will discuss (and use!) most of these commands.

## Installing git

It's likely that git is already pre-installed on your UNIX system. If not, here are ways to install it:

```shell
brew install git # on MacOS
sudo apt-get install git # on Ubuntu Linux
```

To check that `git` is installed properly, run

```shell
git --version
```

Before we can get started, we need to tell `git` who we are:
```shell
git config --global user.name '<insert your name>'
git config --global user.email '<insert your email address>'
```

Please make sure to change the two commands to reflect your name and email address.

## Creating and cloning a repository
To create an empty repository called `gitdemo`, we use the command

```shell
git init demo
```

Note that `demo` is used here as a dummy name for the repository. Feel free to use descriptive name for your projects.
To clone (i.e., copy) an existing repository, we use

```shell
git clone https://github.com/cdrischler/compphysics-summer-tutorials
```

This clones the specified repository. `clone` also takes another argument, which specifies the desired name of the local copy. Otherwise, the name of the repository as specified remotely will be used.

Note: it's preferred clone via secure protocols such as SSH.

Once the repository is created or cloned, we can `cd` into it.

## Status
Probably, the most used `git` command is 
```shell
git status
```

Using this command, we get a summary of the current status of the working directory. It shows if we have modified, added, or removed files, and more.

## Adding files and committing changes in

To add a new file `<filename>` to the repository, we first create the file and then use 

```shell
git add <filename>
```

We can create such as file conveniently from this Jupyter notebook, or we could use
```shell
vim README
```

In [None]:
%%file README

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis ullamcorper velit ac risus sagittis congue. 

Writing README


What do you expect when running the following command?

```shell
git status
```

After having added the file `README`, the command `git status` lists it as an *untracked* file. 

To make `git` track the file, we need to add it (to the staging index):

```shell
git add README
```

Now, it is listed as a *new file* that has not yet been commited to the repository, as `git` status tell us:

```shell
git status
```

Next, we have to `commit` the changes to our repository:

```shell
git commit -m "Added a README file"
```

The `-m` argument allows us to specify a commit message, which should always be descriptive of what was changed. If you don't add the `-m` argument, your default editor will open, allowing you to compose longer and more descriptive commit messages.

Now, `git status` should tell us that there are no local change in the working directory:

```shell
git status
```

In other words, our working directory is clean.

## Commiting changes

`git status` lists tracked files that have been changed as *modified*.
Let's change the file we just created by adding another paragraph.

In [None]:
%%file README

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis ullamcorper velit ac risus sagittis congue. 

Fusce ut arcu sed massa scelerisque ultrices. Phase vitae tellus porttitor, elementum nulla ac, dignissim eros.

Overwriting README


What do you expect the output of this line to be?

```shell
git status
```

Next, we commit this change in:

```shell
git commit -m "Added a paragraph to the README file"
```

What do you expect the output of this line to be?

```shell
git status
```

## Removing and moving files

To remove the file `<filename>` in the repository, we use 

```shell 
git rm <filename>
```

In [None]:
%%file tmpfile

This is a temporary file that was not meant to be added to the repository.

Writing tmpfile


Let's pretend we add this file unintentionally to the repository:

```shell
git add tmpfile
git commit -m "adding file tmpfile"
git status
```

Ah, we didn't intend to so let's remove it again.

```shell
git rm tmpfile
git commit -m "remove temporary file that was unintentionally added" 
git status
```

Note that we corrected the "mistake". But `git` keeps track of all changes, intended and unintended ones. This is a huge upside!

Likewise, for simply moving a file `<filename>` in the repository, we use 

```shell 
git mv <filename>

## Commit logs

The messages that are added to the commit command are supposed to give a brief yet informative description of the content of the commit. 

We inspect the repository's history  using the command:

```shell
git log
```

Note that each commit is assigned a unique identifier, a SHA, which looks like `9b704e12f58b8275ff7950e97f25d6fd889ffc07`.

Each commit is also specifies the author, with contact information, who created it and the time at which it was created.

## Diff

To see changes with respect to the staging area, we use 
```shell 
git diff
```

Let's give it a try:

In [None]:
%%file README

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis est nunc ullamcorper velit ac risus sagittis congue. 

Fusce ut arcu sed massa scelerisque ultrices. Phase vitae tellus porttitor, elementum nulla ac, dignissim eros.

Overwriting README


Now, let's see what the changes are:

```
git diff README
```

What has changed?

If `README` is omitted, we will get a `diff` of all modified files in the working directory.

## Discard changes in the working directory

To discard changes in the working directory and revert to the version of a file stored in the repository, we use:

```shell
git checkout -- README
git status
```

## Examine previous revisions

If we want to get the code for a specific revision, we use the command `git checkout` and pass the SHA of the commit we want to checkout:

```shell
git log
git checkout <insert SHA: looks like c79d7cfa77d831f4bfbb15ea76721949118c10ec>
```

Now, explore repository at the previous snapshot you just checked out. What do you find?

We return to the latest version of our repository using:

```shell
git checkout main 
```

You can verify that we are back in the latest version, e.g.,

```shell
less README
git status
```

## Revert

To revert a modifications in a commit, we use 

```shell
git revert <SHA>
```

The default editor will open, allowing the user to compose a commit message detailing why this commit is reverted.

## Branching

Branches allow us to **create different versions** of the working directory within the same repository. 

They are particularly useful for **experimental development** or new features that involves significant changes, which could potentially disrupt the functionality of the main branch (e.g., where the release version is located). 

Once the development of a branch reaches a stable state, it can be **merged** into the main branch. 

Using the strategy of branching, development, and merging is effective when multiple people are **collaborating** on the same shared repository. 

However, even in repositories managed by a single author, it can be beneficial to maintain the main branch in a working state (e.g., containing the release version). 

**It is advisable to create a branch before implementing a new feature and later merge it back into the main branch.**

We create a new branch called `feature` using

```shell
git checkout -c feature
```

To list a branches, we use

```shell
git branch -a
```

We switch back to the `main` branch (or like so to any another (existing) branch):

```shell
git checkout main
```

Now, let's go back to `feature` and make some changes:

```shell
git checkout feature
```

Note that we didn't specify `-c`, which creates a new branch.

In [None]:
%%file README

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Cras eleifend erat nec augue aliquet, non sollicitudin augue maximus. Sed laoreet laoreet enim vitae pretium. Fusce non nisi massa. Nulla egestas massa in congue imperdiet. Phase at dignissim leo. Donec at dictum mauris. Suspendisse quam velit, luctus vitae blandit rhoncus, suscipit at sem. Nullam facilisis feugiat pharetra. Curabitur eu iaculis urna.

Nulla ullamcorper dignissim ex, id pulvinar est consectetur eu. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed a odio vestibulum, porttitor ipsum sed, tempor tellus. In consectetur condimentum lorem non interdum. Fusce nec efficitur urna, sed tempor sapien. Nunc nunc eros, rutrum a iaculis in, luctus eu dui. Fusce convallis enim odio, sit amet luctus velit pulvinar eget. Phase eu ipsum molestie, efficitur nibh ac, sodales leo. Aliquam malesuada odio non accumsan aliquet. Nulla ex metus, aliquam at lectus non, sollicitudin semper turpis. Suspendisse nec porttitor orci. 

What do you expect from the lines below?

```shell
git commit -m "edited README in feature branch"
git branch
git checkout main
git branch
```

We can merge an existing branch and all its edits into another branch (for example the `main` branch) like this:

To do so, we need to `checkout` to the target branch, here `main`:
```shell
git checkout main
git merge feature
git branch 
```

After the merge is complete, we can delete the branch `feature` using:

```shell
git branch -d feature
git branch
less README
```

**What happens if you merge two branches in which the same file has been edited?**

If two different lines in the file were changed, `git` can automatically merge the two files. The merged file will contain both edited lines. 

This feature makes Git, especially in combination with remote repositories, an **excellent tool for collaboration**!

However, if the same line was changed, then Git has no way to know which one is the "correct" one. 

The result is a so-called **merge conflict**. That means, the Git user has to decide manually which one (or neither) to use. 

If you in this situation, you can abort the merge (without losing your data) using

```shell
git merge --abort
```

To identify which file caused the merge conflict, use `git status`, locate the file(s) that were changed in both branches, and edit them as needed. 

Git marks the conflicting edits in the file(s) that caused the merge conflict. Convenient, right?

## Interfacing with remote repositories

Cloned repositories, e.g., those from `GitHub`, know the web address of the remote repository is originated from (i.e., the `origin` repository).

```shell
git remote
git remote show origin
```

We 
* retrieve changes from the `origin` repository by "pulling" and 
* update the `origin` repository with our change by "pushing".

```shell
git pull origin
```

The keyword `origin` is optional because `origin` is the default remote repository.

This procedure allows us to work with multiple remote repositories simultaneously.

The usual workflow is as follows:

1. Before we start working, we `git pull` to make sure we have the most recent version of the repository.
2. We make changes in our working directory.
3. We `git add` and then `git commit` changes to our local repository. 
4. We `git push` thoses commits to the remote repository.

For example:

```shell
git status
git pull
git add README
git commit -m "added README" 
git push
```

Note that `git push` requires read/write access to the remote repository, which you may not have.

## GitHub and similar service providers

`GitHub` is a popular hosting service for `git` repositories, widely used for both open- and closed-source projects.

Using a hosted repository makes it easy to collaborate with your colleagues on the same repository, which may contain source code or LaTeX documents, or both at the same time. 

`GitHub` offers a graphical user interface, allowing you to browse the code, view commit logs, and track issues, among other features.

**Popular hosting services for git:**
- GitHub (free subscription with you Ohio U email address)
- GitLab
- Bitbucket