# Git Workshop (approx. 2hours)

In this workshop we will learn the basic functionalities of **Git** and how these can improve both our solo and team workflow.
Most of this workshop will be conducted using a **bash shell** and **Jupyter**, and, as such, some basic command line usage will also be shown and explained.

**Jupyter** is an open-source web-based interactive development environment that allows the creation and sharing of documents that contain live code, equations, visualizations and explanatory text (*like this one*).
One of the great advantages of this approach for a workshop is that all the setup happens on the organizer's side, whereas the participant only needs to focus on learning!

**Protip:** To alternate/run between **Jupyter cells** you can either click the run button or (after selecting a cell with your mouse, like this one) simply press:

- `SHIFT+ENTER` : Run current cell and skip selection to the next one.
- `CTRL+ENTER` : Run current cell (**recommended** so that you do not run cells that you do not want to).

![run](res/run.png)

### Setup (10min *or* 0min)

From here, you have the option of installing all of the required software and downloading the [repository files for this workshop](https://github.com/acmfeup/git-workshop). However, the [**recommended option**](#1---Creating-a-Repository) is to follow the workshop on this platform, and, after completion, configure all the required software for local usage.

#### Local Software Requirements

These instructions are intended to be just a guide on what to install **after** completing this workshop.

For Linux, a simple terminal (Bash Shell) should be enough, and `git` should already be installed!

For Windows, Git needs to be installed from [here](https://git-scm.com/downloads)!

***Windows Remarks***
- Git provides a usable Bash shell after installation;
- Microsoft recently started providing an integrated Linux Subsystem (usually referred as WSL, you can learn more [here](https://docs.microsoft.com/en-us/windows/wsl/faq#what-is-windows-subsystem-for-linux-wsl)) through it's [store](https://www.microsoft.com/store/productId/9NBLGGH4MSV6), which is highly recommended, as Windows can serve as an obstacle throughout you developer life.

***Terminal Remarks***
- Most commands in terminal work in the format: `<command name\> [sub commands] [-options] [...parameters]`;
- Whenever in doubt of how a command exactly works, most have a manual page with details about options and specific parameter patterns. You can access the manual page by runing `man <command name\>`, e.g., `man ls`. Some commands have similar names and may require a number after the command name to access the correct manual page. However, **using Google is also an option**;
- Bash provides pipe like operators by default, these include `>>`, `>` and `|`;
- Bash is a great tool for scripting so you can make the most of the operators mentioned in the point above to make complex operations using more than one command to get from input to output and do a series of commands all in a row using the `;` operator.

### 1 - Creating a Repository 

The first step in using **Git** is creating a repository. The concept of repository is just like a file system, very similiar to the one you have in your own computer. The special thing about Git repositories is that they contain some extra information in a folder named `.git` that will allow us to do a lot more and a lot easier than what we could with a regular file system. To experience this difference let's create a new folder and initiate an empty repository inside it.

In [None]:
# if you ever need to RESET your progress run this cell
rm -rf ~/my-first-repo
cd ~

In [None]:
# create a directory called my-first-repo
mkdir my-first-repo

# enter that directory
cd my-first-repo

# initialize a git repository
git init

You just created your first repository. Let's see what's inside. 

**Disclaimer:** it's empty.

***WARNING:*** Many of the commands you will find in this exercise cannot be run twice and will output errors.

In [None]:
# list the files in the current directory
ls -la

You should see 3 entries:
- `.` - a single dot. Represents the current directory;
- `..` - two dots. Represents the parent directory;
- `.git` - The special directory we mentioned above. This is how Git knows the current directory is also a Git repository.

__Tip__: the `-la` option shoes the output as a list and shows hidden files, i.e., files starting with a dot `.`.

#### 1.1 Command breakdown:
**`mkdir`** - **M**a**k**e **Dir**ectory, a command used to create a new directory or folder as most commonly refered to. The singular parameter with no options will be the name of the new empty directory.

**`cd`** - **C**hange **D**irectory, used to change the current working directory, translates to opening a folder in a GUI (Graphical User Interface). Again, the singular parameter will be the name of the directory to change into.

**`git init`** - This time we have two important words although the same pattern of command (git) and parameter (init) follows. Since git provides a lot of funcionality it is broken down in a lot of "subcommands" of **git** which by themselves also support arguments and options. In this case, **git init** creates an empty repository.

### 2 - Your First Commit

Now that we have a repository, you should add files to it, typically from a project you're working on.

Let's copy some files from a sample project - `messy-files` - to our repository.

In [None]:
# copy "messy files" to the current directory, i.e., '.'
cp  -R ../messy-files/** .
ls -la

You should see the following files:
- check_bullet.png
- index.html
- js1.js
- js2.js
- lib1.lib
- lib2.lib
- style1.css
- style2.css

Ok, so we just added a bunch of files to our directory. 

We have not yet told Git to track them - we can check this through the `git status` command.

In [None]:
git status

Let's tell Git to track one of the files we just added - `index.html`.

In [None]:
# add index.html the staging area
git add index.html

# check status again
git status

Finally, we can _commit_ our changes. It is a good practice to use the `-m` option to succintly describe the changes we're making.
But **first** we need to tell `git` who we are! Edit the next cell to match you name and email.

In [None]:
git config --global user.email "you@example.com"
git config --global user.name "Your Name"

In [None]:
git commit -m "my first commit: add index.html"

That's it! Our first commit. 
By commiting a change to our repository, it is now part of the repository's history and will be registered in the logs - unless it is explicitly deleted (out of scope of this workshop). 

We still need to track the remaining files - let's make a second commit to add them. Again, don't forget to add a clear message of what is changing.

In [None]:
# add all untracked changes to the staging area
git add -A
# commit the changes
git commit -m "add remaining files from 'messy-files' project"

We can now check the repository's history with `git log`. We should see both commits and their respective descriptive messages.

In [None]:
# see the repository's entire history
git log

# Tip: try these alternatives 
#
# show one line per commit
#git log --oneline
#
# show the last commit, including changes
#git log -p -1

#### 2.1 Command Breakdown:
**`git status`** - Displays generic information as well as the current status of the working tree and staging area of the repository.

**`git add [file/directory]`** - Used to select changes in our _working tree_ to be added to the _staging area_. The working tree is the local file system part that's included in our git repository and the staging area is an abstraction of git that represents the changes when executing a `commit` command. The `-A` option stands for _all_ and means all changes in the working tree are added to the staging area. Although useful, don't make a habit of using the `-A` option.

**`git commit`** - As the name implies, we are commiting to a change we made to the repository. Once you *commit*, the state of the repository will be registered and you can always restore it or check differences to the current state. Commits can be deleted but its not something you do on a regular basis as it's usualy easier to fix something rather than redoing everything else. The `-m` lets you add a message to your commit, these should usualy follow a protocol that lets you quicky understand what's beed added removed or modified with that commit.

**`git log`** - The log command is a very powerfull one as it allows you to review the full history of your repository from the moment the init command was executed. If features plenty of options, including:

- `--oneline` view each commit as a single line;
- `-p` (patch) displays each commit's difference in the format of additions and deletions; 
- `-<number>` show only the lastest _n_ commits. Can be combined with other options, e.g. `git log -2 -p`;

***WARNING***: if you ever commit a password or secret information to a repository even if you delete it in a later commit it will always be registered in the log as long as that commit is not deleted. This is a security risk in case anyone gets access to the .git directory. Secret information should be kept in a separate directory to the rest of the code and online platforms often offer support for secrets to be safely shared between developers.

### 3 - Making Changes

Having all files in the root directory is very messy and hard to work with. To sort our files, we now create 3 folders: `css`, `js`, and `lib`. Then, let's move each file to its respective folder, according to its extension.

In [None]:
# create the three directories
mkdir css
mkdir js
mkdir lib

# move all JavaScript files to the js directory
mv *.js js
# move all CSS files to the css directory
mv *.css css
# move all .lib files to the lib directory
mv *.lib lib

# list current directory
ls -la

In [None]:
git status

Because we changed the files into folders, Git has lost track of them - in fact, it thinks we deleted them. Again, this happens because we have made the changes in our filesystem, but we have not yet added them to the staging area.

In [None]:
# add all changes to the staging area
git add -A

# check status
git status

Git now correctly identifies the changes we made. However, we won't commit these changes right away. 

As you probably might have noticed, we have added `.lib` files to our repository. In general, it is good practice to use Git to track code files *only* - that means common files present in software projects, such as binaries or cache files, should **not** be added. In this case, `.lib` files are Windows static libraries - why do we even have these here (?) -  so we should remove them from the repository. To do so, let's remove the `lib` directory from the staging area, using the `git reset` command.

In [None]:
# remove lib from the staging area (or "unstage")
git reset lib
git status

Looks good - the `lib` directory is now back in the "Untracked files" section. We can now commit our changes.

In [None]:
git commit -m "add js and css folders, remove .lib files"

#### 3.1 - Command Breakdown

**`mkdir`** - Already covered [here](#1.1-Command-Breakdown:);

**`mv`** - **M**o**v**e, the same idea as presented [here](#Command-Breakdown:) but with some extra functionality. It appears we are moving a file named `*.js` - and there's no such file. Instead, all files that end in  `.js` are moved. In regular expressions, the `*` symbol is reserved as a wildcard and it represents "any number of characters". In this context, it means any file independently of its name will be selected to move as long as it ends in `.js`. The same aplies for the other terminations. 

**`git reset [file/directory]`** - this command resets the staging area, without changing the working tree. In other others, what we have added with `git add` will be removed, but our files will remain the same. **Tip:** using the command without arguments resets everything you added to the staging area.

**WARNING:** the reset command provides a lot more functionality which at times may even delete changes you have made to the working tree and subsequently all work that has not been commited. Make sure you know what exactly you are doing before copying random commands from StackOverflow.

### 4 - Ignoring Files

> _"Friends don't let friends upload `.DS_Store` files"_ - A good friend.

Now, we know we don't ever want to add the `lib` directory to our repository. We're in luck - we can explicitly tell Git to ignore certain files or directories through a special file named **`.gitignore`**. Everything listed in this file will **not** ever be tracked.

In this case, let's create a `.gitignore` file and add a `lib` entry to it. 

In [None]:
# create a .gitignore file with 'lib' in the first line  
echo "lib" >> .gitignore

# commit the .gitignore file
git add .gitignore
git commit -m "add .gitignore file with 'lib' entry"

# check status 
git status

#### 4.1 Command Breakdown

**`echo`** - This command is used to print information to the shell. When combined with the `>>` pipe it is very useful to dump information into files. So, the complete command writes the `lib` which is redirected using `>>` to the `.gitignore` file. If the file doesn't exist it is created.

### 5 - Parallel Work - Branching and Merging 

> _"No one touches `main.c`, I'm adding a feature!"_ - Non Git users.

Branching is the great feature of Git that allows us to work in parallel in different or even overlapping sections of code with relative independence. It is specially useful in team based projects, where people need to work simultaneously without interfering with others' work. 

#### Simple Workflow
Every repository starts with a default branch called `master` - typically considered the main line of development. When you create and work in a different branch, think of a it as a divergence from the main line of development - here, you can continue to do work without messing with that main line.

You typically create a branch when working on a new feature or bug fix. When you decide the code is finished (and tested), you _merge_ it back to the source branch.

Let's assume the current state of code is in the `master` branch. Every time you want to create a new awesome feature for the project, a *simplified workflow* is as follows:

1. Create a branch for the feature - `git branch feature-name`
2. Do some work in that branch - _write code_ 
3. After it's tested, _merge_ your changes into the original branch - in this case, `master`.

![https://davidjcastner.github.io/git-tutorial/Lab3](res/git-branches.png)

Figure 1: Illustration of a simple branch usage. Each node represents a commit.

We'll follow this workflow with an example in our ongoing project. 

**Problem to fix:** Something is off with the styling - it's a bug in the `css/style2.css` file.

We can check in what branch we are currently in with the `git branch` command.

In [None]:
# check the current branch
git branch

As expected, we're in the `master` branch. Let's create a branch and use it to edit the `css/style2.css` file.

**Tip:** Branches are typically named after what they are created for. Just like code variables, give your branches understandable names instead of "abcd".

In [None]:
# create a branch called fix-style2
git branch fix-style2

# checkout the created branch
git checkout fix-style2

# confirm that we are now working in the fix-style2 branch
git branch

As the `*` indicates, we are now working in the new branch. 

Let's edit the `css/style2.css` file to make the text-align property to be "center" instead of "unset", then commit the changes.

**Your turn.** Try this one on your own. Remember:
- Edit the `css/style2.css` file;
- Add the file to the staging area;
- Commit your changes - don't forget to set a clear message.

Edit the cell below with the necessary commands. If you're stuck, you can find the answer at the [end of the notebook]().

In [None]:
# Edit the css/styles2.css file
# (use the browser editor)

# Add the file to the staging area
## YOUR COMMAND HERE ##

# commit your changes
### YOUR COMMAND HERE ##

Our fix is now complete. 

We can now go back to the `master` branch and merge the `fix-style2` branch into it.

In [None]:
# go to the master branch
git checkout master

# merge fix-style2 into master
git merge fix-style2

#### 5.1 Command Breakdown:
**`git branch [branch-name]`**
   - without arguments - displays the list of local branches and signals which one is currently selected; 
   - with argument - creates a new branch with the argument name.

**`git checkout`** - Checkout combined with branching is one of the features that makes Git a lot more powerful that just any file system. With the `checkout` command we can immediately change our working tree to a different branch or specific commit. Because you can always go back to any point in "history", you don't have to worry about losing code or having to keep track of changes to see how things were working before. 

**Tip:** Be careful when using the `checkout` command with changes you have not yet commited - they will be lost forever. Git [helps avoiding this](https://git-scm.com/docs/git-stash), but it is not covered in this workshop.

**``git merge <branch-name>``** - Merge a branch **into the current branch**. It will try to automatically put together and resolve differences in the files from the parameter branch and into the current branch. There are several options you an explore and even specific strategies to use when merging, but it's never perfect and conflicts often arise. After the command is successfully executed, the current branch will be up to date with all the changes made in the parameter branch.

#### Typical Workflow

You now have the basics of branching and merging down. However, this simple workflow does not scale well when working with multiple people or complex projects.

In fact, branching workflows vary considerably between teams or companies. Nevertheless, a good starting point to follow in most projects is:
- **`master`** branch - entirely stable code. Everything here must be ready for production use, and has been thoroughly reviewed and tested;
- **`develop`** or **`next`** branch - parallel branch used for testing stability. Whenever a feature or bug fix is complete, it is merged into this branch and further tested. When ready, it is merged into `master`;
- **`feature`** or **`topic`** branches - short lived branches for isolated development. Typically used for new features or experiments, to make sure they don't introduce bugs to the most stable branches. 

![https://git-scm.com/book/en/v2/Git-Branching-Branching-Workflows](res/branch-workflow.png)

Figure 2: View of a project's typical branch workflow.

Make sure to use something along these lines when creating your next project.

***Note:*** Sometimes, when multiple people have to work on the same branch due to various constraints, **merge conflicts** can occur, *i.e.*, `git` wasn't able to merge changes of two (or more) commits. 
This commonly happens when the same lines of code were edited in different commits of two developers working at the same time.
It is not a something to be afraid of, but it ***is*** something that has to be fixed very carefully preferentially in a code editor (like VSCode), to ensure that everything that worked stays working!

### 6 - Remote Repositories

Git works well at a local level but it wouldn't mean much if we couldn't work with other people. To effectively use and collaborate on any Git project you need to know how to manage remote repositories. 


#### What is a remote repository?
> "Remote repositories are versions of your project that are hosted on the Internet or network somewhere." - [Git Book](https://git-scm.com/book/en/v2/Git-Basics-Working-with-Remotes)

Working with remote repositories involve knowing how to pull and push data from them when you need to share or update work. A _remote_ is typically hosted in a Git hosting service, such as [Github](https://github.com), [Gitlab](https://gitlab.com), and [Bitbucket](https://bitbucket.org/). 

![http://jlord.us/git-it/challenges/remote_control.html](res/remotes.png)

Figure 3: Simple illustration of interactions between remote and local repositories, with Github as sample Git host.

**We'll be using Github to illustrate how to use remote repositories.** However, the approach applies to most Git hosts.

[Github](https://github.com) is the largest host for Git repositories and is a central point of collaboration for millions of developers. Some of the biggest open-source projects in the world are hosted there - you might have heard of [Python](https://github.com/python/cpython), [Linux](https://github.com/torvalds/linux), and [VSCode](https://github.com/microsoft/vscode).

#### 6.0 - Creating a Github account

If you don't have one yet, go ahead and create a [Github](https://github.com/join) account.

**Tip:** if you create with your institutional email, e.g., `up20*******@fe.up.pt`, you can apply for a free *PRO* account [here](https://education.github.com/pack).

#### 6.1 - Create a repository

Once logged in - on the upper right corner, hit "Create a new repository".

Make sure to:
- Give it a name, e.g., `my-first-repo`;
- Set the privacy to either 'Public' or 'Private';
- **DO NOT** check the "Initialized this repository with a README" option;
- Set 'None' to both "Add .gitignore" and "Add a license" options.

![create-repo](res/create-repo.png)

Figure 4: Github's 'Create a repository' screen.

#### 6.2 - Add a Remote Repository

Once created, you'll be redirected to your new repository's page. Because we chose to create an empty repository, we are now prompted to add something to it. 

We can either:
- "create a new repository on the command line" - useful for quickly setting up new projects;
- push an **existing** repository;
- import code from another repository - useful when using other Version Control System software, such as SVN;

Because we already have a repository - the whole point of using Github was to make it available online - we're going with the "push an existing repository" option.

![remote-repo-start](res/remote-repo-start.png)

Figure 5: Github's 'new empty repository' screen.

To do so, we need to add our newly created repository as a _remote_ of our local repository - enter the `git remote` command.

We can add remotes in two ways:
- using `https:` - Communication is via `https` (duh), and authentication made with your username and password;
- using `ssh` - Communicaton is via `ssh`, and authentication made with SSH keys. **We'll be connecting via `ssh`.**

To use SSH remote, we first need to configure a public key. Let's confirm that **we cannot** connect to the remote without proper authentication.

**Note:** edit the following cell with your repository's url.

In [None]:
# add Github repository as remote
git remote add origin git@github.com:[yourusername]/[repositoryname].git

# push changes to Github
# !! this should fail !!
git push origin master

As expected, you get an authentication error because the Github remote we just set up needs SSH keys to work.

#### 6.3 - Create SSH keys

The pair of public/private keys that is generated allows Github to verify that it's you who is accessing the repository and hence guarantee authentication without username/password. One way to generate SSH keys is through the `ssh-keygen` command.

**Note:** Don't forget to edit the following cell.

In [None]:
# GitHub recomends the following key generation method
# However, when doing it in your computer, stick to "ssh-keygen"
ssh-keygen -t rsa -b 4096 -C \"your_email@example.com\" -f ~/.ssh/id_rsa -q -N ""

# add the key to the ssh agent
eval "$(ssh-agent -s)"
ssh-add ~/.ssh/id_rsa

#### 6.4 - Add public key to Github

You justed created an SSH keys pair. To finish up, you need to configure the public key in Github, so it knows it's you trying to access the repository.

**Tip:** If the last step didn't make any sense to you, read up on [public key authentication for SSH](https://www.ssh.com/ssh/public-key-authentication) - you'll use SSH keys a lot. 

First, get the key.

In [None]:
# print the just created public
cat ~/.ssh/id_rsa.pub

Next, **copy** the output above - your public SSH key - and add it to your Github account:
1. Go to `Profile Settings` -> `SSH and GPG keys` -> `New Key`;
2. Give it a familiar title, e.g., "neACM Git Workshop";
3. Paste the key into the "Key" field;

![ssh-keys](res/ssh-keys.png)

Figure 6: Github's SSH key addition screen.

You're good to go! (hopefully). 

Let's try to push our repository to Github again.

In [None]:
# push changes to Github
# this should work now !
git push origin master

If it worked, you just pushed the code to your remote repository. Head over to your Github page and check the results.

#### 6.5 Command Breakdown
**``git remote``** - Provides access to network accessible repositories and all configurations related to these. The `add` subcommand adds a remote repository. In this case, we set our GitHub repository as a remote of the local repository we have been working on. **Tip:** because you can have more than one remote, each one has a name. It it's typical to call the first and main as `origin`. 

**``git push``** - Used to update remote repositories with our local changes. This time we are pushing our changes to the local master branch to the remote repository we named origin, to the master branch. **Tip:** Pushing to a remote repository can lead to conflicts in case someone already pushed changes to the same branch. In that case, you need to update your local repository - `git pull` - and manually merge conflicts.

**``ssh-keygen``** - ssh-keygen is a command that comes pre-installed with most linux distributions and provides an user friendly way of generating several types of encryption keys. In this example we use the default key pair, RSA, and recommend using the default save path and you may choose to protect the key with a password. Encryption keys are a complex subject that veer too much from the topic of this workshop so we suggest you look these up by yourself.

**``cat``** - Con**cat**tenate, is a very flexible command that we are using to display the contents of the file that contains the public part of the key pair we just generated. It's very important to remember that we should only ever share the public pieces of any key as having access to the private one would mean we could be impersonated by the holder of that key. Most key files are named \".pub\" to signal they are the public part of the key.

**``ssh-add``** - As the name implies, **add**s an **ssh** key to the ssh-agent.

### 7 - Updating the Local Repository

We'll now simulate a typical scenario - someone else made a change to our (remote) repository and we want to get those changes in our own computer. 

#### 7.1 - Create a README.md file in the remote repository

For simplicity and to simulate what would otherwise be another person, go to your Gthub repository and create a 
`README.md` file **directly in the browser**. Write something in the file, leave a commit message, and commit the changes.

**Important:** make sure to select the "Commit directly to the `master` branch".

![create-readme](res/create-readme.png)

Figure 7: Creating a README.md in Github.

#### 7.2 Update the Local Repository

"Someone" just made changes in the remote repository - let's update our local repository.

This takes two steps:
- Downloading the changes - `git fetch`;
- Merging the changes to our branch - `git merge`;

Let's see this in action:

In [None]:
# fetch the changes from the remote
# Note: it uses the 'origin' remote by default
git fetch

# merge the changes into our local branch
git merge origin/master

**Tip:** Alternatively, the process of fetching and merging can be reduced to the `git pull` command. Pulling will perform both of these commands and try to merge changes automatically.

#### 7.3 Command Breakdown

**``git fetch <remote-name>``** - Get all the changes from remote (`origin` by default) but will not change our local working tree. In other words, it retrieves all the changes that have been made to the remote without changing any local files. **Tip:** use `git fetch --all` to fetch changes made to _all_ remotes.

### Bonus Challenge - See your code live!

As you might have noticed, this little sample project has everything a simple website needs - HTML, JavaScript, and CSS files. We can deploy it right now with a little help from Github - maybe it's your _first_ website?

This expected final result:

![gh-pages](res/gh-pages.png)

Figure 7: The challenge's expected result (notice the browser's url).

#### How?

Github Pages is a Github feature that allows us to easily deploy static websites. To activate it, you need to add a valid website to the `gh-pages` branch on the remote repository - in this case, your newly created repository. You can then access your website with the link `https://<your-username>.github.com/<your-repo-name>`. Here's the [example website](https://acmfeup.github.io/my-first-repo/).

With your freshly acquired knowledge, try to deploy it yourself! Here are some tips:
- Create a `gh-pages` branch locally;
- [OPTIONAL] Make some changes to the files;
- Push your changes to the `gh-pages` branch in the remote repository;

Good luck!

**Note:** you can also activate Github Pages in the repository's Settings page, but it defeats the purpose of the challenge.

### That's all, folks!

The goal of this workshop was to help your learn Git fundamentals through a simple and interactive experience. We hope you enjoyed.

#### Some remarks, if you're following the workshop through Jupyter:
- Obviously, this virtual environment is *not* your development environment. We advise you to follow the workshop again, this time through your own machine.
- You should delete the SSH keys created here from your Github account. Head over to `Account Settings` -> `SSH and GPG Keys` again to delete them.

This workshop is available on [Github](https://github.com/acmfeup/git-workshop) - check it out (you can also leave a star ⭐). 

#### We're recruting!

If you enjoyed this workshop and would like to **join us** creating more projects, workshops, conferences and others, please do not hesitate in contacting us through one of our various social media platforms, or emailing us @ [neacm@fe.up.pt](mailto:neacm@fe.up.pt).

### Appendix

#### Answer to the cell challenge
```shell
# Add the file to the staging area
git add css/styles2.css

# commit your changes
git commit -m "set align to center"
```