# Day 7: Git and Github

Welcome to our final day of Computing for Research 2022! Now that you've become equipped with many tools to do some awesome research, we want to instill some best-practices in you to make your research road smoother. Today, we're going to be talking about Git and Github.

## Logistics

Today, this jupyter notebook is just a reference document, we will not be running any code on it. In this tutorial, we'll be live coding together, along with a few exercises and check in's like we've done before. You don't need to have the notebook up.

## Overview
1. What is Git and why use it?
2. Git Basics
3. Branches
4. Gituhb

## What is Git?

Git is open-source, free source control management (SCM). With Git, you can manage changes to files over time, and even go back and see what those changes were. Basically, Git gives you a way to flexibly manage changes-- compare versions, revert to previous versions, see who made changes.

## Why use it?

Tldr; 1. things change, 2. group work.

![image-2.png](attachment:image-2.png)

![image-3.png](attachment:image-3.png)

## Git Basics

You interact with Git through the terminal.

Windows: When you install Git, it comes with a terminal called Git Bash. Use this, or another terminal you've been using (powershell, etc.)

Mac: You can just use your built-in terminal.

Let's open up a new terminal window. Before using git, we need to configure a few things. If you've already done this, skip.

The first thing we need to configure are our username and email address. This is important because then when changes are committed, you can see who made those changes.

`git config --global user.name "YOUR NAME"`\
`git config --global user.email youremail.com`\
`git config --global init.defaultBranch main`

Don't worry about that last one for now, we'll get to branches later on.

So far, we've been using a command called `git config`. As we're going through this and you encounter a command that you're confused by or want to explore the options that come with a command, you can ask for help. In this case, if we type:

`git config -h`

we'll see help information, and all the possible options for the `git config` command. If you want more detailed information, you can try:

`git help config`

This brings up a detailed git manual you can access anytime because it's actually saved on your computer (i.e. you don't need internet.) To exit the manual, simply type `q`.

## Check in 1: It Git working and are your configurations done?

### Initalizing a repository

To learn about the main features of git (and there's a **lot** of terminology), we're going to make a repository. A **respository** (or *repo* for the cool kids) is basically just a directory (folder) that contains the contents of a project you're working on. There's a more technical defintion, but this is the working woman's definition.

**Make a new directory called `CFR_Day7` on your desktop and navigate into it**
* GUI, or navigate to desktop and type `mkdir CFR_Day7`

In order to use Git to control the changes that happen to this repository, we have to *initalize* this repository. Basically, hey Git, track changes here. Type:

`git init`

You made your first (maybe) repo! yay! Note that it doesn't look any differnt in the directory `ls`, but if you type `ls -a` you can now see hidden files, and there's a little thing called .git in there that wasn't in there before.

This directory is obviously empty, but you can perform `git init` in any directory you've already been using to make it a git repository. 

### `git status`

`git status` is a command you'll be using extremely frequently. `git status` is how you check in with the repository. It shows you the branch you're in, and any commits that are relevant. (we will get to these later I promise!) But for now, becuase the repo is empty, there's not much going on. So let's get some things rolling.

### Exercise 1: Make a .txt file called file1.txt in your repo, and add some text. Then call `git status` again and see what happens.

Option 1: GUI way\
Option 2: bash way

`touch file1.txt`\
`nano file1.txt`\
`add text here`

Now, when you type `git status`, you should see that Git has identified the changes in file1.txt. In fact, it says that file1.txt is *untracked*. That means if you make any changes to file1.txt, Git won't care. So in order to ask Git to track a file, we need to use the command `git add`.

### `git add`

Git add is how we tell Git that we want it to track the changes associated with a file. Let's use `git add` to set up tracking for file1.txt.

`git add file1.txt`

To check that it worked, we can again use the command `git status`. file1.txt is now being tracked.

Note: if we want to untrack the file, we can run the command `git rm --cached file1.txt`.

Note2: there is a way to ask Git to totally ignore any files that you want to keep out of the repository-- if you want to learn how to do that, you can look up documentation about `git ignore`

Note 3: we can track everything in a directory by using `git add .`

### `git commit`

After using `git add` to track the changes in file1.txt, there's one final step we need to perform, `git commit`.

A `commit` is basically snapshot of the repository in time. You're writing an entry in the log book of what each file looks like when you commit it. When you make a `commmit`, you are taking a snapshot that you can then go back in time to if you want to revert any changes you make afterwards. 

If we look back at our `git status` output, it tells us that there are changes to be committed associated with file1.txt. These files are in **staging**, which means we've tracked their changes but have yet to commit those changes. Staging is basically a holding pen, where you keep files until you're ready to commit them.

When you're ready to commit, there's a few ways to do so. The main, best-practice way is using the flag `-m`, which allows you to attach a message with your commit. Use this message to describe the commit, describe the state of the repo and any changes you've made. It's tempting to just use the message 'updates' all the time (and I do it myself), but the more detailed you are, the more your future self will thank you.

`git commit -m "message"`

You made your first commit! hooray! make sure it went through with `git status`

![image-2.png](attachment:image-2.png)

Note: If you want to skip the staging altogether, you can modify the `git commit` command:

`git commit -a -m "skipping staging"`

## Exercise 2: the git flow

In summary, here's the Git flow.
1. make edits
2. git add
3. git commit

For this exercise:
1. edit the text within file1.txt
2. make a new file called file2.txt and give it some contents
3. add and commit both files with a message

`nano file1.txt`\
`touch file2.txt`\
`nano file2.txt`

You need to add both file1 and file2 to the staging area before committing. you can add them separately, or use `git add .`

then use `git commit -m ""`

### `git diff`

There's a handy feature that allows us to see what changes have been made to files, and that's `git diff`. To use it, let's make some quick changes to file2.txt.

`nano file2.txt`
`git diff`

You'll see the 'old version' in red, and the 'new version' in green.

### `git restore`

Let's say we make some changes, but decide that we don't want to actually commit them. We can use `git restore` either before or after files have entered the staging area.

If the file is not in staging: `git restore --filename`

If the file is in staging: `git restore --staged filename`

## Exercise 3: back from the trash

Delete one of your files, and use `git restore` to bring it back.

`rm file2.txt`\
`git status`\
`git restore file2.txt`

## `git log`

If we want to review all of the changes we've made, we can use the command `git log`.

Every commit has a unique identifier. It also tells you who made the changes-- that's why you have a username and email attached to your git account. There's also the time and date of each commit, and the message.

To get an abbreviated version, type `git log --oneline`

If you want to ammend any commit messages, you can do that!

`git commit -m "created and populated file1.txt woohoo" --amend`

`git log -p` gives you a more detailed view (`q` to exit)

In this, you can look for specific text, look at changes made at a certain date, etc. There's tons of capabilities!

## `git reset`

If I want to go back to a previous version of the repo, I can use the `git reset` command in conjunction with the commit ID, which is found in the `git log`.

For example (your git log ID may be different)

`git log --oneline` #to see the IDs
`git reset c407b25`

You then have to re-add and re-commit.

Note: You can use `git rebase -i --reboot` to modify the commits. This is beyond the scope of this class but go nuts on your own time.

## Branches

So far, we've been making all of these changes in the main branch of the repository. But you can also set up additional branches. A branch is basically a copy of the original repo where you can troubleshoot, try things out, etc., without impacting the main branch until you are ready to **merge** the two.

You can imagine how useful this is in research or industry-- if you have some code that works, but you want to experiment with modifications, you can make edits in the other branch, get it working first, then merge it with the main branch to update the whole system.

### `git branch BranchExample`

To create a new branch, use the command `git branch NAME`, where NAME is a helpful description of the branch that you're making (i.e. what you're going to be working on)

We can see how many branches we have by using `git branch`. Now, we have two branches, main and BranchExample.

You can tell which branch you're in by looking at which one is green.

### `git switch BRANCH`

To switch between branches, use the command `git switch BRANCHNAME`

Let's switch to the other branch..
`git switch BranchExample`
`git branch`

Now that we're in a new branch 

`nano file1.txt`

`git status`

You'll see that it identifies the branch that you are on. Let's go ahead and commit the change to the branch.

`git commit -a -m "updated file1.txt"`\
remember, -a allows us to skip the staging area

Now, let's switch back into main.
`git switch main` (or master)

You should **not** see the edits made in the branch, even though they were committed.

### `git merge`

In order for our changes to be taken into main, we have to **merge** the branches. To do this, type:

`git merge -m "message" BRANCHNAME`

if we peek at file1.txt again, we should now see the changes.

We probably don't need that branch anymore, so let's delete it.

`git branch -d BranchExample`

## Exercise 4: branch and merge

Make a new branch, edit file1.txt in that branch, and merge it with main (without notes). Two hints: arrow up to see previous commands, use google.

`git branch AnotherBranch`\
`git switch AnotherBranch`\
`git commit -a -m "updated"`\
`git switch main`\
`git merge -m "message" AnotherBranch`

## Merge conflicts

Our merge went smoothly because there were no changes to main after we created our branch. What happens when both files change in their separate locations? That creates what's called a **merge conflict**.

First, let's make some changes to file1.txt in AnotherBranch and commit them.
`git switch AnotherBranch`
`nano file1.txt`
`git commit -a -m "added a number"` 

Then let's switch to main and make some edits to the same file without merging.
`git switch main`
`nano file1.txt`
`git commit -a -m "added two numbers"`

Now let's try the merge again.\
`git merge AnotherBranch`

Uh oh, a conflict! When we have a conflict, we have to resolve it before we can successfully merge. Luckily, when we try to merge and there is a conflict, Git makes it relatively straight forward to sort out.

Let's go into the file that's the issue, in our case file1.txt

`nano file1.txt`

Here you can see the two versions that are trying to merge. **HEAD** refers to the version in main, and **AnotherBranch** refers to the version in the branch. We can now decide which one we want to keep. Let's keep the version in the branch. To do that, delete the version under head.

Once we make the edit, recommit the change.

`git commit -a -m "update file1.txt from main"`

And now we have solved the merge conflict.

## Conclusion

![image.png](attachment:image.png)

Hopefully now you have a working understanding of how git is used in working projects.

## Github

Now that we know the fundamentals of Git, we can now talk about Github. But one of the great applications of Git is working with others on the same repositories. So far, we've been working on a repository on your local machine only. But you can also host your git repo in the cloud. Github is an online platform that allows us to do exactly that (and a lot more).

But even if you aren't working with anyone else, it's great practice to host your repos on somewhere like Github for your own purposes.

### Log in

You should all have made a Github account. Go ahead and log in! If you didn't make an account yet, we'll give everyone a few minutes to do that now.

## Check in: are you in your own github account online?

### Orientation to the website


### Creating a new repository

Click the new repo button. 

Give it a name, and a description.

Public vs private? Private: assign individuals. Public: anyone can see it.

License: ?

Push existing repository from the command line (from within the directory)

`git remote add origin address` *links remote connection (calling it origin), own url*\
`git branch -M main` *set target branch to main*\
`git push -u origin main`

### Personal Access Token

Yikes! The text above didn't work. In 2021, Github ended our ability to use a password to authenticate pushes/pulls, we instead have two options : SSH or a personal access token. You can google to read more about which you prefer to use, but I use a personal access token and so that's what I'm going to be walking us through now.

https://mgimond.github.io/Colby-summer-git-workshop-2021/authenticating-with-github.html

A PAT is a secret key that you use in place of your password. We first generate a PAT on github's website, then use it on our local machine to connect the two.

Now, let's try the above commands again. Then go online and refresh the page, you should see your content on the website.

### Yay, now your stuff is on the cloud! 

### poke around

*files
*most recent commits
*click on commit --view git log
*can drag/drop/add files
*can edit files (git pull in local)

## Issues

feature requests, or maybe someone finds a bug. can add details.

New Issue: title, commend, assign, label, project, etc.

## Pull Requests

A pull request is a way to alert a repo's owners that you want to make some changes to their code. It allows them to review the code and make sure it looks good before putting your changes on the primary branch.

So after you've forked, cloned, edited, add, commit, push--- you want to submit a pull request.

So a pull request to another repo is similar to a 'push'. But it allows for a few things:
1. it allows you to contribute to another repo without needing administrative privelages to make changes to the repo
2. it allows others to review your changes and suggest corrections, additions, edits, etc.
3. it allows repo admin control over what gets added to their projects

The ability to suggest changes to any repo is a powerful feature of Github. You don't have persmissions to edit the earth lab repo, but you (or anyone) can make as many changes as you want in your fork, then suggest that Earth lab inroporate those changes in their repo using a pull request.

This, in essensce, is the heart of the collaborative nature of Github.

Pull requests show the differences of the content between your repo and the repo that you are submitting changes to. 

### Start pull request (do it on our own)

To begin a pull request, click the pull request button on the main repo page.

### Which repo?

Select which you want to update (the base repo) and which contains the content you wish to use to update the base (the head repo).

base: will be updated
head: repo from which changes come

way to remember: head is 'ahead' of the base, so we add changes from the head to the base

### verify changes

when you compare the two repos in a pull request page, Github provides an overview of the differences.

### the admin of the repo can accept and merge any pulls

### you can also just create a pull request from the edits page

### finally, git pull on your local machine


So far we've been working on 'personal' repos, aka by yourself. But github is a social thing! 

## Fork

A github fork is a copy of a repository that sits in your account rather than the account from which you forked the data from. One you have forked a repo, you own your forked copy. This means that you can edit the contents of your forked repository without impacting the parent repo.

Forked = independent, clone = linked.

![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)

Let's fork a repository. Many times you want to use software or code from a particular program or project, they host that information on github. To get your own copy, you fork!

Let's fork the earthpy repo from Earth Lab, using the Fork button. Now, when I look at my repos, you can see earthpy in there. If we open it, we can see that it's an exact copy with the parent copy.

You can apply all of the same things we saw earlier on any public repo-- issues, fork, pull request... the possibilities are endless!
