# Github Basics



## Why Github? 

_**Note: I use Windows and will be of less help for Mac users. The TA, however, uses Mac and will be more helpful when issues stem from OS differences. GitHub, in particular, might be finicky if you have outdated versions of Safari or any other browser.**_

We will be using [GitHub](https://github.com) a lot in this course:

- All of your course-related work will go on GitHub.
- Discussion / help / announcements will happen on GitHub. (Yes, announcements!)
- This entire website is on GitHub!
- Assignments are posted on GitHub.

But why GitHub? Because it's tremendously effective for developing a project. It is used by [Apple](https://github.com/apple), [Uber](https://github.com/uber), [Netflix](https://github.com/Netflix), Google, Microsoft, Bitcoin, CERN, Chinese censors (wait, what?), and many more large, sophisticated, multi-billion dollar entities.

It's useful for (1) cloud storage, (2) collaboration, and (3) version control.

Let's get started!

## GitHub as cloud storage

At the very least, GitHub allows for cloud storage, like Google Drive and Dropbox do. There's a bit more structure than just storing files under your account:

- __Repositories (aka "repo")__: All files must be organized into _repositories_. Think of these as folders with **self-contained** projects. These can either be _public_ or _private_.
- __User Accounts__ vs. __Organization Accounts (aka "Org")__: All repositories belong to an account:
    - A _user account_ is the account you just made, and typically holds repositories related to your own work. 
    - An _Organization_ account can be owned by multiple people, and typically holds repositories relevant to a group (like STAT 545).

Examples: 

- The [`awesome-python`](https://github.com/vinta/awesome-python) repo is a "curated list of awesome Python frameworks, libraries, software and resources" 
- The [`ledatascifi-2021 `](https://github.com/LeDataSciFi/ledatascifi-2021) repo, within its corresponding `LeDataSciFi` Org contains these lectures. 


### Practice

By the end of lecture one and before the next class, you should be able to and have completed:

1. Make a participation repo called "Class Notes" (with a blank README<span>.</span>md file). The repo on GitHub.com is called the "main" or "master" or "origin" repo. 
2. "Clone" that to your computer. The folder and files on your computer are called the "remote" repo. 
3. Modify the repo **in the cloud** and, then, "fetch" those changes to your computer. Think of "fetch" as syncing your computer to catch up with any changes in the master.
4. Modify the repo **on your computer**, and then, "push" those changes to the cloud. Think of "push" as syncing the master to catch up with any changes from your computer.

With those exercises, done you're ready to work with GitHub. We will also explore 

- What a good `README` file is
- Understand what the `.gitignore` file is and how/why to use it 
- What a merge conflict is and how to resolve it


## GitHub for collaboration 

The "traditional" way to collaborate involves sending files over email. But emails get buried, and, also... who has the most recent version, and what is it?

<img src="http://phdcomics.com/comics/archive/phd101212s.gif" alt="You don't want this" width="50%" >

Git(Hub) solves this! 

Git (just "Git") is a distributed version control system. Basically: "Imagine if Dropbox and the "Track changes" feature in MS Word had a baby. Git would be that baby." It's great for us because it's optimized for code. 

GitHub (not just "Git") is built on top of the Git system. Among the many added features that make collaboration easier, two are worth highlighting:

- The GitHub repository is treated as the "master version".
- **You can (and probably should!) use [GitHub Issues](https://guides.github.com/features/issues/) instead of email to track open tasks.** 
    - _Issues_ are a discussion board corresponding to a particular repository. 
    - One "thread" is called an Issue. Some features:
    - You can tag other GitHub users using `@username`. 
    - Get email notifications if you are tagged, or are `Watch`ing a repository.

As an example, check out the Issues in the [`ggplot2`](https://github.com/tidyverse/ggplot2/issues) repository. People raise issues of all kinds, and then when they are solved, "Close" the issue. You can 

We will talk about collaboration later. Suffice it to say, managing group tasks is of paramount importance in virtually all jobs you might have after college.

### Collaboration practice

**"Exercise 1": VERY IMPORTANT** 
1. Find the `classmates` team within the LeDataSciFi org. (You have to click on the gradebook link in coursesite (classroom.github.com/...) and then I'll invite you. 
2. Start `Watch`ing the `classmates` team. **THIS IS WHERE CLASS ANNOUNCEMENTS WILL BE POSTED.**

You should now get an email notification whenever an Issue is posted by myself, the TA, or if your classmates ask a question.

**"Exercise 2": Use the discussion board** 
- Introduce yourself with 2 truths and a lie, and we can go from there. 


## GitHub for version control with Git 

Why version control? In addition to the awful "file naming conundrum" in the comic above,

- Don't fret removing stuff
- Leave a breadcrumb trail for troubleshooting
- "Undo" and navigate a previous state
- Helps you define your work

The way you work on project with GitHub is by following what I will call 
[the GitHub Workflow](assignments/howto_do.html#working-on-assignments-projects-taking-notes):

```{tip} 
**Fetch early, commit frequently, push often!**

This habit will help you avoid disasters, so that you get the positive features of Github without the headaches.

```

````{dropdown} 1. Make your coffee, open Github Desktop, and **FETCH** the project you'll work on. 
1. Change the "current repository" to the assignment you want to work on (or project, or your notes repo, etc.)
1. Click "Fetch origin" to download any changes from the master repo on the Github servers. This is important, because if someone else changed the files while you were sleeping, you'll get the most updated files to work on. 
1. Start your work on your computer. 

![](https://media.giphy.com/media/xUOrwpPFzqDh48XEek/giphy.gif)

```{warning}
If you don't "fetch" before you start, it's becomes easier to change a file someone else changed differently, creating a conflict. When this happens, you have to resolve the conflicting files before moving on. 
```

````

````{dropdown} 2. **"COMMIT" FREQUENTLY** (say every 30 minutes or so, but depends on the team/task): 
- Save the files you're working on. (Just like you would while working on a Powerpoint or Word document.) 
- When you save the file, Github Desktop (GHD) will notice it has been changed. 
- Go to GDH. Notice that your file is listed as a "changed" file. 
- **Describe those changes in the "Summary" and (optionally) "Description" boxes, and click the blue "Commit" button**. 
- Try to do this every time you save your files! It will make rolling back changes easier. 
- **Do this early and often**

```{panels}
![](https://media.giphy.com/media/l4KhSYN6hQ7Y0FZS0/giphy.gif)
---
![](https://media.giphy.com/media/dxITs87fTAxTncZ6WL/giphy.gif)
```
````

```{dropdown} 3. **"PUSH" OFTEN, but probably less than you commit** (say every 60-90 minutes or so, but depends on the team/task): 
- Push your changes to the cloud by clicking the blue "push" button in GHD. 
- Now, you've got an up-to-date backup and teammates can see the changes and work with the latest files.
- GHD will warn you if someone else made a change in the meantime. If this happens, click "fetch" to download what they did. If there is a conflict between your work and your teammate's, you'll have to resolve it. 

![](https://i.imgflip.com/4m0jf6.jpg)

```

Being careful about these steps might seem pointless during solo projects, but I encourage you to practice these good habits now, so that when you do collaborative work, you're protected from mistakes.

### Practice

- Fact: Git only pushes/tracks the _changes_ (called a _diff_) associated with a commit, so that it doesn't need to take a snapshot of all your files each time. 
- View commit history of the [LeDataSciFi.github.io](https://github.com/LeDataSciFi/LeDataSciFi.github.io) repository by clicking on the "commits" button on the repo home page. (You'll end up [here](https://github.com/LeDataSciFi/LeDataSciFi.github.io/commits/master).)
- View a recent "diff" by clicking on the description of the commit or the button with the _SHA_ or _hash_ code (something like `990cf9a`).
    - **This is also useful for collaborators to see exactly what you changed.**
- View the repository from a while back with the `<>` button.
    - Before the `990cf9a` commit, this folder looked VERY different!
- View the history of a file by clicking on the any, then clicking "History".



## Credits

- I have drawn heavily from [STAT545](https://stat545.stat.ubc.ca/)
- [QuantEcon.org](https://quantecon.org/QuantEcon.org)
- [EC607](https://raw.githack.com/uo-ec607/lectures/master/01-intro/01-Intro.html#1)
