# What is GitHub?

> GitHub is a popular online platform that allows you to store your git-managed code repositories remotely. It enhances the collaborative aspect of coding, making it easy to share your projects, invite feedback, track issues, and manage changes with other developers. 

## Motivation
GitHub's comprehensive suite of features takes the version control that git offers and extends it to a collaborative, cloud-based environment, creating a platform for developers to share, improve, and showcase their work. 

- **Free Hosting and Storage:** GitHub provides free hosting and storage for your code repositories, making it an affordable option for individuals and small teams.

- **Collaboration:** It's designed for team collaboration, allowing multiple developers to work on the same project simultaneously and seamlessly merge their changes.

- **Distributed Architecture:** As a cloud-based platform, GitHub ensures there's no single point of failure - you don't need to worry about data loss due to issues like a local hard drive failure.

- **Portfolio Showcase:** Many developers and data professionals use GitHub as a "shop window" to showcase their coding projects, demonstrating their skills to peers and potential employers.

- **Additional Features:** Beyond git's core functionality, GitHub offers a host of extra features, including issue tracking, project management tools, and automated testing workflows, to name a few.

## Get Started with github

> Before we continue with the lesson, set yourself up with a github account via this [link](www.github.com), if you don't already have one.

### Configuring git Locally

To use most git features, you must provide your credentials. To do so, you can use the `git config` command, as follows:

  - `git config --global user.name 'Your Name'`
  - `git config --global user.email 'your@email.com'`

Ensure that you use the same email address you used when creating your GitHub account.

## Creating your First Repository

When you log into github, you will see a screen like this:

<br><p align=center><img src=images/Dashboard.png width=700></p><br>

This is the dashboard screen. It contains a lot of different information, but for now the important thing is the green button at the top of the left panel, marked `New`. Press this button to create a new github repo.

This will then open the following screen:

<br><p align=center><img src=images/demo_create_1.png width=700></p><br>

First, we need to choose a name for our repo. The name should:

- **Be short, memorable and descriptive:** no more than 3 or 4 relevant words
- **Not contain spaces:** use hyphens (`-`) or underscores (`_`) to separate words
- **Be unique:** within the scope of your user account at least. It does not to be globally unique.
- **Avoid special characters:** these can affect the operation of various command-line tools

For now, let's just call the it `my_test_repo`. You then have the option to give the project a short, one-line description in the field below. This is optional, but it's good practice to do so, even for personal projects, as it is easier to see what the repo does as your library of projects grows.

Below the repo name and description, there are several more options:

<br><p align=center><img src=images/demo_create_2.png width=700></p><br>



### Public vs. Private

This is fairly self-explanatory. Public repositories and their contents can be viewed by anyone, and anyone can make a local instance of the repo (known as a *clone*) or copy a new version of the repo itself to their github (known as a *fork*). We will learn more about these concepts later. Private repositories can only be viewed by the creator, or by collaborators that are added by the creator.


### Readme File

You can, and should add a `README.md` file to any public repository. This is a markdown file which is displayed on the repository's homepage, and acts as a guide to the repository's contents. We will learn a lot more about how to structure these in another lesson, but for now you can just tick the box to create one.

### `.gitignore` File

By default, git will track any file inside the repository folder. However there may be some files that you do not want to accidentally stage and commit, and which you would like to tell git to avoid tracking under any circumstances. Situations where thi might apply include:

1. **Secrets and credentials:** any file that contains sensitive data, such as API keys, passwords, or cryptographic keys, should not be tracked in git to avoid these secrets becoming public. It's often better to use environment variables or other secure means to handle this type of information, and there are secure methods for storing secrets in github itself, which we will discuss in a later lesson.

2. **Compiled source code:** files that are produced as the result of a build or compile process, such as `.o`, `.pyc` or `.exe` files, should typically be ignored because they should be generated locally from the source code.

3. **Operating system and editor-specific files:** files that are specific to your operating system or text editor, like .DS_Store files on macOS don't usually belong in version control and should be added to `.gitignore`.

4. **Databases or large data folders:** these are typically not required for a project's source code, and so should be added to `.gitignore`. A repository should generally be kept under 1 GB in size to avoid performance issues and potentially warnings from github. There is also an upper size limit of 100 MB for individual files. For this reason, it is better to keep your repository to just your source code, and host any datasets elsewhere.


#### Avoid Revealing Sensitive Information

> Note: from time to time in the course, you will be asked to make use of API keys, which are a way of authenticating your permission to use a software service. It is very important that you remember to add these to your `.gitignore ` file. github may even block your commits if you expose API keys, or they may inform the API key issuer of the breach and arrange for the key to be revoked. We will learn more about APIs (Application Programming Interfaces) later in the course. 

If you have your repository set to public, anyone on the internet can view its contents. For this reason it's incredibly important to avoid uploading sensitive information such as usernames and passwords, or API keys or tokens. If information like this is needed to make your code run, then it should be stored in a file in your local repository, and this file should be added to your `.gitignore`. Never hard-code passwords or keys into your scripts. Even if your repository is set to private, it can be changed to public at a later date, and it is better not to have to remember to handle these issues.

#### `.gitignore` Defaults

You can select a default `.gitignore` file to start your project with from a list of templates. For example, if you were working on a Python project, you can select `Python` from the dropdown menu:


|
<br><p align=center><img src=images/select_gitignore.gif width=700></p><br>


### License

github is primarily a platform for *open source* software. The open-source software movement is a global effort to promote and advocate for the production, distribution, and collaborative improvement of software with publicly available source code. When you choose to make your repository public, you also need to select a license to go with it, otherwise others have no permission to use your code. You should consider your intentions for the project's use and how open you want it to be for collaboration. Here are some of the popular license types you can use:

- **MIT License:** A permissive license that allows users to do whatever they want with the code (including commercial use) as long as they provide attribution and include the original license in any copies or substantial uses of the work.

- **GNU General Public License (GPL) v3.0:** A "copyleft" license that allows users to use, modify, and distribute the code, but any modified versions of the software must also be open source under the same GPL license.

- **Apache License 2.0:** A permissive license similar to the MIT license but also provides an express grant of patent rights from contributors to users.

- **GNU Affero General Public License v3.0 (AGPL-3.0):** Similar to the GPL, but also extends the copyleft requirements to cover software running as a service over a network.

- **Unlicense:**: This license is used for dedicating works to the public domain, allowing users complete freedom to use, modify, and distribute the code without any restrictions.

- **Creative Commons Zero v1.0 Universal (CC0-1.0):** Similar to the Unlicense, it's a public domain dedication that allows copyright holders to release their works with no restrictions.

### Finish Creating your Repo:

Finally, press the green button marked `Create repository` in the bottom right to create your repo.

<br><p align=center><img src=images/repo_create_button.png width=200></p><br>



## Cloning and Forking Repositories

*Cloning* and *forking* are two different ways of creating copies of a repository on GitHub. 

When you clone a repository, you're making a direct copy of the repository on your local machine, but you don't have permissions to add your changes back into the original repository unless you're the owner, or are given access. Typically you might clone another user's repo to make use of it, or clone your own repo to work on it.

Forking, on the other hand, creates a copy of the repository on your GitHub account. This allows you to make changes to your forked version independently of the original repo. The two can thereby diverge in function, so that you can take the project in a different direction to its progenitor.

### Cloning a Repository to your local machine

To clone your repo to your local machine, navigate to the main page of the repository on GitHub and click the green `Code` button, and copy the URL provided. Open your terminal or VSCode, navigate to the directory where you want the repository to be located, and type `git clone` followed by the URL you just copied, then hit enter. This will create a local copy of the repository on your machine.

<br><p align=center><img src=images/git_clone.gif width=800></p><br>

Make a copy of your `my_test_repo` repository on your local machine.

### Adding Items to `.gitignore`

Earlier in the lesson we discussed the reasons why you might want to add files or directories to your `.gitignore`, but how do we go about doing so? Let's look at an example:

1. Navigate to your local `my_test_repo` repository folder
2. Create a directory called `do_not_track`. You can use the command `mkdir do_not_track`
3. Create a text file called `password.txt` using the command `touch password.txt`
4. Open `.gitignore`, either from the Explorer sidebar in VSCode, or by typing `nano .gitignore` in the terminal

You can add any files you want to include in your `.gitignore` on a new line in the file. Directories are added the same way, but additionally need to be followed by a `/`. Comments can also be added by prefixing the line with a `#`, just like in Python. If you selected a `.gitignore` template when creating the repo, your file might already contain some standard entries describing files that should typically be ignored for the programming language you are using. If so, just add your changes to the bottom of the file.

<br><p align=center><img src=images/after_ignore.png width=700></p><br>

Once you have saved the changes to the file, the files you added to `.gitignore` should be greyed out in the Explorer sidebar on VSCode:

<br><p align=center><img src=images/gitignore.png width=700></p><br>

### Pushing Changes to the Remote Repository

Now that we have updated our `.gitignore` file locally, we'd like those changes to be reflected in the main repository. To do so, we can stage it using the command:

`git add .gitignore`

And then commit the changes using `git commit`. Remember to add a commit message, explaining what changes are in this commit!


Now we need to get that commit to our remote repo. This is called *pushing* our commit, and it is accomplished using the command:

`git push`

This will push all local commits to the same branch of the remote repository.

<br><p align=center><img src=images/git_push.gif width=700></p><br>

### Common Issues with `git push`

There are a few common issues you might encounter when attempting to push your code to the remote repository.

#### Pushing a New Branch

This issue occurs if you create a new branch locally which does not yet exist on the repo, and you want to push it to the remote repository for the first time. 

Let's try it out:

1. Create a new branch of `my_test_repo` using the command `git checkout -b my_new_branch`
2. Create a new text file using the command `touch new_textfile.txt`
3. Stage and commit the new file
4. Try and run `git push`

You should see something like the following:

<br><p align=center><img src=images/new_branch_push.png width=700></p><br>

The solution here is actually provided in the feedback you receive from attempting `git push`:

`git push --set upstream origin my_new_branch`

The `--set-upstream` option links your local branch to its corresponding remote branch on your GitHub repository, allowing git to know where future pushes for the branch should be directed. This command needs to be used only once for each branch, as git will remember the setting for subsequent pushes and pulls. 

> In git, 'origin' is a default name used for the remote repository from which your local repository was cloned. It's essentially an alias or shortcut for the URL of the remote repository.

#### Remote Branch is Ahead of Local Branch

This issue occurs when there are commits on the remote branch that are not reflected in the local repository. When the remote branch is ahead of your local branch, git will reject your push because it could overwrite commits on the remote branch. This is to prevent potential loss of work that exists on the remote repository but hasn't been pulled into the local repository yet. This often happens to new github users when they edit their `README.md` on their remote repository, and subsequently make changes on to their local repository and attempt to push them.

To recreate this issue:

1. Go to your remote repository and add some text to the `README.md` and commit the changes to the main branch:

<br><p align=center><img src=images/add_remote_text.gif width=800></p><br>

2. On your local branch, make sure that `main` is checked out, and type `nano README.md`
3. Add and commit changes to `test.txt`
4. Run `git push`

You should see the following error:

<br><p align=center><img src=images/ahead_by_commits.png width=800></p><br>

Before you can push your changes, you need to integrate the new commits from the remote branch into your local branch. You typically do this using `git pull`, which fetches the changes from the remote repository and then merges them into your current branch. We will learn a lot more about this in a later lesson, but in this situation you can fix this error using the command:

`git pull --rebase origin main`

`git pull` is essentially a combination of `git fetch` (which fetches the latest commits from the remote repository but doesn't merge them) and `git merge` (which merges those fetched commits into your current branch). 

*Rebasing* is a process where that takes the changes made in your branch, temporarily remove them, update your branch with the latest changes from the remote branch , and then reapply your changes on top of those updates. We will learn more about all of these concepts in a later lesson.

When you execute `git pull --rebase`, git fetches the changes from the remote repository and then, instead of merging them, it rebases your local branch onto the fetched branch. 




### Forking a Repository

To fork a repository on GitHub, navigate to the main page of the repository you want to fork. Click on the `Fork` button at the top right of the page. This creates a new copy of the entire repository and its history onto your own GitHub account, allowing you to make changes independently of the original project.


<br><p align=center><img src=images/fork_repo.gif width=700></p><br>

## Key Takeaways


- github is an online hosting platform for git repositories
- To make the best use of github, you need to configure your local git with your name and email address
- When creating a repository on github, you have several options, including setting visibility to public or private, adding a `README` file, and customising your `.gitignore`
- Private repositories can only be viewed by the owner, and people the owner selects as collaborators
- Public repositories can be viewed by anyone
- The `.gitignore` file is used to tell git not to track files so that they don't appear on the remote version of a repository
- The `git clone` command is used to create a local copy of a remote repository
- The `git push` command is used to send commits in your local copy of the repo to the remote version
- **Forking** a repository is the process of creating a separate copy of it, which thereafter does not track the changes of the original repository