# Week 2: Git Basics, Introduction to Pandas & Python Functions

## Part 1: Introduction to Git and Basic Commands on Google Colab

### History of Git
Git was created by Linus Torvalds in 2005 out of a need for a distributed version control system that could handle large projects and was not reliant on a central server.

### Advantages of Git:
1. **Distributed System**: Every user has a complete local copy of the repository, allowing for full functionality and history viewing even when offline.
2. **Branching and Merging**: Git's branching and merging capabilities are swift and straightforward, promoting parallel development without conflicts.
3. **Data Integrity**: Git uses a cryptographic method (SHA-1) to keep track of changes ensuring data integrity.
4. **Speed**: Git operations are fast, optimizing performance.
5. **Collaboration**: Git platforms like GitHub and GitLab provide tools for collaboration, code review, and issue tracking.
6. **Open Source**: Git is free and open source.

### Public vs. Private Repositories:
- **Public Repositories**:
  - Accessible to everyone.
  - Anyone can clone, fork, or view the content, but not everyone can push changes unless given permission.
  - Great for open-source projects where collaboration from the community is encouraged.
- **Private Repositories**:
  - Access is restricted to users who have been granted permission.
  - Ideal for proprietary projects, sensitive information, or academic assignments where you don't want solutions to be publicly accessible.
  - Platforms like GitHub offer private repositories even in their free tier.

### Understanding the Workings of Git and Introduction to GitHub**

**Git Workflow:**
- **Local Working Directory:** Where you work directly on files.
- **Staging Area:** An intermediate area where commits can be formatted and reviewed before completing the commit.
- **Local Repository:** Where commits are stored locally.
- **Remote Repository:** A version of your project hosted on the internet or network.

**Key Commands:**
- `git init`: Initializes a new Git repository.
- `git add`: Adds changes to the staging area.
- `git commit`: Commits changes to the local repository.
- `git push`: Pushes changes to a remote repository.
- `git pull`: Pulls changes from a remote repository.
- `git clone`: Clones a repository from a remote source.


### Running Git Commands in Google Colab
In Google Colab, you can execute terminal commands by prefixing them with the "!" symbol. The `!` signifies that this is not a Python command, but a terminal command in the language `bash`. For Git commands, this is particularly useful. For example:

```python
!git clone <repository_url>
```

### Basic Git Bash Commands

1. **Clone a Repository**: This command creates a copy of the repository on your local machine.
```python
!git clone <repository_url>
```

2. **Add Changes**: After making changes to your files, you need to add them to the staging area before committing.
```python
!git add .
```

3. **Commit Changes**: This saves your changes with a message describing what you did.
```python
!git commit -m "Your descriptive commit message here"
```

4. **Push Changes**: This uploads your committed changes to the remote repository.
```python
!git push origin master
```

## Exercise 1: Cloning your class repository

IMPORTANT: If you get stuck - read this [page](https://medium.com/analytics-vidhya/how-to-use-google-colab-with-github-via-google-drive-68efb23a42d) and follow the instructions.

From now on I'm not going to share individual notebooks with you but rather the class repository. You will need to clone the repository to your Google Drive and then open the notebooks from there. To do so you must follow the following steps.

1. Open your Google Drive page and create a new folder called "git_projects" (or whatever you want to call it) in your `My Drive` folder.
2. Go to Google Colab and create a new notebook and call it `project_setup.ipynb`.
3. Mount your Google Drive by running the following code in the first cell of your notebook:
   ```python
   from google.colab import drive
   drive.mount('/content/drive')
   ```
4. Set your working directory to the folder you created in step 1 by running the following code in the second cell of your notebook:
   ```python
   %cd /content/drive/MyDrive/git_projects/
   ```
5. Clone the class repository by running the following code in the third cell of your notebook:
   ```python
   !git clone https://github.com/jancgreyling/AE_772_892.git
   ```
- This step will clone the repository and create a folder called `AE_772_892` in your `git_projects` folder.
6. Open the `AE_772_892` folder in your Google Drive and navigate to the `Lectures` folder and open the notebook for this lecture. You can open it in Google Colab by right-clicking on the notebook and selecting `Open with` and then `Google Colaboratory`.

**Note the following:**
- You only need to do this once. From now on you can open the notebooks directly from your Google Drive.
- Once you've created your repository, you cannot clone it again, if want the get update it with the latest changes, you can do so by running the following code in a cell in your notebook:
   ```python
   !git pull
   ```
- VERY IMPORTANT:
   - You can only run the above code if you are in the correct directory. If you are not in the correct directory, you will get an error. To change your directory, you can run the following code:
      ```python
      %cd /content/drive/MyDrive/git_projects/AE_772_892/
      ```
   - YOU NEED TO DO THIS AT THE START OF EACH LECTURE TO GET THE LATEST NOTEBOOKS.
   - If you've made changes to the notebooks, you will get an error when you try to pull the latest changes. You will need to commit your changes first. We will discuss this in more detail in the next lecture. For now you can stash (move to a separate branch) your changes by running the following code:
      ```python
      !git stash
      ```
- Note that you need to refresh your Google Drive tab in Google Colab to see the changes you've made to the repository.
   ![Figure 3](/Images/refresh.png)
- Note that when a repo is cloned, or initialised, it contains a set of hidden git folders. When you open the repo in Google Drive, and you see these folder, DO NOT DELETE THESE FOLDERS. If you do, you will break the link between your Google Drive and the GitHub repository.
- For this reason you you can only `stash` and `pull` code if you are in the correct directory. If you are not in the correct directory, you will get an error. To change your directory, you can run the following code:
   ```python
   %cd /content/drive/MyDrive/git_projects/AE_772_892/
   ```