# Introduction to Git (Hub)

Git is a **version control system** — a tool that helps keep track of changes in files over time.  

When people hear *Git*, they often think of **GitHub**. But Git is not the same as GitHub — it’s the underlying tool. You can also use Git with other platforms like **GitLab** and **Bitbucket**.  

If you’ve ever ended up with files like this:  

- super_crazy_code.py
- super_crazy_code1.py
- super_crazy_codev3.py
- super_crazy_codeV4_FINAL.py
- super_crazy_codeFINAL_FINAL.py

…then Git is something you might want to look into! Instead of saving endless versions of files, Git helps you track changes, roll back if needed, and collaborate with others more easily.  

In this workshop, we won’t go too deep into Git as a full version control system. If you’re interested in that, the official Git tutorial is a great place to start:  

-> [https://git-scm.com/docs/gittutorial](https://git-scm.com/docs/gittutorial) <-

For now, we’ll focus on how scientists often use Git:  

- **Cloning repositories** (downloading code from places like GitHub)  
- **Understanding the structure of a repo**  
- **Running someone else’s code**  

That way, you can make use of the code others have already shared.  



## Installing Git

Before we can start using Git, we need to make sure it’s installed on your computer.  
Follow the instructions below depending on your operating system:

### Windows
1. Go to the [Git for Windows](https://git-scm.com/download/win) download page.  
2. Download the installer and run it.  
3. During installation, you can accept the default options (these are fine for most users).  
4. Once complete, open **Git Bash** (installed with Git) to check everything works.

### macOS
1. The easiest way is to use **Homebrew**. If you don’t have it, install it from [brew.sh](https://brew.sh/).  
2. In a terminal, run:  
```bash
   brew install git
```

Alternatively, you can download the installer from the Git website

## Verify the installation

To check that Git is installed correctly, open the Git Bash application that was installed on your machine and run:

```bash
   git --version
```

You should see the installed version number. e.g.

```bash
   git version 2.47.1.windows.2
```



## What Git Can Do

The main purpose of Git is **version control** — keeping track of changes in your files over time.  
But in this workshop, we’re not going to focus on that. Instead, we’ll see how Git can be used to quickly **pull code from open repositories** and run it locally.

If you’d like to dive deeper into Git (things like logging into your GitHub account, collaborating on projects, or using version control for your own work), I recommend following a full Git tutorial: [Official Git Tutorial](https://git-scm.com/docs/gittutorial).  
It’s especially worth it if you plan to work on larger projects — your future self will thank you!  

For today, though, we’ll keep it simple.  
The Git command we’ll use most is:

```bash
git clone

## Example 0: Cloning the Programming Workshop Repo

The first thing we’re going to do is **clone the repository that contains this notebook**!  

1. Navigate to the Programming Workshop GitHub repo.  
   If you get lost, here’s the link: [Programming Workshop Repo](https://github.com/isacasini/ProgrammingWorkshop)

2. Find the HTTPS URL for the repository:  
   - Click the green **Code** button.  
   - Copy the URL shown under “HTTPS.”

   ![Programming Workshop Repository Screenshot](images/program-workshop.png)

3. Open **Git Bash** (or your terminal) and navigate to the folder where you want to save this repo.  
   Personally, I like to keep all repositories in a single folder called `git-code`, but you can choose any location you like.

4. Clone the repository using the URL you just copied:
   ```bash
   git clone [paste-the-URL-here]

![gitbashscreeny](images/git-bash-clone.png)

5. After cloning, navigate into the new folder in your file explorer or using the terminal:

```bash
cd ProgrammingWorkshop
```

You should now see all the files and folder from the GitHub repository saved on your local machine


## Example 1: EasyFig

[Easyfig](https://github.com/mjsull/Easyfig) is a Python application for creating comparison figures between different genes.

I chose this example because it is **not in the Python Package Index** (more on that later), making it a good example for cloning and running a standalone repository.

The EasyFig repository looks like this:

![Easyfig Repository Screenshot](images/easyfig-repo.png)

#### Steps to Clone and Run EasyFig

1. Using the same steps we followed for the Programming Workshop repo, **clone the EasyFig repository** to your local machine.

2. Navigate into the cloned folder and run the program:

```bash
python Easyfig.py
```

This should launch the application, and it’s ready to create some neat visualizations.  

> Note: I don’t fully understand all the outputs myself — the goal here is just to give you an example of cloning and running Python code.  
> For more details about EasyFig and its usage, check out the official tutorial: [EasyFig Tutorial](https://mjsull.github.io/Easyfig/files.html)

#### Try It With Workshop Data

I’ve included some example data in the **ProgrammingWorkshop** repository you cloned earlier.  
- Look inside the `WorkshopProgramming` folder.  
- Input this data into EasyFig, and you should see the results in the program.

This gives you a hands-on feel for how you can pull code from GitHub and run it locally — even for projects that aren’t on PyPI.


## Example 2: A Simple Repo

In the previous example, we worked with a Python project that **did not require any external packages** not even ones such as 'pandas' or 'numpy' just pure python. However, real-world scientific projects often rely on external dependencies to function correctly. 

We've previously installed packages using: 

`conda install pandas numpy matplotlib`

This works when the package has specifically been published to the conda libraries. Not all packages are like this, in fact the large majority are published to the PyPl or python index. The way that we retrieve this is using the `pip` command.

You can have a look at all the avaiable packages on the PyPl index here: [PyPI - The Python Package Index](https://pypi.org/)

### Installing Dependencies

1. **Check the `README` file**  
   Take a moment to read it—this file is usually displayed on the repository’s main page.

2. **Install dependencies as instructed in `.requirements` or setup files.**

```bash
python -m venv env
env\Scripts\activate   # On MacOS: source env/bin/activate
```

> ⚠️ Tip: Using a virtual environment is strongly recommended to avoid package conflicts.

Following these steps ensures that your environment matches the one intended by the developers.

### Step-by-Step

We'll go through a quick example of using a repository that has external dependencies. The link to the repo is here: [Example GitHub Repository](https://github.com/HaxbyH/quick-analysis)

![image.png](images/quick-example.png)



You can see that this is a very basic repository. It has three files:

- **README.md** - This file is displayed on the repository’s webpage. It should contain an overview of what the repository is and include steps on how to run the code.
- **requirements.txt** - This file lists the dependencies needed to run the code in the repository.
- **faker_data_analysis.ipynb** - This is a notebook file which contains the code that we are trying to run.

In the description of the README.md file you can see that this repository wants you to create a new enviroment using a different command: 

```bash
python -m venv env
source env/bin/activate   # On macOS/Linux
env\Scripts\activate      # On Windows
```

This repository is set up using a **venv** virtual environment. In the past, we have used **conda** to create virtual environments. It doesn’t really matter which method you choose—**venv** or **conda**—but a good rule of thumb is to follow the instructions provided by the repository.

Since this repository uses **venv**, we’ll do the same. When you run the command, you’ll notice a folder called **env** will appear. This folder contains the Python environment. You can explore the files inside, but it’s not necessary to understand everything about them just yet.

Once you’re in this environment, you can run the next command:

```bash
pip install -r requirements.txt
```

As we know, `pip install` is used to install packages from the PyPI index. The `-r` flag indicates that we are passing in a requirements file, and `requirements.txt` is the requirements file that we found in the repository.

Once that runs our enviroment is ready to go! You can open the jupyter notebook in vscode, start you're virtual enviroment and run the code!


## Bonus Example: Pandas

So if you've been following the workshops then you might recognise pandas - and how we've been installing it 

```bash
pip install pandas
```

But let's pause for a second and think about what is actually happening here?

- `pip` is the Python **package installer**.  
- When you run `pip install pandas`, it **downloads the Pandas package from the Python Package Index (PyPI)**.  
- It then installs the package into your Python environment so you can use it in your scripts.

> Note: For this workshop, we won’t actually install Pandas on your machines. We just want to understand that this is the code that fetches the library.

#### What gets pulled

If you explore PyPI (https://pypi.org/project/pandas/), you can see:
- The source code for Pandas.
- The metadata about versions and dependencies.
- The wheel or tar.gz file that `pip` downloads.

This is a good example of how GitHub-hosted projects (like EasyFig) and PyPI-hosted packages (like Pandas) are both **distributed for Python users**, but via slightly different systems.

> **Note:** If we really wanted to, we could follow the same process we used for EasyFig or the Programming Workshop repository and clone Pandas directly from GitHub, for example:  
> [https://github.com/pandas-dev/pandas](https://github.com/pandas-dev/pandas)  
>
> However, we won’t be doing that for this tutorial — we’re just illustrating how packages are installed from PyPI using `pip install`.