---


<center>
    <h1>Github 101
    </h1>
    <h2>
        Nov 28 2022
    </h2>
</center>


---


# What is Git?

- Version control system designed to track changes in a source code over time
- Collabortive system to allow many people work on the same project
- Provision for creation of a single central repository (called "origin" or "remote")
- Created by the great Linus Torvalds to make Linux development faster

# What is GitHub?

- Web platform built on top of git technology to make it easier
- Offers additional features like user management, pull requests, automation
- Alternatives include GitLab, BitBucket, and Sourcetree

# GitHub Use Cases

- Version Controlling
- Collaboration
- *Inspiration*
- **Continuous Integration / Continuous Development (CI/CD)**
- **Software Development (Dev) and IT operations (Ops) (DevOps)**

---

# Common Terminology (1/4)

<ul class="list-disc list-outside list-force-indent mb-6">
<li class="mb-2"><p><strong>Repository</strong> - &quot;Database&quot; of all the branches and commits of a single project</p></li>
<li class="mb-2"><p><strong>Branch</strong> - Alternative state or line of development for a repository.</p></li>
<li class="mb-2"><p><strong>Merge</strong> - Merging two (or more) branches into a single branch, single truth.</p></li>
</ul>

# Common Terminology (2/4)

<ul>
<li class="mb-2"><p><strong>Clone</strong> - Creating a local copy of the remote repository.</p></li>
<li class="mb-2"><p><strong>Origin</strong> - Common alias for the remote repository which the local clone was created from</p></li>
<li class="mb-2"><p><strong>Main</strong> / <strong>Master</strong> - Common name for the root branch, which is the central source of truth.</p></li>
</ul>

# Common Terminology (3/4)

<ul>

<li class="mb-2"><p><strong>Stage</strong> - Choosing which files will be part of the new commit</p></li>
<li class="mb-2"><p><strong>Commit</strong> - A saved snapshot of staged changes made to the file(s) in the repository.</p></li>
<li class="mb-2"><p><strong>HEAD</strong> - Shorthand for the current commit your local repository is currently on.</p></li>
</ul>

# Common Terminology (4/4)

<ul>
<li class="mb-2"><p><strong>Push</strong> - Pushing means sending your changes to the remote repository for everyone to see</p></li>
<li class="mb-2"><p><strong>Pull</strong> - Pulling means getting everybody else&#x27;s changes to your local repository</p></li>
<li class="mb-2"><p><strong>Pull Request</strong> - Mechanism to review &amp; approve your changes before merging to main/master</p></li>
</ul>

---

<center>
<strong>Basic Commands</strong>
<p><img class="mb-6" src="commands.png" width="1200" height="1000" alt="a bad repository"/></p>
    <br>
</center>


<center><h2>Terminology in One Picture</h2>
<img class="mb-6" src="term.jpeg" width="800" height="800" alt="term"/></p>
</center>


---

<center>
    <h1>Rules of thumb for Git</h1>
</center>

---


## Don't push datasets

- Space limitations
- Privacy issues
- `.gitignore`
    ```shell
    # any data in root directory
    *.csv
    *.json
    *.hdf5
    
    # ignore archives
    *.zip
    *.tar
    *.tar.gz
    *.rar

    # ignore dataset folder and subfolders
    datasets/
    data/
    ```

## Don't push secrets

- **DO NOT HARD CODE ANY PRIVATE INFO INTO CODE**
- Use `.env` and `python-dotenv` 
    - installing `python-dotenv` 
        ```bash
        pip install python-dotenv
        ```
    - `.gitignore`
         ```
         .env
         ```
    - `.env`
        ```shell
        API_TOKEN=98789fsda789a89sdafsa9f87sda98f7sda89f7
        ```
    - `yourpythonfile.py`
        ```python
        from dotenv import load_dotenv
        load_dotenv()
        api_token = os.getenv('API_TOKEN')
        ```
  

         

## Do small commits with clear descriptions

- part of version controlling
- helpful in code recovery

<center>
<strong>Don't be this person</strong>
<p><img class="mb-6" src="https://valohai.com/blog/git-for-data-science/bad-repo.png" width="1600" height="800" alt="a bad repository"/></p>
    <br>
<strong>Be this person</strong>
<p><img class="mb-6" src="https://valohai.com/blog/git-for-data-science/good-repo.png" width="1600" height="800" alt="a good repository"/></p>
</center>


## Branching & pull requests

- branch repo's to **protect deployed code**, **add new features**, **run locally with customizations**
- `main/master` branch should be OS and machine agnostic and **functional**

<center>
<img src="https://valohai.com/blog/git-for-data-science/git-branches.png"  width="1400" height="600">
</center>


## In general...

- Don't use the `--force`
- Don't push notebook outputs *if possible*
- Always `git pull` before `git push`

---

<center>
    <h1>Suggested Workflows</h1>
</center>

---


## Starting a new code base:

- Create a `README.md` file
- Create a `requirements.txt` and `environment.yml` file for python projects
    - `pip install pipreqs`
    - `pipreqs path/to/project`
    - `conda env create -f environment.yml`
- Decide whether code should be private or public
- Add yourself/others as collaborators once repository has been created
- Create packaging instructions/shell script
- Write **Unit Tests** and **Integration Tests** (*Optional*)

## Starting a new code base (*git and hub*):

```shell
$ git init
$ hub create --private 
$ git add .
$ git commit -m "commit message"
$ git remote add origin [copied web address]
$ git push origin
```


## Starting a new code base (*packaging*):

1. add `__init__.py` in to your codebase
2. create `requirements.txt`  (pipreqs)
3. create `setup.py` 
    - ```python
        import setuptools

        setuptools.setup(
        name='', 
        version='',
        description='',url='#',
        author='',
        install_requires=[""],

        author_email='',

        packages=setuptools.find_packages(),
        zip_safe=False)
      ```
4. build - `pip install .` or rebuild -
     ```shell
     rm -rf build
     rm -rf nameOfPackage.egg-info
     pip install .
     ```

**example:** https://github.com/LFL-Lab/lflPython/tree/shanto


# Unit Tests

* Unit Tests are designed to test a specific functionality of code 
* Unit tests compare the output of a piece of code with expected output to determine whether the test should pass or fail
- example    

```python

def make_integer(float_number):
    """
    Function to turn floats into integers.
    """
    return int(float_number)

def test_make_integer(float_number):
    """
    Tests whether 'make_integer' outputs integers
    """
    assert type(make_integer(float_number)) == int
```  

- https://github.com/cleder/awesome-python-testing


# Integration Test

* Integration Tests ensure that different pieces of code work together correctly as whole (i.e. that different pieces of code are well integrated)

* Simulators to test experimental systems

* Wide test cases to test simulations

* https://github.com/features/actions

## Adding to an existing code base:

- Different protocols for different needs - *bug fixes*, *documentation*, *feature additions*
- Different ways depending on where changes are made from

## Adding to an existing code base (*Proper Protocol*):

- Fork the repository
- Clone the forked repository on the machine being used
- Create a new branch
    - naming conventions:
        - continuous development case: `author-dev` or just `author` (e.g. `shanto`)
        - feature based cases: `feature-author` or just `feature` (e.g. `hmm-sas`)
- Create a Pull Request (https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request)

- Code Maintainer reviews the request (https://github.com/features/code-review) and decides to merge or not

## Adding to an existing code base (*Realistic*):

- For small bugs/documentation
    - If on lab computer 
        - **fix the bug/document and commit**
    - If on personal computer
        - **fork, create new branch and pull request**
- For features
     - **follow the proper protocol**

## Deployment of Software on Lab computers/HPC:

- **Assuming the correct protocols were followed**
- Clone the repository
- Copy over any secret files securely (i.e. `.env`)
- Create a fresh conda environment or `pip install -r requirements.txt` to the current environment
- Build the package to resolve any path issues (`pip install .`)
- Run the code 😇