# CI, CD & GitHub Actions

> GitHub Actions allow us to automate various stages of software development if one uses GitHub

Key characteristics of GitHub actions are:
- __event driven__ - if something happens inside our GitHub repository (e.g. pull request) we can run automated piece of code
- __configuration-oriented__ - we use `.yaml` to configure necessary __workflows__
- __customizable__ - we can create our own specific GitHub Action with `docker` or JavaScript
- __allows simple Continuous Integration (CI)__ - each Pull Request/Merge can be automatically tested
- __allows simple Continuous Deployment/Delivery (CD)__ - after our tests passed we can create a pipeline which automatically deploys our code for others to use

Let's start with the last points:

## Continuous Integration

> __A way of integrating software changes from multiple contributors__

Usually this consists of:
- assessing code quality (via automated code assessment, reviews or both)
- assessing adherence to the project's best practices

Using GitHub Actions we can do all of the above and more in order to streamline code merging process (e.g. automatically label 


## Continuous Deployment

> __Testing that our production code is always ready to be deployed (e.g. bug free)__

This consists of:
- checking changes against integration tests (or a whole suite of them)
- testing every feature that's supposed to be merged
- fast landing of necessary fixes (users always have the best possible version of the product available)

## Continuous Delivery

> __Deploy automatically (or after approval) your tested code__

This is often confused with continuous deployment (and those sometimes functions as a synonym)

__This approach may not be suitable for every project you create (and might be hard to get finished), though brings great benefits if done correctly__.

Amongst the things one "has to set up" could be:
- automated/easy rollbacks (when our tests fail to catch critical flaws)
- test optimization - run only necessary parts of tests in order to speed up/make the process more cost effective
- automated deployment of the changes into production (e.g. creating images, containers and deploying them on Kubernetes as a new version of software)

## GitHub Actions Concepts

Key concepts in GitHub Actions one should understand are:

- __Events__ - anything that happens inside the repository like new issue, pull request, merging etc. (__external events are allowed via [webhooks](https://docs.github.com/en/rest/reference/repos#create-a-repository-dispatch-event)__)
- __Workflow__ - automated procedure added to the repository. Workflows consist of...
- __Jobs__ - set of steps running on the same machine
- __Step__ - individual task which can run __action__/shell commands on the machine
- __Action__ - standalone commands __set up individually in separate/the same GitHub repository__
- __Runner__ - server with [Github Actions runner](https://github.com/marketplace/actions/github-action-for-latex) installed.

![](./images/github-actions-design.png)

Things to keep in mind:
- [Full list of events](https://docs.github.com/en/actions/reference/events-that-trigger-workflows) triggering workflows
- __Jobs in a workflow run in parallel by default__
- __We can create dependencies between jobs__ (for example deployment relies on testing job passing correctly)
- __We can use any action set up publicly__ (for example: [create LaTeX documents](https://github.com/marketplace/actions/github-action-for-latex) with actions)
- __Runners are provided by GitHub for free__ (see limits in different tiers [here](https://docs.github.com/en/actions/reference/usage-limits-billing-and-administration))
- We can also setup our own runners

## Structure

In order to use GitHub Actions we need the following:
- appropriate folder in our repository containing workflows (`/.github/workflows`)
- workflow `.yaml` files

__Let's dissect an example:__

```yaml
name: learn-github-actions
on: [push]
jobs:
  check-bats-version:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: actions/setup-node@v1
      - run: npm install -g bats
      - run: bats -v

```

- `name` - specifies name of the workflow
- `on` - event on which it will run
- `jobs` - list of jobs which will run (in parallel unless specified otherwise)
- `check-bats-version` - our first job:
    - `runs-on` - which runs on latest Ubuntu worker...
    - ... and contains the following `steps`:
        - `uses` which checks out our repository
        - `uses` `node` (so we can run `npm` commands)
        - `run`s JS installation of bats package
        - `run`s the command checking `bats` version
        
Visually it would look something like this:

![](images/github-actions-example-workflow-diagram.png)

## Possible fields

> There are a lot of possible field configurations, for details [check here](https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions)

Let's go over some of the most useful

### on

> One can specify multiple events (and their details) to run the workflow on 

Let's see the following `.yaml` part:

```yaml

on:
  # Trigger the workflow on push or pull request,
  # but only for the main branch
  push:
    branches:
      - main
  pull_request:
  # Also trigger on page_build, as well as release created events
  page_build:
  release:
    types: [published, created, edited]
  schedule:
    # * is a special character in YAML so you have to quote this string
    - cron:  '30 5,17 * * *'


```

A few possibilities to run the workflow on predefined events:

- On `push`/`pull_request` branches and/or tags (see [here](https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#onpushpull_requestbranchestags)), including their regex exclusion/inclusion
- On `push`/`pull_request` to specific file/directory (see [here](https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#onpushpull_requestpaths))

### env

> Allows us to set environment variables available

Availability can be set for different stages:
- all jobs
- single job
- single step

You can see examples below:

```yaml
env:
  SERVER: production
jobs:
  job1:
    env:
      FIRST_NAME: Mona
    steps:
      - name: My first action
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          FIRST_NAME: Mona
          LAST_NAME: Octocat
          
```

## Jobs dependencies

> GitHub Actions allow us to specify which jobs have to finish before the current one runs

This approach allows us to run jobs sequentially separately. For example:
- Test our application with multiple Python versions (using matrix)
- If all of the tests finish sucessfully, move to deployment stage

By default, all required jobs need to finish sucessfully before the one requiring them runs, e.g.

```yaml
jobs:
  job1:
  job2:
    needs: job1
  job3:
    needs: [job1, job2]
```

If you want for the jobs just to finish (no matter the result) you could do:

```yaml
jobs:
  job1:
  job2:
    needs: job1
  job3:
    if: always()
    needs: [job1, job2]
```

## Strategy & Matrix

> We can specify a __strategy__ which specifies matrix of jobs we can run

Using this approach we can easily create many similar jobs using __variable substitution__.

Few traits:
- Maximum of `256` jobs can be generated this way
- Ordering matters; first option you define will be the first job

Example of matrix containing `6` jobs:

```yaml
runs-on: ${{ matrix.os }}
strategy:
  matrix:
    os: [ubuntu-18.04, ubuntu-20.04]
    node: [10, 12, 14]
steps:
  - uses: actions/setup-node@v2
    with:
      node-version: ${{ matrix.node }}
```

We can also add some specific configurations (say specific mix of OS and `node` which is outside of matrix) using `include` (or exclude with `exclude`):

```yaml
name: Node.js CI
on: [push]
jobs:
  build:
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        os: [macos-latest, windows-latest, ubuntu-18.04]
        node: [8, 10, 12, 14]
        include:
          # includes a new variable of npm with a value of 6
          # for the matrix leg matching the os and version
          - os: windows-latest
            node: 8
            npm: 6
         exclude:
          # excludes node 8 on macOS
          - os: macos-latest
            node: 8
```

There are also a few important variables other than `matrix` we can set to customize the behaviour:
- `strategy.fail-fast`: either `true` or `false`, cancels whole job if `true` (default: `true`)
- `strategy.max-parallel`: number of parallel jobs in matrix. By default, as many as possible (based on number of cores)
- Continuing if one job fails, see below:

```yaml
runs-on: ${{ matrix.os }}
continue-on-error: ${{ matrix.experimental }}
strategy:
  fail-fast: false
  matrix:
    node: [13, 14]
    os: [macos-latest, ubuntu-18.04]
    experimental: [false]
    include:
      - node: 15
        os: ubuntu-18.04
        experimental: true
```

## Step specific arguments

> Step is the essential unit defining what the workflow does, hence it offers a few additional fields

Each step:
- Can (but does not have to) run action (all actions run as steps though)
- Have access to system defined in `runs-on` clause
- Each `step` runs a separate process
- __Unlimited number of steps__

We have seen a few examples already, but another one should not hurt anyone:

```yaml
name: Greeting from Mona

on: [push, pull_request]

jobs:
  my-job:
    name: My Job
    runs-on: ubuntu-latest
    steps:
      - name: Print a greeting
        env:
          MY_VAR: Hi there! My name is
          FIRST_NAME: Mona
          MIDDLE_NAME: The
          LAST_NAME: Octocat
        run: |
          echo $MY_VAR $FIRST_NAME $MIDDLE_NAME $LAST_NAME.
      - name: My first step
        if: ${{ github.event_name == 'pull_request' && github.event.action == 'unassigned' }}
        run: echo "This event is a pull request and has no assignees".
```

### uses & with

> Selects an action to run as part of a step in your job (action is a standalone piece of code doing specific work, e.g. sets up Python with specific version)

Tips for usage:
- specify version directly (similar rationale to Docker's tag)
- specify major version if you want fixes

For example:

```yaml
steps:    
  # Reference a specific commit
  - uses: actions/setup-node@c46424eee26de4078d34105d3de3cc4992202b1e
  # Reference the major version of a release
  - uses: actions/setup-node@v1
  # Reference a minor version of a release
  - uses: actions/setup-node@v1.2
  # Reference a branch
  - uses: actions/setup-node@main
```

Check [here](https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#example-using-a-public-action) to see more possibilities

Another important part of `uses` field is that it often comes with `with` nested field which allows us to specify arguments for the action.

`workflow` below specifies checkout from the private repository. __Please notice the arguments__:

```yaml
jobs:
  my_first_job:
    steps:
      - name: Check out repository
        uses: actions/checkout@v2
        with:
          repository: octocat/my-private-repo
          ref: v1.0
          token: ${{ secrets.PERSONAL_ACCESS_TOKEN }}
          path: ./.github/actions/my-private-repo
      - name: Run my action
        uses: ./.github/actions/my-private-repo/my-action
```

Another simple example could be:

```yaml
jobs:
  my_first_job:
    steps:
      - name: My first step
        uses: actions/hello_world@main
        with:
          first_name: Mona
          middle_name: The
          last_name: Octocat 
```

> __Please refer to individual actions documentation to see mandatory/optional arguments!__

### run

Another option is to use `run` instead of `action`:

> __run field specifies shell command run on the operating system__

Simply pass `shell` command into the field (and optionally specify working directory w.r.t. repository's root):

```yaml
steps:
  - name: Clean temp directory
    run: rm -rf *
    working-directory: ./temp
```

You could also specify different shell to run the command, e.g. (although `bash` is default value):

```yaml
steps:
  - name: Display the path
    run: echo $PATH
    shell: bash
```

> __Python is available as an optional shell, hence you can directly run Python commands!__

# Reading exercise


Now you will go through an example `test + deployment` workflow line by line:
- __section `name` to `jobs`__
- __`jobs.tests` - `jobs.tests.steps`__
- __`jobs.tests.steps` - `jobs.docker`__
- __`jobs.docker` - `jobs.docker.steps`__
- __`jobs.docker.steps` - `jobs.pip`__
- __`jobs.pip` - until the end__


```yaml
---
name: update
on:
  push:

jobs:
  tests:
    name: ${{ matrix.os }}-py${{ matrix.python }}-torch${{ matrix.pytorch }}
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        pytorch:
          - "v1.6.0"
          - "latest"
        python:
          - 3.6
          - 3.7
          - 3.8
        os:
          - ubuntu-latest
          - ubuntu-16.04
    steps:
      - uses: actions/checkout@v1
      - name: Set up Python ${{ matrix.python }}
        uses: actions/setup-python@v1
        with:
          python-version: ${{ matrix.python }}
      - name: Update torchlambda version
        run: ./scripts/release/update_version.sh
      - name: Install dependencies
        run: ./scripts/ci/dependencies.sh
      - name: Build docker image locally
        run: ./scripts/ci/build.sh ${{ matrix.pytorch }}
      - name: Perform tests
        run: ./tests/settings/automated/run.sh ${{ matrix.pytorch }}
      - name: Upload test results
        uses: actions/upload-artifact@v1
        with:
          name: ${{ matrix.os }}-py${{ matrix.python }}-torch${{ matrix.pytorch }}.npz
          path: analysis.npz

  docker:
    needs: tests
    name: Deployment image szymonmaszke/torchlambda:${{ matrix.pytorch}}
    runs-on: ubuntu-latest
    strategy:
      matrix:
        pytorch:
          - "v1.6.0"
          - "latest"
    steps:
      - uses: actions/checkout@v1
      - name: Set up Python
        uses: actions/setup-python@v1
        with:
          python-version: 3.7
      - name: Update torchlambda version
        run: ./scripts/release/update_version.sh
      - name: Install dependencies
        run: ./scripts/ci/dependencies.sh
      - name: Build docker image locally
        run: ./scripts/ci/build.sh ${{ matrix.pytorch }}
      - name: Login to Docker
        run: >
          docker login
          -u ${{ secrets.DOCKER_USERNAME }}
          -p ${{ secrets.DOCKER_PASSWORD }}
      - name: Deploy image szymonmaszke/torchlambda:${{ matrix.pytorch }}
        run: >
          docker push szymonmaszke/torchlambda:${{ matrix.pytorch }}

  pip:
    needs: tests
    name: Create and publish package to PyPI with current timestamp
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@master
      - name: Update torchlambda version
        run: ./scripts/release/update_version.sh
      - name: Set up Python
        uses: actions/setup-python@v1
        with:
          python-version: "3.7"
      - name: Install dependencies
        run: |
          python -m pip install --upgrade pip
          pip install setuptools wheel
      - name: Build package
        run: python setup.py sdist bdist_wheel
      - name: Publish package to PyPI
        uses: pypa/gh-action-pypi-publish@master
        with:
          password: ${{ secrets.PYPI_PASSWORD }}
```

## Contexts and expressions

> Contexts are a set of defined (__or predefined__) variables one can use within the workflow

We can access them using __expression syntax__ and __index__ access, for example:

```yaml
name: CI
on: push
jobs:
  prod-check:
    # Expression syntax, github is context, ref is the key
    # We could also use github.ref instead
    if: ${{ github['ref'] == 'refs/heads/main' }}
    runs-on: ubuntu-latest
    steps:
      - run: echo "Deploying to production server on branch $GITHUB_REF"

```

There are a few contexts one should use, namely:
- `github` - [docs](https://docs.github.com/en/actions/reference/context-and-expression-syntax-for-github-actions#github-context); github related info (and info about current run)
- `env` - [docs](https://docs.github.com/en/actions/reference/context-and-expression-syntax-for-github-actions#env-context); environment variables created within the workflow using `env`
- `job` - [docs](https://docs.github.com/en/actions/reference/context-and-expression-syntax-for-github-actions#job-context); variables about current job
- `steps` - [docs](https://docs.github.com/en/actions/reference/context-and-expression-syntax-for-github-actions#steps-context); variables about step(s)
- `secrets` - [docs](https://docs.github.com/en/actions/reference/encrypted-secrets); __allow us to specify secrets like password inside our repository__

Let's focus on the last one. We can specify them inside `settings` as shown on the screenshot below:

![](images/github_workflow_secrets_setting_up.png)

- __The ones we specify inside `actions` will be available inside context for others to use__
- Repository-wide will be usable via `secrets` but will not be encrypted

__Example usage:__

```yaml
steps:
  - name: Hello world action
    with: # Set the secret as an input
      super_secret: ${{ secrets.SuperSecret }}
    env: # Or as an environment variable
      super_secret: ${{ secrets.SuperSecret }}
```

> __Most of the data required for custom actions and/or our own scripts are already provided by GitHub (e.g. branch name, name of the event, name of the runner etc.)!__


### Literals

> When using expressions we can use a set of values to compare to

Let's see those values below:


```yaml
env:
  myNull: ${{ null }}
  myBoolean: ${{ false }}
  myIntegerNumber: ${{ 711 }}
  myFloatNumber: ${{ -9.2 }}
  myHexNumber: ${{ 0xff }}
  myExponentialNumber: ${{ -2.99-e2 }}
  myString: ${{ 'Mona the Octocat' }}
  myEscapedString: ${{ 'It''s open source!' }}
```

Comparison operators are also pretty standard:

| Operator  | Description |
| ------------- | ------------- |
| ()| Logical grouping |
| []| Index |
| .| Property dereference |
| !| Not |
| <| Less than |
| <=| Less than or equal |
| >| Greater than |
| >=| Greater than or equal |
| ==| Equal |
| !=| Not equal |
| &&| And |
| \|\| | Or |

__Keep in mind that:__
- GitHub casts data to numerical types if they do not match
- __Comparison is case insensitive!__
- __Objects are the same based on instance, not data!__

## Functions

Besides predefined environment variables, settable context and other goodies, GitHub Actions give us a few functions to use within our `.yaml`.

For a full list [reference documentation](https://docs.github.com/en/actions/reference/context-and-expression-syntax-for-github-actions#functions), here is a useful chosen set:

- `contains(iterable, item)` - `true` if `iterable` contains and `item`, e.g. `contains(github.event.issue.labels.*.name, 'bug')`
- `{starts, ends}With(string, string)` - e.g. `startsWith('Hello world', 'He') `
- `format(string, r1, r2, ..., rN)` - like in Python, for example: `format("Current reference on branch: {0}", github['ref'])`


> __Please notice `github.event.issue.labels.*.name`, any matching key in place of `*` will be returned, for example `[bug, first-issue, fix]` etc.__

`{to, from}JSON(value)` - returns JSON pretty-printed value of `value`, for example: `toJSON(job)` might return `{ "status": "Success" }`.

We can use those functions in order to pass data between jobs:

```yaml
name: build
on: push
jobs:
  job1:
    runs-on: ubuntu-latest
    outputs:
      matrix: ${{ steps.set-matrix.outputs.matrix }}
    steps:
      - id: set-matrix
        run: echo "::set-output name=matrix::{\"include\":[{\"project\":\"foo\",\"config\":\"Debug\"},{\"project\":\"bar\",\"config\":\"Release\"}]}"
  job2:
    needs: job1
    runs-on: ubuntu-latest
    strategy:
      matrix: ${{fromJSON(needs.job1.outputs.matrix)}}
    steps:
      - run: build
```


### Status check functions

> Allow us to check status of previous jobs and based on that branch with the `if` conditional

Let's see the most common status check function `success()`:

```yaml
steps:
  ...
  - name: The job has succeeded
    if: ${{ success() }}
```

Others include:
- `failure()`
- `cancelled()` (manually)
- `always()` (is always `true`, no matter previous statuses)

## Guides

GitHub Actions provide a lot of guides for common cases we might want to use.

You can find them [here](https://docs.github.com/en/actions/guides), they contain, among other things:
- How to build your Python code and test it
- How to deploy to Kubernetes
- How to upload your packages to `pypi`
- Managing opened issues
- Managing created Pull Request

# Exercise

Create a new GitHub repository (preferably private, let's call it `test` or something similar) and:
- Add all of the people in the breakout room as collaborators
- Create workflow called `prs.yaml` in appropriate directory, where:
    - if `label` of issue is set as `prod` assign yourself
    - if `label` is `research` assign another person in the group
    - Come up with as many `labels` as people in the group (or you can split into groups)
    - __Check out [this action](https://github.com/marketplace/actions/auto-assign-action) which will help you__
- Create workflow called `issues.yaml`, __check out [this action](https://github.com/marketplace/actions/auto-comment)__, which:
    - When an issue is opened add automated comment: "Thanks for submitting this issue, we will investigate the matter ASAP"
    - Use previous action to assign a single person to all of the issues when created
    - When an issues is closed add a comment which states "Your issue should now be resolved. If that is not the case, please re-open it"

__Check the `workflow`s you have set up and verify that they work correctly__

## Challenges

### Assessment

- Check [default environment variables defined by GitHub](https://docs.github.com/en/actions/reference/environment-variables#default-environment-variables)
- How to specify different entrypoints and arguments to `docker` action? Read about it [here](https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#jobsjob_idstepswithargs)
- What are `GitHub Actions` and how to create one using `docker` images? Check [here](https://docs.github.com/en/actions/creating-actions/creating-a-docker-container-action)

### Non-assessment

- Check how to set permissions for specific workflows [here](https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#permissions)
- Check what [councurrency field](https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#concurrency) does