*Creating a Jupyter Book with The Turing Way*

# Module 6: Continuous Integration and Deployment - An Introduction

**Learning Objectives:**

- Explain what Continuous Integration (CI) and Continuous Deployment (CD) are and how they are useful for reproducible workflows
- Explain how we can use CI and CD when publishing a Jupyter Book
- Introduce GitHub Actions and discuss why we use it in _The Turing Way_
- Guide our learners setup a GitHub Action on their repository

📹 [VIDEO](TBA)
---

## Continuous Integration (CI)

Continuous Integration (CI) is the process of automating the integration of code changes from multiple contributors into a single software project.
This process is often comprised of a range of automatic tooling to assert the new code's correctness before integration.
A version control system is the crucial element of CI processes and is often supplemented with other checks such as code quality, syntax style reviews, and more.

### CI for Jupyter Book

When building a Jupyter Book, we may use CI processes to achieve tasks like spellchecks, checking for broken links, code cells are bug-free and don't hang, and so on.

## Continuous Deployment (CD)

Continuous Deployment (CD) is a software release process that uses automated testing to validate if changes to a code base are correct and stable before immediate deployment to a production environment.
This is beneficial as bug fixes and new features can often be in the hands of users as soon as they are pushed.

### CD for Jupyter Book

A CD process for Jupyter Book might include a deployment preview so that we can automatically check what the rendered book will look like with the added content before releasing it.

#### CI/CD Vendors

There are many different platforms that provide CI/CD services, such as [Travis](https://travis-ci.com/), [Circle](https://circleci.com/), [Azure Pipelines](https://docs.microsoft.com/en-us/azure/devops/pipelines/), and so on.
These services can be thought of as "someone else's computer" where the testing and deployment phases of your software release process can be executed.
The commands you would usually run on the command line to test and build your code can be put into a script that the CI vendor will automatically run on a given trigger, for example, when new code is pushed.

## GitHub Actions

GitHub Actions is a CI service provided by GitHub.
For the purposes of this tutorial we are going to focus on GitHub Actions, but all we do here can also be achieved with any other CI vendor.

### GitHub Actions: Example from _The Turing Way_ repository

We use CI for the following reasons in _The Turing Way_:

- Results of CI runs are fully integrated with the GitHub repository where the code is kept
- Actions can be triggered by a wide range of events, such as issue comments or label assignments
- Actions can do much more than just test and build your code too, such as comment on issues and assign labels!

_**Disclaimer:** These are the opinions of the author of this tutorial, this is not an advertisement for GitHub Actions!_

### Some GitHub Actions related vocabulary

- **GitHub Actions:**
  This is the name of the product/service developed by GitHub.
- **Workflows:**
  This is the task (or tasks) you want to run automatically when triggered.
  A workflow is defined in a YAML file and contains all the steps that need to be run along with other information, such as the OS to run the job on and any dependencies that need installing.
  Another advantage of GitHub Actions over other CI vendors is that you can define as many workflow files as your project requires.
  Instead of one file that does lots of things, you can have a workflow file per task with it's own specific triggers.
- **Action:** An "action" is some code for a particular step that has been packaged up in such a way that it can be imported into your workflow.
  An example of an action is ["Build and Push" from Docker](https://github.com/marketplace/actions/build-and-push-docker-images).
  If you need to build a Docker image and push it to a registry during deployment, you can import this action to manage that process for you instead of installing docker and executing the build and push commands separately.
  Actions are available on [GitHub Marketplace](https://github.com/marketplace?type=actions) and are provided by official sources and third party developers.
  Creating your own action is also a possibility if you can't find one to suit your needs.

## Hands-on session on using GitHub Actions

For this session, we will work on a GitHub repository.

You can either create a new repository by pushing all your files created in this tutorial to GitHub (follow the instructions [described here](https://docs.github.com/en/github/importing-your-projects-to-github/adding-an-existing-project-to-github-using-the-command-line)).

Another option is to fork the GitHub repository of this tutorial, by visitig this link: https://github.com/malvikasharan/jupyterbook-with-turing-way (follow the instructions to [fork a repo](https://docs.github.com/en/github/getting-started-with-github/fork-a-repo)).

### Create a GitHub Action: CD

1. Go to your GitHub repository
2. Create a new branch called `add-action`
3. On this new branch, create a new file with the filepath `.github/workflows/deploy-book.yml` (Double check this, the folder names must be exact!)
4. Copy the following content into the new file

    ```yaml
    # Name your workflow so you can tell them apart!
    name: deploy-book

    # The workflow will only run when the master branch changes
    on:
      push:
        branches:
          - master

    jobs:
      # This job installs dependencies, builds the book, and pushes it to `gh-pages`
      deploy-book:
        runs-on: ubuntu-latest  # We request a Linux machine to run the job
        steps:
          # We need to check out the repository. The step uses an Action written
          # by the GitHub Actions team
          - name: Checkout repo
            uses: actions/checkout@v2

          # Install Python using another Action
          - name: Set up Python 3.7
            uses: actions/setup-python@v1
            with:
              python-version: 3.7

          # Install the book dependencies
          - name: Install dependencies
            run: |
              pip install -r requirements.txt

          # Build the book
          - name: Build the book
            run: |
              jupyter-book build .

          # Push the book's HTML to github-pages
          - name: GitHub Pages action
            uses: peaceiris/actions-gh-pages@v3.6.1
            with:
              github_token: ${{ secrets.GITHUB_TOKEN }}  # This token is automatically generated, don't worry about adding it
              publish_dir: ./_build/html
    ```

5. Commit the file to your branch (Remember to write a useful commit message!)
6. Open a pull request to your `master` branch

The pull request should not have any feedback from the GitHub Action because we have not merged to `master` yet and so our workflow hasn't been triggered.
This means that our workflow so far only covers _Continuous Deployment_.

### Create a GitHub Action: CI

As our `master` branch gets updated, so does our book hosted by GitHub Pages.
To implement _Continuous Integration_, let's add another job to the workflow that will test for broken links.

1. Edit the `deploy-book.yml` file on the `add-action` branch
2. Add the following extra lines to your file.

    Edit the `on` section to look like this:

    ```yaml
    on:
      push:
        branches:
          - master
      pull_request:
        branches:
          - master
    ```

    The workflow will now be triggered when `master` is updated or a Pull Request is opened against `master`.

    Now add the following step **after** the "Build the book" step, and **before** the GitHub Pages step.

    ```yaml
    - name: Run html proofer
      uses: chabad360/htmlproofer@master
      with:
        directory: "./book/website/_build/html"
        arguments: --assume-extension --disable-external --only_4xx
    ```

    This new step will run `html-proofer` to check the links within the book work before publishing.

    Lastly, add this `if` statement to the GitHub Pages step like so:
    ```yaml
    # Push the book's HTML to github-pages
    - name: GitHub Pages action
      if: github.event_name == 'push' && github.ref == 'refs/heads/master'
      uses: peaceiris/actions-gh-pages@v3.6.1
      with:
        github_token: ${{ secrets.GITHUB_TOKEN }}  # This token is automatically generated, don't worry about adding it
        publish_dir: ./_build/html
    ```

    This condition ensures that the book is only published when `master` is updated, and not by the opened Pull Request.
3. Commit these changes to your branch

Back in your Pull Request, you should now see that the workflow has been triggered.
If everything goes well, your test should pass.
Click on "details" next to the action name to see the breakdown of the steps.
You should also see that the last step was skipped. That's good!

You can now merge this PR to `master` to update the book and ensure that the workflow runs on future PRs.

🗝 Takeaway
---

- Continuous Integration (CI) is the process of automating the integration of code changes from multiple contributors into a software project.
- Continuous Deployment (CD) is a software release process that uses automated testing to validate if changes to a code base are correct before being deployed to production.
- GitHub Actions is a service provided by GitHub, which allows users to define a CI workflow for their GitHub repository.
- We use CI in _The Turing Way_ GitHub repository to enable collaboration and quality control of the contributions to its Jupyter Book by multiple users.

**References:**

- "Automatically Host your book with GitHub Actions", Jupyter Book documentation, https://jupyterbook.org/publish/gh-pages.html#automatically-host-your-book-with-github-actions

👉 [Next Module](./7-final-demo.ipynb)
---