# Challenge - Setting up the continuous deployment

In the previous lab, we coded our package named wqp. In this section, we are goind to add some unit tests to it, and we will set up Circle CI to run those tests automatically, and deploy our package to a private python package server.

### Writing unit tests

#### Setup

Checkout a new branch: 
```bash
git checkout master && git pull
git checkout -b feature/unit_tests
```

At the root of your project, create a file called tests-requirements.txt In this file, add the following lines:

```bash
-e .
pytest
```

Then create a new virtual environnement for the tests:

```bash
virtualenv -p your_path_to_python venv_test
source venv_test/bin/activate
pip install -r tests-requirements.txt
```
Add this line to your .gitignore file:

```bash
venv_test/*
```
You can commit those changes locally:

```bash
git add test-requirements.txt
git add .gitignore
git commit -m 'setup venv for tests'

```

You're now ready to implement the tests.


We are going to write 1 unit test, to show an example. Of course, it is advised for a production project to write much more than 1 unit test, but here it shows and example of how we can implement it.

We are going to write a test for the data access layer.

At the root of your project, create a folder named __tests__, and add inside a file called __test_data_access.py__.

Paste this contents into your file: 
```python
from unittest import TestCase
import pandas as pd
import string
import random
from wqp.data_access import build_train_test_sets


class DataAccessTests(TestCase):

    def test_build_train_tests_sets(self):
        num_data = list(range(10))
        str_data = list(string.ascii_lowercase)[:10]
        label_data = [0]*5 + [1]*5

        for d in [num_data, str_data, label_data]:
            random.shuffle(d)

        label_col = 'label'
        df = pd.DataFrame.from_dict({
            'num_col': num_data,
            'str_col': str_data,
            label_col: label_data
        })

        train_size = 0.8
        train_test_sets = build_train_test_sets(data=df, label_col=label_col, train_size=train_size)
        
        # assertions to be added
```
Here, we create a dummy pandas Dataframe, and test our build_train_test_sets method agains i

### Q1 - Test the build_train_test_sets method

Add the code in the test_build_train_tests_sets function, to test that:
- The keys of the dictionary train_test_sets are 'train' and 'test'
- The number of lines of the training dataframe is equal to train_size * le nombre de lignes des inputs.

### Q2 - Run the unit tests

From your venv_tests virtual environment, use the __pytest__ command to run the unit tests for your project.

Now that we have a test correctly set up, we can work on the continuous delivery part. For this, we will use:
- [Circle CI](https://circleci.com/) to orchestrate the run of the unit tests and the deployment of the package
- [GemFury](https://gemfury.com/) to deploy our package

### Q3 - Setup your accounts

Create your accounts on CircleCI and GemFury (I suggest to sign up with your personal GitHub account whenever possible), and spend 10 minutes reading the documentation of the platforms.

Now that you have a CircleCI account, we will setup CircleCI to have actions triggered when we push to our GitHub repository.
How does Circle Circle CI work? In a nutshell:
- you are signing up with your GitHub account, hence allowing Circle CI to access some information about your repos.
- Circle CI will look for a file name .circleci/config.yml, at the root of your project
- This file will describe the steps that Circle CI needs to run to make your workflow
- Under the hood, Circle CI will use Docker containers to execute your steps on a brand new machine. This is why we have to install things ourselves on those machines.
- Finally, Circle CI will communicate to GitHub the status of the global workflow.

To setup your project in Circle CI, you should:

- Go to CircleCI UI
- On the left side, click ADD PROJECTS
- Search for the name of your GitHub repository
- Click Set Up Project
- Click Start Building
- Click Add Manually
- Click (again) Start Building

<p align="center">
<img src="https://drive.google.com/uc?export=view&id=1F7ST01-D2PmF1cFVq-DkH3876Kgde-Si">
</p>

### Q4 - Define a workflow to run the unit tests in Circle CI

Let's study together a bit the structure of this file.

Now, on a new branch, called feature/ci_unit_tests, do the following steps:
- create a new directory at the root of your project, called .circleci (do not forget the ".")
- create a new file .circleci/config.yml, with the contents below
- fill the missing parts of the file (bash command to install test dependencies and bash command to run unit tests)
- commit your changes locally with a clear commit message
- push your changes on the remote repository

```yaml

version: 2.0
jobs:

  unittests:
    working_directory: ~/repo
    docker:
      - image: python:3.7
    steps:
      - checkout
      - run:
          name: Install test dependencies
          command:  # PUT YOUR BASH COMMAND HERE TO INSTALL THE TEST DEPENDENCIES
      - run:
          name: Run unit tests
          command: # PUT YOUR COMMAND HERE TO RUN THE UNIT TESTS


workflows:
  version: 2
  test-and-deploy:
    jobs:
      - unittests:
          filters:
            branches:
              ignore:
                - master
```

After that, when you push changes on any branch that is not "master" on your GitHub repository, you should see the CircleCI job triggered. You can monitor it in the "workflow" section of the web UI.

### Q5 - Building the python package

In this question, we will manually build our python package. By "build", I mean creating a package file that can be sent to server to be available via pip install. Here, we want a gzip'ed tar file.
For this, do some research about "creating a source distribution in python", and test it yourself.

NB: if you run the correct commands, this process should create a file in dist/wqp-1.0.0.tar.gz

### Q5 - Building the python package

From the [doc](https://docs.python.org/3/distutils/sourcedist.html).

At the source of your directory, run: 

```bash
python setup.py sdist
```

Output: 

```
running sdist
running egg_info
writing wqp.egg-info/PKG-INFO
writing dependency_links to wqp.egg-info/dependency_links.txt
writing entry points to wqp.egg-info/entry_points.txt
writing requirements to wqp.egg-info/requires.txt
writing top-level names to wqp.egg-info/top_level.txt
reading manifest file 'wqp.egg-info/SOURCES.txt'
writing manifest file 'wqp.egg-info/SOURCES.txt'
running check
warning: check: missing required meta-data: url

warning: check: missing meta-data: if 'author' supplied, 'author_email' must be supplied too

creating wqp-1.0.0
creating wqp-1.0.0/tests
creating wqp-1.0.0/wqp
creating wqp-1.0.0/wqp.egg-info
copying files to wqp-1.0.0...
copying README.md -> wqp-1.0.0
copying setup.py -> wqp-1.0.0
copying tests/test_data_access.py -> wqp-1.0.0/tests
copying wqp/__init__.py -> wqp-1.0.0/wqp
copying wqp/cli.py -> wqp-1.0.0/wqp
copying wqp/data_access.py -> wqp-1.0.0/wqp
copying wqp/evaluation.py -> wqp-1.0.0/wqp
copying wqp/ml.py -> wqp-1.0.0/wqp
copying wqp/workflow.py -> wqp-1.0.0/wqp
copying wqp.egg-info/PKG-INFO -> wqp-1.0.0/wqp.egg-info
copying wqp.egg-info/SOURCES.txt -> wqp-1.0.0/wqp.egg-info
copying wqp.egg-info/dependency_links.txt -> wqp-1.0.0/wqp.egg-info
copying wqp.egg-info/entry_points.txt -> wqp-1.0.0/wqp.egg-info
copying wqp.egg-info/requires.txt -> wqp-1.0.0/wqp.egg-info
copying wqp.egg-info/top_level.txt -> wqp-1.0.0/wqp.egg-info
Writing wqp-1.0.0/setup.cfg
Creating tar archive
removing 'wqp-1.0.0' (and everything under it)
```

### Q6 - Prepare the deployment on GemFury

We will make the necessary setup on GemFury platform to enable package deployment.

- Go to the GemFury web UI and login to your account
- You should end up in this url: https://manage.fury.io/dashboard/YOUR_ACCOUNT/repos/-/intro
- On the left side, click "Upload"
- Click Python PyPI

Here, you can recognize the command line you've ran in the previous step.
You notice that you need a __push token__ to enable deployment of your package (otherwise, anyone could deploy package in your account).
- Follow the instructions to create the push token.
- Run the command indicated in the GemFury instructions to upload your package.

Once this is done, you should see your package in the GemFury UI!

<p align="center">
<img src="https://drive.google.com/uc?export=view&id=1SyeGvHR3icsKp0MXbfIkIPKWP4H2AJQj">
</p>

### Automatic deployment on CircleCI

Now that we've built locally, and manually deployed our package through a command line, we actually have everything we need to have it done automatically via CircleCI.

But there is one trick: we are using a private token in our command to deploy the package. And this token __should not be commited to GitHub__.

So how do we give it to Circle CI? ==> Through environment variables.

### Q7 - Create an environment variable in Circle CI to host the value of your fury push token

- Go to the Circle CI web UI
- On the left side, where is written the name of your project (wine-quality-predictor), click the settings wheel
- Go to Environment Variables
- Click Add Variable
- Add a variable named FURY_PUSH_TOKEN, and set the value of your token here.

Repeat this step to create a variable FURY_USER_NAME, with the name of your fury account (should be your GitHub username).

### Q8 - Wrap it up: automatic deployment on Circle CI

We now have:
- manual command lines to build and deploy our package
- CircleCI aware of our private fury push token

So we have everython we need to tell Circle CI to do the job for us. For that, we just have to modify our .circleci/config.yml

Fill the missing parts of the configuration file, then commit your changes and push them.

```yaml
version: 2.0
jobs:

  unittests:
    working_directory: ~/repo
    docker:
      - image: python:3.7
    steps:
      - checkout
      - run:
          name: Install dependencies
          command:  pip install --upgrade pip -r test-requirements.txt
      - run:
          name: Run unit tests
          command: export PYTHONPATH=$PYTHONPATH:$(pwd):$(pwd)/tests && pytest tests/* -p no:warnings

  deploy:
    working_directory: ~/repo
    docker:
      - image: python:3.7
    steps:
      - checkout
      - run:
          name: Install package
          command: pip install .
      - run:
          name: Get version
          command: |
            WQP_VERSION=$(python -c "import wqp; print(wqp.__version__)")
            echo export WQP_VERSION=$WQP_VERSION >> $BASH_ENV
            echo Deploying version $WQP_VERSION...
      - run:
          name: Build package
          command: |
            # FILL HERE THE COMMAND TO BUILD THE PYTHON 
            echo export TAR_FILE="dist/wqp-$WQP_VERSION.tar.gz" >> $BASH_ENV
      - run:
          name: Deploy package
          command: |
            echo Pushing file $TAR_FILE to gemfury...
            # FILL HERE THE COMMAND TO UPLOAD THE PACKAGE TO GEMFURY


workflows:
  version: 2
  test-and-deploy:
    jobs:
      - unittests:
          filters:
            branches:
              ignore:
                - master
      - deploy:
          filters:
            branches:
              ignore:
                - master

```

### Q9 - Secure our Master branch

This is the last step of our process. Quick reminder of the git flow workflow, when developing a new feature:

- create a new branch, from the master branch
- do the developments on this branch
- commit this branch locally and on GitHub
- ==> this push should trigger the execution of the unit tests
- If tests are successful, open a pull request agains master
- Do a code review with a tierce person
- Merge the pull request on master
- Deploy the new version of the package from the master branch.

Hence, there are 2 things we need to do: 

- Setup a hook on GitHub to make it impossible to open a pull request on master if the unit tests are failing
- run the unit tests on all branches pushed except master
- deploy the package ony from the master branch

For the first part: 
- go to your GitHub account, on the wine-quality-predictor package.
- on the top tabs, click the settings wheel
- click Branches, on the left
- Next to Branch protection rules, click Add Rule
- Under Branch name pattern, add master
- Click:
    - require pull request review before merging
    - require status checks to pass before merging, and select ci/circleci:unittests

Then, update the "workflows" part of the circleci configuration file:

```yaml
version: 2.0
jobs:

  unittests:
    working_directory: ~/repo
    docker:
      - image: python:3.7
    steps:
      - checkout
      - run:
          name: Install dependencies
          command:  pip install --upgrade pip -r test-requirements.txt
      - run:
          name: Run unit tests
          command: export PYTHONPATH=$PYTHONPATH:$(pwd):$(pwd)/tests && pytest tests/* -p no:warnings

  deploy:
    working_directory: ~/repo
    docker:
      - image: python:3.7
    steps:
      - checkout
      - run:
          name: Install package
          command: pip install .
      - run:
          name: Get version
          command: |
            WQP_VERSION=$(python -c "import wqp; print(wqp.__version__)")
            echo export WQP_VERSION=$WQP_VERSION >> $BASH_ENV
            echo Deploying version $WQP_VERSION...
      - run:
          name: Build package
          command: |
            python setup.py sdist
            echo export TAR_FILE="dist/wqp-$WQP_VERSION.tar.gz" >> $BASH_ENV
      - run:
          name: Deploy package
          command: |
            echo Pushing file $TAR_FILE to gemfury...
            curl -F "package=@dist/wqp-$WQP_VERSION.tar.gz" "https://$FURY_PUSH_TOKEN@push.fury.io/$FURY_USER_NAME/"


workflows:
  version: 2
  test-and-deploy:
    jobs:
      - unittests:
          filters:
            branches:
              ignore:
                - master
      - deploy:
          filters:
            branches:
              only: # TO BE CHANGED
                - master

```

### Q10 - Testing the whole chain.

That's it! Circle CI and our GitHub repo are correctly set to do continuous delivery.

To test the automatic deployment, now you can:
- commit your changes to the current branch
- push your branch ==> it should trigger the execution of the unit tests via Circle CI
- open a pull request on master
- review together the pull request
- if you are ok with the changes, merge and close the PR ==> it should trigger the deployment job via Circle CI
- go on the Circle CI monitoring UI and follow the job execution

Once it has finished, try to create a virtual environment anywhere on your computer, activate it, install the package deployed on GemFury, and execute the training code via the wqp command line!
