Ensuring reproducibility and transparency can be so easy when using the right tooling from the start. This is a tutorial to showcase the use of [conda](https://conda.io) and [GitHub Actions](https://docs.github.com/en/actions). We use both to ensure easy reproducibility of our example notebook.

```console
foo@bar:~$ git clone https://github.com/HumanCapitalAnalysis/student-project-template
foo@bar:~$ cd student-project-template
foo@bar:~$ conda env create -f environment.yml
foo@bar:~$ conda activate student_project
foo@bar:~$ jupyter nbconvert --execute student_project.ipynb 
```

## conda - package management

In [1]:
! conda

usage: conda [-h] [-V] command ...

conda is a tool for managing and deploying applications, environments and packages.

Options:

positional arguments:
  command
    clean        Remove unused packages and caches.
    config       Modify configuration values in .condarc. This is modeled
                 after the git config command. Writes to the user .condarc
                 file (/home/peisenha/.condarc) by default.
    create       Create a new conda environment from a list of specified
                 packages.
    help         Displays a list of available conda commands and their help
                 strings.
    info         Display information about current conda install.
    init         Initialize conda for shell interaction. [Experimental]
    install      Installs a list of packages into a specified conda
                 environment.
    list         List linked packages in a conda environment.
    package      Low-level conda package utility. (EXPERIMENTAL)
    remove 

In [3]:
! conda --version

conda 4.6.14


We can create a virtual environment for our student project and install some basic packages right from the beginning.

In [4]:
! conda env remove --name student_project_template
! conda create -y --name student_project_template numpy pandas

Collecting package metadata (repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/peisenha/.local/share/anaconda3/envs/student_project_template

  added / updated specs:
    - numpy
    - pandas


The following NEW packages will be INSTALLED:

  blas               pkgs/main/linux-64::blas-1.0-mkl
  ca-certificates    pkgs/main/linux-64::ca-certificates-2019.5.15-0
  certifi            pkgs/main/linux-64::certifi-2019.6.16-py37_0
  intel-openmp       pkgs/main/linux-64::intel-openmp-2019.4-243
  libedit            pkgs/main/linux-64::libedit-3.1.20181209-hc058e9b_0
  libffi             pkgs/main/linux-64::libffi-3.2.1-hd88cf55_4
  libgcc-ng          pkgs/main/linux-64::libgcc-ng-9.1.0-hdf63c60_0
  libgfortran-ng     pkgs/main/linux-64::libgfortran-ng-7.3.0-hdf63c60_0
  libstdcxx-ng       pkgs/main/linux-64::libstdcxx-ng-9.1.0-hdf63c60_0
  mkl                pkgs/main/linux-64::mkl-2019.4-243
  mkl_fft            pkgs/main/linux-64::mkl_fft-1.

Now we can have a look at all environments.

In [5]:
! conda env list

# conda environments:
#
base                     /home/peisenha/.local/share/anaconda3
altruism_replication     /home/peisenha/.local/share/anaconda3/envs/altruism_replication
copulpy                  /home/peisenha/.local/share/anaconda3/envs/copulpy
dev_norpy                /home/peisenha/.local/share/anaconda3/envs/dev_norpy
dev_respy                /home/peisenha/.local/share/anaconda3/envs/dev_respy
dev_soepy                /home/peisenha/.local/share/anaconda3/envs/dev_soepy
dev_trempy               /home/peisenha/.local/share/anaconda3/envs/dev_trempy
estimagic                /home/peisenha/.local/share/anaconda3/envs/estimagic
norpy                    /home/peisenha/.local/share/anaconda3/envs/norpy
option_value             /home/peisenha/.local/share/anaconda3/envs/option_value
ose_tutorials            /home/peisenha/.local/share/anaconda3/envs/ose_tutorials
ose_utils                /home/peisenha/.local/share/anaconda3/envs/ose_utils
ov_analysis              /home/peisenha/.l

We can now switch to the terminal window or the Anaconda prompt to activate the environment.

```console
foo@bar:~$ conda acivate student_project_template
foo@bar:~$ which python 
foo@bar:~$ conda list
```

We are free to add / and remove packages form the environment.

```console
foo@bar:~$ conda install scipy
foo@bar:~$ conda remove pandas
```

Only the installed packages will be available.

```console
foo@bar:~$ python -c "import scipy"
foo@bar:~$ python -c "import pandas"
```

Returning to the notebook we can automate the process of environment generation using **environment.yml** files.

In [17]:
! cat environment_tutorial.yml

name: student_project_template

dependencies:
- numpy
- pandas
- scipy 


In [16]:
! conda env remove --name student_project_template
! conda env create -f environment_tutorial.yml


Remove all packages in environment /home/peisenha/.local/share/anaconda3/envs/student_project_template:

Collecting package metadata (repodata.json): done
Solving environment: done
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate student_project_template
#
# To deactivate an active environment, use
#
#     $ conda deactivate



## GitHub Actions - continuous integration

Now that we have automated the installation of all required software, we can move it to the cloud and execute our analysis there to ensure that there are no local dependencies that we are missing.

I have linked your projects to **GitHub Actions** already, so all your commits are monitored now and since you have a **.github/workflows/ci.yml** in your repo a build will be triggered based on the instructions in there.

In [1]:
! cat .github/workflows/ci.yml

name: Continuous Integration

on: [push]

jobs:
  build:

    runs-on: ubuntu-latest

    steps:
    - uses: actions/checkout@v2
    - uses: conda-incubator/setup-miniconda@v2
      with:
           activate-environment: student_project
           environment-file: environment.yml
           python-version: 3.6
           auto-activate-base: false
    - name: execute notebooks
      shell: bash -l {0}
      run: |
        export PATH="$PATH:/usr/share/miniconda/bin"
        source .envrc
        jupyter nbconvert --to html --execute --ExecutePreprocessor.timeout=120 *.ipynb


Now we can run the notebooks on the CI server to ensure full reproducibility as (if passing) this means that all required files are available on **GitHub** and the whole software environment is also fully specified.

Let's trigger a build and see the magic in action.

![title](material/github-actions-ci.png)

When all is working, don't forget to proudly add your [badge](https://docs.github.com/en/actions/guides/about-continuous-integration#status-badges-for-workflow-runs) to the **README.md** file.