# jupyter-nbrequirements

<p style="font: 30px; text-transform: uppercase;">
    Jupyter Notebook dependency resolution and environment setup
</p>

---

<span style="font: 18px"><b>Description</b></span><br>

<p style="text-align: justify; text-justify: inter-word;">
    This is an e2e pipeline from a single Jupyter notebook to fully set-up virtual environment ready to run the notebook.
    We're gonna demonstrate the whole functionality starting with setting notebook requirements, through Thoth configuration and dependency resolution to creating a complete virtual environment and setting a new Jupyter kernel.
    Hold tight. 
</p>

<span style="font: 18px"><b>Goal</b></span><br>

<p style="text-align: justify; text-justify: inter-word;">
    Run all cells in the notebook.
</p>

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Notebook-Content" data-toc-modified-id="Notebook-Content-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Notebook Content</a></span></li><li><span><a href="#Set-notebook-requirements" data-toc-modified-id="Set-notebook-requirements-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Set notebook requirements</a></span></li><li><span><a href="#Get-notebook-requirements" data-toc-modified-id="Get-notebook-requirements-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Get notebook requirements</a></span></li><li><span><a href="#Generate-Thoth-config" data-toc-modified-id="Generate-Thoth-config-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Generate Thoth config</a></span></li><li><span><a href="#Generate-Pipfile" data-toc-modified-id="Generate-Pipfile-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Generate Pipfile</a></span></li><li><span><a href="#Lock-down-dependencies" data-toc-modified-id="Lock-down-dependencies-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Lock down dependencies</a></span></li><li><span><a href="#Install-dependencies-into-virtual-environment" data-toc-modified-id="Install-dependencies-into-virtual-environment-7"><span class="toc-item-num">7&nbsp;&nbsp;</span>Install dependencies into virtual environment</a></span></li><li><span><a href="#Install-and-set-a-new-Jupyter-kernel" data-toc-modified-id="Install-and-set-a-new-Jupyter-kernel-8"><span class="toc-item-num">8&nbsp;&nbsp;</span>Install and set a new Jupyter kernel</a></span></li><li><span><a href="#About-the-dep-ensure" data-toc-modified-id="About-the-dep-ensure-9"><span class="toc-item-num">9&nbsp;&nbsp;</span>About the <code>dep ensure</code></a></span></li></ul></div>

---

## _ <a class="tocSkip">

In [None]:
%load_ext autoreload
%autoreload 2

---

---

## Notebook Content

For the purposes of this demo, let's pretend that all the notebook source code is actually contained in this section.

In [None]:
import json
import sys

import pandas as pd
import sklearn

In [None]:
df = pd.read_csv("<path>")
df.head()

with open("requirements.json") as f:
    requirements = json.load(f)

try:
    ...
except Exception as exc:
    print(exc, file=sys.stderr)

---

## Set notebook requirements

<p style="text-align: justify; text-justify: inter-word;">
    The goals are in order of operations that we want to perform when setting up the environment.
    The first step is to define notebook requirements.
</p>

<span style="font: 18px"><b>Acceptance Criteria</b></span><br>

<p style="text-align: justify; text-justify: inter-word;">
    The notebook has requirements embedded in its metadata
</p>

<span style="font: 18px"><b>How to do it</b></span><br>

<p style="text-align: justify; text-justify: inter-word;">
    <b>Option 1:</b> There is a cell magic command <code>%%requirements</code> which takes the content of a cell and turns it into notebook requirements.
</p>

<p style="text-align: justify; text-justify: inter-word;">
    <b>Option 2:</b> Load the requirements from an existing Pipfile <code>%requirements -f Pipfile</code>
</p>
    
<p style="text-align: justify; text-justify: inter-word;">
    <b>Option 3:</b> Add the dependencies one by one. This is perhaps the most convenient way and provides the finest control: <code>%dep add pandas --version ">=0.24.0" </code>
</p>

> The `dep` is an alias to `requirements`

<br>
        
Example 1:

    %%requirements

    [packages]
    ipython = "*"
    ipykernel = "*"

    [dev-packages]
    autopep8 = "*"

    [[source]]
    name = "pypi"
    url = "https://pypi.org/simple"
    verify_ssl = true

    [requires]
    python_version = "3.6"

In [None]:
%%requirements

[packages]
ipython = "*"
ipykernel = "*"

[dev-packages]
autopep8 = "*"

[[source]]
name = "pypi"
url = "https://pypi.org/simple"
verify_ssl = true

[requires]
python_version = "3.6"

In [None]:
%dep add pandas --version ">=0.24.0"

> Should you wonder why there is no `%load_ext` before and yet the commands still work, it is because the are autoloaded (the extension loads itself into the notebook context). This is quite useful, because you can simply start your notebook with initial `%dep add` commands and finally `%dep ensure` (see the [last section](#About-the-dep-ensure) of this notebook).

---

## Get notebook requirements

<p style="text-align: justify; text-justify: inter-word;">
    Now suppose that we've received the notebook from somebody else. We want to check what requirements the notebook has defined and eventually, what are the <i>real</i> notebok requirements.
</p>

<span style="font: 18px"><b>Acceptance Criteria</b></span><br>

<p style="text-align: justify; text-justify: inter-word;">
    We can safely check what requirements are defined and which are actually used.
</p>

<span style="font: 18px"><b>How to do it</b></span><br>

<p style="text-align: justify; text-justify: inter-word;">
    There is a line magic command <code>%requirements</code> which displays the content of notebok requirements metadata, and if it doesn't exist, it performs static analysis and checks for library usage in the notebook source code.
</p>

Example:

    %requirements  # notice the single % sign


In [None]:
%requirements

Few remarks about the output that we see above. If you take a look at the [#Code](#Code) section, you'll see that the imports actually look like this:

```
import json
import sys

import pandas as pd
import sklearn
```

So why do we *NOT* see all of these requirements in the output?

First of all, `json` and `sys` are somewhat special, `json` is a part of **standard library** and `sys` is a **built-in** module. Which means that they don't need to be installed.

As far as `sklearn` is concerned, we don't use it in this notebook. That's right, we track not only **imports**, but also the **usage**.

---

## Generate Thoth config

<p style="text-align: justify; text-justify: inter-word;">
    Thoth uses configuration file which looks something like this:
    
    # A remote Thoth service to talk to:
    host: stage.thoth-station.ninja

    # Configure TLS verification for communication with remote Thoth instance:
    tls_verify: true

    # Format of requirements file, currently supported is only Pipenv:
    requirements_format: pipenv

    runtime_environments:
      - name: 'fedora:30'
        # Operating system for which the recommendations should be created:
        operating_system:
          name: fedora
          version: '30'
        # Hardware information for the recommendation engine:
        hardware:
          # Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
          cpu_family: 6
          cpu_model: 78
        # Software configuration of runtime environment:
        python_version: '3.6'
        cuda_version: null
        # Recommendation type - one of testing, stable, latest:
        recommendation_type: stable
        # Number of latest versions considered during advises.
        limit_latest_versions: null

    #
    # Configuration of bots:
    #
    managers:
      - name: pipfile-requirements
      - name: info
      - name: version
        configuration:
          # A list of maintainers (GitHub or GitLab accounts) of this repository:
          maintainers: []
          # A list of assignees to which the opened pull requests and issues should
          # be assigned to:
          assignees: []
          # Labels for issues and pull requests:
          labels:
            - bot
          # Automatically maintain a changelog file stating features of new
          # releases:
          changelog_file: true
</p>
<p style="text-align: justify; text-justify: inter-word;">
    In order to be able to fully configure Thoth functionality, we would like to be able to simply generate the file.
</p>

<span style="font: 18px"><b>Acceptance Criteria</b></span><br>

<p style="text-align: justify; text-justify: inter-word;">
    The <code>.thoth.yaml</code> file has been generated
</p>

<span style="font: 18px"><b>How to do it</b></span><br>

<p style="text-align: justify; text-justify: inter-word;">
    There is a subcommand to the <code>%requirements</code> command called <code>config</code>. This generates the default <code>.thoth.yaml</code> config file.
</p>
    
Example:

    %requirements config


In [None]:
%requirements config --to-file

Now let's leave the config as-is except for a simple change... let's disable `tls_verify` and set `fedora:29` as our operating system.

In [None]:
# set tls_verify to false
!perl -i -pe 's/(?<=tls_verify: )(false|true)/false/' .thoth.yaml
# set operating system version to 29
!perl -i -pe "s/(?<=version: )('30')/'29'/" .thoth.yaml
# set name to fedora:29
!perl -i -pe "s/fedora:30/fedora:29/" .thoth.yaml

In [None]:
%requirements config

---

## Generate Pipfile

<p style="text-align: justify; text-justify: inter-word;">
    Now we're getting to the part in which we want to work with the requirements -- that is, install them -- and in order to do that, we need a manifest file. In our case, it's gonna be the Pipfile.
</p>

<span style="font: 18px"><b>Acceptance Criteria</b></span><br>

<p style="text-align: justify; text-justify: inter-word;">
    The <b>Pipfile</b> has been generated from notebook requirements.
</p>

<span style="font: 18px"><b>How to do it</b></span><br>
    The <code>%requirements</code> magic has an option <code>--to-file</code> which outputs the requirements to the Pipfile.

<p style="text-align: justify; text-justify: inter-word;">
</p>
    
Example:

    %requirements --to-file


For the purpose of this example -- check that there is no Pipfile present

In [None]:
%cat Pipfile

In [None]:
%requirements --to-file

And now ...

In [None]:
%cat Pipfile

---

## Lock down dependencies

<p style="text-align: justify; text-justify: inter-word;">
Here we're getting to the core part. We want to resolve the software stack and lock down dependencies so that the software stack is <i>thoth-optimal</i>. 
</p>

<span style="font: 18px"><b>Acceptance Criteria</b></span><br>

<p style="text-align: justify; text-justify: inter-word;">
    The Pipfile.lock has been generated using Thoth adviser API.
</p>

<span style="font: 18px"><b>How to do it</b></span><br>


<p style="text-align: justify; text-justify: inter-word;">
    The <code>%requirements</code> magic has a subcommand <code>lock</code> which takes an optional parameter <code style="white-space: nowrap;">--engine</code></span>. It triggers an analysis in <i>thoth-backend-stage</i> namespace and outputs the resolved software stack with locked down dependencies to the <b>Pipfile.lock</b>.
    
    NOTE: Engine can be set to `pipenv` in case pipenv should be used to resolve the dependencies instead of the Thoth resolution engine.
</p>
    
Example:

    %requirements lock
    

Once again, check that Pipfile.lock is not present

In [None]:
%cat Pipfile.lock

And also ignore the notebook metadata, just in case the developer also locked requirements (and remember, he had a tough night...)

In [None]:
%dep lock --help

The requirements are cached, so when we want to output them to a file, we don't need to go through the analysis again.

In [None]:
%dep lock --engine pipenv

---

## Install dependencies into virtual environment

<p style="text-align: justify; text-justify: inter-word;">
</p>

<span style="font: 18px"><b>Acceptance Criteria</b></span><br>

<p style="text-align: justify; text-justify: inter-word;">
</p>

<span style="font: 18px"><b>How to do it</b></span><br>

<p style="text-align: justify; text-justify: inter-word;">
</p>
    
Example:

    %requirements install


In [None]:
%dep install

---

## Install and set a new Jupyter kernel

<p style="text-align: justify; text-justify: inter-word;">
</p>

<span style="font: 18px"><b>Acceptance Criteria</b></span><br>

<p style="text-align: justify; text-justify: inter-word;">
</p>

<span style="font: 18px"><b>How to do it</b></span><br>

<p style="text-align: justify; text-justify: inter-word;">
    
</p>
    
Example:
    
    %kernel install [name]
    
    # if we want to also set a kernel
    # by default, this assumes a kernel matching the name of your notebook,
    # optionally you can provide custom name of an existing kernelspec
    %requirements kernel set [name]


Lets check our current kernel specification

In [None]:
%kernel

Now install the new kernel from the notebook requirements

In [None]:
%kernel install

And at the very end of this demo ... set the new Jupyter kernel.

First check out the current kernel spec (again, just to demonstrate that there is no `example` kernel present)

> HINT: The kernels are located at the toolbar menu: *Kernel* -> *Change kernel*

If it is the case and there is already such kernel, feel free to provide custom kernel name by `%requirements kernel set <name>`

> WARNING: After you issue this command, you're gonna have a fresh kernel ready, so don't expect your variables or imports to be present

In [None]:
%kernel set

Remember the error that we got about `pandas` library not being present?

In [None]:
from IPython.core.display import HTML

try:
    import pandas as pd  # <--- This was not possible before
    
    display(HTML("""
        <br>
        <div style="display: grid">
            <img style="margin: 0 auto;" src="/static/base/images/logo.png?v=641991992878ee24c6f3826e81054a0f" alt="Jupyter Notebook">
        </div>
        <hr>
        <center><p style="text-align: center; font-size: 21px"> Thank you for your attention! </p></center>
    """))
except:
    
    display(HTML("""
        <br>
        <div style="display: grid">
            <i class="fas fa-ban" style="margin: 0 auto; font-size: 80px; color: red;"></i>
        </div>
        <hr>
        <center><p style="text-align: center; font-size: 21px"> Time to blame the QA... </p></center>
    """))

## About the `dep ensure`

This notebook has demonstrated plenty of commands (and yet not nearly all of them), which was its purpose. However, in the real environment, you probably don't want to execute all of the commands mentioned. You want to have a *single* directive which does all the hard stuff for you.

That's what the `%dep ensure` command is for. Once you have your dependencies added with the `%dep add` command, just run `%dep ensure` (see `%dep ensure --help` for more info). 

In [None]:
%dep ensure --help