# Module 2: Introduction to ML Development

## How to run code

Jupyter notebooks allow evaluating Python code immediately. Just type some code and hit `Run` or press `Shift-enter`.

In [1]:
1 + 1

2

In [2]:
name = "world"
print("Hello " + name)

Hello world


You can access all packages from the standard library and packages installed in your environment.

Let's test this with a quick little example. Our goal is to **write a random number generator** using the standard library. Thus, we want to create a function that receives an interval from which to draw numbers and the desired array length. 

Firstly, we have to import the `random` package from the standard library.

In [3]:
import random

We can now use autocomplete to find a suitable function for our RNG and fill out its parameters.

In [5]:
random.randint(a=0, b=10)

9

Now we can build our own function around it.

In [6]:
def rng(start, end, size):
    """ A basic random number generator. """
    res = []
    for _ in range(size):
        res.append(random.randint(start, end))
    return res

In [8]:
rng(0, 10, 10)

[3, 4, 1, 0, 7, 2, 6, 1, 1, 2]

Running this function repeatedly will yield different results, as expected.

## How to write documentation

As you've already seen, we can write documentation. Jupyter uses the Markdown format, e.g., supporting **bold**, *italic* and ~strikethrough~ text.

We can also insert images... 

![Jupyter logo](img/jupyter.png)

...and [hyperlinks](https://jupyter.org).

Jupyter also supports LaTex-style equations which can be quite helpful when dealing with statistical modeling.

You can insert inline functions: $a = b + c$

Alternatively, you can create new lines for more complex equations:

$$RMSE = \frac{1}{n} \sum_1^n (y - \hat{y})^2$$

## How to access the command line

Jupyter offers multiple ways to access the command line:

1. Open a dedicated terminal from the home page
2. Run commands from within Jupyter notebooks

I will demonstrate both ways here.

For a dedicated terminal, go to your home page and hit `New > Terminal` to open a new terminal.

If you want your interaction with the command line to be captured (e.g., for reproducibility of data downloads), run them within a Jupyter cell. **Important:** You have to prefix the command with the `!` character.



In [11]:
!echo $SHELL

/bin/zsh


In [12]:
!ls

02_Introduction_ML_Development.ipynb 06_Model_Evaluation.ipynb
03_Exploratory_Data_Analysis.ipynb   [35mdata[m[m
04_Feature_Engineering.ipynb         [1m[36mimg[m[m
05_Model_Development.ipynb           [1m[36mtmp[m[m


In [13]:
!pwd

/Users/felix/code/ml/ml-workflow-tools/nbs


## Environment management

Let's find out more about our environment by using some `conda` commands.

Firstly, let's check all our environments.

In [14]:
!conda env list

# conda environments:
#
base                     /Users/felix/miniconda3
aws                      /Users/felix/miniconda3/envs/aws
misq_sim                 /Users/felix/miniconda3/envs/misq_sim
ml_workshop           *  /Users/felix/miniconda3/envs/ml_workshop



Nice, our pre-configured workshop environment is already activated, as denoted by the asterisk next to it.

Now let's see what packages are installed in this environment.

In [15]:
!conda list

# packages in environment at /Users/felix/miniconda3/envs/ml_workshop:
#
# Name                    Version                   Build  Channel
appnope                   0.1.0                 py37_1000    conda-forge
arrow-cpp                 0.14.1           py37h43d7656_0    conda-forge
attrs                     19.1.0                     py_0    conda-forge
backcall                  0.1.0                      py_0    conda-forge
bleach                    3.1.0                      py_0    conda-forge
boost-cpp                 1.70.0               h75728bb_2    conda-forge
brotli                    1.0.7             h6de7cb9_1000    conda-forge
bzip2                     1.0.8                h01d97ff_0    conda-forge
c-ares                    1.15.0            h01d97ff_1001    conda-forge
ca-certificates           2019.9.11            hecc5488_0    conda-forge
certifi                   2019.9.11                py37_0    conda-forge
cycler                    0.10.0           

## How to use built-in functions

Jupyter also comes with some pretty handy built-in functions. For example, we can access documentation for all our installed packages from within Jupyter. Using autocomplete and the `?` character can save you a lot of Google searches and make you more productive. Simply load a package, and prefix your desired class or function with `?` to get its documentation, or `??` to see the actual source. We will demonstrate this for NumPy's `mean` function.

In [16]:
import numpy as np

In [17]:
?np.mean

In [18]:
??np.mean

In addition to accessing documentation, Jupyter offers built-in functions, e.g., to...

... configure libraries, e.g., plotting libraries

In [23]:
# display all plots inside the Jupyter notebook
%matplotlib inline

... or profile your code.

In [25]:
# test how fast our random number generator is
%timeit rng(0, 100, 100)

125 µs ± 2.76 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


## How extend Jupyter notebooks

Since Jupyter is completely open-source, it can be extended quite easily. Several extensions can be found in the `jupyter_contrib_nbextensions` package. These extensions can be activated and deactivated using the `nbextensions` tab on your home page. For example, these extensions allow:
- folding code and sections within your notebook
- formatting code according to the pep8 guidelines
- adding a table of contents to your notebook
- changing the look and feel of your notebook environment

More information on how to extend Jupyter's functionality can be found in the official [documentation](https://jupyter-notebook.readthedocs.io/en/stable/extending/).