# Organizing your code!

Pitfalls of Jupyter-base development:
- no separation between the code implementation (i.e. a class) and its use (i.e. an analysis using that class);
- mixups between global and local variables can lead to unintended consequences!
- Jupyter allows cells to be executed in ***any order*** and it's hard to keep track of what the program is doing: you may have the impression that everything works but once you reset the program and try to execute it in a **linear** fashion it breaks!

## Modules
- a **module** is a plaintext file containing python code;
- a module is named `module.py` where "module" should be something descriptive;
- modules are often organized in **packages**, but packaging is too articulated to be covered here.
- a module is invoked with the `import` statement.

## Example: a random number generator

We have learned to use `numpy.random` in a basic way, but we can improve on that. In fact, the current recommended usage does not involve calling `numpy.random` directly but rather instantiating a random number generator object, i.e.:

```python
rng = np.random.default_rng(seed)
```

As an example, we will write a "wrapper class", a class that "wraps around" existing functionality to make it more convenient for us.

In [1]:
import numpy as np

class RNG:
    def __init__(self, seed : int = 0):
        self.rng = np.random.default_rng(seed=seed)

    def generate(self, shape : tuple = None):
        # this is a bit dumb because it only replicates the behaviour of random()
        # but this is just meant as an example
        if shape is not None:
            return self.rng.random(shape)
        else:
            return self.rng.random()

In [2]:
rand = RNG(seed=27)

rand.generate(10)

array([0.69773622, 0.31381427, 0.1211971 , 0.32359152, 0.93121187,
       0.78966731, 0.01001912, 0.19893322, 0.29311369, 0.94341571])

## Creating a module!

- Now move the content of the cell containing the RNG definition to a separate file under `modules/utils.py`!
- Comment out the cell content and import from the file.

In [3]:
from modules.utils import RNG

rand = RNG(seed=55)

In [4]:
rand.generate()

0.8322521834549506

## `import` HOWTO

In [5]:
# 0. Imports all symbols from a module, popular in the past but now DEPRECATED 

from modules.utils import *

b = RNG()

**Don't do that.** Why not? Because you do not have control about which symbols will be imported into your scope (possibly overriding existing ones). Also not very efficient. Here are two good way of doing this:

In [6]:
# 1. Import  a one or more symbols from a module with (optionally) an alias.

from modules.utils import DEFAULT_SEED, RNG as random_number_generator

a = random_number_generator()

In [7]:
# 2. Import a reference to the module (with optional alias) and access its symbols with the '.' operator.
import modules.utils as random_utils
c = random_utils.RNG()
# Similar to `import numpy as np`

## Packages

A **package** like `numpy` is a collection of modules. The package `numpy` provides the `numpy` module, that provides some basic functionality. Some features of `numpy` are accessible through other modules provided by the same package.

In [8]:
import numpy

# This is the main module of the package.
print(type(numpy))
# These are symbols contained in the main module.
print(type(numpy.ndarray))
print(type(numpy.array))
# This is a module accessible through the main module.
print(type(numpy.random))
# This is a symbol contained in the previous.
print(type(numpy.random.random))

<class 'module'>
<class 'type'>
<class 'builtin_function_or_method'>
<class 'module'>
<class 'builtin_function_or_method'>


The way a package makes accessible its functionality through is main module is based on a chain of `import` statements. Be aware that there may be some differences across packages!

Packages that are installable through `pip install package_name` are published at [pypi](https://pypi.org/)! You may also learn how to write your own private package and install it locally...

### Best practices
- never use `import *`
- if you plan to use only a few items from the module in specific places, use `from module import class as class_alias`;
- if you plan to use many features all the time, import the module with a short alias `import numpy as np`;
- you may store "constants" in modules but try not to store variables!.

### Note
`import` executes all the code from the module file! The easter egg module (`this`) we have experienced at the beginning shows exactly what a module should not do...

In [9]:
import this 

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


# Our first script...

To show the usefullness of modules, we want now to use the class we have written in a python **script** that we can execute outside of Jupyter.

The way we write a script is creating a file such as `script.py` and defining a `main()` function to call in the body of the script as in the following:
```python
def main():
    pass

if __name__ == "__main__":
    main()    
```

The `if(__name__) == "__main__"` guard statement is required to ensure that `main()` is called only when if file is run as a script! Sometimes, the same file could be imported as a module or run as a script, alternatively. (Thanks to the student who pointed this out in class!)


Take a look at `simple_script.py` for an example script using the RNG class. You can launch it by running `python simple_script.py` from your console. Alternatively, we can invoke it in Jupyter:

In [10]:
# This command does not use the python interpreter internal to Jupyter but rather "invokes" the `python` interpreter of the underlying system!
!python simple_script.py

[0.95600171 0.20768181 0.82844489 0.14928212 0.51280462 0.1359196
 0.68903648 0.84174772 0.425509   0.956926  ]


We can also test the `main()` function of the script inside Jupyter, effectively using the script file as a module!

In [11]:
from simple_script import main as script_main # better use an alias as main is a very common name!

In [12]:
script_main()

[0.95600171 0.20768181 0.82844489 0.14928212 0.51280462 0.1359196
 0.68903648 0.84174772 0.425509   0.956926  ]
