# 7. Packages and modules

Classes and functions, everything you can use in Python that is not in the standard library is organized in **modules** and **packages**.

## 7.1 Modules

A module is every file that ends with **.py**. It can contain function definitions, constants and class definitions. It can also be run like you know it from a **.m** file. You can write modules yourself, and you should. One huge advantage is to avoid spaghetti code. Define your classes and functions in one file, import them im another and this way you bring a structure to your code. This improves readability, it leverages the pillars of OOP (abstraction) and it makes different parts of your code reusable for different aspects. No more copy-pasting code from one file to another. Just import it.

We'll have a look at how that works right now:

**Exercise**

   1. Open a file **in this directory** using an editor of your choice, e.g. the one built into jupyter lab. 
   2. Rename it to **test.py**.
   3. In this file, define two functions:
      1. One that prints a message of your choice.
      2. One that does any mathematical operation with its input, e.g. square, double, subtract 1, etc.
   4. Save the file

Congratulations, you just wrote your first module. You can now import the functions into this notebook. There are multiple ways to do this, but all of them include the `import`-statement.

You can import the whole module. The syntax for calling functions from the module uses the `module.function()`-notation, just like classes:

```Python
import module_name
```

You don't have to write the `.py`-part, Python knows it's a module because of this file ending.

**Exercise**

Go ahead and import your whole module. Then try out the both functions you just defined in there:

In [68]:
#your code here


There are advantages and disadvantages to this approach. They will become apparent, when we import from more complicated modules and packages. One big advantage is the implicit use of namespaces. I.e. because you call `module.function()` and not just `function()`, you can have functions of the same name in different modules without a problem. Recall one line from the **Zen of Python**:

```
Namespaces are one honking great idea -- let's do more of those!
```


We can also rename the module while importing it. For a few packages there are conventions for this, e.g.

```Python
import numpy as np;
import seaborn as sns;
```

You can also be more specific and just import certain functions. The syntax is:

```Python
from module_name import function1, function2, functionN
```

And again, you can rename it while doing that.

```Python
from my_package import long_function_name_that_would_be_annoying_to_use as my_function
```

You can also leverage this approach to avoid the need to call `module.function()`, while still importing all functions from the module:

```Python
from module_name import *
```

You'll see this often with `from numpy import *`. It's okay if you use packages with only few functions and classes or only use one package at a time. Otherwise it's a bad habit and can lead to unexpected behavior when you import different functions with the same name.

**Exercise**

   1. Restart the kernel. You can't really unimport modules, so this is the cleanest way to achieve this task. Either hit '0' twice, or use the GUI.
   2. Import only one of your functions and rename it.
   3. Make sure everything worked as expected by calling the function.

In [None]:
#your code here


## 7.2 Packages

When you have a lot of different classes and functions it becomes unfeasible (and especially unmaintainable) to put all of them in one module. E.g. `scipy` has functions and classes for things like linear algebra but also filters for signal progressing. It would be really confusing to have all of that in one file. The simplest packages are really just collections of modules. E.g. `numpy` has a module named `random` in which all of the random number generators and the like are defined.

You can import only certain modules. You can also import only specific functions from modules. As usual there's more than one way to do it. 

Import one module from package:
```Python
from package import module
import package.module [ as module ]
```
E.g. from `matplotlib`, the plotting package, you usually import only `pyplot`, the OOP-based interface to low-level plotting libraries. The convention to do this is:
```Python
import matplotlib.pyplot as plt
```
or
```Python
from matplotlib import pyplot as plt
```
which are equivalent.

Import specific functions from one module:
```Python
from package.module import function1, function2
```
As usual, you can rename everything
```Python
from package.module import function1 as my_fun_1, function2 as my_fun_2
```

If you only use a few functions 

**Exercise**

Import the function `randn` from the module `random` in the package `numpy`. Find out how it works and use it. Welcome to `numpy`.

In [2]:
#your code here


## 7.3 sys.path

Just like MATLAB can only use functions that are on its `path`-variable, we can only import from modules and packages that are within the scope of the interpreter. You can always import from modules that are in 
   1. your current working directory in case you're using an interactive shell
   3. in the same directory as the script/module that tries to import in case you're running a program

And you can import from modules and packages that are on in `sys.path`.

**sys** is a package that you use to get information about the interpreter and interact with it. `sys.path` is a list (both as in an enumeration and a Python object of type `list`) of directories. You can add to and remove from the path as needed. 

**Exercise**

Import `sys` and have a look at the path variable.

In [22]:
#your code here


['D:\\Documents\\Python_Workshop\\notebooks', 'C:\\Users\\Lukas\\Anaconda3\\envs\\workshop\\python37.zip', 'C:\\Users\\Lukas\\Anaconda3\\envs\\workshop\\DLLs', 'C:\\Users\\Lukas\\Anaconda3\\envs\\workshop\\lib', 'C:\\Users\\Lukas\\Anaconda3\\envs\\workshop', '', 'C:\\Users\\Lukas\\Anaconda3\\envs\\workshop\\lib\\site-packages', 'C:\\Users\\Lukas\\Anaconda3\\envs\\workshop\\lib\\site-packages\\IPython\\extensions', 'C:\\Users\\Lukas\\.ipython']


**Exercise**

In the folder `..\scripts`, there is a module called `my_module`  with a function `my_function`. Try to import it.

In [23]:
#your code here


It doesn't work, because the directory is not on `sys.path`.

Super short digression to the `os` module:

`os` is short for "Operating System". You can use the module to get information about the OS you're using, e.g. which file seperator to use. Also you can use it to interact with the OS, e.g. printing or changing the current working directory, shutting down the computer, etc. 

Have a look:


In [24]:
import os
print(os.sep); #this will be different depending on wether you are on UNIX or Windows.
print(os.getcwd()); #like pwd 
os.chdir( '..' ); #like cd, go back one level in folder hierarchy

\
D:\Documents\Python_Workshop\notebooks


IPython implements `pwd` and `cd`, but if you have a look, they're really just wrappers around the `os` functions:

In [25]:
pwd??

[1;31mSource:[0m
    [1;33m@[0m[0mskip_doctest[0m[1;33m
[0m    [1;33m@[0m[0mline_magic[0m[1;33m
[0m    [1;32mdef[0m [0mpwd[0m[1;33m([0m[0mself[0m[1;33m,[0m [0mparameter_s[0m[1;33m=[0m[1;34m''[0m[1;33m)[0m[1;33m:[0m[1;33m
[0m        [1;34m"""Return the current working directory path.

        Examples
        --------
        ::

          In [9]: pwd
          Out[9]: '/home/tsuser/sprint/ipython'
        """[0m[1;33m
[0m        [1;32mtry[0m[1;33m:[0m[1;33m
[0m            [1;32mreturn[0m [0mos[0m[1;33m.[0m[0mgetcwd[0m[1;33m([0m[1;33m)[0m[1;33m
[0m        [1;32mexcept[0m [0mFileNotFoundError[0m[1;33m:[0m[1;33m
[0m            [1;32mraise[0m [0mUsageError[0m[1;33m([0m[1;34m"CWD no longer exists - please use %cd to change directory."[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mFile:[0m   c:\users\lukas\anaconda3\envs\workshop\lib\site-packages\ipython\core\magics\osm.py


Now, we are no longer in the `notebooks` folder:

In [26]:
os.getcwd();

**Exercise**

   1. Use `os` to create a string of the absolute path to the "scripts"-folder. You can use string concatenation or look for a function in the module `os.path`
   2. Add it to the `sys.path` variable. Don't forget, it's just a list after all.
   3. Navigate back to the "notebooks"-folder
   4. Try again to import `my_function` from `my_module` and run it. It expects your name as a string as input.

In [None]:
#your code here


## 7.4 Programs

### 7.4.1 Programs vs modules

A program is an executable piece of code. Internally, there is no distinction between a program and a module. A **.py**-file can even be both. The command

```Python
import my_module
```
is really nothing but `run the file my_module`. Running function definitions has the effect to have the function in memory afterwards, which is the definition of importing a function. To illustrate the point, do the following:

**Exercise**

   1. Create a file hello_world.py in "..\scripts".
   2. Define a function hello_world that prints "hello world".
   3. Use the `print()`-function to print something completely different.
   4. Run the following cell (make sure that the directory is on your sys.path)

import hello_world

There are ways to avoid this confusion and we'll cover them in a second.

## 7.4.2 Running a program from the command line

The "command line" can be any shell that knows a Python executable that is compatible with the program (Python 2 vs. Python 3). On Windows, this will be the Anaconda prompt. On MacOS and Linux you should be able to use the system prompt/Terminal.

In the prompt, type the following to run a Python script:
```
python name_of_script.py
```
which just tells the Python interpreter to run the code in the file. The file has to be in your working directory, otherwise you have to use an absolute path.

**Exercise**

   1. Open up a new instance of the interpreter. 
   2. Run your hello_world-file. 
   3. Observe what happens.

### 7.4.3 if \_\_name__ == "\_\_main__":

You will see this in many scripts and since it can be really confusing, it seems like a good idea to cover that here.

The Python interpreter assigns a variable called `__name__` to any source files before executing the code. This is true for notebooks and **.py**-files alike.

**Exercise**

Print the `__name__` of the current notebook:

In [4]:
#your code here


__main__


The name of the current notebook is `"__main__"`. This is because we are actively using this notebook and not importing from somewhere else. E.g. if you import from a module, the name of the **.py**-file is the name of the module by default.

**Exercise**

   1. Create a new **.py**-file and add a print statement that prints: `__name__ = {value of __name__}`
   2. Import the module 
   3. Run the module as a program from the command line

As you can see, the `__name__`  is a different one in both cases.

This fact can be leveraged to have modules that can contain code that can be imported without side effects and can still be run as programs. When writing programs that should be used from the command line, it is good practice to add the statement `if __name__ == "__main__":` before all code that is supposed to be executed.

**Exercise**

   1. Open a new **.py**-file
   2. Import randint from numpy.random
   3. Define a function that prints 10 random integers between 1 and 100 to the screen.
   4. Run the function, but only if the program is used as main program.
   5. Check the behavior by running from the command line and importing the package in here

### 7.4.4 Command line arguments

You can also pass arguments using the command line approach. E.g. you can set a toggle for verbosity, you can define a subID, whatever you want. For this we will use the package `argparse`. Let's have a look at the file "argparse_demo.py".

**Exercise**

Write a script that computes the nth power of a value. Make the base a positional argument and define a default for the exponent.