# Python for SysAdmins – Modules

**just in case...**

* ... you wonder where you can find this script: it is available at [Github](https://github.com/eth-its/Python-for-SysAdmins/tree/main/ws2)
* ... you have a mess with your Python installation: [Python Best Practices – Getting Started](https://gitlab.ethz.ch/vermeul/python-best-practices/-/blob/master/01-Getting_Started.md) might help

## before we begin: a few Jupyter tricks

**1) put a question mark ? directly after any method or module name** and execute the cell to receive the so called _docstring_. 

In [None]:
import os
os?

It becomes especially handy if you can't remember the parameters that you need to provide:

In [None]:
print?

Even more handy is to hit `shift-TAB` when you are inside the brackets of a function or a method:

In [None]:
print()

**2) use Jupyter’s TAB completion to list all methods**

enter the following cell, move the cursor right after `sys.`, then hit the tabulator key: a list of possible methods will appear as a vertical list (first time it may take a while)

In [None]:
import sys
sys.

**3) Use your keyboard to navigate (VIM bindings)**

Jupyter has an _insert_ and a _browse_ (or normal) mode, like the famous [vim editor](https://geekflare.com/saving-and-quitting-vim-editor/). Hit the `Esc` key to enter browse mode, then use:

- the `k` and `jp` keys to go up and down
- `d d` to delete a cell
- `z` to undo a deletion
- `b` to insert a cell below the current one
- `a` to insert a cell above the current one
- `y`, `c` and `r` to convert a cell into code, markdown or raw text
- Hit `Enter` to enter the insert mode. Hit `Esc` to go back to browse mode.
- Hit `Shift + Enter` to execute a cell

See Help -> Show Keyboard Shortcuts for additional keybindings.

### Exercise 1

- [ ] Move up and down cells, enter and leave them without using your mouse
- [ ] create a new cell below or a above the current one, change it's type to Markdown and enter a new title or subtitle. Execute the cell to render it's content.
- [ ] find out the keys how to copy a cell and paste it below
- [ ] try stopping and restarting the kernel (the Python interpreter) using the stop/restart symbols

## Everyday modules: The standard library

Python comes with a lot of pre-installed modules (standard Python library) which greatly extend the language.

Visit the [Python Module of the Week](https://pymotw.com/3/) website to get a good overview. [The complete list can be found here](https://docs.python.org/3/library/index.html).

All these modules are directly shipped with every Python installation, hence «batteries included». Which means, no extra installation is necessary in order to run your module on a different machine. 

For dealing with files and directories, we are going to look into a few modules of the standard library:

- [`os`](https://docs.python.org/3/library/os.html) – operating system interactions
- [`sys`](https://docs.python.org/3/library/sys.html) – Python runtime environment manipulation
- [`pathlib`](https://docs.python.org/3/library/pathlib.html) – object-oriented filesystem paths
- [`shutil`](https://docs.python.org/3/library/shutil.html) – High-level file operation
- [`re`](https://docs.python.org/3/library/re.html) – Regular Expressions
- [`json`](https://docs.python.org/3/library/json.html) – JSON encoder and decoder
- [`csv`](https://docs.python.org/3/library/csv.html) – CSV files, reading and writing

### Excercise 2

- [ ] choose any of the modules in the standard library, create a new code cell and `import` that module. Execute the cell.
- [ ] Enter the cell again and use the `TAB` completion or `?` to find out what's inside the module

## Additional modules: PyPi and the Python Package Manager `pip`

But there is much more! The [Python Package Index](https://pypi.org) hosts thousands of additional modules which solve almost all possible everyday problems. Simply use the `pip` command line tool, the Python package manager which is being shipped with Python, to install them.

**Hint**: In Jupyter, simply put an exclamation mark `!` at the beginning of a code cell to execute a command as if you where in a command shell.

To install a new module/library/framework, e.g. `pandas`, a library for data analysis, just issue this command:p

In [None]:
!pip install pandas

To upgrade an existing module to the latest version, you do:

In [None]:
!pip install --upgrade pandas

The following command does the same as above. It tells the Python interpreter with the `-m` parameter to use the module `pip`, followed by the subcommand `install` and the argument `pandas` :

In [None]:
!python -m pip install pandas

Some bigger frameworks come with optional installable dependencies, like the highly recommended [FastAPI Framework](https://fastapi.tiangolo.com). You can install the framework with the usual

```
pip install fastapi
```

but also with

```
pip install fastapi[all]
```

to install a number of additional modules, such as [`pydantic-extra-types`](https://github.com/pydantic/pydantic-extra-types) .


### Sneak preview: run your project somewhere else

Before you can run your script on another machine, you need to know which packages you've installed, and which versions you used.
If you distribute your script, you want to put these in some configuration:
* `pyproject.toml` (modern approach)
* `pyproject.toml` + `setup.cfg` (almost modern approach)
* `pyproject.toml` + `setup.py` (conservative approach)
* `requirements.txt` (old school / data science)

More info: https://setuptools.pypa.io/en/latest/userguide/quickstart.html

We will discuss packaging later in WS4.

For now, we'll only learn how to get the list of used libraries (including their exact versions) and put them into a file.  

In [None]:
!pip freeze > requirements.txt

then later, people can install exactly the same modules in their exact versions, like this:

```sh
pip install -r requirements.txt
```

### Exercise 3

* [ ] install a module from the Python Package Index. You can try [this list](https://hackr.io/blog/best-python-libraries) for an inspiring starting point.
* [ ] create a list of the installed modules and look what's inside it

## How modules are imported

At the beginning of most Python files, you will see a list of `import` statements.

With the `import` statement we tell Python look for a module and treat that module like a variable.

The **order** where Python looks for modules in the system is as follows:

1. look in the current path
2. look in the paths specified by the `PYTHONPATH` environment variable, if this variable exists
3. look in the standard library path (`.../lib/python3.x/`)
4. look in the path where all external modules (e.g. from [pypi.org](https://pypi.org)) are installed (`.../lib/python3.x/site-packages`)

The `sys.path` tells us _exactly_ where the Python interpreter is looking for modules and in which order:

In [None]:
import sys
sys.path

When importing modules, we often don't need the whole module, just some parts from it.

We can achieve this using the `from <module> import <function/submodule>`
 syntax:

In [None]:
from datetime import date, datetime
print(date.today())
print(datetime.now())

Sometimes, module names seem rather long to type, so we give it an **alias**:

In [None]:
from datetime import datetime as dt
dt.now()

### Write your own modules: the infamous `__init__.py` file

In your file browser, inside the folder `my_module/`, you'll find a Python file [my_hello_world.py](my_module/my_hello_world.py) .
We can tell Python to refer to that file inside that folder by using the `from <folder> import <module>` syntax, like so:

In [None]:
from my_module import my_hello_world
my_hello_world.say_hello("World!")

If you have many functions in separate `.py` files, that might get cumbersome to import them. It is easier to bundle them into a module and present it to the user. That's where the `__init__.py` file becomes important. With this file, you can treat the whole directory `my_module` like as it was a Python file.

### Exercise 4

- [ ] `import my_module` in a cell and use `TAB` to see what's in there
- [ ] edit the file `my_module/__init__.py` and add uncomment the lines that are commented out
- [ ] run the `import` again and observe what changed (maybe restart the kernel)
- [ ] directly import the functions of `my_module`, using the `from <module> import <function>` syntax
- [ ] invoke the methods, use the `Shift+TAB` to find out which arguments you need to pass

## The `sys` module

This module shows a lot of information about the **Python interpreter** itself and its location

In [None]:
sys.version

In [None]:
sys.version_info

In [None]:
sys.executable

A typical example how we can avoid a script from being executed with the **wrong Python interpreter**:

In [None]:
if sys.version_info < (3,9):
    sys.exit('Sorry, Python < 3.9 is no longer maintained')

Add the **locally installed packages** to the `PYTHONPATH`.

Sometimes, we cannot install a package into the standard `.../python3.xx/site-packages` folder, because of lack of privileges. However, the `pip` package manager can install it in user-space:

In [None]:
!pip install pandas --user

The `--user` option will install the package into the `site.USER_SITE` folder:

In [None]:
import site
site.USER_SITE

However, after installing, the following statement might fail:

In [None]:
import pandas  # this might if pandas is only installed in user-space

The `sys.path` is an array (list) of paths. We can use the `.append` method on it to add another item, in our case it is the `site.USER_SITE` folder:

In [None]:
import sys
import site
sys.path.append(site.USER_SITE)

Now the import should work (if it hasn't worked before):

In [None]:
import pandas

### Exercise 5
- [ ] play around with the `sys.path`. Use the `.pop()` and `.append()` methods
- [ ] make `sys.path` an empty array and look at the list after an `import` statement and `TAB`
- [ ] Restart the kernel of the Jupyter notebook and try again