# Python for SysAdmins – Modules

**just in case...**

* ... you wonder where you can find this script: it is available at [Github](https://github.com/eth-its/Python-for-SysAdmins/tree/main/ws2)
* ... you have a mess with your Python installation: [Python Best Practices – Getting Started](https://gitlab.ethz.ch/vermeul/python-best-practices/-/blob/master/01-Getting_Started.md) might help

## Batteries included: The standard library

Python comes with a lot of pre-installed modules ([Python standard library](https://docs.python.org/3/library/index.html)) which greatly extend the language.

Visit the [Python Module of the Week](https://pymotw.com/3/) website to get a useful introduction.

All these modules are directly shipped with every Python installation, hence **batteries included**. Which means, often no external module needs to be installed in order to run your code on a different machine. 

For dealing with files and directories, we are going to look into a few modules of the [Python standard library](https://docs.python.org/3/library/index.html):

- [`os`](https://docs.python.org/3/library/os.html) – operating system interactions
- [`sys`](https://docs.python.org/3/library/sys.html) – Python runtime
- [`pathlib`](https://docs.python.org/3/library/pathlib.html) – object-oriented filesystem paths
- [`shutil`](https://docs.python.org/3/library/shutil.html) – High-level file operations (shell utilities)
- [`re`](https://docs.python.org/3/library/re.html) – Regular Expressions to extract text
- [`json`](https://docs.python.org/3/library/json.html) – JSON encoder and decoder
- [`csv`](https://docs.python.org/3/library/csv.html) – CSV files, reading and writing

To **use a module** in your code, you need to `import` it. Python will look in every path of listed in the `PYTHONPATH` environment variable until it finds it:

In [None]:
import sys
sys.path

### Excercise 1

- [ ] choose any of the modules in the standard library, create a new code cell and `import` that module. Execute the cell.
- [ ] Enter the cell again and use the `TAB` completion or read its docstring

## External modules: PyPi and its Python Package Manager `pip`

But there is much more! The [Python Package Index](https://pypi.org) hosts thousands of additional modules which solve almost all possible everyday problems. Simply use the `pip` command line tool, the Python package manager which is being shipped with Python, to install them.

<div class="alert alert-block alert-info">
    
**Note**: In Jupyter, simply put an exclamation mark `!` at the beginning of a code cell to execute a command as if you where in a command shell.

</div>

### Installing and upgrading packages

To **install** a new module, e.g. `pandas`, a library for data analysis, just issue this command:

In [None]:
!pip install pandas

To **upgrade** an existing module to the latest version, you do:

In [None]:
!pip install --upgrade pandas

A **variant how to use** the pip module. The `-m` flag tells Python to use the module `pip`, followed by the subcommand `install` and the argument `pandas` :

In [None]:
!python -m pip install pandas

### Installing optional packages

Some bigger frameworks come with optional installable dependencies, like the highly recommended [FastAPI Framework](https://fastapi.tiangolo.com). You can install the framework with the usual

```
pip install fastapi
```

but also with

```
pip install fastapi[all]
```

to install a number of optional modules, such as [`pydantic-extra-types`](https://github.com/pydantic/pydantic-extra-types) .


### Show installed packages

To get a nice list of all currently installed packages

In [None]:
!pip list

The list is nice to read, however, to create a machine readable list of modules (i.e. a **lock file**), you would typically use this command instead:

In [None]:
!pip freeze > requirements.txt

Later, other users that user your code can install exactly the same modules in their exact versions, like this:

```sh
pip install -r requirements.txt
```

In **WS4** we will learn about [setuptools](https://setuptools.pypa.io/en/latest/userguide/quickstart.html) which helps you **packaging and distribute your own project**.

### Exercise 2

* [ ] install a module from the Python Package Index. You can try [this list](https://hackr.io/blog/best-python-libraries) for an inspiring starting point.
* [ ] create a list of the installed modules and look what's inside it

## How modules are imported

At the beginning of most Python files, you will see a list of `import` statements.

With the `import` statement we tell Python look for a module and treat that module like a variable.

The **order** where Python looks for modules in the system is as follows:

1. look in the current path
2. look in the paths specified by the `PYTHONPATH` environment variable, if this variable exists
3. look in the standard library path (`.../lib/python3.x/`)
4. look in the path where all external modules (e.g. from [pypi.org](https://pypi.org)) are installed (`.../lib/python3.x/site-packages`)


As demonstrated before, the `sys.path` tells us _exactly_ where the Python interpreter is looking for modules and in which order:

In [None]:
import sys
sys.path

## Module parts and aliases

When importing modules, we often don't need the whole module, just some parts from it.

We can achieve this using the `from <module> import <function/submodule>`
 syntax:

In [None]:
from datetime import date, datetime
print(date.today())
print(datetime.now())

Sometimes, module names are rather long to type or **conflicting with existing names**, so we give it an **alias**:

In [None]:
import cachecontrol as cctrl
import collections as coll

Likewise, we also can assign subparts of a module an alias as well:

In [None]:
from datetime import datetime as dt
dt.now()

### How to write your own module

In your file browser, inside the folder `my_module/`, you'll find a Python file [my_hello_world.py](my_module/my_hello_world.py) .
We can tell Python to refer to that file inside that folder by using the `from <folder> import <module>` syntax, like so:

In [None]:
from my_module import my_hello_world
my_hello_world.say_hello("World!")

Make sure you include an empty file named `__init__.py`. It declares Python that this folder should be treated as a **module**. It also helps to only present the most important functions to the user while hiding away the implementation details.

### Exercise 3

- [ ] `import my_module` in a cell and use `TAB` to see what's in there
- [ ] edit the file `my_module/__init__.py` and add uncomment the lines that are commented out
- [ ] run the `import` again and observe what changed (maybe restart the kernel)
- [ ] directly import the functions of `my_module`, using the `from <module> import <function>` syntax
- [ ] invoke the methods, use the `Shift+TAB` to find out which arguments you need to pass

## The `sys` module

This module shows a lot of information about the **Python interpreter** itself and its location

In [None]:
sys.version

In [None]:
sys.version_info

In [None]:
sys.executable

A typical example how we can avoid a script from being executed with the **wrong Python interpreter**:

In [None]:
if sys.version_info < (3,9):
    sys.exit('Sorry, Python < 3.9 is no longer maintained')

In [None]:
sys.path

## install packages in user-space

Sometimes, we cannot install a package into the standard Python package folder (typically named `site-packages`), because of **lack of privileges**. However, we can tell the Python package manager to install a package in user-space instead:

In [None]:
!pip install pandas --user

The `--user` option will install the package into the `site.USER_SITE` folder:

In [None]:
import site
site.USER_SITE

After installing in user-space, the following statement might still fail:

In [None]:
import pandas  # this might fail if pandas is only installed in user-space

To import the package from user-space, we can **manipulate `sys.path`** by appending `site.USER_SITE`:

In [None]:
import sys
import site
sys.path.append(site.USER_SITE)

Now the import should work (if it hasn't worked before):

In [None]:
import pandas

### Exercise 4
- [ ] play around with the `sys.path`. Use the `.pop()` and `.append()` methods
- [ ] make `sys.path` an empty array and look at the list after an `import` statement and `TAB`
- [ ] Restart the kernel of the Jupyter notebook and try again