# Packages Setup

## [pip](https://pypi.org/project/pip/) - Python package manager

[Installing](https://pip.pypa.io/en/stable/installation/) pip

> **_Note_**:Linux ubuntu seem to have a [bug](https://bugs.launchpad.net/ubuntu/+source/python3.4/+bug/1290847), just use apt: `apt install python3-pip`

## [venv](https://docs.python.org/3/library/venv.html) - Virtual environments

A Python module which allows to isolate all your dependencies into a virtual environment.
**Installing venv**

``` bash
$ apt install python3-venv
```

**Creating and using a virtual env:**

> ``` bash
> # Check ehich python is being used
> $ which python3
> #> /usr/bin/python3
> 
> $ python3 -m venv path/to/virtual/environment/dir
> $ source path/to/virtual/environment/dir/bin/activate
> 
> # Re-check ehich python is being used
> $ which python3
> #> path/to/virtual/environment/dir/bin/python3
> ```

**Installing Packages in a virtual env**

- For installing python packages in the virtual env all we need to do is run pip.

>    ``` bash
>    $ pip3 install <package>
>    ```

- pip allows the usage of a file to simplify the tracking and installation of all requirements/dependencies.

>    ``` bash
>    $ pip3 install -r path/to/requirements/file
>    ```

> Requirements file structure example:
> ```bash
>   $ cat path/to/requirements/file
>   # <package>==<version>
>   #> ruamel.yaml==0.16.10
> ```

## [Import](https://docs.python.org/3/reference/import.html)

`Import` is what allows us to load and use whatever packages, modules or function we might require. For a more in depth understanding check [python scopes](https://realpython.com/python-scope-legb-rule/).

In [1]:
# Importing a python module
import os
print(os)
print(os.walk)

<module 'os' from '/usr/lib/python3.8/os.py'>
<function walk at 0x7f84de9451f0>


In [2]:
# Importing a specific method from a module
from os import walk
print(walk)

<function walk at 0x7f84de9451f0>


# The Usual Suspects: 
The most commonly used Python packages

## [OS](https://docs.python.org/3/library/os.html) - operating system interfaces

The **os** modules provides a simple way to interact with the operating system.

Simplifies daily operations like handling fetching environment variables, handling paths and files, among other things. 

In [3]:
import os
print(dir(os))

['CLD_CONTINUED', 'CLD_DUMPED', 'CLD_EXITED', 'CLD_TRAPPED', 'DirEntry', 'EX_CANTCREAT', 'EX_CONFIG', 'EX_DATAERR', 'EX_IOERR', 'EX_NOHOST', 'EX_NOINPUT', 'EX_NOPERM', 'EX_NOUSER', 'EX_OK', 'EX_OSERR', 'EX_OSFILE', 'EX_PROTOCOL', 'EX_SOFTWARE', 'EX_TEMPFAIL', 'EX_UNAVAILABLE', 'EX_USAGE', 'F_LOCK', 'F_OK', 'F_TEST', 'F_TLOCK', 'F_ULOCK', 'GRND_NONBLOCK', 'GRND_RANDOM', 'MFD_ALLOW_SEALING', 'MFD_CLOEXEC', 'MFD_HUGETLB', 'MFD_HUGE_16GB', 'MFD_HUGE_16MB', 'MFD_HUGE_1GB', 'MFD_HUGE_1MB', 'MFD_HUGE_256MB', 'MFD_HUGE_2GB', 'MFD_HUGE_2MB', 'MFD_HUGE_32MB', 'MFD_HUGE_512KB', 'MFD_HUGE_512MB', 'MFD_HUGE_64KB', 'MFD_HUGE_8MB', 'MFD_HUGE_MASK', 'MFD_HUGE_SHIFT', 'MutableMapping', 'NGROUPS_MAX', 'O_ACCMODE', 'O_APPEND', 'O_ASYNC', 'O_CLOEXEC', 'O_CREAT', 'O_DIRECT', 'O_DIRECTORY', 'O_DSYNC', 'O_EXCL', 'O_LARGEFILE', 'O_NDELAY', 'O_NOATIME', 'O_NOCTTY', 'O_NOFOLLOW', 'O_NONBLOCK', 'O_PATH', 'O_RDONLY', 'O_RDWR', 'O_RSYNC', 'O_SYNC', 'O_TMPFILE', 'O_TRUNC', 'O_WRONLY', 'POSIX_FADV_DONTNEED', 'POSIX_

In [4]:
print(f'OS name: {os.name}')
current_dir_path = os.getcwd()
print(f'OS current dir: {current_dir_path}')
print(f'OS current dir content: {os.listdir(current_dir_path)}')
print(f'OS current dir content: {os.listdir()}')

OS name: posix
OS current dir: /home/ctw01000/WS/PyCademy/day_2
OS current dir content: ['Packages.ipynb', 'scripts', 'README.md', '.ipynb_checkpoints']
OS current dir content: ['Packages.ipynb', 'scripts', 'README.md', '.ipynb_checkpoints']


In [5]:
print(f'path          : {current_dir_path}')
print(f'relative path : {os.path.relpath(current_dir_path)}')
print(f'basename      : {os.path.basename(current_dir_path)}')
print(f'dirname path  : {os.path.dirname(current_dir_path)}')
print('-'*100)

for root, dirs, files in os.walk(current_dir_path):
    print(f'root path   : {root}')
    print(f'directories : {dirs}')
    print(f'files       : {files}')
    for file in files:
        print(f'files path   : {os.path.join(root, file)}')


path          : /home/ctw01000/WS/PyCademy/day_2
relative path : .
basename      : day_2
dirname path  : /home/ctw01000/WS/PyCademy
----------------------------------------------------------------------------------------------------
root path   : /home/ctw01000/WS/PyCademy/day_2
directories : ['scripts', '.ipynb_checkpoints']
files       : ['Packages.ipynb', 'README.md']
files path   : /home/ctw01000/WS/PyCademy/day_2/Packages.ipynb
files path   : /home/ctw01000/WS/PyCademy/day_2/README.md
root path   : /home/ctw01000/WS/PyCademy/day_2/scripts
directories : []
files       : ['02_logging.py', '01_argparse.py']
files path   : /home/ctw01000/WS/PyCademy/day_2/scripts/02_logging.py
files path   : /home/ctw01000/WS/PyCademy/day_2/scripts/01_argparse.py
root path   : /home/ctw01000/WS/PyCademy/day_2/.ipynb_checkpoints
directories : []
files       : []


## [argparse](https://docs.python.org/3/library/argparse.html) - Argument parser

An easy way to handle with script arguments.

Some of the most usefull features provided are:

**Parser**: 
- Automatically parses the scripts input arguments and returns them in a neat object, which makes it easier (and cleanner) to use them.

Argument **validation**: 
- Required: Automatically validates if the script received all the required arguments.
- Type Checking: validate if an argument is of the expected type.

**Help** text:
- You don't have to worry about righting a help flag to display what the script doeas and what arguments it has available.
- Description: Add the script description to be displayed in the help text.

> **_Note_**: Also check: [01_argparse.py](scripts/01_argparse.py)

In [6]:
import argparse

# Most basic use case, just give a name and description to your script
parser = argparse.ArgumentParser(
    prog = "my_script",
    description="My script does something.",
)

# equivalent to running: python3 my_script.py --help
parser.print_help()
print('-'*100)
print(parser)

usage: my_script [-h]

My script does something.

optional arguments:
  -h, --help  show this help message and exit
----------------------------------------------------------------------------------------------------
ArgumentParser(prog='my_script', usage=None, description='My script does something.', formatter_class=<class 'argparse.HelpFormatter'>, conflict_handler='error', add_help=True)


In [7]:
## Adding arguments
parser = argparse.ArgumentParser(prog = "argument_snippet")

# Positional arguments
parser.add_argument("input_1", type=int)
parser.add_argument("input_2", type=str)

# flag arguments
# string
parser.add_argument(
    "-s", "--string",
    type=str,
    help="A string delivered by flag.",
    required=True
)

# list
parser.add_argument(
    "-l", "--list",
    nargs=3,
    dest="list3",
    help="A list argument of 3 elements.",
)

# Boolean
parser.add_argument(
    "-d", "--debug",
    action="store_true",
    help="A simple debug flag."
)

parser.print_help()

args = parser.parse_args('1 -d 2 -l 3 2 1 -s asd'.split())
print('-'*100)
print(args)
print('-'*100)
print(f'Debug : {args.debug}')
print(f'list  : {args.list3}')
print(f'String: {args.string}')

usage: argument_snippet [-h] -s STRING [-l LIST3 LIST3 LIST3] [-d]
                        input_1 input_2

positional arguments:
  input_1
  input_2

optional arguments:
  -h, --help            show this help message and exit
  -s STRING, --string STRING
                        A string delivered by flag.
  -l LIST3 LIST3 LIST3, --list LIST3 LIST3 LIST3
                        A list argument of 3 elements.
  -d, --debug           A simple debug flag.
----------------------------------------------------------------------------------------------------
Namespace(debug=True, input_1=1, input_2='2', list3=['3', '2', '1'], string='asd')
----------------------------------------------------------------------------------------------------
Debug : True
list  : ['3', '2', '1']
String: asd


In [8]:
# Mutually exclusive flags
parser = argparse.ArgumentParser(prog = "exclusive_argument_snippet")

group = parser.add_mutually_exclusive_group()

# Add arguments to the group
group.add_argument('--file', action='store_true', help="Use a file path.")
group.add_argument('--dir', action='store_false', help="Use a dir path.")

parser.print_help()
args = parser.parse_args(["--file"])
print(args)

usage: exclusive_argument_snippet [-h] [--file | --dir]

optional arguments:
  -h, --help  show this help message and exit
  --file      Use a file path.
  --dir       Use a dir path.
Namespace(dir=True, file=True)


## [Logging](https://docs.python.org/3/library/logging.html) - Well.. logging stuff

Forget about print.

As the name implies, logging provides a simple and clean way to log what happens on your scripts. Logs are essential when debugging what went wrong with the script, making this one must have tool.

Some of the most usefull features it provides are:

- Customization of the printed messages format.
- Objects can easily "sign" their prints.
- Adds timestamps to printed lines.
- Easily set different logging levels for what's happening at certain points of the script.
- Simultaneously write to the terminal and multiple logfiles, each with their own configuration.

In [9]:
import logging

logging.basicConfig(level=logging.DEBUG)

logging.debug("Something is happening!")
logging.info("Something happened!")
logging.warning("Found something weird!")
logging.error("Script failed!")
logging.critical("Script failed harder!!!")

DEBUG:root:Something is happening!
INFO:root:Something happened!
ERROR:root:Script failed!
CRITICAL:root:Script failed harder!!!


In [10]:
# A more modular approach
my_logger = logging.getLogger("my logger")
my_logger.setLevel(logging.WARNING)

my_logger.debug("Something is happening!")
my_logger.info("Something happened!")
my_logger.warning("Found something weird!")
my_logger.error("Script failed!")
my_logger.critical("Script failed harder!!!")

ERROR:my logger:Script failed!
CRITICAL:my logger:Script failed harder!!!


The logging module provides 3 kinds of objects:

- **[Loggers](https://docs.python.org/3/howto/logging.html#loggers)**: Expose the interface that application code directly uses.

- **[Handlers](https://docs.python.org/3/howto/logging.html#handlers)**: send the log records (created by loggers) to the appropriate destination. (ex: console, file, buffer, server, ...)

- **[Filters](https://docs.python.org/3/library/logging.html#filter)**: Provide a finer grained facility for determining which log records to output.

- **[Formatters](https://docs.python.org/3/howto/logging.html#formatters)**: Specify the layout of log records in the final output.

> **Note**: Check the [02_logging.py](scripts/02_logging.py) to see a more in depth example.

## [JSON](https://docs.python.org/3/library/json.html) - Json handling

[JSON](https://en.wikipedia.org/wiki/JSON) (JavaScript Object Notation) is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays (or other serializable values). It is a common data format with diverse uses in electronic data interchange, including that of web applications with servers. 

Example:

```json
{
  "firstName": "John",
  "lastName": "Smith",
  "isAlive": true,
  "age": 27,
  "address": {
    "streetAddress": "21 2nd Street",
    "city": "New York",
    "state": "NY",
    "postalCode": "10021-3100"
  },
  "phoneNumbers": [
    {
      "type": "home",
      "number": "212 555-1234"
    },
    {
      "type": "office",
      "number": "646 555-4567"
    }
  ],
  "children": [
      "Catherine",
      "Thomas",
      "Trevor"
  ],
  "spouse": null
}
```

The process of encoding JSON is usually called **serialization**. Naturally, **deserialization** is the reciprocal process of decoding data that has been stored or delivered in the JSON standard.

### Serializing JSON

What happens after a computer processes lots of information? It needs to take a data dump. Accordingly, the `json` library exposes the `dump()` method for writing data to files. There is also a dumps() method (pronounced as “dump-s”) for writing to a Python string.

Simple Python objects are translated to JSON according to a fairly intuitive conversion.

| Python         |  JSON     |
|----------------|-----------|
| dict           | object    |
| list, tuple    |    array  |
| str            |  string   |
| int, long, float |  number |
| True          |       true |
| False         |      false |
| None          |       null |

### Deserializing JSON

In the `json` library, you’ll find `load()` and `loads()` for turning JSON encoded data into Python objects.

| JSON         |  Python     |
|----------------|-----------|
| object           | dict    |
| array    |    list  |
| string            |  str   |
| number(int) |  int |
| number(real)    |     float |
| true         |      True |
| false          |      False |
| null          |       None |


In [10]:
import json

print(json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=4))

# Serializing JSON
data = {
    "president": {
        "name": "Zaphod Beeblebrox",
        "species": "Betelgeusian"
    }
}

with open("data_file.json", "w") as write_file:
    json.dump(data, write_file)

json_string = json.dumps(data)
print(json_string)

# Deserialization JSON
with open("data_file.json", "r") as read_file:
    data = json.load(read_file)

# handling JSON dict
data['president']
data['president']['name']


{
    "4": 5,
    "6": 7
}
{"president": {"name": "Zaphod Beeblebrox", "species": "Betelgeusian"}}


'Zaphod Beeblebrox'

## [YAML](https://pypi.org/project/PyYAML/) - Yaml handling

YAML support doesn't have native library support from Python. To be able to handle YAML we need a third-party library.
YAML is a data serialization format designed for human readability and interaction with scripting languages.

In [22]:
import yaml

email_message = """\
message:
  date: 2022-01-16 12:46:17Z
  from: john.doe@domain.com
  to:
    - bobby@domain.com
    - molly@domain.com
"""

# Deserializing YAML
# Calling safe_load() is currently the recommended way of handling content received from untrusted sources, which could contain malicious code.
print(yaml.safe_load(email_message))

data = {"name": "John"}

# Serializing
with open("email_message.yaml", mode="wt", encoding="utf-8") as file:
    yaml.dump(data, file)

# Deserialization
with open("email_message.yaml", "r") as read_file:
    message = yaml.safe_load(read_file)

print(message)
print(message['name'])

{'message': {'date': datetime.datetime(2022, 1, 16, 12, 46, 17, tzinfo=datetime.timezone.utc), 'from': 'john.doe@domain.com', 'to': ['bobby@domain.com', 'molly@domain.com']}}
{'name': 'John'}
John
