# Packages

The most _commonly_ used Python packages.

## [OS](https://docs.python.org/3/library/os.html) - operating system interfaces

The **os** modules provides a simple way to interact with the operating system.

Simplifies daily operations like handling fetching environment variables, handling paths and files, among other things. 

In [None]:
import os
print(dir(os))

In [None]:
print(f'OS name: {os.name}')
current_dir_path = os.getcwd()
print(f'OS current dir: {current_dir_path}')
print(f'OS current dir content: {os.listdir(current_dir_path)}')
print(f'OS current dir content: {os.listdir()}')

OS name: posix
OS current dir: /home/ctw01000/WS/PyCademy/day_2
OS current dir content: ['Packages.ipynb', 'scripts', 'README.md', '.ipynb_checkpoints']
OS current dir content: ['Packages.ipynb', 'scripts', 'README.md', '.ipynb_checkpoints']


In [None]:
print(f'path          : {current_dir_path}')
print(f'relative path : {os.path.relpath(current_dir_path)}')
print(f'basename      : {os.path.basename(current_dir_path)}')
print(f'dirname path  : {os.path.dirname(current_dir_path)}')
print('-'*100)

for root, dirs, files in os.walk(current_dir_path):
    print(f'root path   : {root}')
    print(f'directories : {dirs}')
    print(f'files       : {files}')
    for file in files:
        print(f'files path   : {os.path.join(root, file)}')


path          : /home/ctw01000/WS/PyCademy/day_2
relative path : .
basename      : day_2
dirname path  : /home/ctw01000/WS/PyCademy
----------------------------------------------------------------------------------------------------
root path   : /home/ctw01000/WS/PyCademy/day_2
directories : ['scripts', '.ipynb_checkpoints']
files       : ['Packages.ipynb', 'README.md']
files path   : /home/ctw01000/WS/PyCademy/day_2/Packages.ipynb
files path   : /home/ctw01000/WS/PyCademy/day_2/README.md
root path   : /home/ctw01000/WS/PyCademy/day_2/scripts
directories : []
files       : ['02_logging.py', '01_argparse.py']
files path   : /home/ctw01000/WS/PyCademy/day_2/scripts/02_logging.py
files path   : /home/ctw01000/WS/PyCademy/day_2/scripts/01_argparse.py
root path   : /home/ctw01000/WS/PyCademy/day_2/.ipynb_checkpoints
directories : []
files       : []


## [argparse](https://docs.python.org/3/library/argparse.html) - Argument parser

An easy way to handle with script arguments.

Some of the most usefull features provided are:

**Parser**: 
- Automatically parses the scripts input arguments and returns them in a neat object, which makes it easier (and cleanner) to use them.

Argument **validation**: 
- Required: Automatically validates if the script received all the required arguments.
- Type Checking: validate if an argument is of the expected type.

**Help** text:
- You don't have to worry about righting a help flag to display what the script doeas and what arguments it has available.
- Description: Add the script description to be displayed in the help text.

> **_Note_**: Also check: [01_argparse.py](scripts/01_argparse.py)

In [3]:
import argparse

# Most basic use case, just give a name and description to your script
parser = argparse.ArgumentParser(
    prog = "my_script",
    description="My script does something.",
)

# equivalent to running: python3 my_script.py --help
parser.print_help()
print('-'*100)
print(parser)

usage: my_script [-h]

My script does something.

optional arguments:
  -h, --help  show this help message and exit
----------------------------------------------------------------------------------------------------
ArgumentParser(prog='my_script', usage=None, description='My script does something.', formatter_class=<class 'argparse.HelpFormatter'>, conflict_handler='error', add_help=True)


In [4]:
## Adding arguments
parser = argparse.ArgumentParser(prog = "argument_snippet")

# Positional arguments
parser.add_argument("input_1", type=int)
parser.add_argument("input_2", type=str)

# flag arguments
# string
parser.add_argument(
    "-s", "--string",
    type=str,
    help="A string delivered by flag.",
    required=True
)

# list
parser.add_argument(
    "-l", "--list",
    nargs=3,
    dest="list3",
    help="A list argument of 3 elements.",
)

# Boolean
parser.add_argument(
    "-d", "--debug",
    action="store_true",
    help="A simple debug flag."
)

parser.print_help()

args = parser.parse_args('1 -d 2 -l 3 2 1 -s asd'.split())
print('-'*100)
print(args)
print('-'*100)
print(f'Debug : {args.debug}')
print(f'list  : {args.list3}')
print(f'String: {args.string}')

usage: argument_snippet [-h] -s STRING [-l LIST3 LIST3 LIST3] [-d]
                        input_1 input_2

positional arguments:
  input_1
  input_2

optional arguments:
  -h, --help            show this help message and exit
  -s STRING, --string STRING
                        A string delivered by flag.
  -l LIST3 LIST3 LIST3, --list LIST3 LIST3 LIST3
                        A list argument of 3 elements.
  -d, --debug           A simple debug flag.
----------------------------------------------------------------------------------------------------
Namespace(debug=True, input_1=1, input_2='2', list3=['3', '2', '1'], string='asd')
----------------------------------------------------------------------------------------------------
Debug : True
list  : ['3', '2', '1']
String: asd


In [None]:
# Mutually exclusive flags
parser = argparse.ArgumentParser(prog = "exclusive_argument_snippet")

group = parser.add_mutually_exclusive_group()

# Add arguments to the group
group.add_argument('--file', action='store_true', help="Use a file path.")
group.add_argument('--dir', action='store_false', help="Use a dir path.")

parser.print_help()
args = parser.parse_args(["--file"])
print(args)

usage: exclusive_argument_snippet [-h] [--file | --dir]

optional arguments:
  -h, --help  show this help message and exit
  --file      Use a file path.
  --dir       Use a dir path.
Namespace(dir=True, file=True)


## [Logging](https://docs.python.org/3/library/logging.html) - Well.. logging stuff

Forget about print.

As the name implies, logging provides a simple and clean way to log what happens on your scripts. Logs are essential when debugging what went wrong with the script, making this one must have tool.

Some of the most usefull features it provides are:

- Customization of the printed messages format.
- Objects can easily "sign" their prints.
- Adds timestamps to printed lines.
- Easily set different logging levels for what's happening at certain points of the script.
- Simultaneously write to the terminal and multiple logfiles, each with their own configuration.

In [None]:
import logging

logging.basicConfig(level=logging.DEBUG)

logging.debug("Something is happening!")
logging.info("Something happened!")
logging.warning("Found something weird!")
logging.error("Script failed!")
logging.critical("Script failed harder!!!")

DEBUG:root:Something is happening!
INFO:root:Something happened!
ERROR:root:Script failed!
CRITICAL:root:Script failed harder!!!


In [None]:
# A more modular approach
my_logger = logging.getLogger("my logger")
my_logger.setLevel(logging.WARNING)

my_logger.debug("Something is happening!")
my_logger.info("Something happened!")
my_logger.warning("Found something weird!")
my_logger.error("Script failed!")
my_logger.critical("Script failed harder!!!")

ERROR:my logger:Script failed!
CRITICAL:my logger:Script failed harder!!!


The logging module provides 3 kinds of objects:

- **[Loggers](https://docs.python.org/3/howto/logging.html#loggers)**: Expose the interface that application code directly uses.

- **[Handlers](https://docs.python.org/3/howto/logging.html#handlers)**: send the log records (created by loggers) to the appropriate destination. (ex: console, file, buffer, server, ...)

- **[Filters](https://docs.python.org/3/library/logging.html#filter)**: Provide a finer grained facility for determining which log records to output.

- **[Formatters](https://docs.python.org/3/howto/logging.html#formatters)**: Specify the layout of log records in the final output.

> **Note**: Check the [02_logging.py](scripts/02_logging.py) to see a more in depth example.

## [JSON](https://docs.python.org/3/library/json.html) - Json handling

[JSON](https://en.wikipedia.org/wiki/JSON) (JavaScript Object Notation) is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays (or other serializable values). It is a common data format with diverse uses in electronic data interchange, including that of web applications with servers. 

Example:

```json
{
  "firstName": "John",
  "lastName": "Smith",
  "isAlive": true,
  "age": 27,
  "address": {
    "streetAddress": "21 2nd Street",
    "city": "New York",
    "state": "NY",
    "postalCode": "10021-3100"
  },
  "phoneNumbers": [
    {
      "type": "home",
      "number": "212 555-1234"
    },
    {
      "type": "office",
      "number": "646 555-4567"
    }
  ],
  "children": [
      "Catherine",
      "Thomas",
      "Trevor"
  ],
  "spouse": null
}
```

The process of encoding JSON is usually called **serialization**. Naturally, **deserialization** is the reciprocal process of decoding data that has been stored or delivered in the JSON standard.

### Serializing JSON

What happens after a computer processes lots of information? It needs to take a data dump. Accordingly, the `json` library exposes the `dump()` method for writing data to files. There is also a dumps() method (pronounced as “dump-s”) for writing to a Python string.

Simple Python objects are translated to JSON according to a fairly intuitive conversion.

| Python         |  JSON     |
|----------------|-----------|
| dict           | object    |
| list, tuple    |    array  |
| str            |  string   |
| int, long, float |  number |
| True          |       true |
| False         |      false |
| None          |       null |

### Deserializing JSON

In the `json` library, you’ll find `load()` and `loads()` for turning JSON encoded data into Python objects.

| JSON         |  Python     |
|----------------|-----------|
| object           | dict    |
| array    |    list  |
| string            |  str   |
| number(int) |  int |
| number(real)    |     float |
| true         |      True |
| false          |      False |
| null          |       None |


In [5]:
import json

print(json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=4))

# Serializing JSON
data = {
    "president": {
        "name": "Zaphod Beeblebrox",
        "species": "Betelgeusian"
    }
}

with open("data_file.json", "w") as write_file:
    json.dump(data, write_file)

json_string = json.dumps(data)
print(json_string)

# Deserialization JSON
with open("data_file.json", "r") as read_file:
    data = json.load(read_file)

# handling JSON dict
data['president']
data['president']['name']


{
    "4": 5,
    "6": 7
}
{"president": {"name": "Zaphod Beeblebrox", "species": "Betelgeusian"}}


'Zaphod Beeblebrox'

## [YAML](https://pypi.org/project/PyYAML/) - Yaml handling

YAML support doesn't have native library support from Python. To be able to handle YAML we need a third-party library.
YAML is a data serialization format designed for human readability and interaction with scripting languages.

In [None]:
import yaml

email_message = """\
message:
  date: 2022-01-16 12:46:17Z
  from: john.doe@domain.com
  to:
    - bobby@domain.com
    - molly@domain.com
"""

# Deserializing YAML
# Calling safe_load() is currently the recommended way of handling content received from untrusted sources, which could contain malicious code.
print(yaml.safe_load(email_message))

data = {"name": "John"}

# Serializing
with open("email_message.yaml", mode="wt", encoding="utf-8") as file:
    yaml.dump(data, file)

# Deserialization
with open("email_message.yaml", "r") as read_file:
    message = yaml.safe_load(read_file)

print(message)
print(message['name'])

{'message': {'date': datetime.datetime(2022, 1, 16, 12, 46, 17, tzinfo=datetime.timezone.utc), 'from': 'john.doe@domain.com', 'to': ['bobby@domain.com', 'molly@domain.com']}}
{'name': 'John'}
John


## [Re](https://docs.python.org/3/library/re.html): Regular expression operations

This module provides [regular expression](https://en.wikipedia.org/wiki/Regular_expression) matching operations similar to those found in Perl.
A regular expression is a sequence of characters that specifies a search pattern in text.

To get a glimpse of how regex works you can also consult the Python3 regex page: https://docs.python.org/3/library/re.html

The result of the search or match operations will return a [Match object](https://docs.python.org/3/library/re.html#match-objects).

To test your regular expressions there are several sites but we recommend using [regex101](https://regex101.com/) or [debuggex](https://www.debuggex.com/).

In [19]:
import re

# This example looks for a word following a hyphen
m = re.search(r'(?<=-)\w+', 'spam-egg')
print(m.group(0))

# Input git tag string pattern
TAG_STRING_PATTERN = r"([A-Z0-9-]+_)?([A-Z0-9_-]+_)?[0-9]{2}w[0-9]{2}.[0-9]-[0-9]{1,2}(?:-[0-9]{1,3})?"

# Search
search = re.search(TAG_STRING_PATTERN, 'mgu22_A_21w11.1-2-37')
print(search)

# Match
match = re.fullmatch(TAG_STRING_PATTERN, 'A_B_21w11.1-2-37')
print(match)

# Doesn't match
match = re.fullmatch(TAG_STRING_PATTERN, 'mgu22_A_21w11.1-2-37')
print(match)

# We can also split a string by a regex
re.split(r'\W+', 'Words, words, words.')


egg
<re.Match object; span=(3, 20), match='22_A_21w11.1-2-37'>
<re.Match object; span=(0, 16), match='A_B_21w11.1-2-37'>
None


['Words', 'words', 'words', '']

## [Requests](https://pypi.org/project/requests/): Requests is a simple, yet elegant, HTTP library.

Requests allows you to send HTTP/1.1 requests extremely easily. There’s no need to manually add query strings to your URLs, or to form-encode your `PUT` & `POST` data — but nowadays, just use the `json` method!

**Official documentation:** https://requests.readthedocs.io/en/latest/

In [8]:
import requests

print(requests.__author__)

print(requests.get('https://api.github.com'))

# Status codes
response = requests.get('https://api.github.com')
response.status_code

if response.status_code == 200:
    print('Success!')
elif response.status_code == 404:
    print('Not Found.')
else:
    print(f'New response code: {response.status_code}')

# Get the responser headers
print(response.headers)

# Search GitHub's repositories for requests
repos = requests.get(
    'https://api.github.com/search/repositories',
    params={'q': 'requests+language:python'},
)

# Inspect some attributes of the `requests` repository
json_response = repos.json()
#print(json_response)
repository = json_response['items'][10]
print(f'Repository name: {repository["name"]}')
print(f'Repository description: {repository["description"]}')

# Get the content of a request. If we "ping" an API usually we get a JSON.
# If we "ping" a webpage usually we get the HTML.
#print(repos.content)

# Other HTTP Methods
requests.post('https://httpbin.org/post', data={'key':'value'})
requests.put('https://httpbin.org/put', data={'key':'value'})
requests.delete('https://httpbin.org/delete')
requests.head('https://httpbin.org/get')
requests.patch('https://httpbin.org/patch', data={'key':'value'})
requests.options('https://httpbin.org/get')

# Authentication


Kenneth Reitz
<Response [200]>
Success!
{'Server': 'GitHub.com', 'Date': 'Fri, 08 Jul 2022 16:25:02 GMT', 'Cache-Control': 'public, max-age=60, s-maxage=60', 'Vary': 'Accept, Accept-Encoding, Accept, X-Requested-With', 'ETag': '"4f825cc84e1c733059d46e76e6df9db557ae5254f9625dfe8e1b09499c449438"', 'Access-Control-Expose-Headers': 'ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Used, X-RateLimit-Resource, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, X-GitHub-SSO, X-GitHub-Request-Id, Deprecation, Sunset', 'Access-Control-Allow-Origin': '*', 'Strict-Transport-Security': 'max-age=31536000; includeSubdomains; preload', 'X-Frame-Options': 'deny', 'X-Content-Type-Options': 'nosniff', 'X-XSS-Protection': '0', 'Referrer-Policy': 'origin-when-cross-origin, strict-origin-when-cross-origin', 'Content-Security-Policy': "default-src 'none'", 'Content-Type': 'application/json; charset=utf-8', '

<Response [200]>

## [Subprocess](https://docs.python.org/3.8/library/subprocess.html#): Subprocess management

The subprocess module allows you to spawn new processes, connect to their input/output/error pipes, and obtain their return codes.

**Official documentation:** https://docs.python.org/3.8/library/subprocess.html#

In [39]:
import subprocess

# Run a process
subprocess.run(["ls", "-l"]) 

cmd = "git rev-parse HEAD"
print(subprocess.check_output(cmd, shell=True).decode())
# If shell is True, the specified command will be executed through the shell.

# Check if git is available and get return code.
# This is a useful trick to check if a bin is available on the system.
git = subprocess.run([ 'echo', 'Hello, world!' ], stdout=subprocess.DEVNULL)
git.returncode

total 16
-rw-r--r-- 1 bferreira bferreira 8237 Jul  7 13:26 Packages_2.ipynb
-rw-r--r-- 1 bferreira bferreira 1301 Jul  6 15:18 README.md
bdf99ae3cdd99b615a81ae42a9736ceecc600494



0

# Building a package:

In [None]:
# TODO how to do a python package

## Creating a custom module:

In [None]:
# TODO how to make a module

## Creating a custom package

In [None]:
# TODO how to do a python package