# standard library -- continued
- a large numer of packages
- comes with the python installation, no need to install anything separately
- [python docs](https://docs.python.org/3/library/index.html)

In [None]:
# poor mans progress bar...
for _ in range(100):
    print('.', end='')

In [None]:
help(print)

## mathemtical things

### math & cmath
- `math` all the mathematical functions defined in C99 for real numbers
- `cmath` all the mathematical functions defined in C99 for complex numbers

- rounding: `trunc`, `floor`, `ceil`, ...
- combinatorics: `comb`, `perm`, ...
- integer arithmetics: `factorial`, `gcd`, `lcm`, ...
- (float) modulo: `fmod` (watch out, different from `%`)
- trigonometry: `sin`, `cos`, `tan`, `asin`, `acos`, `atan`, `sinh`, `cosh`, `tanh`, `asinh`, `acosh`, `atanh`
- angles: `degrees`, `radians` for conversion
- log and exp: `exp`, `exp2`, `log`, `log2`, `log10`, `pow`, ...
- special functions: `erf`, `gamma`, ...
- value tests: `isinf`, `isfinite`, `isnan`
- constants: `pi`, `e`, `tau`, `nan`, `inf`

In [None]:
math.inf == float('inf')

In [None]:
math.nan == float('nan')

In [None]:
math.comb(7, 3)   # 7 choose 3 without order

In [None]:
math.perm(7, 3)  # 7 choose 3 with order

In [None]:
math.cos(math.pi)

In [None]:
math.acos(-1)

In [None]:
math.acos(1)

In [None]:
math.acos(0)

In [None]:
cmath.exp(math.pi*1j)

### random
- generate pseudorandom numbers from various distributions, and related functions
- pseudorandom numbers are **not** suitable for cryptographic applications! take a look at `secrets` or 3rd party libraries
- for matrices (or otherwise large arrays) of random numbers and additional distributions look into `numpy` and `scipy`

- `random.seed`: set the seed for pseudorandom generator (same seed = same sequence will be generated)
- `random.randrange` - random element from range
- `random.randint` - uniform random int between min and max (both inclusive!)
- `random.random` - uniform floating point between 0 (inclusive) and 1 (exclusive)
- `random.uniform` - uniform floating point between min and max
- `random.gauss` - normal distribution
- `random.choice` - return random element from sequence
- `random.shuffle` - shuffle sequence in-place
- `random.sample` - return k elements from sequence without replacement

In [None]:
import random, string

In [None]:
random.seed(42)

In [None]:
random.randint(1, 10)

In [None]:
random.choice(string.ascii_lowercase)

In [None]:
random.choice(['abc', 'def', 'ghi'])

In [None]:
random.sample(string.ascii_lowercase, 5)

In [None]:
from matplotlib import pyplot as plt
%matplotlib inline

In [None]:
uniform_data = [random.uniform(-1, 1) for _ in range(100000)]
normal_data = [random.gauss() for _ in range(100000)]

plt.hist(normal_data, bins=100, range=(-4, 4), alpha=0.5);
plt.hist(uniform_data, bins=100, range=(-4, 4), alpha=0.5);

### decimal
- decimal numbers with configurable precision
- 'exact' within that precision
- useful particularly for e.g. accounting

In [None]:
import decimal

In [None]:
0.1 + 0.1 + 0.1 - 0.3

In [None]:
decimal.Decimal(0.1) + decimal.Decimal(0.1) + decimal.Decimal(0.1) - decimal.Decimal(0.3)

In [None]:
decimal.Decimal('0.1') + decimal.Decimal('0.1') + decimal.Decimal('0.1') - decimal.Decimal('0.3')

In [None]:
with decimal.localcontext(traps=[decimal.FloatOperation]):
    decimal.Decimal(0.1)

In [None]:
# significance
decimal.Decimal('1.50') + decimal.Decimal('1.50')

In [None]:
# setting precision
with decimal.localcontext(prec=3):
    print(decimal.Decimal('1.23456'))
    print(decimal.Decimal(1) / decimal.Decimal(7))
    print(decimal.Decimal('1.23456') + decimal.Decimal('1.23456'))

In [None]:
# setting precision
with decimal.localcontext(prec=345):
    print(decimal.Decimal(1) / decimal.Decimal(7))

## data persistence

### csv
- csv: 'comma separated values' -- common data interchange format for tabular data
- there is a standard ([RFC 4180](https://datatracker.ietf.org/doc/html/rfc4180.html)), but it's often ignored and/or misused
- making the parsing of csv files tedious and error-prone
- the `csv` module supports various 'dialects' of csv-files

In [None]:
import csv

In [None]:
with open('../data/titanic.csv', 'r') as ifile:
    reader = csv.reader(ifile)
    data = [line for line in reader]

In [None]:
for line in data[:3]:
    print(line)

In [None]:
# and writing it again
with open('../data/demo_output.csv', 'w') as ifile:
    writer = csv.writer(ifile)
    writer.writerows(data[:3])

In [None]:
# DictReader assumes the first line to be headers and uses the values as keys for dictionaries, each row becoming a dict.
with open('../data/titanic.csv', 'r') as ifile:
    reader = csv.DictReader(ifile)
    data = [line for line in reader]

In [None]:
data[0]

In [None]:
# Terrible excel interpretation of what 'csv' means...
with open('../data/bad_excel_titanic.csv', 'r') as ifile:
    reader = csv.DictReader(ifile)
    data = [line for line in reader]

In [None]:
data[0]

In [None]:
# Terrible excel interpretation of what 'csv' means...
with open('../data/bad_excel_titanic.csv', 'r') as ifile:
    dialect = csv.Sniffer().sniff(ifile.read(2048))
    ifile.seek(0)
    reader = csv.DictReader(ifile, dialect=dialect)
    data = [line for line in reader]
data[1]

### json
- javascript object notation
- lightweight, text-based, language-independent data interchange format
- but the syntax already maps quite nicely to python as-is, making the content easy to understand
- standardized in [RFC7159](https://datatracker.ietf.org/doc/html/rfc7159.html)
- some 

- `load`: parse json data from a file object
- `loads`: parse json data from a string
- `dump`: encode objects as json data and write to file
- `dumps`: encode objects as json strings

<div class="alert alert-block alert-warning">
<b>Untrusted data:</b> <br>
<a>
Be cautious when parsing JSON data from untrusted sources. A malicious JSON string may cause the decoder to consume considerable CPU and memory resources. Limiting the size of data to be parsed is recommended.
</a>
</div>

In [None]:
import json

In [None]:
encoded_list = json.dumps([1, '2', 3.1415])
encoded_list

In [None]:
decoded_list = json.loads(encoded_list)
decoded_list

In [None]:
for element in decoded_list:
    print(f"{element} is a {type(element)}")

In [None]:
encoded_dict = json.dumps({'a': 'foo', 'b': 3.1415})
encoded_dict

In [None]:
decoded_dict = json.loads(encoded_dict)
decoded_dict

In [None]:
# arbitrary nesting works just fine
my_data = {
    "some_key": "simple",
    "mixed_list": [1, "2", 3.1415, {"inner_dict": "also_works"}],
    "dict": {
        "my_list": [1, 2, 3]
    }
}
encoded_data = json.dumps(my_data)
encoded_data

In [None]:
restored_data = json.loads(encoded_data)
restored_data

In [None]:
restored_data == my_data

#### possible issues

In [None]:
# json standard only supports strings as keys
encoded_dict = json.dumps({17: "seventeen"})
decoded_dict = json.loads(encoded_dict)
decoded_dict

In [None]:
for k, v in decoded_dict.items():
    print(f"{k} is a {type(k)} and {v} is a {type(v)}")

In [None]:
# and even worse, collisions can occur and are silent
encoded_dict = json.dumps({17: "int_key", '17': 'str_key'})
decoded_dict = json.loads(encoded_dict)
decoded_dict

In [None]:
# order matters...
encoded_dict = json.dumps({'17': "str_key", 17: 'int_key'})
decoded_dict = json.loads(encoded_dict)
decoded_dict

In [None]:
# the reverse is also possible
weird_json = '{"x": 1, "x": 2, "x": 3}'
json.loads(weird_json)

In [None]:
# constants and weird floats
funky_json = json.dumps([True, False, None, float('nan'), float('inf')])
funky_json

In [None]:
json.loads(funky_json)

In [None]:
# NaNs and Infinity are beyond the standard though
json.dumps([float('nan'), float('inf')], allow_nan=False)

In [None]:
# recursion dosn't work...
recursive_list = [1, 2, 3]
recursive_list.append(recursive_list)
recursive_list

In [None]:
recursive_list[-1]

In [None]:
json.dumps(recursive_list)

### pickle
- serialize and deserialize arbitrary python objects
- serialization: converting Python objects into a byte stream that can be easily stored, transmitted, or shared
- `pickle.dump` -- serialize a python object and write it to a file object
- `pickle.load` -- the reverse, load bytes from file and restore python object
- `pickle.dumps` / `pickle.loads` -- the same but to/from just bytes
- look at `dill` for a third-party extension allowing to serialize more types of objects

#### important considerations

<div class="alert alert-block alert-warning">
<b>Untrusted data:</b> <br>
<a>
    don't every unpickle data from untrusted sources, as unpickling can lead to code execution
</a>
</div>

- not all objects can be pickled (file handles, database connections, ...) and attempting to do so will result in errors
- restoring old pickles into updated versions of your code can lead to incompatibilities

#### basic pickling/unpickling

In [None]:
import pickle

In [None]:
list_bytes = bytes([1, 2, 3])
list_bytes

In [None]:
pickled_list = pickle.dumps([1, 2, 3])
pickled_list

In [None]:
pickle.loads(pickled_list)

In [None]:
pickle.loads(list_bytes)

In [None]:
list(list_bytes)

In [None]:
print(list(pickled_list))

In [None]:
with open('stored_list.pickle', 'wb') as ifile:
    pickle.dump([1, 2, 3], ifile)

In [None]:
with open('stored_list.pickle', 'rb') as ifile:
    print(ifile.read())

In [None]:
with open('stored_list.pickle', 'rb') as ifile:
    print(pickle.load(ifile))

#### pickling custom objects/classes

In [None]:
class PickleDemo:
    def say(self):
        print(f"Hey, I'm a pickle demo")

In [None]:
pd = PickleDemo()
pd.say()

In [None]:
pickled_pd = pickle.dumps(pd)
pickled_pd

In [None]:
restored_pd = pickle.loads(pickled_pd)
restored_pd.say()

In [None]:
# the class still needs to be defined!
del PickleDemo

In [None]:
pickle.loads(pickled_pd)

In [None]:
class PickleDemo:
    def say(self):
        print(f"hello, I'm new")

In [None]:
# the method code is actually taken from the new class, the pickle just tells you to create an instance
new_pd = pickle.loads(pickled_pd)

In [None]:
new_pd.say()

In [None]:
# you made some changes to your code and updates your pickle class to have an attribute

In [None]:
class PickleDemo:
    def __init__(self, attribute):
        self.some_attribute = attribute

    def say(self):
        print(f"Hey, I'm a pickle demo, and my attribute is \"{self.some_attribute}\"")

In [None]:
restored_pd = pickle.loads(pickled_pd)
restored_pd.say()

In [None]:
# but the value of the attribute is stored in the pickle
pd_with_attribute = PickleDemo("attribute_value")
pd_with_attribute.say()

In [None]:
pickled_pd_with_attribute = pickle.dumps(pd_with_attribute)
restored_with_attribute = pickle.loads(pickled_pd_with_attribute)
restored_with_attribute.say()

#### some un-pickle-able objects

In [None]:
# closures
def my_function(a, b):
    def inner_function():
        print(f"hey there, from function f with arguments {a} and {b}")
    return inner_function

In [None]:
pickled_f = pickle.dumps(my_function(3, 5))
pickled_f

In [None]:
try:
    fp = open('demo_file.txt', 'rb')
    pickle.dumps(fp)
finally:
    fp.close()

In [None]:
# unlike json, recursion works just fine though
recursive_list = [1, 2, 3]
recursive_list.append(recursive_list)
recursive_list

In [None]:
encoded_recursive_list = pickle.dumps(recursive_list)
encoded_recursive_list

In [None]:
restored_recursive_list = pickle.loads(encoded_recursive_list)
restored_recursive_list

In [None]:
restored_recursive_list[-1]

#### customizing pickling behaviour

In [None]:
class MyFileReader:
    def __init__(self, filename):
        self.fp = open(filename)

    def read(self):
        return self.fp.read()

    def close(self):
        self.fp.close()

In [None]:
my_reader = MyFileReader('demo_file.txt')
my_reader.read()

In [None]:
try:
    pickle.dumps(my_reader)
finally:
    my_reader.close()

In [None]:
class MyPickleFileReader:
    def __init__(self, filename):
        self.fp = open(filename)
        self.filename = filename

    def __reduce__(self):
        return (self.__class__, (self.filename, ))

    def read(self):
        return self.fp.read()

    def close(self):
        self.fp.close()

In [None]:
my_pickle_reader = MyPickleFileReader('demo_file.txt')
my_pickle_reader.read()

In [None]:
try:
    pickled_reader = pickle.dumps(my_pickle_reader)
finally:
    my_pickle_reader.close()

In [None]:
pickled_reader

In [None]:
restored_reader = pickle.loads(pickled_reader)
restored_reader.read()

In [None]:
restored_reader.close()

#### BUT: this is dangerous if you get pickles from untrusted sources!

In [None]:
class Exploit:
    def __reduce__(self):
        return (print, ("You've been pwned!", ))

exploit = Exploit()
pickled_exploit = pickle.dumps(exploit)
pickled_exploit

In [None]:
# sending this pickle to your system, which doesn't know about the `Exploit` class
del Exploit

In [None]:
# but still...
pickle.loads(pickled_exploit)

## generic operating system and runtime services

### os
- platform-independent way of interacting with the operating system
  - read environment variables
  - low-level file access: prefer builtins (`open`) as well as `pathlib` and `shutil`
  - process management (running external commands): prefer e.g. `subprocess`
  - reading and changing user-ids/groups/...

In [None]:
os.name

In [None]:
os.uname()

In [None]:
os.environ

In [None]:
os.sep  # path separator

In [None]:
os.linesep

In [None]:
os.getcwd()

In [None]:
os.chdir()  # change working directory

In [None]:
os.system('ls')

In [None]:
# loads and loads of stuff for directory/file manipulation
# CPU time scheduler, starting/stopping/killing other processes
# special devices/mounts/...

### sys
- provides access to the Python interpreter's variables and functions, allowing us to interact with and control various aspects of the Python runtime.
  - search path
  - command line arguments
  - python version and platform information
  - I/O redirection
  - recursion limits and other interpreter-specific settings

In [None]:
import sys

#### Interpreter Info

In [None]:
sys.version

In [None]:
sys.implementation

In [None]:
sys.platform

In [None]:
sys.byteorder

In [None]:
sys.float_info

In [None]:
sys.flags

In [None]:
print(sys.copyright)

In [None]:
# and many many more...

#### controlling interpreter behaviour

In [None]:
sys.getrecursionlimit()
sys.setrecursionlimit(200)

In [None]:
i = 0
def infitite_recursion():
    global i
    i += 1
    infitite_recursion()
infitite_recursion()

In [None]:
print(i)

In [None]:
sys.setrecursionlimit(3000)

In [None]:
sys.get_int_max_str_digits()

In [None]:
sys.set_int_max_str_digits(640)

In [None]:
value = 2**10000

In [None]:
value

In [None]:
sys.set_int_max_str_digits(4300)

#### command line arguments
- `sys.argv` contains command line arguments passed to the interpreter

In [None]:
# first element: script name
# any other elements: arguments
sys.argv

In [None]:
# demo: utils.py
# also see package `argparse`

#### search path
- `sys.path` is a list of directories that Python searches for modules when you use the import statement.
- this allows you to `import` (e.g. your own) packages/modules from non-standard locations
- look into building real packages and installing those (in development mode) for a better alternative

In [None]:
sys.path

#### in/output stream redirection

In [None]:
with open('stdout.txt', 'w') as ifile:
    original_stdout = sys.stdout
    sys.stdout = ifile
    print('hello')
    sys.stdout = original_stdout

### platform
- hardware- and OS-specific information

In [None]:
import platform

In [None]:
platform.architecture()

In [None]:
platform.processor()

In [None]:
platform.platform()

### time
- provides various time-related functions. see also `datetime`
- not all functions are available on all platforms. See [documentation](https://docs.python.org/3/library/time.html) for details.
- use `datetime` for conversion from/to strings, timestamps and timestamp differences
- use `time` or the `timeit` module for (simple) measuring of the execution time of your code
- use `time.sleep` for (blocking) waits for a specified amount of time

- `time.sleep`: wait for the specified number of seconds
- `time.time` / `time.time_ns`: current time in (fractional) seconds since the epoch (1970-01-01 00:00+00:00)
- `time.monotonic` / `time.monotonic_ns`: always increasing time counter with unspecified offset
- `time.perf_counter` / `time.perf_counter_ns`: as `time.monotonic` but using the highes possible resolution clock source
- `time.process_time` / `time.process_time_ns`: sum of user and system time, unspecified offset, excludes idle time
- `time.get_clock_info`: return detailed information on the different time sources above

In [None]:
import time

In [None]:
time.time()

In [None]:
time.time_ns()

In [None]:
for clock in ['time', 'monotonic', 'perf_counter', 'process_time']:
    print(time.get_clock_info(clock))

In [None]:
t0 = time.time()
time.sleep(1)
t1 = time.time()
print(f'Elapsed time: {t1 - t0}')

In [None]:
t0 = time.monotonic()
time.sleep(1)
t1 = time.monotonic()
print(f'Elapsed monotonic time: {t1 - t0}')

In [None]:
t0 = time.process_time()
time.sleep(1)
t1 = time.process_time()
print(f'Elapsed process time: {t1 - t0}')

In [None]:
t0 = time.perf_counter_ns()
t1 = time.perf_counter_ns()
print(f'Elapsed time (ns): {t1 - t0}')

### logging
- means of tracking events that happen when some software runs.
- Add logging calls to your code to indicate that certain events have occurred. An event is described by a descriptive message which can optionally contain variable data (i.e. data that is potentially different for each occurrence of the event).
- Events also have an importance which you ascribe to the event; the importance can also be called the level or severity.
  - `debug`: very detailed logging of normal events
  - `info`: logging of normal events, less verbose than `debug`
  - `warning`: abnormal, but does not stop execution
  - `error`: replaces an `Exception` in code that should not stop (e.g. long-running server processes)
  - `critical`: like `error`, but you want to make everyone even more worried...
- users of your code can potentially specify the minimum level of severity they are interested in
- log output destination should be configurable

In [None]:
import logging

In [None]:
logging.debug("This is fine...")

In [None]:
logging.info("All normal...")

In [None]:
logging.warning("Hey, watch out!")

In [None]:
logging.error("It's all gone wrong :(")

In [None]:
logging.critical("Really, really terribly wrong!")

In [None]:
logging.basicConfig(filename='demo.log', level=logging.DEBUG, force=True)

In [None]:
logging.basicConfig(format='%(levelname)s:%(message)s', level=logging.DEBUG, force=True)
logging.debug('This message should appear on the console')
logging.info('So should this')
logging.warning('And this, too')

In [None]:
logging.basicConfig(format='%(asctime)s - %(name)s - %(levelname)s - %(message)s', force=True)
logging.warning('You still need to watch out!')

In [None]:
logger = logging.getLogger('my_module_logger')
logger.warning('This is a warning from my module')

## selected other interesting packages:

- `struct`: Interpret bytes as packed binary data

- `calendar`: Calendar-related functions
- `heapq`: Heap queue algorithm (ordered binary tree, each node smaller than any children)
- `bisect`: bisection -- manipulating sorted lists
- `weakref`: create weak references to objects (primarily for caching)
- `types`: dynamic creation of new types / classes
- `enum`: support enumerations
- `graphlib`: support for graph-like structures

- `abc`: abstract base classes (inheritance -- 'interfaces' in Python)
  - `collections.abc`: abstract base classes for collections
  - `numbers`: abstract base classes for numeric types

- `statistics`: simple statistics (mean/median/std etc). use `numpy`/`scipy` for more advanced capabilities

- `sqlite3`: simple file-based database interface
- `shelve`: python object persistence on top of pickle
- `zlib`/`gzip`/`bz2`/`lzma`/`zipfile`/`tarfile`: support for various data compression formats

- `hashlib`: Secure hashes and message digests
- `secrets`: Generate secure random numbers for managing secrets

- `curses`: build terminal UIs
- `tkinter`: built-in UI framework

- `ctypes`: call C code directly from python

- `threading`: Thread-based parallelism
- `multiprocessing`: Process-based parallelism
 - `multiprocessing.shared_memory`: Shared memory for direct access across processes
- `subprocess`: Subprocess management
- `sched`: Event scheduler
- `queue`: A synchronized (thread-safe) queue class
- `concurrent.futures`: Launching parallel tasks
- `asyncio`: asynchronous i/o for concurrent execution

- `typing`: support for type hints
- `unittest`: unit testing framework (prefer third-party `pytest`)
- `pdb`: debugger
- `timeit`: measure code execution time
- `trace`: trace statement execution
- `warnings`: control user-visible warnings
- `dataclasses`: low-overhead 'passive' classes with just attributes
- `gc`: garbage collector interface
- `ast`: python abstract syntax tree parsers
- `tokenize`: tokenizer for python source code
- `dis`: disassemble python code (to bytecode)

- `msvcrt`/`winreg`/`winsound`: windows-specific functionality
- `pwd`/`grp`/`tty`/...: unix/linux specific functionality