# Introduction to Python and Natural Language Technologies

__Lecture 12, Decorators, Packaging, Type hints__

__May 4, 2021__

__Judit Ács__

# Decorators

## Introduction

Let's define a function that
- takes another function as a parameter
- greets the caller before calling the function

In [None]:
def greeter(func):
    print("Hello")
    func()
    
def say_something():
    print("Let's learn some Python.")
    
greeter(say_something)
# greeter(12)  # raises TypeError

### Functions are first class objects

- they can be passed as arguments
- they can be returned from other functions (example later)

Let's create a `count_predicate` function

- takes a iterable and a predicate (yes-no function)
- calls the predicate on each element
- counts how many times it returns True
- same as `std::count` in C++

In [None]:
def count_predicate(predicate, iterable):
    true_count = 0
    for element in iterable:
        if predicate(element) is True:
            true_count += 1
    return true_count

We can write this function in fewer lines:

In [None]:
def count_predicate(predicate, iterable):
    return sum(predicate(e) for e in iterable)

The predicate parameter it can be anything 'callable'

a function:

In [None]:
def is_even(number):
    return number % 2 == 0

numbers = [1, 3, 2, -5, 0, 0]

count_predicate(is_even, numbers)

an instance of a class that implements `__call__` (functor):

In [None]:
class IsEven(object):
    def __call__(self, number):
        return number % 2 == 0
    
count_predicate(IsEven(), numbers)

In [None]:
# Other ways of using IsEven:
IsEven()(123)
i = IsEven()
i(11)

or a lambda expression:

In [None]:
count_predicate(lambda x: x % 2 == 0, numbers)

### Functions can be nested

In [None]:
def parent():
    print("I'm the parent function")
    
    def child():
        print("I'm the child function")
        
parent()

the nested function is only accessible from the parent

In [None]:
def parent():
    print("I'm the parent function")
    
    def child():
        print("I'm the child function")
    
    print("Calling the nested function")
    child()
        
parent()
# parent.child  # raises AttributeError

### Functions can be return values

In [None]:
def parent():
    print("I'm the parent function")
    
    def child():
        print("I'm the child function")
        
    return child

child_func = parent()

print("Calling child")
child_func()
child_func()
child_func()
child_func()

print("\nUsing parent's return value right away")
parent()()

## Closure: nested functions have access to the parent's scope

In [None]:
def parent(value):
    
    def child():
        print(f"I'm the nested function. The parent's value is {value}")
        
    return child
        
child_func = parent(42)

print("Calling child_func")
child_func()

Calling the `parent` returns a new function object each time:

In [None]:
f1 = parent("abc")
f2 = parent(123)

f1()
f2()

f1 is f2

## Function factory

In [None]:
def make_func(param):
    value = param
    
    def func():
        print(f"I'm the nested function. The parent's value is {value}")
        
    return func

func_11 = make_func(11)
func_abc = make_func("abc")

func_11()
func_abc()

Calling `make_func` with the same arguments results in different functions:

In [None]:
f1 = make_func(1)
f2 = make_func(1)
f1()
f2()

f1 == f2, f1 is f2

## Wrapper function factory

- let's create a function that takes a function return an almost identical function
- the returned function adds some logging

In [None]:
def add_noise(func):
    
    def wrapped_with_noise():
        print(f"Calling function {func.__name__}")
        func()
        print(f"{func.__name__} finished.")
        
    return wrapped_with_noise

The function we are going to wrap:

In [None]:
def noiseless_function():
    print("This is not noise")
    
noiseless_function()

Now add some noise:

In [None]:
noisy_function = add_noise(noiseless_function)

noisy_function()

We often want to bind the wrapped function to the original reference:

- i.e. `greeter` should refer to the wrapped function
- we don't need the original function

In [None]:
def greeter():
    print("Hello")
    
print(id(greeter))
   
greeter = add_noise(greeter)
greeter()
print(id(greeter))

This turns out to be a frequent operation:

In [None]:
def friendly_greeter():
    print("Hello friend")
    
def rude_greeter():
    print("Hey you")
    
friendly_greeter = add_noise(friendly_greeter)
rude_greeter = add_noise(rude_greeter)
friendly_greeter()

rude_greeter()

## Decorator syntax

- a decorator is a function
  - that takes a function as an argument
  - returns a wrapped version of the function
- the decorator syntax is just __syntactic sugar__ (shorthand) for:

```python
func = decorator(func)
```

In [None]:
@add_noise
def informal_greeter():
    print("Yo")
    
# informal_greeter = add_noise(informal_greeter)
    
informal_greeter()

__Pie syntax__

- introduced in [PEP318](https://www.python.org/dev/peps/pep-0318/) in Python 2.4
- various syntax proposals were suggested, summarized [here](https://wiki.python.org/moin/PythonDecorators#A1._pie_decorator_syntax)

## Problems

### Function metadata is lost

In [None]:
informal_greeter.__name__

__Solution 1. Copy manually__

In [None]:
def add_noise(func):
    
    def wrapped_with_noise():
        """Useless docstring."""
        print(f"Calling {func.__name__}...")
        func()
        print(f"{func.__name__} finished.")
        
    wrapped_with_noise.__name__ = func.__name__
    return wrapped_with_noise

@add_noise
def greeter():
    """meaningful documentation"""
    print("Hello")
    
print(greeter.__name__)

What about other metadata such as the docstring?

In [None]:
print(greeter.__doc__)

__Solution 2. `functools.wraps`__

In [None]:
from functools import wraps

def add_noise(func):
    
    @wraps(func)
    def wrapped_with_noise():
        print(f"Calling {func.__name__}...")
        func()
        print(f"{func.__name__} finished.")
        
    return wrapped_with_noise

@add_noise
def greeter():
    """function that says hello"""
    print("Hello")
    
print(greeter.__name__)
print(greeter.__doc__)

### Function arguments

- so far we have only decorated functions without parameters
- to wrap arbitrary functions, we need to capture a variable number of arguments
- remember `args` and `kwargs`

In [None]:
def function_with_variable_arguments(*args, **kwargs):
    print(args)
    print(kwargs)
    
function_with_variable_arguments(1, "apple", tree="peach")

the same mechanism can be used in decorators

In [None]:
def add_noise(func):
    
    @wraps(func)
    def wrapped_with_noise(*args, **kwargs):
        print(f"Calling {func.__name__}...")
        func(*args, **kwargs)
        print(f"{func.__name__} finished.")
        
    return wrapped_with_noise

- the decorator has only one parameter: `func`, the function to wrap
- the returned function (`wrapped_with_noise`) takes arbitrary parameters: `args`, `kwargs`
- it calls `func`, the decorator's argument with arbitrary parameters

In [None]:
@add_noise
def personal_greeter(name):
    print(f"Hello {name}")
    
# personal_greeter("John", "Tim")  # raises TypeError because personal_greeter only takes one parameter
personal_greeter("John")

### Return values

Let's not forget about return values:

In [None]:
def add_noise(func):
    
    @wraps(func)
    def wrapped_with_noise(*args, **kwargs):
        print("Calling {func.__name__}...")
        ret_value = func(*args, **kwargs)
        print("{func.__name__} finished.")
        return ret_value
        
    return wrapped_with_noise

## Decorators can take parameters too

They have to return a decorator without parameters - __decorator factory__

In [None]:
def decorator_with_param(param1, param2=None):
    print(f"Creating a new decorator: {param1}, {param2}")
    
    def actual_decorator(func):
        
        @wraps(func)
        def wrapper(*args, **kwargs):
            print(f"Wrapper function {func.__name__}")
            print(f"Params: {param1}, {param2}")
            return func(*args, **kwargs)
        
        return wrapper
    
    return actual_decorator

In [None]:
@decorator_with_param(42, "abc")
def personal_greeter(name):
    print(f"Hello {name}")
    
@decorator_with_param(4)
def personal_greeter2(name):
    print(f"Hello {name}")
    
print("\nCalling personal_greeter")
personal_greeter("Mary")

In [None]:
def hello(name):
    print(f"Hello {name}")
    
hello = decorator_with_param(1, 2)(hello)
hello("john")

## Decorators can be implemented as classes

- `__call__` implements the wrapped function

In [None]:
class MyDecorator(object):
    def __init__(self, func):
        self.func_to_wrap = func
        wraps(func)(self)
        
    def __call__(self, *args, **kwargs):
        print(f"before func {self.func_to_wrap.__name__}")
        res = self.func_to_wrap(*args, **kwargs)
        print(f"after func {self.func_to_wrap.__name__}")
        return res
    
@MyDecorator
def foo():
    print("bar")

foo()

# Modules and imports

- `import` statement combines two operations
    1. it searches for the named module, 
    2. then it binds the results of that search to a name in the local scope -- [official documentation](https://docs.python.org/3/reference/import.html) (emphasis mine)
- several formats

## importing full modules

In [None]:
import sys

print(", ".join(dir(sys)))
sys.stdout

## importing submodules

In [None]:
from os import path

try:
    os
except NameError:
    print("os does not seem to be defined")
    
try:
    path
    print("path found")
except NameError:
    print("path does not seem to be defined")

the `as` keyword binds the module to a different name:

In [None]:
import os as os_module

try:
    os
except NameError:
    print("os does not seem to be defined")
    
try:
    os_module
    print("os_module found")
except NameError:
    print("os_module does not seem to be defined")

Some widely used convenctions are:

In [None]:
import numpy as np
import pandas as pd

## importing more than one module/submodule

In [None]:
# import os, sys
from sys import stderr, stdin, stdout

## importing functions or classes

In [None]:
from argparse import ArgumentParser
import inspect

inspect.isclass(ArgumentParser)

## importing everything from a module

**NOT** recommended because we have no way of knowing where names come from

In [None]:
len(globals())

In [None]:
from os import *

try:
    makedirs
    stat
    print("everything found")
except NameError:
    print("Something not found")

In [None]:
print(len(globals()))
import os
len(globals())

In [None]:
print(len(globals()))
from itertools import *
len(globals())

In [None]:
import os
os.stat

In [None]:
len(globals())

# Packaging

Python projects can be packaged and distributed.

## Naming convention

- all lowercase
- underscore separated, no hyphens
- unique on PyPI

## Minimal structure

~~~
example_package/
    example_package/
        __init__.py
    setup.py
~~~

- the source code is located in a separate subdirectory with the same name
  - just a convention, not mandatory
- `setup.py` describes how the package should be installed

## Source code

- each directory that has a `__init__.py` file is going to be a subpackage
  - `__init__.py` may be empty
  
## setup.py

A single call to `setuptools.setup`. Its arguments describe how the package should be installed.

## Nice to have

- licence
- `Manifest.IN` - list of additional files
- `setup.cfg` - option defaults for `setup.py`
- `README.rst` - `README` using reStructuredText

https://github.com/pypa/sampleproject

## See also

https://packaging.python.org/tutorials/distributing-packages/

# Pip, virtualenv, Anaconda

1. Pip
  - package installer
2. Virtualenv
  - Python environment manager
  - a virtualenv is a Python environment separate from the system Python install
  - advantages
    - different Python version than the system default may be used
    - different package versions may be used
    - updates and package installs do not affect the system install
    - no need for root/Admin access
  - activate and deactivate
  - virtualenvwrapper is a collection of helper scripts (mainly for Linux)
3. Anaconda
  - package installer and environment manager
  - scientific packages included
  - Miniconda is the package manager only

In [None]:
! which conda
! which python
! which pip
! which ls

# Global Interpreter Lock (GIL)

- CPython, the reference implementation has a reference counting garbage collector
- reference counting GC is **not** thread-safe :(
- "GIL, is a mutex that protects access to Python objects, preventing multiple threads from executing Python bytecodes at once"
- IO, image processing and Numpy (numerical computation and matrix library) heavy lifting happens outside the GIL
- other computations cannot fully take advantage of multithreading :(
- Jython and IronPython do not have a GIL

## See also

[Python wiki page on the GIL](https://wiki.python.org/moin/GlobalInterpreterLock)

[Live GIL removal (advanced)](https://www.youtube.com/watch?v=pLqv11ScGsQ)

# Type hints

[PEP 484](https://www.python.org/dev/peps/pep-0484/) defines the standard definitions, tools and some conventions for providing type information (hints) for static type analysis.

A simple example of a function that takes a `str` and returns a `str` looks like this (from [here](https://www.python.org/dev/peps/pep-0484/)):

In [None]:
def greeting(name: str) -> str:
    return 'Hello ' + name

greeting("John")

In [None]:
def happy_birthday(name: str, age: int) -> str:
    return f"Happy {age}th birthday, {name}"

happy_birthday("John", 25)

In [None]:
happy_birthday("John", "def")

The [typing module](https://docs.python.org/3/library/typing.html) provides definitions for frequent abstract types:

In [None]:
from typing import Sequence

def print_all(elements: Sequence) -> None:
    for e in elements:
        print(e)

print_all("abc")
print_all(range(2))

`Optional` specifies optional arguments:

In [None]:
from typing import Optional

def print_all(elements: Sequence, prefix: Optional[str] = None) -> None:
    for e in elements:
        if prefix:
            print(f"{prefix}: {e}")
        else:
            print(e)

print_all("abc", "de")
print_all(range(2))

`Union` is a collection of types. `Optional[T]` is just a shorthand for `Union[T, None]`.

In [None]:
from typing import Union

def happy_birthday(name: str, age: Union[int, str]) -> str:
    return f"Happy {age}th birthday, {name}"

happy_birthday("John", 25)
happy_birthday("John", "25")

# Misc topics we could not include

- coroutines, `async`, `await`, more in [PEP-492](https://www.python.org/dev/peps/pep-0492/)
- the `collections` module, container datatypes [doc](https://docs.python.org/3/library/collections.html)


# See also

Decorator overview with some advanced techniques: https://www.youtube.com/watch?v=9oyr0mocZTg

A very deep dive into decorators: https://www.youtube.com/watch?v=7jGtDGxgwEY