# Type Hints in Python

Paul Staab
2020-08-06

## How does typing in python work?

- Python is a **dynamically typed** programming language
- Each variable has fixed type at any given point during program execution
- The type can change during progamm execution
- Types are determined at runtime, not at compile time

In [None]:
x = 1.0
type(x)

In [None]:
x = 'Hello!'
type(x)

## What are type hints?

Type hints are optional annotations for the type of a variable:

In [None]:
x: int = 1

They are designed as hints for programmers to **document** the indented type of a variable in a **standardized, machine readable way**.

They are not enforced by Python (CPython to be precise):

In [None]:
x: int = 'Hello!'

You can define type hints for **functions**

In [None]:
def plus(x: float, y: float) -> float:
    return x + y

plus(1.5, 2.3)

In [None]:
plus('ab', 'cd')  # Again, tpye hints are not enforced on runtime

and for member variables of **classes**

In [None]:
class Pet:
    name: str
    
    def __init__(self, name: str) -> None:
        self.name = name
        
Pet('Hansi')

## What are benefits of using type hints?

### 1. Documentation
Type hints make code easier to understand for your colleagues and your future self. In production-grade projects you often have to document your code anyway. Type hints are a standardized way of doing so.

In [None]:
def plus(x, y):
    """
    Adds two numbers
    @param x The first number (float)
    @param y The second number (float)
    @return The sum of both numbers (float)
    """
    return x + y

def plus(x: float, y: float) -> float:
    """Adds two numbers"""
    return x + y

### 2. Better suggestion by your IDEs autocompletion

IDEs and editors like PyCharm and Jupyter try hard to determine the type of an object without running the code. But this is possible only sometimes. For example there is no reliable way to derive the type of a function argument:

In [None]:
import pandas as pd

def fill_missing(df):
    df.fil
        
def drop_missing(df: pd.DataFrame):
    df.dr

Often, even a few type hints can help the autocompletion engine to derive the types of many other variables. Knowing the type also enables the IDE to make your life easier in other ways, e.g. by showing relevant documentation.

### 3. Detecting bugs with improved linting and static analysis

You can use code analysis tools e.g. as a git pre-commit hook or as part of a CI pipeline to automatically detect type problems in our code without executing it: 
* General linting tools like pylint and pep8 work better with type hints
* Many specialized Python type checkers are available:
    * [mypy](http://www.mypy-lang.org)
    * [pyre](https://pyre-check.org) (Facebook)
    * [pytype](https://google.github.io/pytype) (Google)
    * [pyright](https://github.com/microsoft/pyright) (Microsoft)

In [None]:
!cat ./type_error_example.py

In [None]:
!./__pypackages__/3.8/lib/bin/mypy ./type_error_example.py

## 4. 3rd party libaries can do amazing things with types
Because type hints are machine readable, many 3rd use them to provide conventient functions.

### Check function inputs
You can use the library [`typeguard`](https://pypi.org/project/typeguard) to check the types of function arguments at runtime:

In [None]:
from typeguard import typechecked

@typechecked
def plus(x: float, y: float) -> float:
    return x + y

plus(1, 2)

In [None]:
plus('ab', 'cd')

### Check the schema of Pandas DataFrames

Unfortunately, pandas does not provide or use type hints at the moment, but we can use [`dataenforce`](https://github.com/CedricFR/dataenforce) to check the schema of a DataFrame:

In [None]:
from dataenforce import Dataset, validate

@validate
def process_data(data: Dataset["id": int, "name": object]):
    pass

process_data(pd.DataFrame(dict(id=[1,2], name=["Alice", "Bob"])))

In [None]:
process_data(pd.DataFrame(dict(id=[1,2])))

In [None]:
process_data(pd.DataFrame(dict(id=[1,'2'], name=["Alice", "Bob"])))

Similar, the library [`pydantic`](https://pydantic-docs.helpmanual.io) checks input data in e.g. json.

### Build an API
The library [`fastapi`](https://fastapi.tiangolo.com) relies heavily on types hints. You create API-Endpoints for function 
using function decorators. It uses type hints to e.g. check inputs, generate documentation and SWAGGER-definitions.

    @app.get("/items/{item_id}")
    def read_item(item_id: int, q: Optional[str] = None):
        return {"item_id": item_id, "q": q}

## What are the disadvantages of type hints?

* Writing type hints takes some time an effort
* Work best with modern Python (>= 3.6)
* They can slow down python startup (a bit)
* Sometimes types can become complex and difficult to read (see next section)
* It can require advanced Python knowledge to find the correct type for certain variables, in particular when using duck typing and inheritance
* Your collegues may force you to write a lot of type hints

## How can I express more complex types?

The buildin `typing` module provides many types:

### Optional
Optional is a wrapper around other types. It means that the variable can either have other type or be `None`.

In [None]:
from typing import Optional

@typechecked
def or_one(x: Optional[int]) -> int:
    return x or 1

or_one(5), or_one(None)

### Union
Union given different possiblities what the type can be:

In [None]:
from typing import Union

@typechecked
def to_int(x: Union[str, float]) -> int:
    return int(x)

to_int(1.5), to_int('5')

### List, Dict, Set, Tuple

You can specify the types of items in a container with the upper case equivalents from typing:

In [None]:
from typing import List

@typechecked
def sum_list(x: List[float]) -> float:
    return sum(x)

sum_list([1.1, 2.2, 3.3])

In [None]:
from typing import Dict

@typechecked
def sum_dict_values(x: Dict[str, float]) -> float:
    return sum(x.values())

sum_dict_values({'a': 1.1, 'b': 2.2, 'c': 3.3})

Note: In professional software development, you often try to keep input types a generic as possible. That way, you can reuse a function more often. The `sum_list` function would work also for other containers and other numeric types. To express this, we could write it as:

In [None]:
from typing import Iterable, Union

@typechecked
def sum_iterable(x: Iterable[Union[int, float, complex]]) -> Union[int, float, complex]:
    return sum(x)

sum_iterable([1.1, 2.2, 3.3]), sum_iterable({1.1, 2.2, 3.3})

Finding the correct types can get quite tricky. For example, if we normalized an array to have mean 0, we could be tempted to again us Iterable as type:

In [None]:
from typing import Iterable, Union

@typechecked
def normalize(arr: Iterable[float]) -> List[float]:
    arr_sum = arr_len = 0
    for x in arr:
        arr_sum += x
        arr_len += 1
    
    arr_mean = arr_sum / arr_len
    return [x - arr_mean for x in arr]

arr = [1, 2, 3, 4, 5]
normalize(arr)

Say are interested only in number smaller than 4. What will be the output of

In [None]:
arr = filter(lambda x: x <= 3, arr)
normalize(arr)

## What do you recommend for using type hints?

* Use more type hints the more 'serious' your project is
* Almost always use some type hints (e.g. for function arguments) to get better IDE support
* Do not overdue it
* Keep things simple
* It is okay to omit complex, unhelpful or unknown type hints

# Links
* https://realpython.com/python-type-checking/