# 8. Type Hints in Functions

> It should also be emphasized that **Python will remain a dynamically typed language, and the authors have no desire to ever make type hints mandatory, even by convention.**
>  Guido van Rossum

## Gradual Typing in Practice

In [42]:
def show_count(count, word):
    if count == 1:
        return f'1 {word}'
    count_str = str(count) if count else 'no'
    return f'{count_str} {word}s'

In [43]:
show_count(99, 'bird')

'99 birds'

In [44]:
show_count(1, 'bird')

'1 bird'

In [45]:
show_count(0, 'bird')

'no birds'

In [46]:
from pytest import mark

In [47]:
@mark.parametrize('qty, expected', [
    (1, '1 part'),
    (2, '2 parts'),
])
def test_show_count(qty, expected):
    got = show_count(qty, 'part')
    assert got == expected
    
def test_show_count_zero():
    got = show_count(0, 'part')
    assert got == 'no parts'

In [48]:
def show_count(count: int, word: str) -> str:
    if count == 1:
        return f'1 {word}'
    count_str = str(count) if count else 'no'
    return f'{count_str} {word}s'

In [49]:
@mark.parametrize('qty, expected', [
    (1, '1 part'),
    (2, '2 parts'),
])
def test_show_count(qty, expected):
    got = show_count(qty, 'part')
    assert got == expected
    
def test_show_count_zero():
    got = show_count(0, 'part')
    assert got == 'no parts'

### A Default Parameter Value

In [50]:
def test_irregular() -> None:
    got = show_count(2, 'child', 'children')
    assert got == '2 children'

In [51]:
def show_count(count: int, singular: str, plural: str = '') -> str:
    if count == 1:
        return f'1 {singular}'
    count_str = str(count) if count else 'no'
    if not plural:
        plural = singular + 's'
    return f'{count_str} {plural}'

### Using **None** as a Default

In [52]:
from typing import Optional

def show_count(count: int, singular: str, plural: Optional[str] = None) -> str:
    if count == 1:
        return f'1 {singular}'
    count_str = str(count) if count else 'no'
    if not plural:
        plural = singular + 's'
    return f'{count_str} {plural}'

## Types Are Defined by Supported Operations

In practice, it’s more useful to consider the set of supported operations as the defin‐ ing characteristic of a type.

In [53]:
class Bird:
    pass

class Duck(Bird):
    def quack(self):
        print('Quack!')
        
def alert(birdie):
    birdie.quack()
    
def alert_duck(birdie: Duck) -> None:
    birdie.quack()

def alert_bird(birdie: Bird) -> None:
    birdie.quack()

In [54]:
daffy = Duck()
alert(daffy)

Quack!


In [55]:
alert_duck(daffy)

Quack!


In [56]:
alert_bird(daffy)

Quack!


In [57]:
try:
    woody = Bird()
    alert(woody)
    alert_duck(woody)
    alert_bird(woody)
except Exception as e:
    print(f"{e=}")

e=AttributeError("'Bird' object has no attribute 'quack'")


## Types Usable in Annotations

Pretty much any Python type can be used in type hints, but there are restrictions and recommendations. In addition, the typing module introduced special constructs with semantics that are sometimes surprising.

### The Any Type

The keystone of any gradual type system is the `Any` type, also known as the _dynamic type_. When a type checker sees an untyped function like this:

```python
def double(x)
    return x * 2
````

it assumes this:

```python
def double(x: Any) -> Any:
    return x * 2
```

However, a type checker will reject this function:

```python
def double(x: object) -> object:
    return x * 2
```
The problem is that object does not support the `__mul__` operation.

### Simple Types and Classes

Simple types like `int`, `float`, `str`, and `bytes` may be used directly in type hints. Concrete classes from the standard library, external packages, or user defined—`French Deck`, `Vector2d`, and `Duck`—may also be used in type hints.

### Optional and Union Types

### Generic Collections

Most Python collections are heterogeneous. For example, you can put any mixture of different types in a `list`. However, in practice that’s not very useful: if you put objects in a collection, you are likely to want to operate on them later, and usually this means they must share at least one common method.

```python
def tokenize(text: str) -> list[str]:
    return text.upper().split()

In [58]:
list1 = [ 1, 2, 2, 3, 4]
list2 = [ 2, 2, 4, 5]
list( set(list1) & set(list2) )

[2, 4]

### Tuple Types

There are three ways to annotate tuple types:

+ Tuples as records
+ Tuples as records with named fields
+ Tuples as immutables sequences

#### Tuples as records

If you're using a `tuple` as a record, use the `tuple` built-in and declare the types of the fields within `[]`.

For example, the type hint would be `tuple[str, float, str]` to accept a tuple with city name, population, and country: `('Shanghai', 24.28, 'China')`.

```python
from geolib import geohash as gh # type: ignore

PRECISION = 9

def geohash(lat_lon: tuple[float, float]) -> str:
    return g.encode(*lat_lon, PRECISION)
```

#### Tuples as records with named fields

To annotate a tuple with many fields, or specific types of tuple your code uses in many places, I highly recommend using `typing.NamedTuple`.

```python
from typing import NamedTuple
from geolib import geohash as gh # type: ignore

PRECISION = 9

class Coordinate(NamedTuple):
    lat: float
    lon: float

def geohash(lat_lon: Coordinate) -> str:
    return gh.encode(*lat_lon, precision)

def diplay_ll(lat_lon: tuple[float, float]) -> str:
    lat, lon = lat_lon
    ns = 'N' if lat>=0 else 'S'
    ew = 'E' if lon>=0 else 'W'
    return f'{abs(lat):0.1f}°{ns}, {abs(lon):0.1f}°{ew}'
```

#### Tuples as immutable sequences

To annotate tuples of unspecified length that are used as immutable lists, you must specify a single type, followed by a comma and `...` (that’s Python’s ellipsis token, made of three periods, not Unicode `U+2026—HORIZONTAL ELLIPSIS`).

For example, `tuple[int, ...]` is a tuple with `int` items.

The ellipsis indicates that any number of elements >= 1 is acceptable. There is no way to specify fields of different types for tuples of arbitrary length.

Here is a `columnize` function that transforms a sequence into a table of rows and cells in the form of a list of tuples with unspecified lengths. 

In [59]:
from collections.abc import Sequence

def columnize(
    sequence: Sequence[str], num_columns: int =0,
) -> list[tuple[str, ...]]:
    if num_columns == 0:
        num_columns = round(len(sequence) ** 0.5)
    num_rows, reminder = divmod(len(sequence), num_columns)
    num_rows += bool(reminder)
    return [tuple(sequence[i::num_rows]) for i in range(num_rows)]

In [60]:
animals = 'drake fawn heron ibex koala lynx tahr xerus yak zapus'.split()
table = columnize(animals)
table

[('drake', 'koala', 'yak'),
 ('fawn', 'lynx', 'zapus'),
 ('heron', 'tahr'),
 ('ibex', 'xerus')]

In [61]:
for row in table:
    print(''.join(f'{word:<10}' for word in row))

drake     koala     yak       
fawn      lynx      zapus     
heron     tahr      
ibex      xerus     


### Generic Mappings

Generic mapping types are annotated as `MappingType[KeyType, ValueType]`. The built-in `dict` and the mapping types in collections and `collections.abc` accept that notation in Python ≥ 3.9. For earlier versions, you must use `typing.Dict` and other mapping types from the `typing` module.

In [62]:
import sys
import re
import unicodedata
from collections.abc import Iterator

RE_WORD = re.compile(r'\w+')
STOP_CODE = sys.maxunicode + 1

def tokenize(text: str) -> Iterator[str]:
    """return iterable of uppercased words"""
    for match in RE_WORD.finditer(text):
        yield match.group().upper()
        
def name_index(start: int = 32, end: int = STOP_CODE) -> dict[str, set[str]]:
    index: dict[str, set[str]] = {}
    for char in (chr(i) for i in range(start, end)):
        if name := unicodedata.name(char, ''):
            for word in tokenize(name):
                index.setdefault(word, set()).add(char)
    return index

In [63]:
index = name_index(32, 65)

In [64]:
index['SIGN']

{'#', '$', '%', '+', '<', '=', '>'}

In [65]:
index['DIGIT']

{'0', '1', '2', '3', '4', '5', '6', '7', '8', '9'}

In [66]:
index['DIGIT'] & index['EIGHT']

{'8'}

### Abstract Base Classes

> Be conservative in what you send, be liberal in what you accept.
> Postel's law, a.k.a. the Robustness Principle

Consider this function signature:

```python
from collections.abc import Mapping

def name2hex(name: str, color_map: Mapping[str, int]) -> str:
```

Using abc.Mapping allows the caller to provide an instance of `dict`, `defaultdict`, `ChainMap`, a `UserDict` subclass, or any other type that is a subtype-of `Mapping`.

In contrast, consider this signature:

```python
def name2hex(name: str, color_map: dict[str, int]) -> str:
```

Now `color_map` must be a `dict` or one of its subtypes, such as `defaultDict` or `OrderedDict`. In particular, a subclass of `collections.UserDict` would not pass the type check for `color_map`, despite being the recommended way to create user-defined mappings.

### Iterable 

The `typing.List` documentation I just quoted recommends `Sequence` and `Iterable` for function parameter type hints.

```python
from collections.abc import Iterable

FromTo = tuple[str, str]

def zip_replace(text: str, changes: Iterable[FromTo]) -> str:
    for from_, to in changes:
        text = text.replace(from_, to)
    return text
```

### Parametrized Generics and TypeVar

A parameterized generic is a generic type, written as `list[T]`, where `T` is a type variable that will be bound to a specific type with each usage. This allows a parameter type to be reflected on the result type.

In [67]:
from collections.abc import Sequence
from random import shuffle
from typing import TypeVar

T = TypeVar('T')

def sample(population: Sequence[T], size: int) -> list[T]:
    if size<1:
        raise ValueError('size must be >=1')
    result = list(population)
    shuffle(result)
    return result[:size]

In [68]:
from collections import Counter
from collections.abc import Iterable, Hashable
from typing import TypeVar

HashableT = TypeVar('HashableT', bound=Hashable)

def mode(data: Iterable[HashableT]) -> HashableT:
    pairs = Counter(data).most_common(1)
    if len(pairs) == 0:
        raise ValueError('no mode for empty data')
    return pairs[0][0]

#### The AnyStr predefined type variable

The typing module includes a predefined `TypeVar` named `AnyStr`. It’s defined like this:

```python
AnyStr = TypeVar('AnyStr', bytes, str)
```

### Static Protocols

In [73]:
try:
    l = [object() for _ in range(4)]
    print(f"{l=}")
    sorted(l)
except Exception as e:
    print("{e=}",)

l=[<object object at 0x7f4f7eaa2040>, <object object at 0x7f4f7eaa2260>, <object object at 0x7f4f7eaa23d0>, <object object at 0x7f4f7eaa20c0>]
{e=}


In [74]:
class Spam:
    def __init__(self, n): self.n = n
    def __lt__(self, other): return self.n < other.n
    def __repr__(self): return f"Spam({self.n})"

In [75]:
l = [ Spam(n) for n in range(5, 0, -1) ]
l

[Spam(5), Spam(4), Spam(3), Spam(2), Spam(1)]

In [76]:
sorted(l)

[Spam(1), Spam(2), Spam(3), Spam(4), Spam(5)]

In [77]:
from typing import Protocol, Any

class SupportsLessThan(Protocol):
    def __init__(self, n): self.n = n
    def __lt__(self, other: Any) -> bool: return self.n < other.n
    def __repr__(self): return f"Spam({self.n})"

In [78]:
from collections.abc import Iterable
from typing import TypeVar

LT = TypeVar('LT', bound=SupportsLessThan)

In [79]:
def top(series: Iterable[LT], length: int) -> list[LT]:
    ordered = sorted(series, reverse=True)
    return ordered[:length]

In [80]:
def test_top_tuples() -> None:
    fruit = 'mango pear apple kiwi banana'.split()
    series: Iterator[tuple[int, str]] = (
        (len(s), s) for s in fruit
    )
    length = 3
    expected = [(6, 'banana'), (5, 'mango'), (5, 'apple')]
    result = top(series, length)
    if TYPE_CHECKING:
        reveal_type(series)
        reveal_type(expected)
        reveal_type(result)
    assert result == expected

In [81]:
import pytest

def test_top_objects_error() -> None:
    series = [object() for _ in range(4)]
    if TYPE_CHECKING:
        reveal_type(series)
    with pytest.raises(TypeError) as excinfo:
        top(series, 3)
    assert "'<' not supported" in str(excinfo.value)

### NoReturn

This is a special type used only to annotate the return type of functions that never return. Usually, they exist to raise exceptions. There are dozens of such functions in the standard library.

## Annotating Positional Only and Variadic Parameters

```python

from typing import Optional

def tag(
    name: str,
    /,
    *content: str,
    class_: Optional[str] = None,
    **attrs: str,
) -> str:
```

The `/` notation for positional-only parameters is only available in Python ≥ 3.8. In Python 3.7 or earlier, that’s a syntax error. 

## Imperfect Typing and Strong Testing

Maintainers of large corporate codebases report that many bugs are found by static type checkers and fixed more cheaply than if the bugs were discovered only after the code is running in production. However, it’s essential to note that automated testing was standard practice and widely adopted long before static typing was introduced in the companies that I know about.