# Nb Mypy

## Mypy Type Checking in Jupyter Notebooks

&copy; 2021 - Lars van den Haak,  Tom Verhoeff (Eindhoven University of Technology)

---

_Nb Mypy_ is a facility to automatically run [`mypy`](http://mypy-lang.org/) on Jupyter notebook cells as they are executed, whilst retaining information about the execution history.

## Table of Contents

* Installation
* Usage
* Type Hint Checking Examples

## Installation

* _Nb Mypy_ relies on the packages mypy and astor, which you can install via `python3 -m pip install mypy astor`. 
* _Nb Mypy_ can be installed with `python3 -m pip install nb_mypy`. 

## Usage

If you want to apply automatic type checking, you can load the extension as:

In [1]:
%load_ext nb_mypy

Version 1.0.3




With the line magic `%nb_mypy` you can modify the behaviour of _Nb Mypy_
* `%nb_mypy -v`: show version
* `%nb_mypy`: show the current state
* `%nb_mypy On`: enable automatic type checking
* `%nb_mypy Off`: disable automatic type checking
* `%nb_mypy DebugOn`: enable debug mode
* `%nb_mypy DebugOff`: disable debug mode

An unknown argument will result in an error message and a list of known commands:

In [2]:
%nb_mypy unknown

Unknown argument
 Valid arguments: ['', '-v', 'On', 'Off', 'DebugOn', 'DebugOff']


## Type Hint Checking Examples

Type hints were introduced in Python 3.5:

* [Full specification (PEP 484)](https://www.python.org/dev/peps/pep-0484)
* [Simplified introduction (PEP 483)](https://www.python.org/dev/peps/pep-0483)

* [`typing` - Support for type hints](https://docs.python.org/3/library/typing.html)
* [Type hints cheat sheet](https://mypy.readthedocs.io/en/stable/cheat_sheet_py3.html)

Type hints can be provided for
* Variables, with or without assignment
* Function parameters
* Function return values

The general syntax is `name: type`.

Built-in types can be used:
  - `str`, `int`, `float`, `bool`, `bytes`
  - `tuple`, `list`, `dict`, `set` (but these are not recommended; see beloow)
  
Consider the following declarations,
that are consistently typed.

In [3]:
answer: int = 42
name: str

def replicate(s: str, f: float) -> str:
    """Return f copies of s.
    """
    assert f >= 0, "f must be >= 0"
    n = int(f)  # integer part of f
    return n * s + s[:round((f - n) * len(s))]

def print_plural(word: str, n: int) -> None:
    """Print pluralized word.
    """
    print(f"{word}{'' if n == 1 else 's'}")

**Type hints** in Python:
* Are *voluntary*, not mandatory
* Are _not_ checked automatically by the Python interpreter
* Serve as **documentation**
* Can help **prevent mistakes**

The following usages execute without exceptions,
but `mypy` type checking reveals mistakes.

In [4]:
answer = '42'
answer

<cell>1: error: Incompatible types in assignment (expression has type "str", variable has type "int")


'42'

In [5]:
answer: str = '42'  # can give new type

In [6]:
replicate([1, 2, 3], 2.5)

<cell>1: error: Argument 1 to "replicate" has incompatible type "List[int]"; expected "str"


[1, 2, 3, 1, 2, 3, 1, 2]

In [7]:
print_plural('word', 2.5)

<cell>1: error: Argument 2 to "print_plural" has incompatible type "float"; expected "int"


words


Code need not run for type checking to be useful.
The following mistake is caught even without calling the function.

In [8]:
def f(n: int) -> str:
    """Convert number to string.
    """
    return n

<cell>4: error: Incompatible return value type (got "int", expected "str")


### Type hints for collections

For collections, prefer capitalized type names,
with arguments to specify the types of the items.

These need to be imported from `typing`:

In [9]:
from typing import Any, Tuple, List, Dict, Set

In [10]:
t: Tuple[str, Any] = ('a', 1)
c: Tuple[bool, ...] = ()
names: List[str] = []
d: Dict[str, float] = {}
v: Set[int] = set()

In the assignments above,
the type cannot be inferred from the right-hand expression.

The following assignments execute without exception,
but the types are not correct.

In [11]:
t = ('b', 2, False)
c = (False, True, 3)
names = ['a', 1]
d = {'a': 1, 'b': 'c'}
v = {'a'}

<cell>1: error: Incompatible types in assignment (expression has type "Tuple[str, int, bool]", variable has type "Tuple[str, Any]")
<cell>2: error: Incompatible types in assignment (expression has type "Tuple[bool, bool, int]", variable has type "Tuple[bool, ...]")
<cell>3: error: List item 1 has incompatible type "int"; expected "str"
<cell>4: error: Dict entry 1 has incompatible type "str": "str"; expected "str": "float"
<cell>5: error: Argument 1 to <set> has incompatible type "str"; expected "int"


Can also use more _generic_ types names
* `Sequence`: generalizes `List` and `Tuple`
* `Iterable`: anything usable in `for`-loop
* `Mapping`, `MutableMapping`: generalizes `Dict`, `DefaultDict`
* `Callable`: for anything that can be called
* `Generator`: for generators

In [12]:
from typing import Sequence, Mapping, Iterable
from typing import Callable, Generator

### Extra type hint features

* **Type aliases**: different name for same type
* `NewType`: treat existing type as different type
* `TypeVar`: to express type constraints
* `reveal_type`: to find out about inferred types

In [13]:
from typing import TypeVar, NewType

In [14]:
# Type alias
Distribution = Sequence[float]
# Assumptions for distr: Distribution:
# * all(0 <= p <= 1 for p in distr)
# * sum(dist) == 1

def condition(distr: Distribution, item: int) -> Distribution:
    """Return distribution under the condition that given item was not selected.
    
    Assumptions:
    * item in range(len(distr))
    * distr[item] < 1
    """
    result = distr[:]
    result[item] = 0
    q = sum(result)  # probability that item not selected
    return list(map(lambda x: x / q, result))

condition([0.1, 0.4, 0.5], 2)

<cell>15: error: Unsupported target for indexed assignment ("Sequence[float]")


[0.2, 0.8, 0.0]

In [15]:
# New type name (not just an alias!)
Distance = NewType('Distance', float)
Area = NewType('Area', float)

def scale(factor: float, dist: Distance) -> Distance:
    return factor * dist

<cell>6: error: Incompatible return value type (got "float", expected "Distance")


In [16]:
def scale(factor: float, dist: Distance) -> Distance:
    return Distance(factor * dist)

In [17]:
a = Area(100)

scale(10, a)

<cell>3: error: Argument 2 to "scale" has incompatible type "Area"; expected "Distance"


1000

In [18]:
# Type variable
T = TypeVar('T')

def mid(seq: Sequence[T]) -> T:
    """Return item from seq near the middle.
    
    Assumption: seq is not empty
    """
    return seq[len(seq) // 2]

This is more informative than
```python
def mid(seq: Sequence[Any]) -> Any
```

In [19]:
# reveal_type is not defined, but interpreted by mypy.
# This extension also removes reveal_type calls, so we don't get errors.
reveal_type(mid([1, 2]))
reveal_type(mid(['a', 'b']))

<cell>3: note: Revealed type is 'builtins.int*'
<cell>4: note: Revealed type is 'builtins.str*'


### Advanced type hints

* `Optional`: if value can also be `None`
* `Union`: if value can have multiple types

In [20]:
from typing import Optional, Union

In [21]:
result: Optional[int] = None
    
answer: Union[str, int, float, bool] = "Don't know yet"

---

## (End of Notebook)