# A brief introduction to static typing in Python 

Davis Bennett  
Scicomp morning meeting  
December 6, 2021

These slides can be found in https://github.com/d-v-b/presentations/


* motivation
    - what is static typing?
    - why add static typing to a dynamic language?

* usage
    - type checking with `mypy`
    - typed data structures with `pydantic`

* references
    - python type hints docs: https://docs.python.org/3/library/typing.html
    - pydantic docs: https://pydantic-docs.helpmanual.io/
    - gradual typing (theory): https://en.wikipedia.org/wiki/Gradual_typing

# Static vs dynamic typing

## C is statically typed

In [None]:
%%file add_one.c

int add_one(int arg) {
    return arg + 1;
    }

int main() {
    char bad_arg[] = "not an int";
    int foo = add_one(bad_arg); // errors when compiled
    return 1;
}


In [2]:
!gcc add_one.c -o add_one && ./add_one

[01m[Kadd_one.c:[m[K In function ‘[01m[Kmain[m[K’:
    8 |     int foo = add_one([01;35m[Kbad_arg[m[K); // errors when compiled
      |                       [01;35m[K^~~~~~~[m[K
      |                       [01;35m[K|[m[K
      |                       [01;35m[Kchar *[m[K
[01m[Kadd_one.c:2:17:[m[K [01;36m[Knote: [m[Kexpected ‘[01m[Kint[m[K’ but argument is of type ‘[01m[Kchar *[m[K’
    2 | int add_one([01;36m[Kint arg[m[K) {
      |             [01;36m[K~~~~^~~[m[K


## Python is dynamically typed

In [3]:
%%file add_one.py

def add_one(arg):
    return arg + 1

if __name__ == '__main__':
    bad_arg = "not an int"
    foo = add_one(bad_arg) # errors when run

Overwriting add_one.py


In [4]:
!python add_one.py

Traceback (most recent call last):
  File "add_one.py", line 7, in <module>
    foo = add_one(bad_arg) # errors when run
  File "add_one.py", line 3, in add_one
    return arg + 1
TypeError: can only concatenate str (not "int") to str


Advantages of static typing:
- Catch type errors before they happen
- Makes complicated code easier to understand
- Enables efficient (i.e., good) performance 

Disadvantages: 
- Slower to write
- More boilerplate code
- Harder to write generic code

Since Python 3.5, type annotations give Python some of the advantages of static typing 

In [5]:
%%file add_one_typed.py

def add_one(arg: int) -> int:
    return arg + 1

if __name__ == '__main__':
    bad_arg = "not an int"
    foo = add_one(bad_arg) # errors when typechecked

Overwriting add_one_typed.py


In [6]:
!mypy add_one_typed.py

add_one_typed.py:7: [1m[31merror:[m Argument 1 to [m[1m"add_one"[m has incompatible type [m[1m"str"[m; expected [m[1m"int"[m[m
[1m[31mFound 1 error in 1 file (checked 1 source file)[m


The typechecker finds error *before* runtime, which can be nice if errors are bad.

The type system supports unions:

```python
from typing import Union, List

def list_or_int(arg: int) -> Union[List, int]:
    if arg % 2 == 0:
        return []
    else:
        return 0
```

...Generics: 

```python
from typing import TypeVar

T = TypeVar('T')
def identity(arg: T) -> T:
    return arg
```


... a catch-all `Any` type:

```python
from typing import Any

blob: Any = load_data()
blob += 10 # the typechecker is fine with this
```


...and a lot more features (but not runtime performance*)






*but see https://cython.readthedocs.io/en/latest/src/tutorial/pure.html

# How and when to use python type annotations

## How to use type annotations

* While developing:
    1. Write code
    2. Annotate types
    3. Run `mypy src/my_code.py` to get results from the typechecker

Major IDEs (VSCode, PyCharm) can also parse type annotations

## When to use type annotations

Whenever you can, but especially when your project gets bigger.

## Type-checked datastructures with pydantic

- type annotations enable libraries to check types at runtime
- the `pydantic` library uses this for data validation

In [7]:
from pydantic import BaseModel

class User(BaseModel):
    name: str
    id: int

steve = User(name=100, id='steve') # this should error


ValidationError: 1 validation error for User
id
  value is not a valid integer (type=type_error.integer)

Pydantic supports nested models, and can generate JSON from model instances

In [8]:
from pydantic import BaseModel
from typing import Tuple


class ImageVolume(BaseModel):
    name: str
    size: Tuple[int, ...]
    resolution: Tuple[float, ...]


class Experiment(BaseModel):
    user: str
    images: Tuple[ImageVolume, ...]


experiment = Experiment(user='bennettd', images=({'name': 'exp1', 'size': (10, 10), 'resolution': (1.0, 1.0)},))

print(experiment.json(indent=2))

{
  "user": "bennettd",
  "images": [
    {
      "name": "exp1",
      "size": [
        10,
        10
      ],
      "resolution": [
        1.0,
        1.0
      ]
    }
  ]
}


We can make JSON schemas from data models, which can be used for code generation 

In [9]:
print(Experiment.schema_json(indent=2))

{
  "title": "Experiment",
  "type": "object",
  "properties": {
    "user": {
      "title": "User",
      "type": "string"
    },
    "images": {
      "title": "Images",
      "type": "array",
      "items": {
        "$ref": "#/definitions/ImageVolume"
      }
    }
  },
  "required": [
    "user",
    "images"
  ],
  "definitions": {
    "ImageVolume": {
      "title": "ImageVolume",
      "type": "object",
      "properties": {
        "name": {
          "title": "Name",
          "type": "string"
        },
        "size": {
          "title": "Size",
          "type": "array",
          "items": {
            "type": "integer"
          }
        },
        "resolution": {
          "title": "Resolution",
          "type": "array",
          "items": {
            "type": "number"
          }
        }
      },
      "required": [
        "name",
        "size",
        "resolution"
      ]
    }
  }
}


# Coda

- type annotations are an efficient way to prevent errors and keep large codebases easy to understand
- tools can use type annotations to provide runtime type safety
- no performance benefits from static typing