# Types and Type Systems

What are types, and how can we reason about them?

### Motivation

Something that shows up in our code all the time...  

```
if type(x) == float:
    do_something(x)
```

This is almost certainly not what you want to do!<br>

What did you mean? And more importantly, _why_?

## Historical Perspective

What kinds of type systems exist in programming?
What does _Python_ do?

We might think of types as having certain meanings, even as being fundamental to programming, but this hasn't always been the case 

## Untyped Languages

E.g assembly

```
add $10, %eax /* EAX is set to EAX + 10 */
```

Doesn't know or care what is in memory - it's all bytes.  There is integer addition happening in EAX, but no _type information_ about its contents

## Weakly Typed Languages

Things have types, but they might not be enforced, they might get 'coerced' into other types, or you may be able to subvert the type system altogether.  This may or may not be a good thing depending how you look at it...

C
```
    int *x; // Pointer to an integer
    float y = 21.1; // Floating point number
    x = (int*)&y; // Think of a x as a pointer to an integer, that is looking at our float.. 
    *x = 10; // Assign 10 (integer) to the memory that held our float
```

y is still a floating point variable.  But it's been overwritten with a bunch of bits that only make sense when interpreted as an integer...  it's still a 'valid' floating point number, but probably not a meaningful one...

In [None]:
# Contrast with Python - we can certainly convert a float into an integer -
# but we do so by constructing a new value, which doesn't share memory

y = 21.1
x = int(y)

### A brief diversion - What do we mean when we say "Python"?

The Python programming language is really a language specification; the version we run is called CPython.  This is an _implemenation_ of the language, and is the de-facto standard.

Some things about Python 2.x are not true of Python 3.x - this is not just about new language features (eg the kinds of differences between Python 3.5 and Python 3.7), but some core concepts have changed. 

From here on out - when we say Python, we mean CPython 3.x

## Strong Typing

Everything has a type; types cannot be changed.  They can be reasoned about, and are enforced at runtime (albeit in an inconsistent way)


In [None]:
x = 5.1
type(x)

In [None]:
def thing(x):
    return x

type(thing)

In [None]:
# Runtime type enforcement

len(thing)

In [None]:
x[5]

In [None]:
# A different kind of TypeError - this is not an issue with list itself,
# but with the type of the argument

l = [0, 1, 2]
l[0.5]

In [None]:
# Is this changing a type?  Absolutely not - we're just reassigning an identifier
# Addresses are not houses, and the string "My address" is neither my address nor my house...

thing = []
type(thing)

In [None]:
# Is Python really "strongly" typed, or is it "duck typed"?

class Duck:
    def quack(self):
        return "I'm a duck! Quack!"
        
class DishonestPig:
    def quack(self):
        return "I'm also a duck! Quack!"
        
for animal in [Duck, DishonestPig]:
    me = animal()
    print(me.quack())

In [None]:
# It is certainly possible to write something that looks like a list,
# but doesn't enforce types in the same way
# This is very bad code, but there is plenty of this out there in the world!
# Type enforcement is only as good as the library that uses it...

class MyList:
    def __init__(self, contents):
        self._internal_list = {i:c for i,c in enumerate(contents)}
        
    def __getitem__(self, index):
        return self._internal_list[index]
    
l = MyList((0,1,2))
l[0.5]

## Object Oriented Programming
### Taxonomies of ideas

All of the above is increasingly regarded as a bit arbitrary - what we really want to know is how a language (and a programmer) can _reason_ about types

Enter object oriented programming - everything is an object!<br>Most objects are also other things, probably several.<br>
Objects are not just "things", but the fundamental philosophical building block in the 'universe of ideas'

In [None]:
x = 5.1
type(x)

In [None]:
# Well, we know the answer...
type(x) == float

In [None]:
# We already know what type x is - so this seems like a silly question.
type(x) == object

In [None]:
# And yet, everything is an object
isinstance(x, object)

In [None]:
# x is an _instance_ of the idea (type) 'float'
isinstance(x, float)

In [None]:
# float is in instance of the idea 'type'
isinstance(float, type)

In [None]:
# x is not an instance of type
isinstance(x, type)

In [None]:
# What kind of idea is 'type'?  Well, let's just ask Python!
type(type)

## Type systems as (something like) set theory

In [None]:
from numbers import Number, Real, Integral

In [None]:
isinstance(5, Number), isinstance(5, Real), isinstance(5, Integral)

In [None]:
isinstance(5.1, Number), isinstance(5.1, Real), isinstance(5.1, Integral)

In [None]:
# So, back to our motivating example...

def do_something(x: int):
    if type(x) == int:
        return x + 3
    else:
        raise TypeError("Not an integer!")

In [None]:
import numpy as np

In [None]:
some_array = np.array(range(10000),dtype=np.int16)

In [None]:
x = some_array[10]
x

In [None]:
do_something(x)

In [None]:
def do_something(x: Integral):
    if isinstance(x, Integral):
        return x + 3
    else:
        raise TypeError("Not an integer!")

In [None]:
do_something(x), do_something(10)

In [None]:
do_something(5.1)

In [None]:
# The idea of a floating point number is not in fact a number...
isinstance(float, Number)

In [None]:
# But, in the taxonomy of things, floating point numbers are a subset of numbers...
issubclass(float, Number)

In [None]:
# Bonus round - we ask the hard philosophical questions
# According to Python - nothing is definitely something

isinstance(None, object)

## A line in the sand - 'type system' reasoning vs actual types

Not all categories are types - but the type system can still use the same kinds of reasoning


In [None]:
# Old Python (<3.10)
# Included for reference, but mostly redundant
from typing import Union, Dict, List, Any, Optional

In [None]:
isinstance(Union[float, dict], type)

In [None]:
isinstance(list[str], type)

In [None]:
isinstance(Optional[float], type)

In [None]:
isinstance(Any, type)

In [None]:
def optional_func(x: Optional[float]) -> type:
    if x is None: # None is a 'singleton' - there's only one in the universe.
        raise Exception("Sorry, I lied when I said this was optional... ")
    else:
        return type(x)

In [None]:
# The inspect module allows us to 'introspect' on code, modules, objects etc

import inspect

In [None]:
inspect.signature(optional_func)

In [None]:
for k, v in inspect.get_annotations(optional_func).items():
    print(k, type(v))

In [None]:
# When you call a function, the arguments are by definition objects (since everything is an object)
# All objects have a type (and a type heirarchy)

In [None]:
# A supposedly meaningful heirarchy

class ColourfulThing:
    def __init__(self, colour: str, **kwargs):
        super().__init__(**kwargs)
        self.colour = colour
        
class InflatableThing:
    def __init__(self, is_inflated: bool, **kwargs):
        super().__init__(**kwargs)
        self.is_inflated = is_inflated
        
    def pop(self):
        if self.is_inflated:
            self.is_inflated = False
            return "Pop!"
        else:
            return "Nothing happens..."
        
class Balloon(ColourfulThing, InflatableThing):
    def __init__(self, colour, is_inflated):
        super().__init__(colour=colour, is_inflated=is_inflated)
        
    def __repr__(self):
        state = "inflated" if self.is_inflated else "deflated"
        return f"A currently {state} {self.colour} balloon"

In [None]:
b = Balloon("red", True)
b

In [None]:
b.pop()

In [None]:
b

In [None]:
# How does inheritance get reconciled?
# Method Resolution Order

Balloon.mro()

In [None]:
x = 5.1
isinstance(5.1, float), isinstance(5.1, Number)

In [None]:
# Let's have a look at MRO for a built-in type

float.mro()

In [None]:
# The Abstract Base Class library allows you to declare MetaTypes - these are collections
# that can reason about type information using properties of type objects, even if
# they are not part of the MRO type heirarchy

type(Number)

## A cautionary tale from the 1990s...

Object orientation is new and great and replaces everything! Everything lives in a heirachy, and the One True Taxonomy of the Universe is obvious!

In [None]:
class ConcreteBalloon(Balloon):
    def __init__(self):
        # You can't really inflate or deflate a concrete balloon, we'll say that is_inflated is True
        # because it looks like it is...
        super().__init__(colour="Grey", is_inflated=True)


In [None]:
cb = ConcreteBalloon()
cb.pop()

In [None]:
cb

In [None]:
# Wow, OK - I guess I can start overwriting those methods, and reading the parent class documentation... 
# Actually, maybe this heirarchy is wrong (is that possible?)  I guess I can refactor the code...

## Metaprogramming with types as objects

In [None]:
# Let's try something a bit weird...

# Please never do anything like this, just know that you can
class WordNumber(int):
    
    def __repr__(self):
        return str(self)
    
    def __str__(self):
        if self < 0:
            return "negative number"
        elif self > 3:
            return "higher than I can count!"
        word_map = {
            0: "zero",
            1: "one",
            2: "two",
            3: "three"
        }
        return word_map[self]
    
    def __add__(self, other):
        return WordNumber(int(self)+int(other))

In [None]:
x = WordNumber(2)
x

In [None]:
type(x) == int

In [None]:
isinstance(x, int)

In [None]:
isinstance(x, Integral)

In [None]:
# Inspecting the type heirarchy;
# MRO (Method Resolution Order)

WordNumber.mro()

In [None]:
def wordnumber_factory(base_type: type) -> type:
    class WordNumber(base_type):
    
        def __repr__(self):
            return str(self)

        def __str__(self):
            if self < 0:
                return "negative number"
            elif self > 3:
                return "higher than I can count!"
            word_map = {
                0: "zero",
                1: "one",
                2: "two",
                3: "three"
            }
            return word_map[self]

        def __add__(self, other):
            return WordNumber(base_type(self)+base_type(other))
        
    return WordNumber

In [None]:
WordNumberFloat = wordnumber_factory(float)
WordNumberInt = wordnumber_factory(int)

In [None]:
type(WordNumberFloat)

In [None]:
x = WordNumberInt(2)
x

In [None]:
y = WordNumberFloat(2.0)
y

In [None]:
y = WordNumberFloat(2.5)
y

In [None]:
def better_wordnumber_factory(base_type):
    class WordNumber(base_type):
    
        def __repr__(self):
            return str(self)

        def __str__(self):
            if self < 0:
                return "negative number"
            elif self > 3:
                return "higher than I can count!"
            word_map = {
                0: "zero",
                1: "one",
                2: "two",
                3: "three"
            }
            if int(self) == self:
                return word_map[self]
            else:
                return f"{word_map[int(self)]} and a bit"

        def __add__(self, other):
            return WordNumber(base_type(self)+base_type(other))
        
    return WordNumber

In [None]:
WordNumberFloat = better_wordnumber_factory(float)

In [None]:
x = WordNumberFloat(1.0)
y = WordNumberFloat(1.5)
x, y

In [None]:
x + y

In [None]:
float(x+y)

In [None]:
f"I have {x+y} fish in my pockets"

In [None]:
isinstance(x, Real)

In [None]:
np.ones(5) * (x+y)

## Is Python really "type safe"?

Sure it is - as long as you're in Python

In [None]:
x = np.linspace(0,1,20, dtype=np.float64)

In [None]:
x

In [None]:
# The raw memory contents of our numpy array - the thing that will get passed into C/Fortran functions
# inside numpy, scipy etc

x.tobytes()

In [None]:
type(x.tobytes())

In [None]:
# Reconstitute this array from the bytes - note that we need to specify type and shape information
# This got lost in the "untyped" bytes representation

# OK, so we only read 10 values in shape, but they're clearly the _right_ values...

np.ndarray(buffer=x.tobytes(), dtype=float, shape=(10,))

In [None]:
# Let's make a subtle mistake...

x = np.linspace(0,1,20,dtype=np.float32)

In [None]:
y = np.ndarray(buffer=x.tobytes(), dtype=float, shape=(10,))

In [None]:
import pandas as pd

In [None]:
# Our original array

pd.Series(x).plot()

In [None]:
# The reconstituted version
# Something looks... a bit strange

pd.Series(y).plot()

### Floating point representation

For a full explanation of what just happened, we'll cover this in another workshop - the short answer is that floating point numbers use part of their memory to represent the exponent (scale), and another part to represent the precise value (mantissa).  As discussed in C example at the start, something might be 'valid' (ie not crash), but still be nonsense - or even worse, Plausible But Wrong.  On the plus side, exploiting this kind of thing can be highly lucrative...

https://en.wikipedia.org/wiki/Fast_inverse_square_root<br>
https://en.wikipedia.org/wiki/Quake_III_Arena

### Final thoughts

Python's type system has grown gradually over many years, and several "Pythons"; it isn't always internally consistent, but it does provide significant capacity for reasoning about ideas, and improving legibility and safety of code.<br>

...As always, it's only as good as the weakest link in the chain.  Know your libraries, and when not to trust them - and try to write code that other people (ie you in the future) can trust!

## ...ps

Remember this?
```
if type(x) == float:
    do_something(x)
```

You probably meant
```
if isinstance(x, Real):
```

...but I'm sure you know that by now!
