In [None]:
%reload_ext postcell
%postcell register

# Adding type safety to Python

Python is often called a _dynamic_ language. In the 90s it was in a category called _scripting_ languages. Most production systems, until recently, were written in _static_, as opposed to _dynamic_ langauges.

Here is a simple line of code in Python:

```python
age = 34
```

Here is a line a static language (Java/C/C++/C#/etc.):

```python
int age = 34;
```

Notice that the static langauge forces a progarmmer to declare the _type_ of the variable. In this case, we must tell the langauge compiler that the `age` variable is of type int. 

The `age` variable in Python can later be assigned a decimal value, without error. However, the static line will not allow the `age` variable to be reassigned to a non-integer value.

Further, the static language can not be simply executed. It must first be _compiled_, where all types are checked. Once the type check passes, only then is the language allowed to execute. One again, if there is a single type error in the whole file, the file will refuse to run!

Hence, languages such as Java and C# do a _static_ check of the whole file for incorrect types, and are called _static_ langauges.

Python also does check types, but only when they are required:

In [None]:
age = 34 # <= This will not result in a type error
# age = 73

name = "Homer"
is_elderly = age >= 70

if is_elderly:
    print("Good afternoon " + age) # <= In static languages, this would cause an error, since it should be str(age)
else:
    print("Hiya " + name)
    

Notice that the above code does actually check types (when `age = 73`); however, the type is checked only when required, not up-front. In other words, types are checked _dynamically_, hence Python is a dynamic language.

Other examples of popular dynamic languages include Ruby, Javascript, R and others.

#### Historical aside
As with my common programming concetps, what seems mechanical and utilitarian often has roots in logic and philosophy. An earlier incranation of _types_ can be found in famous mathematician, philosopher and author's work with set theory.

Naive _sets_, being one of the foundations of mathematics contain a paradox. 

```
Let R be the set of all sets that are not members of themselves
```
Is R a member of this set?

This caused the logician Frege to have a breakdown and end up in the hospital!

An easier to digest example might be 
```
In a small town with one barber
1. All men must be clean shaven
2. A barber shaves all men who do not shave themselves (and only men who do not shave themselves)
```
Does this barber shave himself?

If no, then the barber is not clean shaven and rule #1 is broken. If yes, then rule #2 is broken.

sources: 
- https://en.wikipedia.org/wiki/Russell%27s_paradox
- https://www.youtube.com/watch?v=xauCQpnbNAM

Russell proposed a heirarchy of types where an expression can refer to basic objects (like numbers), sets of those basic objects or sets containing sets of basic objects, etc. This disallowed a set which can contain itself. The author of these lecture notes is not a logician and will stop any further attempts to explain something he, himself, doesn't understand.

### Back to Python
Recent version of Python have added syntax to allow type checking to Python code. The version we are using in this lecture assumes Python version 3.9 or higher.

Python now allows optional type annotation:

```python
variable: type = value
```


In [None]:
age: int = 34

Notice the addition of `int` to the definition of the variable `age`. 

**important** Python does not actually check the types. The interpreter simply ignores them. Types exist only for human readers and third-party type checkers

In [None]:
age

In [None]:
age = "this is now a string"

In [None]:
age

Notice Python didn't complain that we assigned a string to an integer. What's more, we can violate type annotations more directly:

In [None]:
age:int = "this can't be right"

In [None]:
age

Again, notice no error.

We can have a utility carry out type checks for us. Let's install _nb_mypy_, load it into the notebook and turn on automatic type checking.

In [None]:
# !pip install nb_mypy

In [None]:
%load_ext nb_mypy

In [None]:
%nb_mypy On
#%nb_mypy Off

### All subsequent cells will have type checking turned on!

This should now show an error:

In [None]:
age:int = "this can't be right"

This should be ok (but notice execution takes slightly longer)

In [None]:
age:int = 34

### Basic types
You can annotate variables with any basic type

In [None]:
name: str        = "Homer"
age: int         = 34
is_elderly: bool = False
gpa: float       = 1.7

Variables can be annotated with collection types

In [None]:
names: list = ['Homer', 'Marge', 'Skinner', 'Bart']

In [None]:
names_and_ages: dict = {'Homer':36, 'Marge':34, 'Skinner':42, 'Bart':12}

Let's make sure the type system is working:

In [None]:
names: dict = ['Homer', 'Marge', 'Skinner', 'Bart']

**Exercise** Please add types to the following code:

In [None]:
%%postcell exercise_025_300_a

student_name = "Jim"
student_age  = 24
student_fee  = 34.45
student_is_passing = True
student_current_enrollment = ['Adv Python', 'Linear Alg', 'Machine Learning']
student_grades = {'Basic Python': 3.4, 'Basic Stats': 3.1, 'Team org': 3.8}

### Types for functions

Function types are defined by types of input arguments _and_ types of output value

In [None]:
def calc_grade(grade):
    
    print(grade.capitalize())
    
    if grade > 3.5: return 'A'
    else: return grade

In [None]:
calc_grade(3.4) # <= Wrong!

Types would have helped us catch some errors earlier!

_mypy_ would have caught the error at the time we defined the function (instead of waiting until we called it), if it had enough information. Here is an example:

In [None]:
def calc_grade(grade: str) -> float:
    
    print(grade.capitalize())
    
    if grade > 3.5: return 'A'
    else: return grade

_Before_ running this function, we found errors! Imagine if you were creating a large library of function, this would be abosoutely invaluable!

**Exercise** Write a Python function to convert Celcius to Fahrenheit: ` F = (9/5)C + 32`. Assume the intputs and outputs are decimals and please use appropriate types.

In [None]:
%%postcell exercise_025_300_b
#you code here

### Union types
Sometimes functions should be able to accept more than one argument type. For example, an ill-advised function could return the numeric grade if a string grade is passed in or vice versa:

In [None]:
def calc_grade(grade: str|float) -> str|float:
    if isinstance(grade, str):
        if grade.upper() == 'A': return 4.0
        else: return 3.0
    else: # if it is not a string, then it must be float, type checker forbids everything else
        if grade > 3.5: return 'A'
        else: return 'B'

In [None]:
calc_grade('A')

In [None]:
calc_grade(3.4)

Note: an argument type of `float` can be used when a function can accept either `float` or `int`. There is no need to use the union operator `float|int`

In [None]:
calc_grade([2.3, 3.1])

### `Any` type
Python provides the ability to annotate a variable with the `Any` type. This is used when we wish to simply ignore any type constraints. 

In [None]:
from typing import Any

In [None]:
name: Any = "Homer"
age : Any = "Age of darkness"

Why do we need `Any`, what if we simply not add any type annotation?

Some projects may force developers to type annotate every variable and function. This sometimes happens when a dev manager has background in static languages, such as Java. Python is a dynamic langauge and forcing the use of type annotations, even if that type is `Any`, is silly. 

### `None` type
Sometimes function don't return any value. In those instances, the function should be typed as returning `None`

In [None]:
def say_hello(name: str) -> None:
    print(f"Hello {name}")

In [None]:
say_hello("Shahbaz")

### Generic types
Python has some advanced types. 
We saw earlier that we can limit variables to a collection type, for example a list:

In [None]:
def say_hellos(names: list) -> None:
    for n in names:
        print(f"Hello {n}")

In [None]:
say_hellos([1,2,3,4])

Let's be more specific with our types!

In [None]:
def say_hellos2(names: list[str]) -> None:
    for n in names:
        print(f"Hello {n}")
        #print(n ** 2)

In [None]:
say_hellos2([1,2,3,4])

Notice that you don't even have to call the function!

In [None]:
def greet_clients(cust_ids: list[int]) -> None:
    print("You are now entering an advanced Python library")
    say_hellos2(cust_ids)

You can futher refine types!

```python
variable: collection_type[element_type] = [...]
```

Do the same thing with dictionaries

In [None]:
def say_hellos3(names_ages: dict[str, float]) -> None:
    for name, age in names_ages.items():
        print(f"Hello {name}" if age < 75 else f"Good afternoon {name}")

In [None]:
say_hellos3({'Homer':36, 'Marge':34, 'Mr. Burns':78})

In [None]:
say_hellos3({'Homer':36, 'Marge':34, 78:78})

**Exercise** Fix the type error in this code

In [None]:
def calc_post_tax_price(price: float, tax_rate: int):
    post_tax_px =  price * (1 + tax_rate)
    return post_tax_px

The function above will calculate tax. Given a price and a tax rate, it will multiply them together.

The function below will accept a list of transactions and the state in which those transactions took place, calculate the tax for each item and return the sum.

**You job is the find the type error.**

In [None]:
def calc_total_price(prices: list[float], state_of_sale: str, tax_rates: dict[str, float]):
    total = 0
    for price in prices:
        tax_rate = tax_rates[state_of_sale]
        post_tax_total = calc_post_tax_price(price, tax_rate)
        total += post_tax_total
    return total

tax_rates: dict[str, float]   = {'IL': 0.0625, 'IN': 0.07, 'KS': 0.065}
sale_prices: list[float] = [43.44, 78.65, 20.11, 29.3, 2.54, 7.34]

calc_total_price(sale_prices, 'IL', tax_rates)

In [None]:
%%postcell exercise_025_300_c

# paste both functions here (after correcting the error)

### Make type checking part of continuous integration

It should be obvious how checking types should be part of any CI/CD pipeline. Indeed, these types can be checked at the command line using the `mypy` tool, then this tool can be called from Github Actions or even pre-commit git hooks.

You may need to install `mypy` first:

In [None]:
# !pip install mypy

In [None]:
%%writefile badly_typed_code.py

def calc_grade(grade:str) -> str:
    
    print(grade.capitalize())
    
    if grade > 3.5: return 'Pass'
    else: return "Fail"

In [None]:
!python badly_typed_code.py

In [None]:
!mypy badly_typed_code.py

In [None]:
!mypy .