# Functions

A **function** is a mathematical object, described as an arrow $f: a\to b$.

In [None]:
def double(x):
    return x * 2

In [None]:
double(3)

In [None]:
def apply_to_one(f):
    """Calls the function f with 1 as its argument"""
    return f(1)
my_double = double
x = apply_to_one(my_double)
x

In [None]:
y = apply_to_one(lambda x: x + 4)
y

In [None]:
another_double = lambda x: 2 * x    # the book says don't do this... why...?
another_double(5)

In [None]:
def another_double(x):
    return 2 * x    # it says do this instead...
another_double(3)

We can assign default arguments to functions.

In [None]:
def my_print(message = "my default message"):
    return print(message)   # print(message) also works
    
my_print("hello")
my_print()    # will give the default value

Useful to specify arguments by name:

In [None]:
def full_name(first = "What's-his-name", last = "Something"):
    return first + " " + last

print(full_name("Joel", "Grus"))
print(full_name("Joel"))
print(full_name(last = "Grus"))
full_name(last = "Grus")

#  Strings

Python uses backslashes to encode special characters. For example:

In [None]:
tab_string = "\t"    # represents the tab character
len(tab_string)    

In [None]:
multi_line_string = """the first line.
the second line.
the third line."""    #Multiline

print(multi_line_string)

**f-string**: way to substitutes values into strings.

In [None]:
first_name = "Joel"
last_name = "Grus"

# There are three ways to combine them into a full name

full_name1 = first_name + " " + last_name
full_name2 = "{0} {1}".format(first_name, last_name)
full_name3 = f"{first_name} {last_name}"    #this is the f-string way.
full_name4 = f"{first_name} blah {last_name}"    #this is the f-string way.
print(full_name3)
print(full_name4)

# Exceptions

When something goes wrong, probably something like $1/0$ divide by $0$, Python raises an **exception**. Unhandled exceptions will cause programs to crash. Using **try and except** gives a way to handle them.

In [None]:
try:
    print(0/0)
except ZeroDivisionError:
    print("cannot divide by zero")
    
try:
    print(1/3)
except ZeroDivisionError:
    print("cannot divide by zero")

# Lists

A list is an ordered collection.

In [None]:
integer_list = [1, 2, 3]
heterogeneous_list = ["string", 0.1, True]
list_of_lists = [integer_list, heterogeneous_list, []]

list_length = len(integer_list)
list_sum = sum(integer_list)
print("The length of integer_list is" + " " + str(list_length) + " and the sum of elements in the list is"\
      + " " + str(list_sum) +".")

We can choose n-th element of a list by using **[n]**.

In [None]:
x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

zero = x[0]  # the first element in the list
one = x[1]    # the second element in the list
nine = x[-1]    # the last element in the list 'Pythonic' for last element
eight = x[-2]    # 'Pythonic' for next-to-last element
x[0] = -1    # now x is [-1, 1, 2, 3, 4, 5, 6, 7, 8, 9]

We can slice a list by using **[i:j]** (i is inclusive and j is not inclusive).

In [None]:
print(x[:3])
print(x[3:])
print(x[1:5])
print(x[-3:])    # last three
print(x[1:-1])    # without first and last
print(x[:])    # copy of x

We can similarly slice strings and other "sequential" types.

We can take a third argument to indicate its stride, which can be negative:

In [None]:
every_third = x[::3]
five_to_three = x[5:2:-1]

print(every_third)
print(five_to_three)

*in* operator allows us to check for list membership.

In [None]:
print(1 in [1, 2, 3])
print(0 in [1, 2, 3])

We can concatenate lists together.

In [None]:
x = [1, 2, 3]
x.extend([4, 5, 6])    # this changes x
print(x)

In [None]:
x = [1, 2, 3]
y = x + [4, 5, 6]    # this doesn't change x
print(x)
print(y)

In [None]:
x = [1, 2, 3]
x.append(0)   # append 0 to x
y = x[-1]
print(y)
print(x)

When we know length of a list, we can unpack it. For example,

In [None]:
x, y = [1, 2]    # now x is 1 and y is 2.

We will get a **ValueError** if we don't have the same number of elements on both sides.

We can throw out a value we don't care by using a underscore '_'. 

In [None]:
_, y = [1, 2]    # now y = 2, but we didn't care about 1


# Tuples

Tuples are immutable cousins of lists. A tuple is specified by using parentheses () or nothing, instead of square brackets.

In [None]:
my_list = [1, 2]
my_tuple = (1, 2)
other_tuple = 3, 4

my_list[1] = 3    # my_list now is [3, 2]

try:
    my_tuple[1] = 3

except TypeError:
    print("cannot modify a tuple")

We can use tuples to return mutiple values from functions:

In [None]:
def sum_and_product(x, y):
    return (x + y), (x * y)

sp = sum_and_product(2, 11)
s, p = sum_and_product(2, 11)
print(sp, s, p)

Tuples also can be used to assign mutiple values.

In [None]:
x, y = 1, 2    # x is 1 and y is 2
x, y = y, x    # "Pythonic" way to swap variables; now x is 2 and y is 1.


# Dictionaries

A dictionary associates values with keys.

In [None]:
empty_dic = {}
grades = {"Joel": 80, "Tim": 95}

print(grades["Joel"])

In [None]:
try:
    kate_grade = grades["Kate"]
except KeyError:
    print("no grade for Kate!")

In [None]:
joel_has_grade = "Joel" in grades    # True
kate_has_grade = "Kate" in grades    # False

A **get** method returns a default value (instead of raising an exception) when you look up a key not in the dictionary:

In [None]:
joels_grade = grades.get("Joel", 0)
kates_grade = grades.get("Kate", 0)
no_ones_grade = grades.get("No One")    # default is None

print(joels_grade)
print(kates_grade)
print(no_ones_grade)

We can change values and add key/value pair.

In [None]:
print(len(grades))
print(grades)

grades["Tim"] = 99    # change Tim's value to 99
grades["Kate"] = 100    # adds a third entry

num_students = len(grades)
print(num_students)
print(grades)

We can use dictionaries to represent structured data:

In [None]:
tweet = {
    "user": "joelgrus",
    "text": "Data Science is Awesome",
    "retweet_count": 100,
    "hashtags": ["#data", "#science", "#datascience", "#awesome", "#yolo"]
}

Besides looking for specific keys, we can look at all of them:

In [None]:
tweet_keys = tweet.keys()    # iterable for the keys
tweet_values = tweet.values()    # iterable for the values
tweet_items = tweet.items()    # iterable for the (key, value) tuples

print(tweet_keys)
print(tweet_values)
print(tweet_items)
tweet_items

In [None]:
"user" in tweet_keys    # True, but not "Pythonic"
"user" in tweet    # True, and "Pythonic"
"joelgrus" in tweet_values    # True (slow but the only way to check)

Dictionary keys must be "hashable" (~ immutable). For example, you cannot use lists as keys.


# defaultdict

Assume that we want to count the words in a document. To do this, we can make a dictionary whose keys are words and values are counts.

In [None]:
word_counts = {}
for word in document:
    if word in word_counts:
        word_counts += 1
    else:
        word_counts = 1

In [None]:
# "forgiveness is better than permission" approach

word_counts = {}
for word in document:
    try:
        word_counts[word] += 1
    except KeyError:
        word_counts[word] =1

In [None]:
# We can use "get" as well

word_counts = {}
for word in document:
    previous_count = word_counts.get(word, 0)
    word_counts[word] = previous_count + 1

These three methods are slightly inefficient, which is why **defaultdict** is useful.
When you try to find a key that a defaultdic doesn't have, it first adds a value for it using a zero-argument function provided when you created it. You need to **import defaultdict from collections** to use it.

In [None]:
from collections import defaultdict

word_counts = defaultdict(int)    # int() produces 0
for word in document:
    word_counts[word] += 1

In [None]:
from collections import defaultdict

dd_list = defaultdict(list)    # list() produces an empty list []
dd_list[2].append(1)    # now dd_list is {2: [1]}

print(dd_list)

In [None]:
dd_dict = defaultdict(dict)    # dict() produces an empty dictionary
dd_dict["Joel"]["City"] = "Seattle"

dd_dict

In [None]:
dd_pair = defaultdict(lambda: [0, 0])
dd_pair[2][1] = 1
print(dd_pair)

dd_pair[3].append(1)
print(dd_pair)


# Counter

A Counter turns a sequence of values into a defaultdic(int)-like object mapping keys to counts. This gives a simple way to count words. A counter instance has a **most_common** method printing out most common words and their counts.

In [None]:
from collections import Counter

c = Counter([0, 1, 2, 0])
print(c)

In [None]:
document = "1 2 3 4 0 1 2 1 2 0 0 8 9 9 0 0 1 8"
word_counts = Counter(document)

print(word_counts)

In [None]:
for word, count in word_counts.most_common(3):    # print the 3 most common words and their counts
    print(word, count)


# Sets

A set represents a collection of _distict_ elements. For example,

_primes_below_10 = {2, 3, 5, 7}_.

For empty set, we cannot use _{}_ because _{}_ means "empty dict", so we use _set()_ instead.

In [None]:
s = set()
print(s)

s.add(2)
print(s)

s.add(1)
print(s)

s.add(1)
print(s)

print(2 in s)
print(3 in s)

**in** is a very fast operation on sets, so when it comes to membership test for a large collection, a set is more appropriate than a list. Also, it is useful **to find distict items**.

In [None]:
hundreds_of_other_words = list(range(1000000))
stopwords_list = ["a", "an", "at"] + hundreds_of_other_words + ["yet", "you"]

print("zip" in stopwords_list)    # have to check every element so slower

In [None]:
stopwords_set = set(stopwords_list)
print("zip" in stopwords_set)    # fast to check

In [None]:
item_list = [1, 2, 3, 1, 2, 3]
num_items = len(item_list)
item_set = set(item_list)
num_distict_items = len(item_set)
distict_item_list = list(item_set)

print(item_list)
print(num_items)
print(item_set)
print(num_distict_items)
print(distict_item_list)

# Control Flow

Performing an action conditionally using **if**:

In [None]:
if 1 > 2:
    message = "if only 1 were greater than two..."
elif 1 > 3:
    message = "elif stands for 'else if'"
else:
    message = "when all else fails use else (if you want to)"
    
print(message)

We can also use _ternary(composed of three parts)_ **if-then-else** on one line.

In [None]:
x = 9
parity = "even" if x % 2 == 0 else "odd"
parity

We can also use **while** loop:

In [None]:
x = 0

while x < 10:
    print(f"{x} is less than 10")    # f"{x} ??????????
    x += 1

We will more often use **for** and **in**:

In [None]:
for x in range(10):
    print(f"{x} is less than 10")

If more complex logic is needed, we can use **continue** and **break**:

In [None]:
for x in range(10):
    if x == 3:
        continue    # go immediately to the next iteration
    if x == 5:
        break    # quit the loop entirely
    else:
        print(x)

# Truthiness

**True, False, and None** None indicates a nonexistent value. ('None' is similar to 'null' in other languages.)

In [None]:
x = None
assert x == None, "not Pythonic"
assert x is None, "Pythonic"

If the assertions were false, it would print out error messages.

Python lets you use any value where it expects a Boolean. For example, _False, None, [], {}, "", set(), 0, 0.0_ are _"falsy"_.
Pretty much anything else is treated as _True_. So we can use **if** statements to test for empty lists, empty strings, empty dictionaries, etc. _This can also cause unexpected tricky bugs._


In [None]:
s = some_function_that_returns_a_string()
if s:
    first_char = s[0]
else:
    first_char = ""
first_char

The following is another way to do this but it can be confusing.

In [None]:
s_1 = []
first_char = s_1 and s_1[0]
first_char

**and** returns its second value when the first one is _truthy_ and its first vulue when it's _falsy_.
Conversely, **or** returns its second value when the first one is _falsy_ and its first value when it's _truthy_.

In [None]:
safe_x = x or "yes"    # x is falsy
safe_x

In [None]:
safe_x = x if x is not None else 0    # this way is more readable
safe_x

Python has **all** and **any** functions which take an iterable and returns Booleans. **all** returns "True" when no element is falsy and **any** returns "True" when at least one element is truthy.

In [None]:
a = all([True, 1, {3}])    #True
b = all([True, 1, {}])    # False
c = any([True, 1, {}])    # True
d = all([])    # True
e = any([])    # False
print(a, b, c, d, e)

# Sorting

Python has a **sort** method and the **sorted** function.

In [None]:
x = [4, 1, 2, 3]
print(x)
y = sorted(x)
print(y)
x.sort()
print(x)

sort/sorted sort a list from smallest to largest. If you want to sort from largest to smallest, you can use **reverse = True**. Also, instead of comparing the elements themselves, you can compare the results of a function you specify with key:

In [None]:
# sorth the list by absolute value from largest to smallest
x = sorted([-4, 1, -2, 3], key = abs, reverse = True)
print(x)

In [None]:
# sort the words and counts from highest count to lowest
wc = sorted(word_counts.items(), key = lambda word_and_count: word_and_count[1], reverse = True)

# List Comprehensions

Transforming one list to another list: you can choose some elements and transform them.

In [None]:
even_numbers = [x for x in range (5) if x % 2 == 0]
squares = [x * x for x in range (5)]
even_squares1 = [x * x for x in even_numbers]
even_squares2 = [x * x for x in range(5) if x % 2 == 0]

print(even_squares1, even_squares2)

In [None]:
square_dict = {x: x * x for x in range(5)}    # lists into dictionaries
square_set = {x * x for x in [1, -1]}    # lists into sets

If you don't need the value from the list, you can use an underscore "_".

In [None]:
zeros = [0 for _ in even_numbers]    # has the same length as even_numbers
zeros

A list comprehension can include multiple fors and later fors can use the results of earlier ones:

In [None]:
pairs = [(x, y)
        for x in range(3)
        for y in range(3)]
pairs

In [None]:
increasing_pairs = [(x, y)
                   for x in range(3)
                   for y in range(x + 1, 3)]
increasing_pairs

# Automated testing and assert

Skip this section for now.

# Object-Oriented Programming

We can define **classes** that encapsulate data and **the functions that operate on them**. Let's look at an example which is a class representing a "counting clicker". This counts, for example, how many people have shown up for a certain event.

We would like to maintain a count, click to increment the count, read the count, and reset back to zero.
To define a class, we use the **class** keyword.

A class contains zero or more _member_ functions. By convention, each takes a first parameter, **self**, that refers to the particular class instanace. ** __init__ ** is a constructor which takes parameters you need to construct an instance of your class and does setup you need:

In [None]:
class CountingClicker:
    """This is a docstring. This is CountingClicker."""
    def __init__(self, count = 0):
        self.count = count
        
    def __repr__(self):    # produces the string representation of a class instance
        return f"CountingClicker(count = {self.count})"
    
    def click(self, num_times =1):
        """Click the clicker some number of times"""    # functions defined on attributes of this classes
        self.count += num_times
    
    def read(self):
        return self.count
    
    def reset(self):
        self.count = 0
        
print(CountingClicker(10))
CountingClicker(10).read()

In [None]:
clicker1 = CountingClicker()    # initialized to 0
clicker2 = CountingClicker(100)    # starts with count = 100
clicker3 = CountingClicker(count = 100)    # more explicit way of doing the same

In [None]:
clicker = CountingClicker()
assert clicker.read() == 0, "clicker should start with count 0"
clicker.click()
clicker.click()
assert clicker.read() == 2, "after two clicks, clicker should have count 2"
clicker.reset()
assert clicker.read() == 0, "after reset, clicker should be back to 0"

Tests like above help us to check our code is working as we expect.

We can create **subclasses** that _inherit_ some of their functionality from a parent class. For example, we can create a non-reset-able clicker by using **CountingClicker** as the base class and overrideing the **reset** method to do nothing:

In [None]:
# A subclass inherits all the behavior of its parent class.
class NoResetClicker(CountingClicker):
    # This class has all the smae methods as CountingClicker
    
    # Except that it has a reset method that does nothing.
    def reset(self):
        pass

In [None]:
clicker2 = NoResetClicker()
assert clicker2.read() == 0
clicker2.click()
assert clicker2.read() == 1
clicker2.reset()
assert clicker2.read() == 1, "reset shouldn't do anything"
clicker2

# Iterables and Generators

Skip this section for now.

# Randomness

We will frequently need to generate random numbers, which we can do with the **random** module.

In [None]:
import random
random.seed(10)    # this ensures we can the same results every time

four_uniform_randoms = [random.random() for _ in range(4)]    # random.random() produces numbers uniformly between 0 and 1
four_uniform_randoms

In [None]:
random.randrange(10)    # choose randomly from range(10)

random.randrange(3, 6)    # choose randomly from range(3, 6) = [3, 4, 5]

We can (randomly) shuffle a list or pick one element from a list

In [None]:
up_to_five = [1, 2, 3, 4, 5]
random.shuffle(up_to_five)
print(up_to_five)

my_best_friend = random.choice(["Alice", "Bob", "Charlie"])
print(my_best_friend)

In [None]:
lottery_numbers = range(60)
winning_numbers = random.sample(lottery_numbers, 6)    # Sample 6 numbers without duplication
print(winning_numbers)

four_with_replacement = [random.choice(range(10)) for _ in range(4)]
print(four_with_replacement)

# Regular Expressions

Skip this section

# Fuctional Programming


# zip and Argument Unpacking
Come back when **zip** appears.

# args and kwargs

Skip this section.

# Type Annotations

Python is a _dynamically typed_ language. It means in general it doesn't care about the types of objects we use as long as we use them in valid ways:

In [None]:
def d_add(a, b):
    return a + b

assert d_add(10, 5) == 15
assert d_add([1, 2], [3]) == [1, 2, 3]
assert d_add("hi ", "there") == "hi there"

In [None]:
try:
    d_add(10, 'five')
except TypeError:
    print("cannot add an int to a string")

In a _statically typed_ language, our functions and objects would have specific types. For example,

In [None]:
def s_add(a: int, b: int) -> int:
    return a + b

s_add("hi ", "there")   # What??? -_-

**Point here is "use type hints!"

# How to Write Type Annotations

Let's look at an example.

In [None]:
# def total(xs: list) -> float:
#     return sum(xs)

This is not wrong but the type is not specific enough. It is clear that we want xs to be a list of floats, not (say) a list of strings.

The **typing module** provides parametrized types that we can use to do just this:

In [4]:
from typing import List    # note capital L

def total(xs: List[float]) -> float:
    return sum(xs)

total([1.1, 1])

2.1

So far we've only specified annotations for function parameters and return types. For variables themselves it's usually obcious what the type is but there is a way to specify:

In [6]:
x: int = 5    # this is same as x = 5 here it is clear that x is an int
print(x)

5


In [7]:
# values = []
# best_so_far = None    # hmmm... what are their types?

In [17]:
from typing import Dict, Iterable, Tuple

# keys are strings, values are ints
counts: Dict[str, int] = {"data": 1, "science": 2}
    
# lists and generators are both iterable
if 1:
    evens: Iterable[int] = (x for x in range(10) if x % 2 ==0)
else:
    evens = [0, 2, 4, 6, 8]

print(evens)

for i in evens:
    print(i)

<generator object <genexpr> at 0x000002688F793948>
0
2
4
6
8


In [19]:
# tuples specifiy a type for each element
triple: Tuple[int, float, int] = (10, 2.3, 5)

type(triple)

tuple

Since Python has first-class functions, we need a type to represent those as well. Here is a pretty contrived(deliberatly created) example:

In [33]:
from typing import Callable

# The type hint says that repeater is a function that takes two arguemtns, a string and an int, and returns a string.

def twice(repeater: Callable[[str, int], str], s: str) -> str:
    return repeater(s, 2)

def comma_repeater(s: str, n: int) -> str:
    n_copies = [s for _ in range(n)]
    return ", ".join(n_copies)

In [39]:
def add_str(a: int, b: int, shift: int = 0) -> str:
    addr = a + b + shift
    return str(addr)

def printer(fn):
    def _inner(*args, **kwargs):
        print("inputs are ", args)
        res = fn(*args, **kwargs)
        print("result is: ", res)
    return _inner

In [40]:
enhanced_add_str = printer(add_str)

In [41]:
add_str(2, 3, shift=2)

'7'

In [37]:
enhanced_add_str(2, 3)

inputs are  (2, 3)
result is:  5


In [42]:
def add_str_2(a: int = 0, b: int = 0) -> str:
    print(f"inputs are a={a} and b={b}")
    addr = a + b
    return str(addr)

In [45]:
add_str_2(b=3, a=2)

inputs are a=2 and b=3


'5'

In [None]:
char[] add_int(int a, int b) {
    int addr = a + b;
    return iostream.>>(addr)
}

add_str: Callable[[int, int], str] Callable[[int, int], str]

**Note:** The type hint `Callable` should be read as `Hom` (i.e. $\text{Hom}_\text{Set}(-,-)$).

In [None]:
def evaluate(fn: Callable[a, b], x: a) -> b:
    return fn(x)

evaluate :: (a -> b) -> a -> b
evaluate fn x = fn x      -- a -> b I THINK Hom(a, b) as a SET

In [51]:
from typing import Union, Callable

Real = Union[int, float]
Addable = Union[int, str, float, bool]
VectHom = Union[Callable[[float, float], float], Callable[[List[float]], List[float]]]