# Chapter 3: Sequence in Python

Python supports useful data structures:
1. `list`
2. `tuple`
3. `string`
4. `dict`
5. `set`

Readers may read the [Programiz](https://www.programiz.com/python-programming) under the section of Python Data Types. And refer to Python documentations when you found unfamiliar methods (e.g.: `extend`, `slice`, `append`) This chapter intends to present the use of data structure in solving problem.

# Tuple

## Tuple Assignment

The swapping idiom is merely a syntatic sugar from tuple assignment

In [21]:
x = 10
y = 12
print(x,y)
x, y= y, x # (x,y) = (y, x)
print(x,y)

10 12
12 10


In [1]:
x, y, z = 10, 12, 15
print(x, y, z)
x, y = 12 ,10
print(x,y)

10 12 15
12 10


## Tuple and Function Argument

In [5]:
def summation(li):
    result = 0
    for i in li:
        result += i
    return result
summation([1,2,3])

6

Instead, you can use the aterisk operator (`*`)

In [7]:
def summation(*args):
    result = 0
    for i in args:
        result += i
    return result
summation(1,2,3,4)

10

The aterisk operator will transform the arguments into a tuple

In [12]:
def print_args(*args):
    print(args, type(args))
print_args(1,2,3)

(1, 2, 3) <class 'tuple'>


In [13]:
def compose(f,g):
    return lambda x: f(g(x))
compose(lambda x: x*x, lambda x: x+1)(2)

9

What if we wish to compose the multi-valued functions rather than single-valued? Says, sum of 2 squares of two inputs.

In [26]:
add = lambda x,y: x+y
def squareboth(x,y):
    return x*x, y*y
add(squareboth(3,4))

TypeError: <lambda>() missing 1 required positional argument: 'y'

To fix it, we have to reapply the aterisk operator to unpack the tuple

In [21]:
add =  lambda x,y: x+y
add(*(3,4)) # add(3,4)

7

In [23]:
# reapply the aterisk operator
add = lambda x,y: x+y
def squareboth(x,y):
    return x*x, y*y
add(*squareboth(3,4))

25

Hence, the `compose` for multi-valued functions can be generalized as

In [25]:
def compose(f,g):
    return lambda *args: f(*g(*args))
compose(lambda x,y: x+y, squareboth)(3,4)

25

# Map, Filter, Accumulate, Comphrension

In [54]:
li = [i for i in range(5, 11)] # enumeration through list comphrension
li

[5, 6, 7, 8, 9, 10]

In [55]:
list(map(lambda x: x*x, li)) # map

[25, 36, 49, 64, 81, 100]

In [56]:
from functools import reduce
from operator import add
reduce(add, map(lambda x: x*x, li)) # accumulate or reduce

355

In [40]:
n = 10
li = list(range(1, n+1))
li = filter(lambda x: n%x == 0, li)
li = list(li)
li

[1, 2, 5, 10]

In [27]:
def divisors(n):
    # below is another way writing it using list comphrension
    # use `if` to filter
    # use `for` to iterate something
    return [d for d in range(1,n+1) if n%d == 0]
divisors(10)

[1, 2, 5, 10]

In [45]:
def is_perfect(n):
    return sum(divisors(n)[:-1]) == n
print(is_perfect(6))
print(is_perfect(18))
print(is_perfect(28))
print(is_perfect(8128))

True
False
True
True


The comphrension, map, filter, accumulate can be similarily applied to all iterables including `tuple`, `set`, `dict` and `list`.

In [53]:
t = tuple(i for i in range(10))
t

(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)

In [58]:
print(tuple(map(lambda x: x*2, t)))
print(tuple(i*2 for i in t))

(0, 2, 4, 6, 8, 10, 12, 14, 16, 18)
(0, 2, 4, 6, 8, 10, 12, 14, 16, 18)


In [61]:
from random import choice, randint
npc_first_names = ["Alan", "Khan", "Mei"]
npc_second_names = ["Leo", "Callie", "Max"] 
random_name = lambda: choice(npc_first_names) + " " + choice(npc_second_names)
# random npc with random level
random_npc = {
    random_name(): randint(1,100) for _ in range(6)
 }
random_npc

{'Alan Max': 71, 'Mei Leo': 60, 'Alan Leo': 72, 'Mei Max': 96, 'Khan Leo': 83}

## Example: Pythagorean triples

1. Enumerate triple $(i,j,k)$ such that $i + j \leq k$ and $i \leq j$ (why?)
2. Filter $(i,j,k)$ if it is Pythagorean triples

In [43]:
n = 4
triples = [(x,y,z) for z in range(1, n+1) for y in range(1, z) for x in range(1, y)]
triples

[(1, 2, 3), (1, 2, 4), (1, 3, 4), (2, 3, 4)]

In [44]:
def enumerate_triples(n):
    return [(x,y,z) for z in range(1, n+1) for y in range(1, z) for x in range(1, y)]

In [45]:
is_pytha_triple = lambda x,y,z : x*x + y*y == z*z
n = 25
pytha_triples = [triple for triple in enumerate_triples(n) if is_pytha_triple(*triple)]
pytha_triples

[(3, 4, 5),
 (6, 8, 10),
 (5, 12, 13),
 (9, 12, 15),
 (8, 15, 17),
 (12, 16, 20),
 (15, 20, 25),
 (7, 24, 25)]

## Example : Capitalize

In [26]:
"map".capitalize()

'Map'

In [27]:
msg = "How can I capitalize the first letter of each word in a string?"
msg.split()

['How',
 'can',
 'I',
 'capitalize',
 'the',
 'first',
 'letter',
 'of',
 'each',
 'word',
 'in',
 'a',
 'string?']

In [29]:
list(map(lambda s: s.capitalize(), msg.split()))

['How',
 'Can',
 'I',
 'Capitalize',
 'The',
 'First',
 'Letter',
 'Of',
 'Each',
 'Word',
 'In',
 'A',
 'String?']

In [31]:
" ".join(map(lambda s: s.capitalize(), msg.split()))

'How Can I Capitalize The First Letter Of Each Word In A String?'

## Side Note

In [None]:
msg = "How can I capitalize the first letter of each word in a string?"
from string import capwords
capwords(msg)

The name of framework "MapReduce" might refer to the map function and reduce function in functional programming language.

# Example: Sieve of Eratosthenes

Reference:
1. [`range`](https://www.w3schools.com/python/ref_func_range.asp)

Consider the primality test, previously we have devised a $O(\sqrt{n})$ algorithm that test if a number is prime. How can we do better than? It turns out that the ancient algorithm: [Sieve of Eratosthenes](https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes) (read this link) is more efficient but it consumes memory. Alas, nowadays, memory is still cheap compare to computations.

Notice that `is_prime(n)` mostly take positive integers, then we might establish the following correspondence:
```
def is_prime(n):
    return answer[n]
```
What remains is to fill out the list `answer`. We expect that:
```
answer[0] = False
answer[1] = False
answer[2] = True
answer[3] = True
...
```

This suggest that we can fill out the list using the Sieve of Eratosthenes. First start 

In [57]:
def sieve(n):
    li = [True for _ in range(n)]
    li[0] = False
    li[1] = False
    for i in range(2, n):
        if li[i] == True:
            for j in range(i*i, n, i):
                li[j] = False
    return li

answer = sieve(1000)
def is_prime(n):
    return answer[n]

for i in range(12):
    print(i, is_prime(i))

0 False
1 False
2 True
3 True
4 False
5 True
6 False
7 True
8 False
9 False
10 False
11 True


Process of "remembering" the answer of function is called **memomizing**. It is common technique to speed up the recursive algorithm if applicable. In Chapter 1 Evaluation Model, we have talked about inefficiency of lazy programming language. 

The **memomization** can be used to speed up the lazily evaluated programming languages. In memomized version, computing the same expression multiple times only reaccess the cahced result.

The question is that how come the new algorithm is faster than previous algorithm in testing primality？ It is all about trade off. According to some source, given input $n$, the new one has a time complexity of $O(n \log {n})$ and  $ O(n) $ memory, the old one is $O(\sqrt{n})$ and $ O(1) $. At first glance, the old one is better.

Consider the problem listing the primes before $n$

Using $O(\sqrt{n})$ version, for each $ i $ before $ n $, the program have to peform at worst $O(\sqrt{i})$. Thus, it is asymptotically near to $O(n\sqrt{n})$.

Using the sieve version, it is $O(n \log {n})$. Indeed, the growth rate $O(n \log {n})$ is smaller than $O(n\sqrt{n})$ given big value $ n$ at the cost of $O(n)$ memory.

## Exercise

In [13]:
'''
partition(n,d) is the number of way sum up to n using positive integers no bigger than d.
'''
def partition(n, d):
    if n == 0:
        return 1
    elif n < 0 or d <= 0:
        return 0
    else:
        return partition(n, d - 1) + partition(n - d, d)

In [None]:
partition(200,200) # it becomes slower as n get bigger; it will just stuck

In [15]:
def memomise_partition(n):
    # recur(n,d) hence memo[][]
    memo = [[0 for _ in range(n)] for _ in range(n)]
    for d in range(n):
        memo[0][d] = 1
    for d in range(n):
        memo[d][1] = 1
    for d in range(1, n):
        for x in range(1, n):
            memo[x][d] = memo[x-d][d] + memo[x][d-1]
    return memo
memo = memomise_partition(201)
def faster_partition(n, d):
    return memo[n][d]

In [16]:
assert(faster_partition(10,10) == partition(10, 10))
assert(faster_partition(7,5) == partition(7, 5))
assert(faster_partition(5,3) == partition(5, 3))

In [17]:
faster_partition(200,200)

3972999029388

1. (Difficult) Code above present how to memomize the partition, by refering to it, you may try to memomize the `product_num`

In [None]:
def product_num(n):
    pass

# Example: Coin Change Revisited

Reference:
1. [`slice`](https://www.geeksforgeeks.org/python-list-slicing/)

In Chapter 2 Recusion, we can represent the group of coin type using `list`

In [8]:
MY_COINS = [5,10,20,50]
EU_CURRENCY = [1,2,5,10,20,50,100,200]

In [15]:
def count_change(amount, coins):
    if amount == 0:
        return 1
    elif amount < 0:
        return 0
    elif coins == []:
        return 0
    else:
        return count_change(amount, coins[1:]) + count_change(amount - coins[0], coins)
count_change(100, MY_COINS)

49

In [16]:
count_change(200, EU_CURRENCY) # it takes some time to calculate

73682

## Exercise

In [36]:
def sum_partition(n):
    def recur(n, d, accumulated):
        if n == 0:
            return [accumulated]
        elif n < 0 or d <= 0:
            return []
        else:
            return recur(n, d - 1, accumulated) + recur(n - d, d, accumulated + [d])
    return recur(n, n, [])
sum_partition(5)

[[1, 1, 1, 1, 1], [2, 1, 1, 1], [2, 2, 1], [3, 1, 1], [3, 2], [4, 1], [5]]

The code above return list of group of integers that sum up to `n`. Develop a similar code for `product_partition`.

In [2]:
def product_partition(n):
    pass