# Collections

Python has 4 built-in data types to store collections of data. 
- List
- Tuple
- Set
- Dictionary

Useful reference: https://docs.python.org/3/tutorial/datastructures.html

## List

- A list is a collection of elements in a particular **order**. 
- A list can include the letters of the alphabet, the digits from 0–9, etc. Elements in a list can have **different types**. 
- `[]` indicates a list, and individual elements in the list are separated by commas. 
- A list allows **duplicates**. 

In [37]:
[1,2,3]

[1, 2, 3]

In [38]:
x = ["apple", "banana", "orange"]
x

['apple', 'banana', 'orange']

In [39]:
x = [1, 2.379, "apple"] # you can have multiple data types!!!!!
x

[1, 2.379, 'apple']

In [40]:
# list of lists
[ [1, 2], ["apple", "bannana"] ]

[[1, 2], ['apple', 'bannana']]

### Create a numerical list using `range()`

In [41]:
# create a list from 0 ... 100
s = list(range(0,100))
s

[0,
 1,
 2,
 3,
 4,
 5,
 6,
 7,
 8,
 9,
 10,
 11,
 12,
 13,
 14,
 15,
 16,
 17,
 18,
 19,
 20,
 21,
 22,
 23,
 24,
 25,
 26,
 27,
 28,
 29,
 30,
 31,
 32,
 33,
 34,
 35,
 36,
 37,
 38,
 39,
 40,
 41,
 42,
 43,
 44,
 45,
 46,
 47,
 48,
 49,
 50,
 51,
 52,
 53,
 54,
 55,
 56,
 57,
 58,
 59,
 60,
 61,
 62,
 63,
 64,
 65,
 66,
 67,
 68,
 69,
 70,
 71,
 72,
 73,
 74,
 75,
 76,
 77,
 78,
 79,
 80,
 81,
 82,
 83,
 84,
 85,
 86,
 87,
 88,
 89,
 90,
 91,
 92,
 93,
 94,
 95,
 96,
 97,
 98,
 99]

In [42]:
# length of a list is the number of elements in a list
len(s)

100

In [43]:
y = [ [1, 2], ["apple", "bannana"] ] 
len(y)

2

### Indexing elements in a list

**Index positions start at 0, not 1.** From left to right, 0, 1, 2, .... From right to left, -1, -2, ...

In [44]:
y = [ [1, 2], ["apple", "bannana"] ]
y[0], y[1]

([1, 2], ['apple', 'bannana'])

### Slicing a list
Specify the index of the first and last elements you want to work with. Python stops one item before the second index you specify.

In [45]:
s = list(range(1,100))
s[19:60]

[20,
 21,
 22,
 23,
 24,
 25,
 26,
 27,
 28,
 29,
 30,
 31,
 32,
 33,
 34,
 35,
 36,
 37,
 38,
 39,
 40,
 41,
 42,
 43,
 44,
 45,
 46,
 47,
 48,
 49,
 50,
 51,
 52,
 53,
 54,
 55,
 56,
 57,
 58,
 59,
 60]

### Changing elements in a list

Use the name of the list followed by the index of the element you want to change, and then provide the new value you want that item to have.

In [46]:
x = [1, 2.379, "apple"]
x[2] = "banana"
x

[1, 2.379, 'banana']

### Adding elements to a list
1. `append()` elements to the end of a list
2.  `insert()` a new element at a specified position. Need to specify the index and value of the new element. 

In [47]:
x.append("brat")

In [48]:
x.insert(0, 1000)
x

[1000, 1, 2.379, 'banana', 'brat']

In [49]:
x.insert(4, "apple") # insert AT the 4th position
x

[1000, 1, 2.379, 'banana', 'apple', 'brat']

### Removing elements from a list
 
1. Use `del` to remove an element according to its position. 
2. Use `pop()` to remove an element in a list at a specified position, and it will return the poped element. If no index is provided, pop out the last element. 
3. Use `remove()` to remove an element by value. Note the `remove()` method deletes only the first occurrence of the value you specify.

In [50]:
a = x.pop() # removes last value
a, x

('brat', [1000, 1, 2.379, 'banana', 'apple'])

In [51]:
a = x.pop(1) # argument removes specific value
a, x

(1, [1000, 2.379, 'banana', 'apple'])

In [68]:
x = ["apple", "banana", "orange", "peach", "apple"]

In [69]:
a = x.remove("apple") # only removes one
a, x

(None, ['banana', 'orange', 'peach', 'apple'])

In [70]:
x = ["apple", "banana", "orange", "peach", "apple"]
while "apple" in x:
    x.remove("apple")
x

['banana', 'orange', 'peach']

### Joining multiple lists 

In [72]:
a = list(range(5))
b = list(range(10,15))
c = a+b
c

[0, 1, 2, 3, 4, 10, 11, 12, 13, 14]

In [73]:
a.extend(b)
a

[0, 1, 2, 3, 4, 10, 11, 12, 13, 14]

### Ordering a list
1. Use `sort()` to sort a list permanently
2. Use `sorted()` to temporarily sort a list
3. Reverse the list by `reverse()` or slicing

In [74]:
x = [67, 78, 0, 1, 44]
x.sort()
x

[0, 1, 44, 67, 78]

In [75]:
x = [67, 78, 0, 1, 44]
sorted(x)

[0, 1, 44, 67, 78]

In [76]:
x

[67, 78, 0, 1, 44]

In [77]:
sorted(x, reverse=True)

[78, 67, 44, 1, 0]

In [78]:
x = [67, 78, 0, 1, 44]
x.reverse() # sticky reverse
x

[44, 1, 0, 78, 67]

In [79]:
x[::-1] # reverse it but not sticky

[67, 78, 0, 1, 44]

In [80]:
x[:3]

[44, 1, 0]

## Tuple

- A tuple is **immutable**. It can be viewed as immutable list. 
- `()` indicates a tuple and indidual elements in the tuple are separated by commas. 
- Elements in a tuple can be indexed, but tuple object does not support item assignment.

In [84]:
x = (1, 3.14, "hello")
x

(1, 3.14, 'hello')

In [85]:
x[0]

1

In [86]:
x[0] = 100

TypeError: 'tuple' object does not support item assignment

## Set

- A set is an **unordered** collection with **no duplicate** elements. 
- Basic uses include membership testing and eliminating duplicate entries.
- Set objects also support mathematical operations like union, intersection, difference, and symmetric difference.
- Use `set()` or `{}` to create a set. Note: to create an empty set you have to use `set()`, not `{}`; the latter creates an empty dictionary. 

In [87]:
x = set()
for i in range(10):
    x.add(i)
x

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

In [88]:
x = set([1,1,2,2,3,3,4]) # may be time consuming if list is long
x

{1, 2, 3, 4}

In [97]:
y = {2,3,8}

In [98]:
# Set operation
x - y # removes contents of y from x

{1, 4}

In [99]:
x & y # returns common contents of x & y

{2, 3}

In [100]:
x or y

{1, 2, 3, 4}

## Dictionary

- A dictionary is a collection of **key-value** pairs. 
- Use a key to access the value associated with that key. 
- Keys in a dictionary should be unique. 

### Creating, indexing, modifying, adding, and removing

In [None]:
x = {"name": "Pikachu", "Type": "Electric", "height": 40, "weight": 6}
x

In [None]:
# create from keys

In [None]:
x.keys()

In [None]:
x.values()

In [None]:
x["name"]

In [None]:
x["weight"] = 10
x

In [None]:
x["attack"] = ["thunderbolt", "quick attack", "iron tail"]

In [None]:
x = {}
x["name"] = "Pikachu"
x

In [None]:
x["type"] = "electric"
x

In [None]:
del x["type"]
x

In [None]:
# nested dictionary
pokemon_collection = {}
pokemon_collection["Pikachu"] = {"type": "electric", "height": 0.4, "weight": 6}
pokemon_collection

In [None]:
pokemon_collection["Eevee"] = {"type": "normal", "height": 0.3, "weight": 6.5}
pokemon_collection

### Using `get()` to Access Values

In [None]:
pokemon_collection["Bulbasaur"]

In [None]:
pokemon_collection.get("Bulbasaur")

In [None]:
pokemon_collection.get("Pikachu")

# Control Flow

- `if`, `if-else`, `if-elif-else`
- `for` loop, `while` loop
- `break` and `continue`
- `pass`
- More on iterations: list comprehension


Useful reference: https://docs.python.org/3/tutorial/controlflow.html 4.1-4.5

### if statements

`if` statement evaluates whether a **conditional test** is `True` or `False`.  
If a conditional test evaluates to True, Python executes the code following the if statement. 
If the test evaluates to False, Python ignores the code following the if statement. 

In [None]:
if 1+1 == 2:
    print(":)")

In [None]:
x = 1
if x < 0 :
    print("negative value")
else:
    print("positive value or zero")

In [None]:
x = 1
if x < 0 :
    print("negative value")
elif x == 0:
    print("zero")
else:
    print("positive value")

In [None]:
# multiple elif
x = 85
if x >= 90:
    print("A")
elif x >= 80:
    print("B")
elif x >= 70:
    print("C")
else:
    print("F")

In [None]:
if x >= 90:
    print("A")
elif x >= 70:
    print("C")
elif x >= 80:
    print("B")
else:
    print("F")

In [None]:
fruits = ["apple", "banana", "orange", "cherry", "blueberry"]

In [None]:
x = "apple"
if x in fruits:
    print("available")
else:
    print("not available")

In [None]:
# multiple conditions
x, y = "apple", "pineapple"
if x in fruits and y in fruits:
    print(":)")
elif (x in fruits and y not in fruits) or (x not in fruits and y in fruits):
    print(":-|")
else:
    print(":( go shopping")

### for loop

A `for` loop is used for iterating over a sequence (that is either a list, a tuple, a dictionary, a set, or a string).

In [None]:
fruits = ["apple", "banana", "orange", "cherry", "blueberry"]
for x in fruits:
    print(x)

In [None]:
for i in [0,1,2,3,4]:
    print(i)

In [None]:
list(range(10))

In [None]:
count = 0
for i in range(10):
    count += 1
count

In [None]:
total = 0
for i in range(10):
    total += i
total

In [None]:
count = 0
total = 0
for i in range(10):
    count += 1
    total += i
avg = total/count
avg

In [None]:
for i, j in enumerate(range(10,20)):
    print(i,j)

In [None]:
for i, j in enumerate(range(1,5)):
    print(i, j**2)

In [None]:
for i in range(1, 10):
    if i % 2 == 0:
        print(i**2)
    else:
        print(i)

In [None]:
x = {"name": "Pikachu", "Type": "Electric", "height": 40, "weight": 6}
for k, v in x.items():
    print(k)
    print(v)

### While loop

`while` loop requires a condition. We can execute a set of statements as long as the condition is true.

In [None]:
i = 1
while i < 6:
    print(i)
    i += 1

In [None]:
# be careful to avoid infinite loop
i = 1
while i < 6:
    print(i)

In [None]:
# Using a flag
end = False
i = 1

while not end:
    i += 1
    end = i > 5
i

In [None]:
# find a largest power of 3 <= 1000
x = 1
while 3*x <= 1000:
    x = 3*x
x

In [None]:
x = 1
itr = 0
while 3*x <= 1000:
    x = 3*x
    itr += 1
x, itr

### Break, continue, pass
- Use `break` statement to exit a `for` loop or `while` loop immediately without running any remaining code in the loop. 
- Use `continue` statement to stop the current iteration, and continue with the next. 
- `pass` statement does nothing. It can be used when a statement is required syntactically but the program requires no action.

In [None]:
for i in range(1, 10):
    if i % 2 == 0:
        break
    print(i)

In [None]:
for i in range(1, 10):
    if i % 2 == 0:
        continue
    print(i)

In [None]:
for i in range(1, 10):
    if i % 2 == 0:
        print("pass")
        pass
    print(i)

In [None]:
i = 0
while i < 10:
    i += 1
    if i % 2 == 0:
        break
    print(i)

In [None]:
i = 0
while i < 10:
    i += 1
    if i % 2 == 0:
        continue
    print(i)

### List comprehension

A list comprehension combines the for loop and the creation of new elements into one line, and automatically appends each new element. 

In [None]:
squares = [x**2 for x in range(1,11)]
squares

In [None]:
[x**2 for x in range(1,11) if x % 2 != 0 ]

In [None]:
[x**2 if x % 2 != 0 else 1 for x in range(1,11)]

In [None]:
{k:v for k, v in enumerate(range(5,0,-1))}

In [None]:
{k:v for k, v in enumerate("abcd")}

# exercise
2. Write a program that prints the numbers from 1 to 100. But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz”. For numbers which are multiples of both three and five print “FizzBuzz”.
3. Write a program to print all the numbers between 1000 and 2000 which are divisible by 7 but are not a multiple of 5.
4. Write a program to calculate factorial of a number.
5. Fibonacci Sequence. Write a program that asks the user for a positive integer n and prints the first n numbers of the Fibonacci sequence.
6. List Processing. Write a program that takes a list of numbers and prints the largest and smallest numbers in the list.


# Functions

A function is a block of reusable code that performs a specific task. Functions help organize code into manageable sections, make the code more readable, and allow for code reuse.

- `def function_name():` function definition, followed by the name of this funciton. Use meaning function names. The parentheses hold the information the function needs to do the job. Finally, the definition ends in a colon.
- indentation is important. 
- docstring 

In [None]:
def greeting():
    print("hello world!")

In [None]:
greeting()

In [None]:
def f(a, b):
    """
    Do something with a and b.
    """
    
    return 2*a + 3*b

In [None]:
f(1,2)

In [None]:
# match each argument in the function call with a parameter in the function definition

In [None]:
f(a=2, b=1)

In [None]:
f(b=1, a=2)

In [None]:
f()

In [None]:
# default parameter value

In [None]:
def f(a, b, c=10):
    return 2*a + 3*b - c

In [None]:
f(2,3)

In [None]:
f(2,3,20)

When you use default values, any parameter with a default value needs to be listed after all the parameters that don’t have default values. This allows Python to continue interpreting positional arguments correctly.

In [None]:
# match parameter types
f('hello', 'world')

In [None]:
f([1,2,3], ['a', 'b', 'c'])

In [None]:
# variables
x = 3
def f(a,b):
    return a+b-x

f(2,5)

In [None]:
def f(a,b):
    return a+b-y
f(2,5)

In [None]:
# variables defined within a function are not available within the global scope
def f(a,b):
    y = 10
    return a + b -2*y
f(2,3)

In [None]:
y

In [None]:
# return 

In [None]:
def f(a,b):
    y = 10
    res = a + b -2*y
    return res

In [None]:
def f(a,b):
    return a, b, 2*a + 3*b
f(2,3)

In [None]:
x, y, z = f(2,3)
print(x, y, z)

In [None]:
# write if-statement, for loop, ... in the function

In [None]:
def is_odd(x):
    if x % 2 == 1:
        return True
    else:
        return False

In [None]:
is_odd(3)

In [None]:
is_odd(4)

In [None]:
# calculate the avergae score for each session
def avg_score(x):
    """
    Input: 
    x: a list of scores for one session
    
    Output:
    average score of this session
    """
    total_score = 0
    num = 0
    for i in x:
        total_score += i
        num += 1
    
    avg_score = total_score/num
    return avg_score

In [None]:
avg_score([100,90,80, 85, 60])

In [None]:
def largest_power_below(a, max_num=1000):
    num = 1
    while a*num <= max_num:
        num *= a
    return num

In [None]:
largest_power_below(3)

In [None]:
largest_power_below(3, 5000)

The special `*args` argument can be passed to the function. `*arg` tells Python to make an empty tuple and pack whatever values it receives into this tuple. 

In [None]:
def better_sum(*args):
    total = 0
    for i in args:
        total += i
    return total

In [None]:
better_sum(1)

In [None]:
better_sum(1,5,7)

Sometimes you’ll want to accept an arbitrary number of arguments, but you won’t know ahead of time what kind of information will be passed to the function. `**kwargs` allows writing functions that accept as many key-value pairs as the calling statement provides. 

In [None]:
def shopping(**kwargs):    
    for key, val in kwargs.items():
        print(key, val)

shopping(produce = "lettuce", fruit = "apple")

We will see more examples later. 

### Store functions in modules

We can store functions in a separate file called *module*, and then use `import` to import that module into the main program when we need it. An `import` statement tells Python to make the code in a module available in the currently running program file. 

Advantages: 
1. focus on higher-level logic in programming
2. reuse functions in many different programs
3. easy to share a single file without sharing the entire program. 
4. Knowing how to import functions also allows us to use libraries of functions that other programmers have written.

In [None]:
%%file grades.py

def avg_score(x):
    return sum(x)/len(x)

In [None]:
scores = [100,94.6, 93.2, 85, 78]

In [None]:
import grades
grades.avg_score(scores)

In [None]:
from grades import avg_score
avg_score(scores)

In [None]:
from grades import *
avg_score([100,94.6, 93.2, 85, 78])

In [None]:
import grades as gr
gr.avg_score(scores)

In [None]:
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt

Recursive functions
Anonymous functions
Lazy evaluation
Higher-order functions
Decorators
Partial application
Using operator
Using functional
Using itertools
Pipelines with toolz

### Recursive functions

A recursive function is one that calls itself. It's quite useful for some data structures such as trees. But be careful. 

In [None]:
def fibonacci(n):
    if n == 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fibonacci(n-1) + fibonacci(n-2)

In [None]:
fibonacci(10)

### Anonymous functions

An anonymous function in Python is a function without a name. A `lambda` function is a small anonymous function, i.e., `lambda arguments : expression`. It can take any number of arguments, but can only have one expression.

In [None]:
x = lambda a: 2**a
x(5)

In [None]:
text = "hello world"

upper = lambda string: string.upper()
print(upper(text))

In [None]:
add = lambda x, y: x+y
add(5, 8)

In [None]:
get_min = lambda x, y: x if x < y else y
get_min(7, 5)

In [None]:
x = [1,2,4,5,3,7,8,3]
sorted(x, key = lambda x: x%2 == 0)

In [None]:
student_tuples = [
    ('john', 'A', 15),
    ('jane', 'B', 12),
    ('dave', 'B', 10),
]
sorted(student_tuples, key=lambda student: student[2])   

In [None]:
x = {"a": [1,2,3, 4], "b": [1,3,4], "c": [1,2,3,4,5]}
x_new = dict(sorted(x.items(), key = lambda item: len(item[1]), reverse=True))
x_new

In [None]:
# map, zip, filter