# Module 4: Python Data Structures + Functions

## Lists, Tuples, Sets & Dictionaries

### Lists

- As we saw last week, __a list is a collection of data values__, the elements of which may or may not share the same data type.


- __Lists are "mutable"__, meaning the contents of a list object can be modified and reordered as needed.



In [None]:
# how to create an empty list
emptylist = []

# example of a list containing data
list1 = [1,2,3,4,5]
list2 = [6,7,8,9,10]

list3 = list1 + list2

print(list3)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


In [None]:
# example of a list containing mixed data
mixedlist = [1, 'far', 3.14159, 'pi']

print(type(mixedlist[2]))
print(type(mixedlist[3]))

<class 'float'>
<class 'str'>


### Tuples

- Fixed length, immutable sequence of Python objects: content __cannot__ be modified or reordered except when underlying items themselves __are__ mutable.


- Can be composed of items of any data type or data structure


- As a rule, use parentheses to enclose items when defining a tuple


- Best for use with data that should generally not be alterable (e.g., a database key value).


- Tuples can be processed faster than lists due to their immutable nature.

In [None]:
# define a simple tuple

tup1 = (1, 'leg', 2, 'arm', [47, 99])
type(tup1)

tuple

In [None]:
# access content via 0-based indices

print(tup1[4])

[47, 99]


In [None]:
# like lists, tuples can be sliced

print(tup1[0:2])

(1, 'leg')


In [None]:
# mutable objects inside a tuple CAN be modified in place
tup1[4][0] = 33
print(tup1[4])

[33, 99]


### Sets

- A Python __set__ is an unordered set of distinct elements.


- Sets allow for use of mathematical set operations, e.g., union, intersection, difference, subset test, superset test, etc.


- Contents of a set can be altered, but adding a duplicate of an existing element results in no change to the set.


- Sets can be created either by passing a list to the __set()__ function or via curly brackets '{}'. However, for clarity it's best to use the __set()__ function.

In [None]:
# 2 ways to define a set
set1 = set(['A', 1, 'K', 3.14])

set2 = {1, 'B', 'pi', 3.14}

# get the union of 2 sets
set1.union(set2)

{1, 3.14, 'A', 'B', 'K', 'pi'}

In [None]:
# get the intersection of 2 sets
set1.intersection(set2)

{1, 3.14}

In [None]:
# test for subset
set3 = {1}

set3.issubset(set2)

True

In [None]:
# test for superset
set1.issuperset(set3)

True

### Dictionaries (aka "dicts")

- dicts are collections of __"key:value" pairs__ where both the 'key' and 'value' are themselves Python objects.


- Use when you have a set of unique keys that map to values


- dicts can provide faster access to data than lists, but are not as easy to work with as lists.


- Create a dict using curly brackets {}

In [None]:
# create a dict object
dict1 = {'name': 'John Doe', 'address': '749 West 42nd St', 'zip': 11001}

print(dict1)

{'name': 'John Doe', 'address': '749 West 42nd St', 'zip': 11001}


In [None]:
# access values by their key
dict1['name']

'John Doe'

In [None]:
# modify a value in a dict
dict1['name'] = 'Jane Smith'
print(dict1)

{'name': 'Jane Smith', 'address': '749 West 42nd St', 'zip': 11001}


## List, Set, Dict Comprehensions & Generators

What are 'comprehensions' in Python?

From the author of "A Whirlwind Tour of Python":

*'Comprehensions are simply a way to compress a list-building for-loop into a single short, readable line.'*


An example from his book:

In [None]:
# for loop used to build a list
L = []
for n in range(12):
    L.append(n ** 2)
L

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121]

In [None]:
# and the list comprehension equivalent

L = [n ** 2 for n in range(12)]
L

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121]

### Using Conditionals Within a Comprehension

What if you need to include an __if__ statement in your loop?

In [None]:
# for loop with if statement testing the value of the iterator
L = []
for val in range(20):
    if val % 3:
        L.append(val)
L

[1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16, 17, 19]

In [None]:
# same logic as cell above but using a 1 line comprehension
L = [val for val in range(20) if val % 3]
L

[1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16, 17, 19]

### Set Comprehensions

- Creates a set comprised of iterative values in one line of code


- A set comprehension will automatically eliminate any duplicate values generated by the iterator you specify

In [None]:
# example of a set comprehension: create a set containing the square of
# each item within the range of (0,..,11)
set5 = {n**2 for n in range(12)}
set5

{0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121}

### Dict Comprehensions

- Automatically create a dict object comprised of iterative values


In [None]:
# example of a dict comprehension: create a dict wherein the keys are
# integers in the range of (0,..,5) and the values are the squares of those
# same integers
dict5 = {n:n**2 for n in range(6)}
dict5

{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25}

### Generators

- Generators are comprehensions that create an __"as needed" iterator__. Sometimes referred to as a "recipe for producing iterative values".


- Generators save memory space by not requiring you to store every value you want to iterate over.


- When you need the next iterative value in your sequence, simply invoke the generator


- Downside of generators is they can only be iterated through __once__!! If you need to iterate through the same set of values again, you need to define the generator all over again.

In [None]:
# define a generator and print its contents
G = (n ** 2 for n in range(12))
list(G)

# after printing the values from the generator G, we have exhausted G 
# as a generator and cannot use it again without re-declaring it.

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121]

In [None]:
# generator 'G' above was used up, so we need to specify a new one
G = (n**2 for n in range(12))
for val in G:
    print(val, end=' ')

0 1 4 9 16 25 36 49 64 81 100 121 

## Functions

- Functions are the fundamental method of code organization and reuse within the realm of software development.


- They provide the developer with a way to encapsulate an algorithm within a reusable, "callable" code block.


- One of their primary uses is for purposes of avoiding the unneccessary repetition of code within a program. (__DRY!!!__)


- Careful use of functions can make your code __much__ more readable + understandable to others.


- Functions can be written to accept user-defined 'arguments'. Arguments are simply data values or data structures that you want the function to use as part of its processing.


Example of how to define a function (from "*A Whirlwind Tour of Python*"):

In [None]:
# defining a function that returns the first N fibonacci numbers
def fibonacci(N):
    L = []
    a, b = 0, 1
    while len(L) < N:
        a, b = b, a + b
        L.append(a)
    return L

Now we have a function named "fibonacci' that accepts a single argument 'N' where 'N' is the number of fibonacci numbers you want to find, and returns a list of those N fibonacci numbers.

Once we've defined a function, we can invoke it within our code:

In [None]:
# get the first 10 fibonacci numbers
fibonacci(10)

[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]

### Returning Multiple Values From a Function

In [None]:
# returning multiple values from a function
def f():
    a = 3
    b = 5
    c = 7
    return(a, b, c)

x, y, z = f()

print(x, y, z)

3 5 7


In [None]:
# multiple return values are actually returned as a tuple
ftup = f()
print(ftup)

# access the tuple item having an integer index value of 1
print(ftup[1])

(3, 5, 7)
5


### Default Values for Function Arguments

- You can specify a default value for any function argument


- Allows the function to be invoked without any explicit arguments

In [None]:
# Set N=3 as default value if no value is passed in
def fibonacci_wd(N = 3):
    L = []
    a, b = 0, 1
    while len(L) < N:
        a, b = b, a + b
        L.append(a)
    return L

fibonacci_wd()

[1, 1, 2]

### Doc Strings

- A 'doc string' is simply __an extended comment provided at the beginning of a function that explains the purpose + functionality of the function__.


- The doc string for a function is displayed for introspection purposes when requested by a user (see example below)


- Get in the habit of providing a doc string for each function you write !!


- An example:

In [None]:
# Set N=3 as default value if no value is passed in
def fibonacci_wd(N = 3):
    '''this function finds the first N fibonacci numbers
    - if no value for N is provided, N is defaulted to the value of 3
    - The function generates a list of integers which is returned to the
    - user. '''
    
    L = []
    a, b = 0, 1
    while len(L) < N:
        a, b = b, a + b
        L.append(a)
    return L

fibonacci_wd()

[1, 1, 2]

In [None]:
# display the docstring via introspection
fibonacci_wd?

### "Anonymous", aka "lambda" Functions

- Short, single-line functions defined via Python's __lambda__ statement.


- Sometimes can make code more readable, sometimes not


- Why use them? Readability OR if you want to pass a simple function as a parameter to a function you've written. 


- See "*A Whirlwind Tour of Python*" for a good example of using a lambda function to help sort the contents of a __dict__ object.

In [None]:
# simple lambda function
add = lambda x, y: x + y
add(1, 2)

3

In [None]:
# lambda function shown above is equivalent of this:
def add(x, y):
    return x + y

Example from "*A Whirlwind Tour of Python*"

https://nbviewer.jupyter.org/github/jakevdp/WhirlwindTourOfPython/blob/master/08-Defining-Functions.ipynb

### "Hands On" Exercises

#### Part 1

Write a function that determine whether or not a driver has been speeding. The function should have a single argument: the speed they were driving (we'll assume it's in KM per hour, not miles per hour).

- If their speed is less than 70km per hr, return a value of 'OK'


- If their speed is greater than 70km per hr, for every 5km above the speed limit(70), the driver earns 1 demerit point. Return the total number of demerit points in the form of a string, e.g., if the driver's speed was 85km per hr, return "Points: 3"


- If the total number of demerit points exceeds 6, the function should return "License Suspended"

#### Part 2

Write a function that tests whether a number is a prime number. It should accept one argument: the number you are testing. It should return a Boolean value indicating whether or not the number is a prime number.