# Python intro for HS math - part 3

After completing parts 1 and 2 of the Python introduction, learning about markdown language (including math formatting) and exploring how to generate permutations and combinations, you should be ready to start creating your own projects. In this notebook we'll introduce a few more Python features to round out your knowledge and make you a more capable programmer.

## Dictionaries

Dictionaries are unordered collections of key-value pairs. The equivalent structure in other languages is sometimes called a hash or an associative array. In my opinion, choosing the name "dictionary" over the alteratives is yet another example of how Python was designed to be more intuitive and human friendly.  

Like lists, dictionaries are mutable. While the values can be arbitrary, the key must be a simple "hashable" data type (e.g. lists or sets cannot serve as keys). A key can only appear once and assigning a new value to a key will overwrite the original value. Values can appear multiple times. Dictionaries are enclosed by curly braces and values of the dictionary are accessed using the key name in square brackets.

In [None]:
#Initialize the dictionary with a few key-value pairs
fruit_colors = {'apple':'red', 'banana':'yellow'}

# Print out the original the dictionary
print("Original dictionary")
print(fruit_colors)
print()

# Add a few more items
fruit_colors['lime'] = 'green'
fruit_colors['lemon'] = 'yellow'

# Print out the new dictionary contents
print("Dictionary after adding elements")
print(fruit_colors)
print()

# Now delete an entry and print the new contents
del fruit_colors['lime']
print("Dictionary after deleting entry")
print(fruit_colors)
print()

# Use the keys() and values() methods to list just the keys or values
print("List the keys and values")
print("keys:", fruit_colors.keys())
print("values:", fruit_colors.values())
print()

# The items() methods is useful for iterating over a dictionary
print('Iterate over the dictionary')
for k,v in fruit_colors.items():
    print(k, v)

### Still a little too abstract?

If the concept of dictionary still feels a little too abstract or hasn't quite sunk in, think of it as analogous to the dictionary you're already familiar with.

In [None]:
definition = {}

definition['apple'] = 'the usually round, red or yellow, edible fruit of a small tree, \
Malus sylvestris, of the rose family.'

definition['banana'] = 'a tropical plant of the genus Musa, certain species of which \
are cultivated for their nutritious fruit.'

definition['lemon'] = 'the yellowish, acid fruit of a subtropical citrus tree, Citrus limon.'

In [None]:
print('apple:', definition['apple'])
print()
print('banana:', definition['banana'])
print()
print('lemon:', definition['lemon'])

### Hmmm... this reminds me a lot of sets

The use of curly braces for both sets and dictionaries can be a little confusing. Dictionaries came first and earlier versions of the language did not have native support for sets. Since curly braces were already accepted as the standard set notation among mathematicians, the Python developers decided to reuse them for representing sets.

Reusing curly braces sort of makes sense given that sets and dictionaries share some similarities, such as the requirements that both set elements and dictionary keys cannot be repeated and must be simple types. To a first approximation, a set is lot like a dictionary without values.

### Remember that dictionary values are not subject to same limitations as the keys

Although the dictionary keys are limited to simple hashable types, the values are arbitrary and can be repeated. 

In [None]:
favorite_fruits = {}
favorite_fruits['Bob'] = ['orange', 'banana', 'cherry']
favorite_fruits['Kyle'] = ['lemon', 'peach', 'apricot']
favorite_fruits['Trang'] = ['mango', 'banana', 'cherry']
favorite_fruits['Ovie'] = ['orange', 'banana', 'cherry']

In [None]:
favorite_fruits['Trang']

### Exercises

1. Create your own dictionary containing any information you like - states and populations, friends and their pet's names, surfing spots and quality, musical acts and lead singers, restaurants and ratings. 

2. After creating the dictionary, convince yourself that you know how to add new entries, delete entries, associate new values with keys, iterate over the dictionary. 

3. Experiment with breaking the rules for a dictionary. What happens if you use a list as a key? Access a key that doesn't exist? Forget the colon between key and value when initializing a dictionary?

## Tuples

A tuple (rhymes with couple) is a Python data structure that is similar to a list except that tuples are immutable. A tuple is enclosed by parentheses and tuple elements are accessed by index in the same way as list elements.

In [None]:
t = ('a', 'b', 'c')

print(t)
print()

for x in t:
    print(x)
print()
    
print(t[0], t[1], t[2])

Sometimes you need a tuple that contains a single element. For example, a function may require a tuple as an argument, regardless of the number of elements in the tuple. In this case, follow the element of the tuple with a comma.

In [None]:
y = (3)
type(y)

In [None]:
y = (3,)
type(y)

### OK, why even bother having tuples?

Since tuples behave just like a lists except that they are immutable, why even bother having them? It turns out that programs can be much more efficient if the Python interpreter knows that an object is immutable. Don't worry, this is mostly under the hood, nerdy stuff that you don't need to worry about.

### Exercises

Take a few minutes to experiment with tuples. As usual, see if you can break things. For example, what happens if you try to change an element of a tuple.

Next, create a tuple of lists and try changing elements of the list.

`t = (['a', 'b', 'c'], ['d', 'e', 'f'], ['g', 'h', 'i'])`

## Iterating over multiple lists (and an intro to iterators)

The `zip()` built-in function takes one or more lists and generates an iterator of tuples built from the corresponding elements of the list. Convert to a list if you need to use in a list context.

In [None]:
list1 = ['a', 'b', 'c', 'd']
list2 = ['A', 'B', 'C', 'D']
list3 = [1, 2, 3, 4]
t = zip(list1, list2, list3)
print(list(t))

Normally you would use the `zip()` function with lists of the same lengths. If lists of different lengths are used, the number of tuples will be determined by the length of the shortest list.

In [None]:
list1 = ['a', 'b', 'c', 'd', 'e', 'f']
list2 = ['A', 'B', 'C', 'D', 'E']
list3 = [1, 2, 3, 4, 5, 6, 7, 8, 9]
t = zip(list1, list2, list3)
print(list(t))

A zipped object can be unzipped using the zip command with the argument preceded by an asterisk

In [None]:
t = zip(list1, list2, list3)
x, y, z = zip(*t)
print(x)
print(y)
print(z)

## Iterators

We used the term iterator in the previous discussion of the `zip()` function and in the notebook on Combinations and Permutations, but without providing a definition. An iterator is an object that returns successive items in a data stream and is exhausted when the last element has been returned. Iterators can improve performance and reduce memory usage since elements in the sequence are only generated as needed.

As we illustrate below, once an item in an iterator is accessed or used, it is no longer available.

In [None]:
t = zip(list1, list2, list3)
print(list(t))
print(list(t))

Although you'll rarely need to do this, the next element in an iterator can be obtained using the `next()` function or the `__next__()` method.

In [None]:
t = zip(list1, list2, list3)
print(next(t))
print(next(t))
print(t.__next__())
print(t.__next__())
print(t.__next__())
print(list(t))

## Assignments, copies, deep copies and shallow copies

One of the biggest stumbling blocks in Python is dealing with the quirks of copying lists, dictionaries, sets and other collections of items. Let's start with a simple example

In [None]:
a = ['apple', 'banana', 'orange']
print('List a:', a)
b = a
b[0] = 'lemon'
print('List a:', a)

### What just happened?

The assignment `"b = a"` does not create a new list. The two lists point to the same memory location and `b` is simply an alias or another name for `a`. 

Why would the Python developers implement things this way, isn't that kind of crazy and counterintuitive? Turns out that this behavior gives experienced programmers a lot of control, especially when working with more complex data structures.

To get an independent copy of the list we need to use the `copy()` function.

In [None]:
import copy
a = ['apple', 'banana', 'orange']
print('List a:', a)
b = copy.copy(a)
b[0] = 'lemon'
print('List a:', a)

For nested objects, such as lists of lists, the behavior is more complex. The `copy()` function only resolves the first level of references and we need to use a `deepcopy()` if we want to recursively copy the entire object.

In [None]:
a = ['apple', 'banana', 'orange', ['lime', 'peach']]
print(a)
b = copy.copy(a)
b[0] = 'cherry'
b[3][0] = 'mango'
print(a)

In [None]:
a = ['apple', 'banana', 'orange', ['lime', 'peach']]
print(a)
b = copy.deepcopy(a)
b[0] = 'cherry'
b[3][0] = 'mango'
print(a)

What if we want to do something really clever, such as creating a copy of an object that is independent only at the first two levels of loop nesting? See the example below. Don't worry, this is not on the test ;)

In [None]:
aaa = ['a', ['b','c'], [['d','e'], ['f','g']]]
print(aaa)

bbb = ['']*3
for i,x in enumerate(aaa):
    bbb[i] = copy.copy(aaa[i])
    
bbb[0] = 'X'
bbb[1][0] = 'Y'
bbb[2][0][0] = 'Z'

print(aaa)

## List comprehension

Python's list comprehension functionality makes it very easy to create and populate a new list. We'll show an example first using the loop syntax and then demonstrate how much easier it is to do with list comprehension.

While this isn't something that you absolutely need to know, the technique is commonly used and will make your code easier to read. It's also very Pythonic!

In [None]:
# Populate a list of squares using a loop
squares = []
for x in range(10):
    squares.append(x*x)    
squares

In [None]:
# Populate a list of squares using list comprehension (very Pythonic)
squares = [x*x for x in range(10)]
squares

The general form of list comprehension is **[transformation iterator filter]**. In the example below, we use a more complex function and limit the list to values that are divisible by 3

In [None]:
squares = [x*x + x - 2 for x in range(10) if (x*x + x - 2)%3 == 0]
squares

### One more thing about the range function

Like many Python functions, range can be called with different numbers of arguments.

+ range(n) --> integers 0 through n-1
+ range(m,n) --> integers m through n-1
+ range(m,n,p) --> integers m through n-1 with stride p

In [None]:
list(range(2,20,3))

## Map

The map built-in can be used to apply a function to all elements of an iterable. Note that map returns an iterator and will need to be converted to a list to use in a list context.

In [None]:
x = [0.0, -0.25, 0.5, -0.75, 1.0]

In [None]:
y = abs(x) # This does not do what you would hope and produces an error

In [None]:
y = map(abs, x) # Instead use map() which applies function (abs) to list (x)
print(list(y))

### Lambda functions

Python supports lambda functions. We won't get into the theory, but think of it as a way to declare a nameless function that can be used in certain contexts to simply your code

In [None]:
z = [2,3,5,7,11]
y = list(map(lambda a: a**2 + a**3, z))
print(y)

Lambda functions can take multiple arguments

In [None]:
z = [2,3,5,7,11]
w = [1,2,4,6,10]
y = list(map(lambda a, b: a**2 + a**3 + b, z, w))
print(y)

## Making printing prettier

We introduced the `print()` function in the very first notebook since we needed some way to generate output beyond the clumsy default method of dumping the contents of the last object listed in a cell. So far, this has worked out pretty well. But consider the following example, which generates some rather ugly output.

In [None]:
names = ['Bob', 'Ovie', 'Trang', 'Tony', 'Timothy', 'Ange']
x = range(1,7)
x7 = [x**7 for x in range(1,7)]

for a, b, c in zip(names, x, x7):
    print(a, b, c)

Fortunately, the `str.format()` method makes the output a little more appealing using the following syntax

`'string containing placeholders'.format(items to be printed)`

The placeholders are of the form `{position:specification}` and the number of placeholders matches the number of items to be printed. The specification can include the length and the data type. Fortunately, we just need to know a few.

+ integer: d
+ floating point number: f
+ string: s

This is still a bit abstract, so let's consider a concrete example.

In [None]:
for a, b, c in zip(names, x, x7):
    print('{0:8s} {1:3d} {2:10d}'.format(a, b, c))

Let's dissect our print statement

`print('{0:8s} {1:3d} {2:8d}'.format(a, b, c))`

+ `{0:8s}` renders the 1st argument as a string of length 8

+ `{1:3d}` renders the 2nd argument as integer of length 3

+ `{1:3d}` renders the 3rd argument as an integer of length 10

+ `.format(a, b, c)` provides the three arguments

+ The formatted string is passed as a single argument to `print()`




We'll go one step further and show how to render floating point numbers using the notation `m.nf` where `m` is the total length and `n` is the number of places after the decimal.

In [None]:
for a, b, c in zip(names, x, x7):
    print('{0:8s} {1:3d} {2:10d} {3:12.8f}'.format(a, b, c, b/c))

## File I/O

In this section we'll cover some of the basics if writing to and reading from a file.

### Writing to a file

Until now, we've been displaying output to the Jupyter notebook - in computer speak, we've been writing to standard output (stdout). While this is great when working with small amounts of text that we want to inspect interactively, there are times when we want to save the data to file for later use or processing. One simple way to do this is to use

+ `fout = open(filename, 'w')` to create a file handle (`'w'` means write)
+ `fout.write(s)` to write string `s` to the file
+ `fout.close()` to close the file handle

In [None]:
fout = open('x_tothe_7.txt', 'w')
for a, b, c in zip(names, x, x7):
    s = '{0:8s} {1:3d} {2:10d}\n'.format(a, b, c)
    fout.write(s)
fout.close()

### What's with the `\n` in the string?

The `print()` function automatically adds a newline character to the output, but the `write()` method does not. To advance to the next line, we need to explicity add the newline character `\n`

### Reading from a file

Once we understand how to work with file handles, reading from a file is straightforward. We just need to keep a few things in mind

+ We open the file handle using `'r'` for reading

+ The Pythonic way to read a file line-by-line is to use the syntax `with open(...) as fh`. This avoids potential problems such as forgetting to close the file handle.

+ The lines of the file are read in as strings. You'll normally need to split the line into fields and convert to the appropriate types.

In [None]:
with open('x_tothe_7.txt', 'r') as fin:
    for line in fin:
        a, b, c = line.split()
        print('{0:8s} {1:3d} {2:10d}'.format(a, int(b), int(c)))

### More I/O

We've only gotten a glimpse of Python's I/O capabilities. We can append to file, read an entire file in one fell swoop, operate on csv (comma separated value) files and work binary data - the computer's internal representation of data. We'll illustrate a few of these capabilities below.

In [None]:
# Read an entire file in one fell swoop
# Note that we'll often want to split into lines
# and then further split the lines into fields

with open('x_tothe_7.txt', 'r') as fin:
    contents = fin.read().split('\n')
    
print(contents[0])
print(contents[1])
print(contents[2])

In [None]:
fout = open('sample.txt', 'w')
fout.write("This is the first line\n")
fout.write("This is the second line\n")
fout.close()

with open('sample.txt', 'r') as fin:
    contents = fin.read()
print(contents)

fout = open('sample.txt', 'a+') # Append mode, create if does not exist
fout.write("This is the third line\n")
fout.write("This is the fourth line\n")
fout.close()

with open('sample.txt', 'r') as fin:
    contents = fin.read()
print(contents)

### Exercises

Create an array of strings, array of integers and (optionally) and array of floats that all contain the same number of elements. Be sure to use a variety of strings lengths and integers that span multiple orders of magnitude.

Write the contents of the arrays to a file using Python's formatting features so that the fields line up neatly. Read the file contents line-by-line and in one fell swoop. Then open the file in append mode, add a few more lines and confirm that things worked as expected.

Finally, try breaking your code and see what happens. For example, introduce a mismatch between the number of placeholders and arguments when using the `format()` method or using format specifiers that are too small (e.g. `5d` for integers greater than 99,999).