# Python Programming: Data Structures and Classes

In the previous lecture notebook we quickly coverd all of the important
control structures available in Python, and how to define and use
functions in Python.

In this notebook we begin to look at some of the power of the high-level
language constructs Python provides by looking at some of the built-in
basic data types.  

We will also cover the basics of defining and using classes and objects and
doing object-oriented programming in the Python language.

# Strings

Strings are not like numerical values.  In Pythong, strings or the `str` type 
are first-class data types of the language, unlike some othre languages.
Strings are a fundamental type.

Strings are a type of **sequence**, which means it is an order collection
of values.  Basically a string is an order collection of characters in Python.

## A string is a sequence

You can access individual characters in the string sequence using an indexing
operator (all sequences in Python provide indexing and slicing operations, 
as we will discuss).  Indexing is 0-based in Python, the first element is found
at index 0.

In [1]:
fruit = 'banana'


In [2]:
# first character is at index 0
fruit[0]

'b'

In [3]:
# character at index 2 is the 3rd character, an n
letter = fruit[2]
letter

'n'

You can use expressions that evaluate to an index to access characters of
a sequence like a string.

In [4]:
length = 6 # banana has 6 characters
last_char = fruit[length - 1]
last_char

'a'

In [5]:
mid = length // 2
mid_char = fruit[mid]
mid_char

'a'

## 'len'

`len()` is a built-in function that returns the number of characters in
a string (or in general, the number of items in any sequence container).

In [6]:
len(fruit)

6

To get the last letter of the string you might try this:

In [7]:
length = len(fruit) # get the length programatically instead of hard coding it

try:
    last_char = fruit[length]
except IndexError:
    print('IndexError because valid indexes are from 0 to 5, but length is 6')
    print('   which is an invalid index.')


IndexError because valid indexes are from 0 to 5, but length is 6
   which is an invalid index.


So to get the last character you actually need the character at `length - 1`
index.

In [8]:
last_char = fruit[length - 1]
last_char

'a'

## Traversal with a `for` loop

Another way to iterate in Python is with `for` loops.  But unlike some languages,
`for` loops are specifically designed to iterate over the items in a sequence.
Since the items in a string are characters, if you iterate over a string you will
get the individual characters of the string.

In [9]:
for letter in fruit:
    print(letter)

b
a
n
a
n
a


You can compare this to manipulating an explicit index variable `index` to access
the characters in a string.

In [10]:
index = 0
while index < len(fruit):
    letter = fruit[index]
    print(index, letter)
    index = index + 1

0 b
1 a
2 n
3 a
4 n
5 a


It is considered bad style to use an index controlled loop like this to access
the elementes of a sequence.  If you need to have both the index number of the
item in the sequence, and the item, use the built-in `enumerate()` function,
which enumerates over the items of a sequence, returning the index and the item
as pairs, like this.

In [11]:
for index, letter in enumerate(fruit):
    print(index, letter)

0 b
1 a
2 n
3 a
4 n
5 a


## String slices

A segment of a string is called a slice.  

**ALERT**: understanding slicing is very important for this class.  We use slicing
of sequences in all kinds of contexts. 

You should learn all of the ways you can slice sequences, we will make heavy
use of slicing syntax in this class.  Many things we use in this class are
sequences, like lists, numpy arrays, pandas dataframes, etc.

In [12]:
s = 'Monty Python'

In [13]:
# slice from the beginning to the 5th index (up to by not including index 5)
s[0:5]

'Monty'

In [14]:
# if you omit the beginning index of a slice, 0 is assumed
s[:5]

'Monty'

In [15]:
# slice from the middle of a sequence to the end
s[6:12]

'Python'

In [16]:
# if  you omit the end index, all characters to the end are assumed
s[6:]

'Python'

In [17]:
# arbitrary slice of characters from index 3 up to (but not including)
# index 8 in the middle of the sequence
s[3:8]

'ty Py'

In [18]:
# given what we just demonstrated, what does the following mean?
s[:]

'Monty Python'

In [19]:
# you can provide a 3rd parameter for a slice, the step or skip
# get every even character of the string
s[::2]

'MnyPto'

In [20]:
# get every odd character starting at character 1
s[1::2]

'ot yhn'

In [21]:
# negative indexes are relative to the end of the sequence,
# get the last character
s[-1]

'n'

In [22]:
# get the last 2 characters
s[-2:]

'on'

In [23]:
# reverse the string, go from last to first character with a step of -1
s[::-1]

'nohtyP ytnoM'

## Strings are immutable

Strings are **immutable**.  This means you can't actually change
the characters of a string in place.   The other immutable data structure
we will use a lot is a **tuple**, which we will describe briefly later.

What does it mean that strings are immutable.  Well you can't do something 
like this:

In [24]:
greeting = 'Hello, world!'

try:
    greeting[0] = 'J'
except TypeError:
    print('TypeError is generated because it is illegal to try and change (mutate)')
    print('   the individual characters of a string sequence.')

TypeError is generated because it is illegal to try and change (mutate)
   the individual characters of a string sequence.


If you need to modify a string, the best you can do is create a new
string.

In [25]:
new_greeting = 'J' + greeting[1:]
new_greeting

'Jello, world!'

## Operations on strings

There are operations that can be done on strings, like operations on numeric
types.  In another language, we might say that operations `+` and `*` 
are "overloaded" for the string type in the Python language.

We just saw the operator `+` for strings.  Plus will concatenate strings.

In [26]:
# string concatenation
s = "Hello" + " " + "World!"
s

'Hello World!'

The `*` operator on strings defines a string repetition.  The amount of repetition
has to appear on the right hand side of the `*` operator.

In [27]:
# example of string repetition and concatenation together
x = 'Spam ' * 10 + "Spamity spam, spamity spam!"
x

'Spam Spam Spam Spam Spam Spam Spam Spam Spam Spam Spamity spam, spamity spam!'

## String methods

Although in many ways strings are fundamantal types in Python, it is more
correct to think of strings as instances of objects.

So an instance of a string has many member function that can be called to 
manipulate the string.  This is true of all of the rest of the high-level 
data types we will look at as well in the following sections.

We can use the built-in method `dir()` to list all of the string methods
available for a string.  Documentation of 
[Python string methods](https://docs.python.org/3/library/stdtypes.html#string-methods)
can be found at that link.

In [28]:
s = 'A string'
dir(s)

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isascii',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',


Just a few examples of string methods.

In [29]:
s.upper()

'A STRING'

In [30]:
s.find('s')

2

In [31]:
s.find('ing')

5

In [32]:
s.capitalize()

'A string'

In [33]:
s.split()

['A', 'string']

In [34]:
s = '  \t   This string has lots of space, before and after it   \t\t\n'
s.strip()

'This string has lots of space, before and after it'

## The `in` operator

The word `in` is a boolean operator that takes a value and a sequence
and returns `True` if the value appears as an item in the sequence.

For string types, `in` returns true if the value is a substring of the string.

In [35]:
'nana' in 'banana'

True

In [36]:
'seed' in 'apple'

False

In [37]:
'seed' in 'birdseed and apple sauce'

True

## String comparison

Besides the `+` and `*` operations, all of the boolean operations are
defined for strings.  The boolean operators will compare strings 
by alphabetical ordering.

In [38]:
'apple' < 'banana'

True

In [39]:
'grape' <= 'granite'

False

In [40]:
'apple' == 'orange'

False

In [41]:
'banana' == 'banana'

True

In [42]:
# but all upercase letters come before all lowercase letters, so you can be 
# surprised if you don't know this
'banana' < 'apple'

False

In [43]:
'Banana' < 'apple'

True

### Self-test Exercises

# Lists

## A list is a sequence

A list is also a sequence like a string.  However a list differs in two
fundamental ways.

1. A list is a sequence of arbitrary types/values, unlike a string which is a sequence
of characters.
2. A list is **mutable**, which means unlike a string you can modify the elements
of a list in place.

Lists look kind of like arrays if you are familiar with languages like
C or Java.  But they are really a much higher-level abstract data type
than a simple array.  It is best not to think of a list as a simple array.

- Arrays are homogeneous, they only hold values of a single type.  Python 
lists are nonhomogeneous, they can hold values of many different types.
- Arrays are fixed size, they cannot easily grow and shring.  Python lists
are dynamically resizable, they can grow and shrink as needed.

We create lists using the `[ ]` syntax.

In [44]:
# a list of integers
ivals = [10, 20, 30, 40]

# a list of strings
flavors = ['crunchy frog', 'ram bladder', 'lark vomit']

# but did we mention Python lists are nonhomogeneous
stuff = ['Gouda', 42, True, 4.2e-25, 2+3j, 'end of list']

A value in a list can even be another list (e.g. nested lists), which can
be very useful for building complex data structures at times.

In [45]:
l = ['spam', 2.0, 5, [10, 20, 30, 40]]
l

['spam', 2.0, 5, [10, 20, 30, 40]]

Lists are sequences, so all of the things we talked about you could do with a 
sequence like a string, you can do with a list.

In [46]:
# index and slice list
l[-1]

[10, 20, 30, 40]

In [47]:
l[1:3]

[2.0, 5]

In [48]:
# iterate over a list
for index, element in enumerate(stuff):
    print(index, element)

0 Gouda
1 42
2 True
3 4.2e-25
4 (2+3j)
5 end of list


## Lists are mutable

But lists are mutable, so we can change the items in a list.

In [49]:
cheeses = ['Cheddar', 'Edam', 'Gouda']

In [50]:
cheeses[0] 

'Cheddar'

In [51]:
cheeses[0] = 'Danish Blue'
cheeses[-1] = 'Stilton'
cheeses

['Danish Blue', 'Edam', 'Stilton']

## Traversing a list

You should use the `for` loop construct to iterate over elements
of a Python list.  We showed one example above.  

When you iterate over a nested list, you only get the top level elements, e.g.
any nested lists will be extracted as a single element.  An exmaple should 
make this clearer

In [52]:
# l is  a nested list defined before
l

['spam', 2.0, 5, [10, 20, 30, 40]]

In [53]:
# if we iterate over l, the last element is the whole final list
for element in l:
    print(element)

spam
2.0
5
[10, 20, 30, 40]


In [54]:
# if you want to iterate over the nested elements, well...
for element in l:
    print(element)
    if type(element) == list:
        print("now displaying the individual elements of the nested list")
        for nested_element in element:
            print(nested_element)

spam
2.0
5
[10, 20, 30, 40]
now displaying the individual elements of the nested list
10
20
30
40


## List operations

`+` and `*` operators are also defined for list, and they do the same thing
as for strings.  `+` concatenates lists.

In [55]:
a = [1, 2, 3, 4]
b = [5, 6, 6, 8]

c = a + b
c

[1, 2, 3, 4, 5, 6, 6, 8]

And the `*` operator repeates a list

In [56]:
c = a * 4
c

[1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4]

In [57]:
# create a bag of 10 red, green and blue balls for an experiment
balls = ['red', 'green', 'blue']
bag_of_balls = balls * 10
bag_of_balls

['red',
 'green',
 'blue',
 'red',
 'green',
 'blue',
 'red',
 'green',
 'blue',
 'red',
 'green',
 'blue',
 'red',
 'green',
 'blue',
 'red',
 'green',
 'blue',
 'red',
 'green',
 'blue',
 'red',
 'green',
 'blue',
 'red',
 'green',
 'blue',
 'red',
 'green',
 'blue']

## List slices

As we already mentioned, lists are sequences and all sequences support
slicing syntax in Python.

In [58]:
t = ['a', 'b', 'c', 'd', 'e', 'f']

In [59]:
t[1:3]

['b', 'c']

In [60]:
t[:4]

['a', 'b', 'c', 'd']

In [61]:
t[3:]

['d', 'e', 'f']

In [62]:
t[::-1]

['f', 'e', 'd', 'c', 'b', 'a']

In [63]:
t[1:5:3]

['b', 'e']

## List methods

A Python list is an instance of an object.

[Common sequence operations](https://docs.python.org/3/library/stdtypes.html#common-sequence-operations)
you can perform on a list.
[Mutable sequence operations](https://docs.python.org/3/library/stdtypes.html#mutable-sequence-types)
you can perform on a list.

A list is a mutable sequence, so any common sequence operation can be performed
on lists, and any mutable sequence operations as well.

We can get a list of the methods available to a list using `dir()` built-int
method again.

In [64]:
dir(t)

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__imul__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__rmul__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'append',
 'clear',
 'copy',
 'count',
 'extend',
 'index',
 'insert',
 'pop',
 'remove',
 'reverse',
 'sort']

Here are a few examples of using some of the methods you can perform on lists.

In [65]:
t.reverse()
t

['f', 'e', 'd', 'c', 'b', 'a']

In [66]:
unsorted = ['grape', 'banana', 'apple', 'orange', 'zuchini', 'strawberry', 'kiwi']
unsorted.sort()
unsorted

['apple', 'banana', 'grape', 'kiwi', 'orange', 'strawberry', 'zuchini']

In [67]:
unsorted.append('blueberry')
unsorted.append('rasberry')
unsorted

['apple',
 'banana',
 'grape',
 'kiwi',
 'orange',
 'strawberry',
 'zuchini',
 'blueberry',
 'rasberry']

In [68]:
unsorted.sort()
unsorted

['apple',
 'banana',
 'blueberry',
 'grape',
 'kiwi',
 'orange',
 'rasberry',
 'strawberry',
 'zuchini']

In [69]:
# using a list as a stack
stack = [] # initially empty

stack.append(5) # push
stack.append(3)
stack.append(7)
stack

[5, 3, 7]

In [70]:
top = stack.pop()
top

7

In [71]:
stack

[5, 3]

In [72]:
stack.pop()
top = stack.pop()
top

5

In [73]:
stack

[]

In [74]:
# using a list as a queue
queue = [] # initially empty
queue.append(5)
queue.append(3)
queue.append(7)

queue

[5, 3, 7]

In [75]:
front = queue.pop(0) # remove item 0, e.g. front of queue
front

5

In [76]:
queue

[3, 7]

In [77]:
queue.pop(0)
front = queue.pop(0)
front

7

In [78]:
queue

[]

# Dictionaries

Dictionaries are a high-level abstract data type that you may or may not
be familiar with, but which are extremely useful to learn how to use.

Dictionaries are also known as **maps**, **key-value pairs** or sometimes
as a **hash**.

## A dictionary is a mapping

A dictionary is like a list but more general.  In a list, the
integer index is a key that maps from the index to the value at the
index for that list.  In a list the indices have to be integers.

For the more general `dictionary`, a **key** can be any type, not just an integer.
Each key is associated with a single value, just like each integer index
is associated with a single value in a list.

However one difference, besides that the indexes can be arbitrary values, is
that since the keys don't necessarily have an inherent order, a dictionary is
not considered a sequence like a list.  

In python `{` and `}` squiggly braces are used to identify and define dictionaries.

In [79]:
# dictionary of birthdays
birthday = {'Newton': 1642, 
            'Darwin': 1809, 
            'Curie': 1867, 
            'Einstein': 1879, 
            'Harter': 1978}

In [80]:
# look up some birthdays
birthday['Newton']

1642

In [81]:
birthday['Einstein']

1879

If a key isn't in the dictionary you get an error, just like if you try and
access a list index that is not in the list

In [82]:
try:
    birthday['Hopper']
except KeyError:
    print('KeyError is generated here which means the key is not in the dictionary.')

KeyError is generated here which means the key is not in the dictionary.


The `len()` function works on dictionaries, so we can find out the number of 
key-value pairs in the dictionary using this.

In [83]:
len(birthday)

5

The `in` operator work on dictionaries too; it tell you whether something is 
a key of the dictionary or not.

In [84]:
'Harter' in birthday

True

In [85]:
'Einstein' in birthday

True

In [86]:
'Hopper' in birthday

False

## Dictionaries are mutable

Dictionaries are mutable, like lists.  In fact, we can easily add new
key-value pairs.

In [87]:
birthday['Hopper'] = 1912
birthday['Maxwell'] = 1875

birthday

{'Newton': 1642,
 'Darwin': 1809,
 'Curie': 1867,
 'Einstein': 1879,
 'Harter': 1978,
 'Hopper': 1912,
 'Maxwell': 1875}

In [88]:
len(birthday)

7

In [89]:
'Hopper' in birthday

True

## Dictionary methods

As with lists and strings, dictionaries are actuall objects with a set of 
member methods that can be called on an instance of a dictionary.

[Dictionary member methods](https://docs.python.org/3/library/stdtypes.html#mapping-types-dict)

In [90]:
dir(birthday)

['__class__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'clear',
 'copy',
 'fromkeys',
 'get',
 'items',
 'keys',
 'pop',
 'popitem',
 'setdefault',
 'update',
 'values']

In [91]:
birthday.keys()

dict_keys(['Newton', 'Darwin', 'Curie', 'Einstein', 'Harter', 'Hopper', 'Maxwell'])

In [92]:
birthday.values()

dict_values([1642, 1809, 1867, 1879, 1978, 1912, 1875])

In [93]:
# iterate over the birthday by keys.
# Keys are not guaranteed to come out in any particular order from the dictionary/hash
for key in birthday.keys():
    print(key, birthday[key])

Newton 1642
Darwin 1809
Curie 1867
Einstein 1879
Harter 1978
Hopper 1912
Maxwell 1875


In [94]:
# actually, the default iteration of a dictionary is by the key, so
# if you want to iterate over the keys, you can use more simply
for key in birthday:
    print(key, birthday[key])

Newton 1642
Darwin 1809
Curie 1867
Einstein 1879
Harter 1978
Hopper 1912
Maxwell 1875


In [95]:
# we can iterate over the keys in an order, for example we could get the keys and
# sort them, then iterate over them
for key in sorted(birthday.keys()):
    print(key, birthday[key])

Curie 1867
Darwin 1809
Einstein 1879
Harter 1978
Hopper 1912
Maxwell 1875
Newton 1642


In [96]:
# items returns the key,value pairs together
for key, value in birthday.items():
    print(key, value)

Newton 1642
Darwin 1809
Curie 1867
Einstein 1879
Harter 1978
Hopper 1912
Maxwell 1875


## Dictionaries to count values

A common operation for data exploration is to get frequency counts of the types
and number of values of some data item.  Dictionaries are well suited for this 
task.

For example, lets say you need to determine the count of the characters in
a string.  We can use a dictionary and the `get()` method to do this easily.

In [97]:
text = """
Just some random text we want to count character frequencies of.

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor 
incididunt ut labore et dolore magna aliqua. Ipsum consequat nisl vel pretium 
lectus quam. Id leo in vitae turpis massa sed. Hendrerit gravida rutrum quisque 
non tellus orci ac auctor augue. Maecenas volutpat blandit aliquam etiam. Nibh 
cras pulvinar mattis nunc sed blandit libero volutpat sed. At erat pellentesque 
adipiscing commodo elit at imperdiet dui. Urna nec tincidunt praesent semper 
feugiat nibh sed. Sed pulvinar proin gravida hendrerit. Enim neque volutpat ac 
tincidunt vitae semper quis lectus nulla. Aenean pharetra magna ac placerat 
vestibulum lectus mauris. Hendrerit dolor magna eget est. Sapien et ligula 
ullamcorper malesuada proin libero nunc consequat.

Integer malesuada nunc vel risus commodo viverra. Vestibulum morbi blandit 
cursus risus at. Eleifend quam adipiscing vitae proin. Vulputate ut pharetra 
sit amet aliquam id. Neque ornare aenean euismod elementum. Et egestas quis 
ipsum suspendisse ultrices gravida dictum fusce ut. Enim sit amet venenatis 
urna. Morbi quis commodo odio aenean sed adipiscing. Justo nec ultrices dui 
sapien eget mi proin. Euismod in pellentesque massa placerat duis ultricies 
lacus sed turpis. Adipiscing bibendum est ultricies integer quis auctor. Vehicula 
ipsum a arcu cursus vitae congue mauris rhoncus. Facilisis volutpat est velit 
egestas dui id ornare arcu odio. Duis at consectetur lorem donec massa sapien 
faucibus et. Placerat in egestas erat imperdiet sed euismod nisi porta lorem.
"""

In [98]:
def character_frequency(text):
    frequencies = {} # keep frequency counts in a dictionary
    
    # iterate over the given text
    for char in text:
        count = frequencies.get(char, 0) # get current count or 0 if haven't seen yet
        count += 1 # increment count by 1
        frequencies[char] = count # update dictionary
        
    # return the dictionary result
    return frequencies

In [99]:
# get the character count/frequencies
freq = character_frequency(text)

# get the key/value pairs so we can sort by the value frequency counts
table = list(freq.items())

# sort the table.  We provide a custom sort function using a lambda function
# The lambda function essentially says for each item in table, sort by the
# second item at index 1, thus we sort by the frequency value 
table.sort(key=lambda key_value: key_value[1], reverse=True)

table

[(' ', 235),
 ('e', 150),
 ('i', 126),
 ('a', 118),
 ('t', 106),
 ('u', 102),
 ('s', 101),
 ('r', 82),
 ('n', 82),
 ('o', 63),
 ('l', 59),
 ('c', 58),
 ('d', 57),
 ('m', 56),
 ('p', 41),
 ('.', 29),
 ('\n', 24),
 ('g', 22),
 ('v', 20),
 ('q', 18),
 ('b', 15),
 ('h', 8),
 ('f', 6),
 ('E', 5),
 ('I', 3),
 ('A', 3),
 ('V', 3),
 ('J', 2),
 ('w', 2),
 (',', 2),
 ('H', 2),
 ('M', 2),
 ('N', 2),
 ('S', 2),
 ('x', 1),
 ('L', 1),
 ('U', 1),
 ('F', 1),
 ('D', 1),
 ('P', 1)]

# Tuples

A **tuple** is a sequence of values.  So a tuple is basically a list.
However, you cannot modify a tuple once created, unlike a list.

## Tuples are immutable

A tuple is a sequence of values of any time.  You can index the values
by an integer, and slice the tuple like a list.

However, tuples are immutable.  Why do we need an immutable list/sequence type
in Python?  Read on to find out.

Syntatically you can create a tuple as a list of comma separated
values.  Also, it actually is not necessary in Python syntatically, but
many people indicate tuple lists by enclosing them in `(` and `)` (unlike
a list where the square `[` and `]` are used and are actually necessary
syntatically).

So we can create a tuple like:

In [100]:
t1 = 'hello', 3, 3.14159, False, 42.42 

And although it is not necessary, you will commonly see the same tuple declared
like

In [101]:
t2 = ('hello', 3, 3.14159, False, 42.42)

Most list operators (sequence operators) work on a tuple

In [102]:
t2[0]

'hello'

In [103]:
t2[1:3]

(3, 3.14159)

In [104]:
t1[-1]

42.42

In [105]:
for t in t2:
    print(t)

hello
3
3.14159
False
42.42


But if you try to modify one of the elements of a tuple you get an error

In [106]:
try:
    t2[0] = 'goodbye'
except TypeError:
    print('TypeError generated becasue tuple object does not support item assignment.')

TypeError generated becasue tuple object does not support item assignment.


For a similar reason, methods like `append()` and `pop()` are not defined
for a tuple object as they would modify the tuple.

Also as this may be useful for this class in places, the relational operators
work with tuples (and other sequences as well).  Python starts by comparing the
first element from each sequence.  If they are equal, it goes on to the next sequence.

So our 2 tuples should evaluate as equal

In [107]:
t1 == t2

True

Or generally, the first value that differs in the tuple will determine 
the boolean relation.

In [108]:
(1, 2, 3) < (1, 2, 4)

True

In [109]:
(1, 2, 3) < (1, 2, 2)

False

## Tuple assignment

Tuples can assign individual elements to one another in an
assignment.  This type of assignment between variables and items
in a tuple is used extensively in Python and in the libraries we use
for this class.  For example:

In [110]:
begin, end = 1, 100
print(begin)
print(end)

1
100


In [111]:
(name, age, salary, attributes) = ('Derek', 42, 10.00, {'height': 67, 'weight': 165})

print(name)
print(attributes['height'])

Derek
67


In [112]:
# kind of a trick, but we can use tuple assignment to perform an in place swap of 
# values
a, b = 42, 13

# do the swap
a, b = b, a

print(a)
print(b)

13
42


## Tuples as return values

Probably the most common use of this kind of tuple assignment is to handle the
return values from a function.

Strictly speaking, a function cn only return one value.  But if you want a 
function to return a list of values in Python, you can instead return a
tuple (or a list), and thus you are returning only 1 "thing", but the container
contains multiple values.  

This is used extensively by library funciton to return multiple results back to
the caller of functions.  

As a quick example, the built in `divmod()` function, that we have looked at before,
actually returns a tuple result.  It returns the result of the division, followed
by the remainder.  We can assign both return values into a single tuple.

In [113]:
# 7 / 3 = 2 with a remainder of 1
res = divmod(7, 3)
res

(2, 1)

Or instead, we can pick out the individual elements returned and assign them into
more useful variable names as a result of calling the function.

In [114]:
quotient, remainder = divmod(7,3)
print('result of dividiong 7 by 3: ', quotient)
print('remainder from the division: ', remainder)

result of dividiong 7 by 3:  2
remainder from the division:  1


You can return multiple values from your own function by returning a tuple
list.  For example, here is a function that returns the min and max values
from a list of values given to it.

In [115]:
def min_max(l):
    """Given a list of values, return the min,max tuple of the minmum and 
    maximum values in the list.
    """
    return min(l), max(l)

In [116]:
# a list of integer values
l = [5, 3, 9, 2, 7, 6, 1, 8]

# call the list, get the tuple result
min_val, max_val = min_max(l)

print('minimum value in list:', min_val)
print('maximum value in list:', max_val)

minimum value in list: 1
maximum value in list: 9


## Lists and tuples

`zip()` is a built-in function that takes two or more sequences and
interleaves them. 

For example, `zip()` takes 2 tuples or lists of the same size, and this allows
you to iterate over corresponding pairs of objects.

In [117]:
# a string is a sequence of characters, so it is "list" like
s = 'abcd'

# a tuple of 4 items
t = (1, 2, 3, 4)

for pair in zip(s, t):
    print(pair)

('a', 1)
('b', 2)
('c', 3)
('d', 4)


In [118]:
# once again because of tuple assignment, we could extract the individual
# items of the pair while iterating over them
for char, value in zip(s, t):
    print(char, value)

a 1
b 2
c 3
d 4


You can turn the rusult of calling zip into a true list, you can create
sequences of corresponding elements.

Also, `zip` will actually work with 3 or more sequences, as long as all sequences
are of the same length.

In [119]:
names = ['alice', 'bob', 'carol']
ages = (42, 38, 29)
id = (1, 2, 3)

l = list(zip(id, names, ages))
l

[(1, 'alice', 42), (2, 'bob', 38), (3, 'carol', 29)]

If you combine `zip`, `for` and tuple assignment, you get a useful idiom
for traversing two (or more) equences at the same time.  For example, `has_match`
takes two sequences, `t1` and `t2`, and returns `True` if there is an index
i such that `t1[i] == t2[i]`.

In [120]:
def has_match(t1, t2):
    """Determine if any corresponding element of sequence t1 matches t2.  If
    we find a matching element, we return a True result.  Otherwise the answer
    is false.
    """
    for x, y in zip(t1, t2):
        if x == y:
            return True
    return False

In [121]:
has_match((1, 2, 3), (4, 5, 3))

True

In [122]:
has_match( (1, 2, 3), (4, 5, 6))

False

Finally, as we may have already mentioned, if you need to traverse the elements
of a sequence, like a list or tuple, and you also need the index of each
element, you should use a `for` loop and the `enumerate()` function.  `enumerate()`
basically zips a list of the integer indexes to the list, so  you can
get each element and its index when iterating the list.

In [123]:
for index, element in enumerate('abcdefg'):
    print(index, element)

0 a
1 b
2 c
3 d
4 e
5 f
6 g


## Dictionaries and tuples

Dictionaries have a method called `items()` that returns a sequence of 
tuples, where each tuple is a key-value pair.

In [124]:
d = {'a': 0, 'b': 1, 'c': 2}
t = d.items()
t

dict_items([('a', 0), ('b', 1), ('c', 2)])

In [125]:
for key, value in d.items():
    print(key, value)

a 0
b 1
c 2


Also it is common to use a tuple as a key for a complex dictionary.  One cannot
use lists because of their immutable nature.

For example, we might need a telephone dictionary that maps `('last', 'first')` 
names to their telephone number.

In [126]:
telephone_book = {
    ('Cleese', 'John'): '218-256-7918',
    ('Chapman', 'Graham'): '903-418-1938',
    ('Gilliam', 'Terry'): '817-763-8293',
    ('Palin', 'Michael'): '428-271-2939'
}

telephone_book['Gilliam', 'Terry']

'817-763-8293'

# Sets

Sets are another built-in type that can be very useful in
particular circumstances.  A set is an implementation of the
mathematical notion of a set.  So each unique value only appears one
time within the set.

We used a dictionary in a set like way, to count the occurrences of
letters in a text.  When a letter is encountered, we know if it is a letter
in the text or not depending on if it is a key in the dictionary.

To define a set do the following.

In [127]:
# an initially empty set
words = set() 

In [128]:
# using our lorem ipsum text from the sectionary on dictionaries, this time
# split by words, and create a set of the unique words
# By default, split() splits by whitespace, so spaces, tabs, newlines
for word in text.split():
    words.add(word)

In [129]:
# the set of unique words
words

{'Adipiscing',
 'Aenean',
 'At',
 'Duis',
 'Eleifend',
 'Enim',
 'Et',
 'Euismod',
 'Facilisis',
 'Hendrerit',
 'Id',
 'Integer',
 'Ipsum',
 'Just',
 'Justo',
 'Lorem',
 'Maecenas',
 'Morbi',
 'Neque',
 'Nibh',
 'Placerat',
 'Sapien',
 'Sed',
 'Urna',
 'Vehicula',
 'Vestibulum',
 'Vulputate',
 'a',
 'ac',
 'adipiscing',
 'adipiscing.',
 'aenean',
 'aliqua.',
 'aliquam',
 'amet',
 'amet,',
 'arcu',
 'at',
 'at.',
 'auctor',
 'auctor.',
 'augue.',
 'bibendum',
 'blandit',
 'character',
 'commodo',
 'congue',
 'consectetur',
 'consequat',
 'consequat.',
 'count',
 'cras',
 'cursus',
 'dictum',
 'do',
 'dolor',
 'dolore',
 'donec',
 'dui',
 'dui.',
 'duis',
 'egestas',
 'eget',
 'eiusmod',
 'elementum.',
 'elit',
 'elit,',
 'erat',
 'est',
 'est.',
 'et',
 'et.',
 'etiam.',
 'euismod',
 'faucibus',
 'feugiat',
 'frequencies',
 'fusce',
 'gravida',
 'hendrerit.',
 'id',
 'id.',
 'imperdiet',
 'in',
 'incididunt',
 'integer',
 'ipsum',
 'labore',
 'lacus',
 'lectus',
 'leo',
 'libero',
 'lig

We can check set membership using the `in` operator, difference using
the `difference()` method or the `-` operator, and set union using 
the `union()` or the `+` operator.

In [130]:
# set membership
'vel' in words

True

In [131]:
'velum' in words

False

In [132]:
# difference
s1 = set(['a', 'b', 'c', 'd'])
s2 = set(['b', 'c'])

s1 - s2

{'a', 'd'}

In [133]:
# the empty set
s2 - s1

set()

In [134]:
s2.add('e')
s2 - s1

{'e'}

In [135]:
# union, whoops I was wrong, `+` is not defined for union, seems like a missed
# opportunity there?
#s1 + s2
s1.union(s2)

{'a', 'b', 'c', 'd', 'e'}

# List comprehensions

List comprehensions are a relatively newer addition to the Python language.
On the one hand they can make you code more concise.  On the other hand 
they may make your code more concise, which may not be as readable?

In any case, you will probably see list comprehensions used more and more
frequently in many places.  Any (simple) loop can be turned into a 
list comprehension, and any list comprehension can be turned back into 
a more traditional for loop.

In [136]:
# function to capitalize all letters of a string s, using a traditiona loop
def capitalize_all_loop(s):
    """Capitalize all letters in the string s and return the (new)
    resulting string.
    """
    res = []
    for c in s:
        res.append(c.capitalize())
    return res

In [137]:
capitalize_all_loop("This is a string!  It has some upper, but mostly lower.")

['T',
 'H',
 'I',
 'S',
 ' ',
 'I',
 'S',
 ' ',
 'A',
 ' ',
 'S',
 'T',
 'R',
 'I',
 'N',
 'G',
 '!',
 ' ',
 ' ',
 'I',
 'T',
 ' ',
 'H',
 'A',
 'S',
 ' ',
 'S',
 'O',
 'M',
 'E',
 ' ',
 'U',
 'P',
 'P',
 'E',
 'R',
 ',',
 ' ',
 'B',
 'U',
 'T',
 ' ',
 'M',
 'O',
 'S',
 'T',
 'L',
 'Y',
 ' ',
 'L',
 'O',
 'W',
 'E',
 'R',
 '.']

Notice that the result is no a list instead of a string, which you may not
have been expecting.  How would you convert it back to a string?

To do the same thing using a list comprehension, we do the following.
Notice that `[` and `]` are defining a new list that we return, but inside of
the list comprehension, instead of a comma separated list of elements, we
have a function/code, and a for statement.  The for statement selects each c in
s, then `capitalize()` is called on each c, and the result of each of these is
appended to the list that is created and returned.

In [138]:
def capitalize_all_list(s):
    return [c.capitalize() for c in s]

In [139]:
capitalize_all_list("This is a string!  It has some upper, but mostly lower.")

['T',
 'H',
 'I',
 'S',
 ' ',
 'I',
 'S',
 ' ',
 'A',
 ' ',
 'S',
 'T',
 'R',
 'I',
 'N',
 'G',
 '!',
 ' ',
 ' ',
 'I',
 'T',
 ' ',
 'H',
 'A',
 'S',
 ' ',
 'S',
 'O',
 'M',
 'E',
 ' ',
 'U',
 'P',
 'P',
 'E',
 'R',
 ',',
 ' ',
 'B',
 'U',
 'T',
 ' ',
 'M',
 'O',
 'S',
 'T',
 'L',
 'Y',
 ' ',
 'L',
 'O',
 'W',
 'E',
 'R',
 '.']

List comprehensions can also be used for filtering.  For example this 
function selects only the elements of `s` that are upper case, and
returns a new list:

In [140]:
def only_upper_loop(s):
    """Select only the upper case letters of string s, and put them into a new
    list that we return to the caller.
    """
    res = []
    for c in s:
        if c.isupper():
            res.append(c)
    return res

In [141]:
only_upper_loop("This is a string!  It has some uPPER, but mostly lower.")

['T', 'I', 'P', 'P', 'E', 'R']

The equivalant filter as a list comprehension

In [142]:
def only_upper_list(s):
    return [c for c in s if c.isupper()]

In [143]:
only_upper_list("This is a string!  It has some uPPER, but mostly lower.")

['T', 'I', 'P', 'P', 'E', 'R']

# Python Classes and Object-Oriented Programming

Up to this point we have discussed the built-in data types of the Python
language, and how you can define your own functions to break down
problems into small subproblems and solve them.

Object-oriented programming is another way of organizing code to
solve large problems.  I assume you have some familiarity with what
object-oriented programming is, and some of its principles.
If not, it would be worthwhile reading at least a 
[high-level summary of OO principles](https://www.freecodecamp.org/news/object-oriented-programming-concepts-21bb035f7260/).

We will need to use and maybe define our own classes in several
places for this course, so you should familarizie yourself with the basics
of defining and using classes in Python.

## Classes and objects

One way of thinking of objects and OO programming is it is a way to add your
own user-defined data type to the core Python language.

A classic example is to add a new mathematical `Point` class to the
python language.  The `Point` class will represent a 2-D point in space

### Attributes

Classes have member attributes, and member methods you can call to ask the
class to perform some task for you.

The first step of a class id defining the private internal attributes of
the class.  Though by default, python classes do not enforce strict
priviacy of access to private parts of the class, unlike what you can
do in some other languages.  This is both useful, but also dangerous as it
allows any external entitiy to mess with the internal state of a class
object, which breaks the principle of encapsulation.

But in any case, use the `class` keyword to define a new class to add to the
language.

In [144]:
class Point:
    """Represents a point - 2D space
    """

Since python is a dynamic high-level language, we can create instances
of a Point class object immediately.

In [145]:
p = Point()
p

<__main__.Point at 0x7ffb3c0d1910>

We can dynamically assign member attributes to our `p` instance of the
`Point` class.

In [146]:
p.x = 3.0
p.y = 4.0
print(p.x)
print(p.y)

3.0
4.0


In [147]:
# point p is an instance of a Point class, we can create other instances which
# are separate form p
p2 = Point
p2.x = -5.0
p2.y = -7.0

print(p2.x)
print(p2.y)
print(p.x)
print(p.y)

-5.0
-7.0
3.0
4.0


Notice that the `.` notation is used for member access.  We use `p.x` to 
create and assign a value to a new attribute of the instance `p`, and we can
use `p.x` to access and read out that attribute value.

### Object Composition

It is useful to be able to compose objects using other objects when designing
a solution to a complex problem using OO programming.  For example, lets say
you now need a `Rectangle` object.  We will define a `Rectangle` in our
system as a point where the lower left corner of the rectangle is
located, then as the rectangle width (x-direction) and height (y-direction).

We could just define our own `x` and `y` member instances for the rectangle
corner.  But this is really a `Point` so we should reuse our concept of a 
`Point` data type when building up a definition of a rectangle.

In [148]:
class Rectangle:
    """Represents a rectangle.
    
    attributes: corner, width, height
    """

In [149]:
# create an instance of a rectangle
r = Rectangle()

# add member variables for the width and height of the rectangle
r.width = 200.0
r.height = 100.0

# now use object composition to define the corner point
r.corner = Point()
r.corner.x = 30.0
r.corner.y = 60.0

## Classes and functions

You can of course pass instances of classes you define to functions, just like
any of the built-in data types available in python.

In [150]:
def distance(p1, p2):
    """Given 2 instances of a Point object, calculate the eucledian distance
    from p1 to p2.
    """
    # need sqrt function to calculate distance
    from math import sqrt 
    
    # distance is square root of the sum of the squared difference of each dimenstion
    return sqrt( (p1.x - p2.x)**2.0 + (p1.y - p2.y)**2.0 )

In [151]:
# reusing the 2 points we instantiated above
distance(p, p2)

13.601470508735444

In [152]:
def upper_right_corner(rect):
    """Given an instance of our Rectangle class, determine the upper-right
    corner of the rectangle.
    """
    # upper right corner will be a new point
    ur_corner = Point()
    
    ur_corner.x = rect.corner.x + rect.width
    ur_corner.y = rect.corner.y + rect.height
    
    return ur_corner

In [153]:
# reuse the rectangle from above
ur = upper_right_corner(r)
print(ur.x)
print(ur.y)

230.0
160.0


## Classes and methods

Although we defined some classes, and passed some instances of them to some
regular function, the above code using the `Point` and `Rectangle` class
are not really object-oriented.

They are not object-oriented yet because they do not encapsulate the
operations that are defined that we can use to operate on
`Point` and `Rectangle` instances.

We can define our two regular functions above as member
functions of the `Point` and `Rectangle` class respectively.

In [154]:
class Point:
    """Represents a point - 2D space
    """
    
    def distance(self, p):
        """Calculate the distance between ourself and
        another point, p, given as a parameter to the
        member function.
        """
        # need sqrt function to calculate distance
        from math import sqrt 

        # distance is square root of the sum of the squared difference of each dimenstion
        return sqrt( (self.x - p.x)**2.0 + (self.y - p.y)**2.0 )       

In [155]:
# lets redefine our points using our new class definition
p1 = Point()
p1.x = 0
p1.y = 0

# distance from origin to 3,4 is right triangle 3,4 = 5
p2 = Point()
p2.x = 3
p2.y = 4

p1.distance(p2)

5.0

In [156]:
class Rectangle:
    """Represents a rectangle.
    
    attributes: corner, width, height
    """
    
    def upper_right_corner(self):
        """Given an instance of our Rectangle class, determine the upper-right
        corner of the rectangle.
        """
        # upper right corner will be a new point
        ur_corner = Point()

        ur_corner.x = self.corner.x + self.width
        ur_corner.y = self.corner.y + self.height

        return ur_corner    

In [157]:
# lets redefine our rectangle and try the member method
r = Rectangle()
r.width = 200.0
r.height = 100.0
r.corner = Point()
r.corner.x = 30.0
r.corner.y = 60.0


ur = r.upper_right_corner()
print(ur.x)
print(ur.y)

230.0
160.0


A member function is defined inside the scop of a `class` in python.
All member functions always take `self` as their first parameter.
When you call the member function on an instance of an object, 
`self` will refer to that instance object in the member function.

So as you can see, for our rectangle member function we define it to
accept only `self` as a parameter.  Also notice that when we invoke `upper_right_corner()`
on the rectangle `r` we don't pass in `r`.  The instance of the object is
passed in implicitly because we are invoking the member function of
the `r` instance.

Likewise our `distance()` function takes 2 parameters, `self` and a second
point. But when we invoke the function we only pass in the second parameter,
because the instance of the object we invoke the method for is passed in 
always as the first parameter implicitly.

These classes are getting somewhat useful.  But in order to use the
member function, the user has to know and set up all of the
member variables of each object by hand.  This is both cumbersome 
and error prone.  In an OO language, we can usually define a constructor
for the classes we add to the language.  In Python, this is done with
the `__init__()` method.  There are several methods with special names
you can define for a class.  They all begin and end with two underscores.
These allow you to define the class constructors, and to overload
operators for the clas instances.

Lets add class constructors for both of our classes with the `__init__()`
method.  In most cases, you usually pass in values to the constructor so
that you can initialize the member attributes of your class.  Member functions
are like any function in Python, so we can use named and default parameters
to the function, as well as positional parameters.

In [158]:
class Point:
    """Represents a point - 2D space
    """
    
    def __init__(self, x=0, y=0):
        """Class constructor for the Point class.  Initialize member
        attributes x and y.  We default to being a point at the origin
        of the coordinate system if x and y are not given.
        """
        self.x = x
        self.y = y
        
    def distance(self, p):
        """Calculate the distance between ourself and
        another point, p, given as a parameter to the
        member function.
        """
        # need sqrt function to calculate distance
        from math import sqrt 

        # distance is square root of the sum of the squared difference of each dimenstion
        return sqrt( (self.x - p.x)**2.0 + (self.y - p.y)**2.0 )       

In [159]:
class Rectangle:
    """Represents a rectangle.
    
    attributes: corner, width, height
    """
    
    def __init__(self, x=0, y=0, width=1.0, height=1.0):
        """Class constructor for the Rectangle class.  Initialize member
        attributes.  We are given the lower left corner, and the
        rectangle width and height as attributes.  All parameters have defaults,
        so if none are specified a unit Rectangle (square) located at the origin is
        created with width and height of 1.
        """
        # our lower left corner is a Point, we use object composition
        # in our Rectangle class, reusing the Point class.
        self.corner = Point(x, y)
        
        self.width = width
        self.height = height
        
    def upper_right_corner(self):
        """Given an instance of our Rectangle class, determine the upper-right
        corner of the rectangle.
        """
        # upper right corner will be a new point
        ur_corner = Point()

        ur_corner.x = self.corner.x + self.width
        ur_corner.y = self.corner.y + self.height

        return ur_corner    

With these class init methods added to help us construct the objects, our
code is much nicer and less error prone to use the points and rectangle 
objects.

In [160]:
# lets redefine our points using our new class definition
# defaults to point at the origin
p1 = Point()

# distance from origin to 3,4 is right triangle 3,4 = 5
p2 = Point(3, 4)

p1.distance(p2)

5.0

In [161]:
# defaults to the unit rectangle
r = Rectangle()
c = r.upper_right_corner()
print(c.x)
print(c.y)

1.0
1.0


In [162]:
r = Rectangle(60.0, 30.0, 100.0, 200.0)
c = r.upper_right_corner()
print(c.x)
print(c.y)

160.0
230.0


## Operator overloading and special methods

It was annoying to have to get the point returned then individually 
access and display the x and y coordinates.  There are many special methods
define for Python classes that allow you to add functionality to your user
defined types.  `__str__` is called anytime someone tries to display
your object or convert it to a string representation.  We can give
string methods to our classes to make displaying them easier.

Likewise you can overload operators for your user define types.
Functions with names like `__add__`, `__subtract__`, `__mul__`,
`__lt__`, `__gt__`,
etc. can be defined to implement those corresponding operations 
for your classes.  A fuller list of the special method names you can
define is given
[here](https://docs.python.org/3/reference/datamodel.html#special-method-names)

Lets also overload the add operator for points to define vector
addition, which is simply adding together the x and y dimensions, and
returning a new resulting `Point` which is the result of the vector
addition.

In [163]:
class Point:
    """Represents a point - 2D space
    """
    
    def __init__(self, x=0, y=0):
        """Class constructor for the Point class.  Initialize member
        attributes x and y.  We default to being a point at the origin
        of the coordinate system if x and y are not given.
        """
        self.x = x
        self.y = y
        
    def __str__(self):
        """Provide functionality to provide a string representation of
        this point when needed.
        """
        return 'x:%f, y:%f' % (self.x, self.y)
            
    def __add__(self, p):
        """Define vector addition between ourself and another point
        p as overloaded `+` operation.
        """
        # new resulting point we will return
        newx = self.x + p.x
        newy = self.y + p.y
        
        # new resulting point is returned
        return Point(newx, newy)
        
    def distance(self, p):
        """Calculate the distance between ourself and
        another point, p, given as a parameter to the
        member function.
        """
        # need sqrt function to calculate distance
        from math import sqrt 

        # distance is square root of the sum of the squared difference of each dimenstion
        return sqrt( (self.x - p.x)**2.0 + (self.y - p.y)**2.0 )
    

In [164]:
class Rectangle:
    """Represents a rectangle.
    
    attributes: corner, width, height
    """
    
    def __init__(self, x=0, y=0, width=1.0, height=1.0):
        """Class constructor for the Rectangle class.  Initialize member
        attributes.  We are given the lower left corner, and the
        rectangle width and height as attributes.  All parameters have defaults,
        so if none are specified a unit Rectangle (square) located at the origin is
        created with width and height of 1.
        """
        # our lower left corner is a Point, we use object composition
        # in our Rectangle class, reusing the Point class.
        self.corner = Point(x, y)
        
        self.width = width
        self.height = height
        
    def __str__(self):
        """Provide functionality to provide a string representation of this
        rectangle when needed.
        """
        # notice we are using the __str__ of the point
        self.ur_corner = self.upper_right_corner()
        return 'll corner: %s, ur corner: %s  (width: %f  height: %f)' % \
                  (self.corner, self.ur_corner, self.width, self.height)
        
    def __mul__(self, scale):
        """Overload multiplication operator to provide rectangle scaling.
        Does not change location of ll corner, simply scales the 
        width and height dimensions by the floating point scale parameter.
        Also does not change self, but a new rectangle appropriatly scaled
        is created and returned.
        """
        newwidth = self.width * scale
        newheight = self.height * scale
        newrect = Rectangle(self.corner.x, self.corner.y, newwidth, newheight)
        return newrect
    
    def upper_right_corner(self):
        """Given an instance of our Rectangle class, determine the upper-right
        corner of the rectangle.
        """
        # upper right corner will be a new point
        ur_corner = Point()

        ur_corner.x = self.corner.x + self.width
        ur_corner.y = self.corner.y + self.height

        return ur_corner    

In [165]:
# lets redefine our points using our new class definition
# defaults to point at the origin
p1 = Point()
print(p1) # example of using __str__

# distance from origin to 3,4 is right triangle 3,4 = 5
p2 = Point(3, 4)
print(p2)

p3 = Point(-4, 2)
newp = p2 + p3 # example of __add__
print(newp)

x:0.000000, y:0.000000
x:3.000000, y:4.000000
x:-1.000000, y:6.000000


In [166]:
# defaults to the unit rectangle
r = Rectangle()
print(r)

biggerr = r * 5
print(biggerr)

ll corner: x:0.000000, y:0.000000, ur corner: x:1.000000, y:1.000000  (width: 1.000000  height: 1.000000)
ll corner: x:0.000000, y:0.000000, ur corner: x:5.000000, y:5.000000  (width: 5.000000  height: 5.000000)


In [167]:
r = Rectangle(60.0, 30.0, 100.0, 200.0)
print(r)

biggerr = r * 3.8
print(biggerr)

ll corner: x:60.000000, y:30.000000, ur corner: x:160.000000, y:230.000000  (width: 100.000000  height: 200.000000)
ll corner: x:60.000000, y:30.000000, ur corner: x:440.000000, y:790.000000  (width: 380.000000  height: 760.000000)


## Inheritance

Object inheritance is one of the fundamental concepts of OO programming.
We will make some use of inheritance in places for this class, so you should
understand the fundamental concepts.

The "Think Python" textbook I am using for most of the examples in this
notebook has a good example of object composition and inheritance
using a `Card` class, and then a `Deck` class, which is composed of
`Card` instances.  Then finally inheritance is demonstrated
using a `Hand` class, where a hand of cards is like a small deck
of cards, with some other attributes, (like the name or player entitiy
that is playing and has the hand).  I encourage you to read over this
example.

I'll just give another quick example of inheritance here.  A very typical first
example of inheritance is to define a hierarchy of shapes.  Since we
have been using a rectangle class, lets try defining a shape hierarchy.
In this example, all we want to do is be able to instantiate shapes with
different numbers of sizes, then provide some special methods for
such shapes.

We start with a generic base class for all `Polygon` instances, that all of
our classes will inherit from.  This class basically defines
the constructor and a `__str__` representation that can be used by
all the derived children classes.

In [168]:
class Polygon:
    """Basic class for a hierarchy of Polygon types
    """
    
    def __init__(self, sides):
        """We expect a sequence of sides, where each item in the
        sequence is simply the length of that side.
        """
        self.num_sides = len(sides)
        self.sides = sides
        
    def polygon_name(self):
        """We expect child classes to overrid this, so they can give
        a more useful name than Polygon for the shape
        """
        return '%d-sided Polygon' % self.num_sides
    
    def area(self):
        """A kind of virtual function.  Child classes need to 
        override this function, or else we just throw an exception if
        they do not.
        """
        raise Exception('<Polygon> area not implemented by child class as expected')
              
              
    def __str__(self):
        """Represent the polygon as a string
        """
        # create a string with the polygon name and length of sides in it
        s = "%s" % self.polygon_name()
        for idx, side in enumerate(self.sides):
            s += "\n   side %d: %f" % (idx+1, side)
        return s

In [169]:
# basic tests of the base Polygon class
# triangle with 3 sides of length 3,4,5
triangle = Polygon( (3, 4, 5) )
print(triangle)

3-sided Polygon
   side 1: 3.000000
   side 2: 4.000000
   side 3: 5.000000


Given our base class, lets create a child class called `Triangle` which is
a 3 sided polygon.  The length of the 3 sides uniquely define only a single
triangle, thus we can do things like calculate the area and angles of the 
triangle given its 3 side lengths.

Here we override the `polygon_name` so we have a more specifi name when 
we display the triangle, and we add an area method to calcualte
the triangle area given the length of its sides.

**Note**: notice the syntax for defining inheritance.  By giving a base class
name in parenthesis after our new class, we are declaring that this is 
a child class of the `Polygon` base class.

**Note**: we have an example of chaining a method from our
super class here.  We do some error checking, and if the number
of sides is not 3, we throw an exception since it really isn't a triangle
without 3 sides.

In [170]:
class Triangle(Polygon):
    """A 3-sided polygon is a triangle.
    """
    def __init__(self, sides):
        """We chain the constructor so we can test that
        we really are a triangle.
        """
        # refuse if we are not a triangle
        if len(sides) != 3:
            raise Exception("<Triangle> Error, must have 3 sides")
        
        # otherwise safe to construct ourself
        Polygon.__init__(self, sides)
        
    def polygon_name(self):
        """Override base class so our name is more meaningful.
        """
        return "Triangle"

    def area(self):
        """Given the length of 3 sides only 1 unique triangle is possible.
        We can calculate its area.
        """
        # need a square root function to calculate area
        from math import sqrt 
        
        # tuple assignment to give sides names easier to work with
        a, b, c = self.sides
        
        # calculate the semi-perimeter
        s = (a + b + c) / 2.0
        
        # area is then a function of the semi perimeter and the side lengths
        area = sqrt( s * (s - a) * (s - b) * (s - c) )
        
        return area
        

In [171]:
try:
    Triangle( (1, 2) )
except Exception:
    print('Exception thrown because triangle must have exactly 3 sides')

Exception thrown because triangle must have exactly 3 sides


In [172]:
# a 3, 4, 5 right triangle, has area (3*4) / 2
right_triangle = Triangle( (3, 4, 5) )
print(right_triangle)
print(right_triangle.area())

Triangle
   side 1: 3.000000
   side 2: 4.000000
   side 3: 5.000000
6.0


In [173]:
# equilateral triangle
equilateral = Triangle( (10, 10, 10) )
print(equilateral)
print(equilateral.area())

Triangle
   side 1: 10.000000
   side 2: 10.000000
   side 3: 10.000000
43.30127018922193


Quadrilaterals have 4 sides.  Squares and rectangles are quadralaterals.

We can't determine the area of a quadrilateral just from the length of the sides.
But if we are given angles of 2 opposite corners (among the 4), then we
can uniquel determine the quadrilateral area.

In [174]:
class Quadrilateral(Polygon):
    """A 4-sided polygon generically is a Quadrilateral.
    """
    def __init__(self, sides, angle1, angle2):
        """We chain the constructor.  We check that it is a quadrilateral.
        We also add in opposite angle attributes, so we uniquely define
        the quadrilateral
        """
        # refuse if we are not a quadrilateral
        if len(sides) != 4:
            raise Exception("<Quadrilateral> Error, must have 4 sides")
        
        # otherwise safe to construct ourself
        Polygon.__init__(self, sides)
        
        # save the opposite angle attributes of this quadrilateral as well
        self.angle1 = angle1
        self.angle2 = angle2
        
    def all_angles_90(self):
        """Squares and rectangles have 4 angles all of 90 degrees,
        If the two opposite angles are 90 degrees, all of them are, 
        and thus we are either a square or a rectangle.
        """
        if self.angle1 == 90 and self.angle2 == 90:
            return True
        else:
            return False
        
    def is_square(self):
        """Test for squareness.  We are a square when all angles are
        90degree angles, and all sides are of equal length
        """
        a, b, c, d = self.sides
        if self.all_angles_90() and a == b and a == c and a == d:
            return True
        else:
            return False
        
    def is_rectangle(self):
        """Test for rectangleness.  We are a rectangle when 
        all angles are 90degrees, and we are not a square
        (adjacent sides are of different lengths).
        """
        if self.all_angles_90() and not self.is_square():
            return True
        else:
            return False
        
    def polygon_name(self):
        """Override base class so our name is more meaningful.
        Now that I think about this, we could derive Square
        and Rectangle classes from Quadrilteral.  I leave as
        an exercise for the student.
        """
        # determine shape name
        if self.is_square():
            name = "Square"  
        elif self.is_rectangle():
            name = "Rectangle"
        else:
            name = "Quadrilateral"
            
        return name

    def area(self):
        """Given the 4 side lengths and two opposite angles, 
        determine the area of this quadrilateral.  We
        use Bretschneider's formula
        """
        # need a few math functions and pi
        from math import sqrt, cos, pi
        
        # tuple assignment to give sides names easier to work with
        a, b, c, d = self.sides
        
        # semi-perimeter needed again
        s = (a + b + c + d) / 2.0
        
        # 1/2 sum of the two opposite angles, we assume angles
        # are specified in degrees
        theta = (self.angle1 + self.angle2) / 2.0
        radians = theta * (pi / 180.0)
        
        # area is then a function of sum of angles, length
        area = sqrt( (s - a) * (s - b) * (s - c) * (s - d) -
                      a * b * c * d * cos(radians)**2.0 )
        
        return area
        

In [175]:
# square
q1 = Quadrilateral( (5, 5, 5, 5), 90, 90 )
print(q1)
print(q1.area())

Square
   side 1: 5.000000
   side 2: 5.000000
   side 3: 5.000000
   side 4: 5.000000
25.0


In [176]:
# rectangle
q2 = Quadrilateral( (5, 3, 5, 3), 90, 90)
print(q2)
print(q2.area())

Rectangle
   side 1: 5.000000
   side 2: 3.000000
   side 3: 5.000000
   side 4: 3.000000
15.0


In [177]:
# an irregular polygon, the area of this polygon is about 100
q3 = Quadrilateral( (13, 14, 2.985, 13), 60, 120)
print(q3)
print(q3.area())

Quadrilateral
   side 1: 13.000000
   side 2: 14.000000
   side 3: 2.985000
   side 4: 13.000000
100.00525242157576


# Functional Programming in Python

In addition to supporting Object-Oriented programming,  you
can also program in a
[Functional Programming style](https://docs.python.org/3/howto/functional.html)
with Python.

Functional programming uses pure functions, with no side effects.  In functional
programming we decompose a problem into a set of pure functions.  Ideally
functions only take inputs and produce outputs, they do not have any internal 
state or remember anything or modify anything as a side effect (this is what
makes them a pure function).  Functional programming can be considered as the
opposite of object-oriented programming.  Objects are little capsules containing some internal state along with a collection of methods that allow you to modify
that state.  Functional programming wants to avoid state changes as much as possible
and works with data flowing between functions that take it as input, transform
it in some way (like map or filter it) and send it along the pipeline as output.

You can combine the two approaches in Python.  OO is good for some things
(like the big picture structure of a library) and functional programming
is better suited for other areas.

Some of the libraries and concepts we use in this class make use of a 
functional programming approach towards implementing solutions.  Thus 
a basic familiarity with the concepts can be helpful for this class.



## Functions as first-class objects

We briefly mentiond the idea of passing functions to other functions
when we first talked about functions.  When functions are first-class objects
of a programming language, they can be used as parameters to other functions.
This is an important aspect of functional-style programming, and is used
in many places in the libraries we will use.

As a quick example, say we have a function that simply tests
wheter a values is even or not.

In [178]:
def is_even(x):
    """Test for "evenness" of the given value x
    """
    return (x % 2) == 0

So we could write a function that takes tests like this to perform filtering
actions.  For example, our own implementation of a generic filter might
expect a function that tests a value and returns the truthiness of the
value for the test.

In [179]:
def our_filter(test, iter):
    """Perform a filter of a sequence or some iteratble source of data
    """
    res = []
    
    # for each item in the iteration sequence
    for item in iter:
        # if the item passes the test, it stays in, otherwise filtered out.
        if test(item):
            res.append(item)
            
    return res

In [180]:
l = [3, 5, 2, 6, 9, 7, 4, 8, 2]

# use our test for evenness to filter the sequence
our_filter(is_even, l)

[2, 6, 4, 8, 2]

The point here is that the generic filter function can be used to 
filter by any test, so long as the test function is of the appropriate
signature.

In [181]:
def is_odd_or_2(x):
    if x == 2:
        return True
    else:
        return x % 2 == 1

In [182]:
our_filter(is_odd_or_2, l)

[3, 5, 2, 9, 7, 2]

## Composibility, and map/filter pipelines

The most obvious example we will use in this class of functional programming
and composibility is the `Scikit-learn`'s data transformation 
pipeline.  In general there is a type of class in `sklearn` called
a `Transformer` that takes a data set as input to a single method called
`transform()` and it transforms and returns a new data set as output.
You can chain together sequences of these object/function transformers
to define a data cleaning and transformation pipeline.

As a general example of this type of style of programming, lets look at
two built-in Python functions that are oriented towards a functional
programming style `map()` and `filter()`.

`filter()` does exactly what we just described for our own filter, though
it accepts iterator objects and returns an iterator object as a result.
This means you can use it like our own function, but you will get back
an object you have to iterate over to get the items from the sequence.


In [183]:
# use builg in filter method.  If you call you get back an iterator object
filter(is_even, range(100))

<filter at 0x7ffb3c095610>

In [184]:
# you can use a for loop to get the items out of the filter
for val in filter(is_even, range(100)):
    print(val, end=' ')

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 66 68 70 72 74 76 78 80 82 84 86 88 90 92 94 96 98 

In [185]:
# or you can use the built-in list() function to gather an iteration source into
# a list
list(filter(is_even, range(100)))

[0,
 2,
 4,
 6,
 8,
 10,
 12,
 14,
 16,
 18,
 20,
 22,
 24,
 26,
 28,
 30,
 32,
 34,
 36,
 38,
 40,
 42,
 44,
 46,
 48,
 50,
 52,
 54,
 56,
 58,
 60,
 62,
 64,
 66,
 68,
 70,
 72,
 74,
 76,
 78,
 80,
 82,
 84,
 86,
 88,
 90,
 92,
 94,
 96,
 98]

`map(f, iter)` generically returns a new list which is the result of applying the
function `f` to each item in iter.  So for example

In [186]:
# generate some random numbers, in range -10 to 10
import random
# an example of a generator, this will create a sequcne of 25 random number from
# -10 to 10
gen = (random.randint(-10, 10) for i in range(25))

In [187]:
# map all values to their absolute value, abs() is a built-in function
list(map(abs, gen))

[2, 5, 2, 7, 1, 2, 2, 8, 10, 8, 8, 2, 2, 2, 6, 1, 1, 8, 3, 5, 3, 4, 4, 8, 10]

In [188]:
# as another example, lets say we need a boolean array that is true for all
# values < 0 and false for all >= 0
gen = (random.randint(-10, 10) for i in range(25))
list(map(lambda x: x < 0, gen))

[True,
 True,
 True,
 True,
 True,
 True,
 True,
 False,
 False,
 True,
 True,
 False,
 True,
 False,
 False,
 True,
 True,
 True,
 True,
 True,
 True,
 True,
 False,
 False,
 False]

Map is also useful when we need to map some function of 2 or more
sequences to some new value.  This is common when, for example, we have 
2 or more features of a data set, and we want to create a new feature that is
a combination of the 2 existing features.

So for example, say you want to raise all of the values in your first 
sequence to the power indicated in the second sequence.

In [189]:
bases = [10, 20, 30, 40, 50]
powers = [1, 2, 3, 4, 5]

list(map(pow, bases, powers))

[10, 400, 27000, 2560000, 312500000]

We have already seen the `enumerate()` built-in function, which is an
example that maps an index with a sequence of items and forms tuples.

In [190]:
gen = (random.randint(-10, 10) for i in range(25))
list(enumerate(gen))

[(0, 10),
 (1, 9),
 (2, 3),
 (3, 8),
 (4, 6),
 (5, -8),
 (6, 10),
 (7, -3),
 (8, -4),
 (9, -8),
 (10, -1),
 (11, -6),
 (12, 3),
 (13, 7),
 (14, 10),
 (15, 7),
 (16, 4),
 (17, 7),
 (18, -8),
 (19, 2),
 (20, 9),
 (21, -8),
 (22, -10),
 (23, 2),
 (24, -3)]

In [191]:
# basically this is a map that takes a list of indexes and of the values,
# and returns tuples
gen = (random.randint(-10, 10) for i in range(25))
list(map(lambda x, y: (x, y), range(25), gen))

[(0, -1),
 (1, 2),
 (2, 10),
 (3, -9),
 (4, 0),
 (5, -4),
 (6, -9),
 (7, 6),
 (8, -9),
 (9, -2),
 (10, 7),
 (11, -8),
 (12, -8),
 (13, -5),
 (14, -5),
 (15, 2),
 (16, 10),
 (17, 3),
 (18, 4),
 (19, -6),
 (20, 6),
 (21, -5),
 (22, -2),
 (23, 10),
 (24, -3)]

In [192]:
# or equivalently, the zip() function can do this as well
gen = (random.randint(-10, 10) for i in range(25))
list(zip(range(25), gen))

[(0, 4),
 (1, 9),
 (2, 4),
 (3, -7),
 (4, -1),
 (5, -1),
 (6, -4),
 (7, 7),
 (8, 5),
 (9, 5),
 (10, -9),
 (11, 2),
 (12, -6),
 (13, -7),
 (14, 0),
 (15, 9),
 (16, 3),
 (17, -3),
 (18, 7),
 (19, 2),
 (20, 2),
 (21, 6),
 (22, 9),
 (23, -4),
 (24, 8)]

And there are many more built in functions and libraries that support
functional programming approaches in Python.  I recommend looking
up

- [itertools](https://docs.python.org/3/library/itertools.html#module-itertools)
- [operator](https://docs.python.org/3/library/operator.html#module-operator)