# [AHA! Activity Health Analytics](http://casas.wsu.edu/)
[Center for Advanced Studies of Adaptive Systems (CASAS)](http://casas.wsu.edu/)

[Washington State University](https://wsu.edu)
# Appendix B Data Structures

## Learner Objectives
At the conclusion of this lesson, participants should have an understanding of:
* Working with commonly used built-in Python data structures
    * Strings
    * Lists
    * Tuples
    * Dictionaries
* Object aliasing
* Passing arguments into programs via command line arguments
* Defining classes
* Declaring objects to instantiate classes
* Implementing basic object functionality
* Implementing class methods

## Acknowledgments
Content used in this lesson is based upon information in the following sources:
* None to report

## Review of Strings
Recall a string is a *sequence of characters*. In Python, we can have sequences of items other than characters. For example, we can have sequences of:
* Numbers
    * Integers
    * Floats
* Objects
    * Strings
    * Files
    * Turtles
    * Our own objects we define ourselves (to be learned later, stay tuned!)

## Lists
A list is a *sequence of items*. In a string, the items are characters. In a list, they can be any type. Items in a list are also called *elements*.

We declare a sequence of items as a list with hard brackets: `[<comma separated list items>]`

In [1]:
list_ints = [0, 1, 10, 20]
print(list_ints)

list_floats = [0.2, 0.4, 0.6, 1.0]
print(list_floats)

# types can be mixed in a list
list_numbers = [0, 0.0, 1, 1.0, -2]
print(list_numbers)

list_strings = ["cat", "dog", "bird"]
print(list_strings)

[0, 1, 10, 20]
[0.2, 0.4, 0.6, 1.0]
[0, 0.0, 1, 1.0, -2]
['cat', 'dog', 'bird']


Note: the data types in a list need not all be the same.

### List Indexing
Just like with strings, list indices are 0-based. We can index into a list to access a list item just like how we indexed into a string to get an individual character:

In [4]:
print(list_ints[0])

0


### List Length
We can also use then `len()` function to determine the number of items in a list:

In [5]:
print(len(list_strings))
print(list_strings[len(list_strings) - 1])

3
bird


### The Empty List
Just like how we can have an empty string (`""`), a string with no characters, we can have an empty list (`[]`). An empty list has no items.

In [10]:
empty_list = []
print(len(empty_list))

0


### Nested Lists
We can have lists of lists!

In [4]:
nested_list = [[0, 1], [2], [3], [4, 5], []]
print(nested_list)

[[0, 1], [2], [3], [4, 5]]


Note: the sub-lists can be of unequal lengths.

Now, consider the following nested list:

`matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]`

Logically, `matrix` looks like the following:

|Column Index:||||
|-|-|-|-|
||**0**|**1**|**2**|
|**Row Index**||||
|**0:**|1|2|3|
|**1:**|4|5|6|
|**2:**|7|8|9|

To access an item in a 2-dimensional nested list, we index into the `<nested list variable>` twice: `<nested list variable>[row_index][column_index]` by row first then column. For example: 

In [6]:
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
print(matrix)
# the first element in the first list
print(matrix[0][0])
# the last element in the last list
print(matrix[2][2])
# the middle element in the last list (8)
print(matrix[2][1])

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
1
9
8


### Lists are Mutable
Unlike strings, we can change the items in a list:

In [8]:
buildings = ["Sloan", "EME", "Dana", "ETRL"]
print(buildings)

# modify the list
buildings[2] = "Carpenter"
print(buildings)

['Sloan', 'EME', 'Dana', 'ETRL']
['Sloan', 'EME', 'Carpenter', 'ETRL']


Note: We still cannot change a string. Strings are immutable!

## Looping Through List Items
Just like with strings, we can use the `in` operator or indices to iterate through items in a list:

In [6]:
candies = ["twix", "reeses", "oreos", "snickers"]

for candy in candies:
    print(candy)
    
i = 0
while i < len(candies):
    print(candies[i])
    i += 1
    
i = 0
for i in range(len(candies)):
    print(candies[i])

twix
reeses
oreos
snickers
twix
reeses
oreos
snickers
twix
reeses
oreos
snickers


## List Operators
### List Concatenation
Just like with strings, we can use the concatenation `+` operator to add lists together:

In [23]:
candies = ["twix", "reeses", "oreos", "peach rings"]

print(candies)
candies += ["m&ms", "starburst"]
print(candies)

['twix', 'reeses', 'oreos', 'peach rings']
['twix', 'reeses', 'oreos', 'peach rings', 'm&ms', 'starburst']


### List Repetition
Just like with strings, we can repeat items in a list with the repetition `*` operator:

In [24]:
bag_o_candies = 5 * ["twix"]
print(bag_o_candies)

bag_o_candies += 3 * ["peach rings"]
print(bag_o_candies)

['twix', 'twix', 'twix', 'twix', 'twix']
['twix', 'twix', 'twix', 'twix', 'twix', 'peach rings', 'peach rings', 'peach rings']


### List Slicing
Just like with strings, we can use the slice operator `:` with lists:

In [8]:
print(candies[1:3])
# returns a copy
print(candies[:])

['reeses', 'oreos']
['twix', 'reeses', 'oreos', 'snickers']
False
True


However, since lists are mutable, we can now change multiple items in a list at a time using slices:

In [22]:
candies = ["twix", "reeses", "oreos", "peach rings"]
print(candies)
candies[3:] = ["butterfinger", "heath", "swedish fish"]
print(candies)
candies[0:2] = ["carmello", "airheads"]
print(candies)

['twix', 'reeses', 'oreos', 'peach rings']
['twix', 'reeses', 'oreos', 'butterfinger', 'heath', 'swedish fish']
['carmello', 'airheads', 'oreos', 'butterfinger', 'heath', 'swedish fish']


## List Methods
Just like with strings, lists are objects that have methods we can utilize. 

### `append()`
For example, since lists are mutable, there is an `append(<new item>)` method to add an item to the end of a list:

In [3]:
cities = ["Pullman", "Spokane"]
print(cities)

# adds the string as an item
cities.append("Seattle")
print(cities)

# adds the list as an item
cities.append("Moscow")
print(cities)

['Pullman', 'Spokane']
['Pullman', 'Spokane', 'Seattle']
['Pullman', 'Spokane', 'Seattle', 'Moscow']


As review, how could we achieve the same functionality as `append()` without using `append()`?

In [None]:
cities = ["Pullman", "Spokane"]
print(cities)

# adds the strings as an item
cities += ["Seattle"]
print(cities)

### `extend()`
`extend()` is similar to `append()`; however, `extend()` takes a list as an argument and adds each item to the list:

In [4]:
cities = ["Pullman", "Spokane"]
print(cities)

# adds each string in the list as an item
cities.extend(["Seattle", "Couer d'Alene"])
print(cities)

['Pullman', 'Spokane']
['Pullman', 'Spokane', 'Seattle', "Couer d'Alene"]


What would happen if we used `append()` instead of `extend()` in the above code?

In [5]:
cities = ["Pullman", "Spokane"]
print(cities)
cities.append(["Seattle", "Couer d'Alene"])
print(cities)

['Pullman', 'Spokane']
['Pullman', 'Spokane', ['Seattle', "Couer d'Alene"]]


`cities` becomes a nested list!

### `sort()`
Many applications require lists of items to be sorted. In CptS121, you will learn how to write your own sorting algorithms. For now, we will use the `sort()` list method:

In [6]:
cities = ["Pullman", "Spokane", "Seattle", "Couer d'Alene"]
print(cities)

# ascending order
cities.sort()
print(cities)

['Pullman', 'Spokane', 'Seattle', "Couer d'Alene"]
["Couer d'Alene", 'Pullman', 'Seattle', 'Spokane']


How would you sort a list in descending order? Try using `help(cities.sort)` to find out:

In [7]:
help(cities.sort)

Help on built-in function sort:

sort(...) method of builtins.list instance
    L.sort(key=None, reverse=False) -> None -- stable sort *IN PLACE*



In [8]:
print(cities)
cities.sort(reverse=True)
print(cities)

["Couer d'Alene", 'Pullman', 'Seattle', 'Spokane']
['Spokane', 'Seattle', 'Pullman', "Couer d'Alene"]


## Deleting Items in a List
Since lists are mutable, we can delete items in a list. 

### Single Item Deletes
We have two list methods to delete a *single* item in a list
1. When you know the *index* of the item to delete
    * `pop(<index>)`
1. When you know the *value* of the item to delete
    * `remove(<item>)`

In [9]:
cities = ["Pullman", "Spokane", "Seattle", "Couer d'Alene"]

# pop returns the item removed
city = cities.pop(2)
print(city)
print(cities)

# remove does not return the item removed
cities.remove("Spokane")
print(cities)

Seattle
['Pullman', 'Spokane', "Couer d'Alene"]
['Pullman', "Couer d'Alene"]


### `del` Keyword and Multiple Item Deletes
Alternatively, we can delete an object using the `del` reserved keyword:

In [10]:
cities = ["Pullman", "Spokane", "Seattle", "Couer d'Alene"]
print(cities)

# del is not a function
del cities[1]
print(cities)

['Pullman', 'Spokane', 'Seattle', "Couer d'Alene"]
['Pullman', 'Seattle', "Couer d'Alene"]


We may want to delete multiple items at a time. We can do this with a slice and `del`:

In [11]:
cities = ["Pullman", "Spokane", "Seattle", "Couer d'Alene"]
print(cities)

del cities[0:3]
print(cities)

['Pullman', 'Spokane', 'Seattle', "Couer d'Alene"]
["Couer d'Alene"]


### Relationship Between Strings and Lists
A list of single character strings is not a string:

In [12]:
my_list = ["c", "p", "t", "s", "1", "1", "1"]
print("%s" %(my_list))

['c', 'p', 't', 's', '1', '1', '1']


### `join()` (string method)
However, we can turn a list of strings into a string with the `join()` string method. We need to specify a "delimiter" string to use to concatenate the individual strings in a list into a single string:

In [1]:
my_list = ["p", "y", "t", "h", "o", "n", "!"]
delimiter = '' # empty string
my_string = delimiter.join(my_list)
print("%s" %(my_string))

delimiter = ':)'
my_string = delimiter.join(my_list)
print("%s" %(my_string))

python!
p:)y:)t:)h:)o:)n:)!


### `list()` (function)
To convert the string back into a list, we can type cast the string into a list with `list()`:

In [3]:
my_string = "python!"
my_list = list(my_string)
print(my_list)

['p', 'y', 't', 'h', 'o', 'n', '!']


### `split()` (string method)
`split(<string delimiter>)` breaks a string into pieces at each `<string delimiter>`. The pieces are returned as a list: 

In [15]:
sentence = "hello how are you"
pieces = sentence.split(" ")
print(pieces)

['hello', 'how', 'are', 'you']


## Aliasing
When we declare a list variable, as in `list1 = [0, 1, 2, 3]`, a list *object* is created. We say the variable `list1` is a *reference* to the list object `[0, 1, 2, 3]`. In memory, this looks like the following:
![](https://raw.githubusercontent.com/gsprint23/aha/master/lessons/figures/reference_example.png)

If we declare another list variable, `list2 = [0, 1, 2, 3]`, `list2` refers to a *different* list object, even though both objects that `list1` and `list2` refer to contain the same items:
![](https://raw.githubusercontent.com/gsprint23/aha/master/lessons/figures/references_multiple_example.png)

We can test if `list1` and `list2` refer to lists that contain the same elements:

In [2]:
list1 = [0, 1, 2, 3]
list2 = [0, 1, 2, 3]
print(list1 == list2)

True


To test if `list1` and `list2` *refer* to the same list object, we can use the Python reserved keyword, `is`. `is` tests whether two variables refer to the same object: 

In [3]:
list1 = [0, 1, 2, 3]
list2 = [0, 1, 2, 3]
print(list1 is list2)

False


Note: Python is intelligent! Since strings are immutable, only one object is created in the following code:

In [4]:
string1 = "hello"
string2 = "hello"
print(string1 == string2)
print(string1 is string2)

True
True


In the above code, both `string1` and `string2` refer to the same string object. This phenomenon is called *aliasing*. 

Let's return to our list example and see aliasing at work. 

If instead of assigning `list2` to a new list object, we assign `list2` to `list1`: `list2 = list1`, `list2` refers to the same object as `list1`.
![](https://raw.githubusercontent.com/gsprint23/aha/master/lessons/figures/alias_example.png)

We now say the object is *aliased*, because it has more than one reference, or alias.

If the aliased object is mutable, either reference can modify the object:

In [5]:
# same object aliased by list1 and list2
list1 = [0, 1, 2, 3]
list2 = list1
print(list1)
print(list2)
list2[2] = 100
print(list1)
print(list2)
print("\n")

# compared to creating two separate objects list1 and list2
list1 = [0, 1, 2, 3]
list2 = [0, 1, 2, 3]
print(list1)
print(list2)
list2[2] = 100
print(list1)
print(list2)

[0, 1, 2, 3]
[0, 1, 2, 3]
[0, 1, 100, 3]
[0, 1, 100, 3]


[0, 1, 2, 3]
[0, 1, 2, 3]
[0, 1, 2, 3]
[0, 1, 100, 3]


Aliasing is important to keep in mind, especially when passing lists as arguments.

## Lists Arguments
We can pass lists into functions as arguments:

In [None]:
def pretty_print_list(list_to_print):
    '''
    
    '''
    for value in list_to_print:
        print(value, end=" ")

numbers = [0.0, 0.2, 0.4]
pretty_print_list(numbers)

When a list is passed as an argument to a function, the function parameter variable is a *reference* to the list, making the list *aliased*. This means that if we modify a list in our function, the change to the object persists and the calling code will see the change.

In the example above, `numbers` and `list_to_print` are aliases to the list object `[0.0, 0.2, 0.4]`. If `pretty_print_list()` can use `list_to_print` to modify the object. 

Let's write a new function, `add_one()`, that adds one to each value in a list:

In [None]:
def add_one(list_arg):
    '''
    
    '''
    for i in range(len(list_arg)):
        list_arg[i] += 1

numbers = [0.0, 0.2, 0.4]
print(numbers)
add_one(numbers)
print(numbers)

## Returning Lists
We can write functions that return lists. Consider a function that returns a list of numbers from arguments `start_index` to `end_index + 1`:

In [None]:
def create_sequence(start_index, end_index):
    '''
    
    '''
    sequence = []
    
    for i in range(start_index, end_index):
        sequence.append(i)
    return sequence

first_ten_nums = create_sequence(0, 10)
print(first_ten_nums)

## Command Line Arguments
We can pass arguments into our Python programs. The arguments will be stored in a list, referenced by `sys.argv`. Note: we will have to `import sys` to get access to `sys.argv`.

The first argument is always the name of the script, and is counted in the total number of command line arguments:

In [1]:
import sys

print(sys.argv)
print(len(sys.argv))

['C:\\Anaconda3\\lib\\site-packages\\ipykernel\\__main__.py', '-f', 'C:\\Users\\gsprint\\AppData\\Roaming\\jupyter\\runtime\\kernel-1e668b0e-6bd8-4985-b65d-41430143d283.json']
3


Note: to specify command line arguments in Spyder IDE, go to Run->Configure. In the dialog, under "General settings", check the "Command line options" box and enter the arguments in the text box:
<img src="https://raw.githubusercontent.com/gsprint23/aha/master/lessons/figures/spyder_cmd_line_args.png" width="300">

## Tuples
Tuples are immutable lists. They are declared as a comma separated list, with or without parentheses:

In [6]:
my_tuple = "x", "y", "z"
print(my_tuple)
print(type(my_tuple))

# need a comma after a single element initialization
my_tuple2 = (1, )
print(my_tuple2)

# need a comma after a single element initialization
not_a_tuple = ("a")
print(not_a_tuple)
print(type(not_a_tuple))

# creating an empty tuple
empty_tuple = tuple()
print(empty_tuple)
print(type(empty_tuple))

('x', 'y', 'z')
<class 'tuple'>
(1,)
a
<class 'str'>
()
<class 'tuple'>


Tuple indexing and slicing works the same as for lists:

In [7]:
my_tuple = ("x", "y", "z")
print(my_tuple[1])
print(my_tuple[0:2])

y
('x', 'y')


HOWEVER, tuples are immutable, so you cannot modify them. The follow code demonstrates the immutability of tuples:

In [1]:
my_tuple = ("x", "y", "z")
# crashes! tuples are immutable, you cannot change them
my_tuple[2] = "a"

TypeError: 'tuple' object does not support item assignment

## Key-Value Pairs
Consider the following set of items:
* Your student ID number
* Your checking account number
* The VIN number on your car
* Your social security number

What do all of the above items have in common? They are all *unique* identifiers for something. For example, there may be several students named "John Smith" at WSU. How is the university to distinguish academic records for multiple John Smiths? They assign a unique *key* to identify each individual student:

|ID Number|Last name|First name|
|-|-|-|
|28905|Smith|Jane|
|19485|Smith|John|
|28450|Smith|John|
|25543|Smith|John|
|17834|Smith|Justin|

For the other examples, your checking account number is a key that uniquely identifies your account, the VIN is a key that uniquely identifies your car, and your SSN is a key that uniquely identifies you for government purposes.

Keys are useful because they *map* keys to values. In the example above, a student ID number of 28905 maps to the academic records of Jane Smith at WSU. The academic record of Jane Smith is called the *value* that the *key* (ID number) maps to. Together, the ID number (28905) and the record (Jane Smith's academic record) form a *key-value pair*. 

Keys can be represented as a list of unique values (no duplicates). Values can be represented as a list as well (can have duplicates). A single data structure that combines key lists and value lists is called a *dictionary*.

## Dictionaries
A *dictionary is a list with keys as indices*. Keys can be integers, strings, file objects, etc. Keys cannot be lists. To declare a dictionary, we use the curly braces `{ }`:

In [8]:
# declares an empty dictionary
my_dict = {}
print(my_dict)
# can also use dict()
my_second_dict = dict()
print(my_second_dict)

{}
{}


We can initialize a dictionary with values using comma separated `key:value` pairs:

In [9]:
state_capitals = {'washington': 'olympia', 'idaho': 'boise', 'oregon': 'portland'}
print(state_capitals)

{'idaho': 'boise', 'washington': 'olympia', 'oregon': 'portland'}


We can create a dictionary from a list of tuples, where each tuple in the list is a key-value pair:

In [10]:
# roman numerals
key_values = [("I", 1), ("V", 5), ("X", 10), ("L", 50)]
roman_numerals = dict(key_values)
print(roman_numerals)

{'V': 5, 'L': 50, 'I': 1, 'X': 10}


We can also convert a dictionary back to a list of tuples with the dictionary method `items()` and the built-in function `list()`:

In [11]:
list_of_tuples = list(roman_numerals.items())
print(list_of_tuples)

[('V', 5), ('L', 50), ('I', 1), ('X', 10)]


### Compatible Dictionary Data Types
#### Keys
Dictionary keys can be integers, strings, files, tuples, etc.. Lists cannot be keys.

#### Values
Values can be any type. For example, we can have string keys and list values:

In [12]:
fruit_colors = {'kiwi': ['brown', 'green'], 'banana': ['yellow'], 'watermelon': ['green', 'red']}
print(fruit_colors)

{'watermelon': ['green', 'red'], 'banana': ['yellow'], 'kiwi': ['brown', 'green']}


### Dictionary Indexing
We can access an item via a key using hard brackets `[ ]` (similar to indexing into a list):

In [13]:
state_capitals = {'washington': 'olympia', 'idaho': 'boise', 'oregon': 'portland'}
print("The capital of idaho is %s" %(state_capitals['idaho']))

The capital of idaho is boise


### Adding Key-Value Pairs
Since dictionaries are *mutable*, we can add key-value pairs to the dictionary using hard brackets `[ ]`:

In [14]:
state_capitals = {'washington': 'olympia', 'idaho': 'boise', 'oregon': 'portland'}
print(state_capitals)

state_capitals['montana'] = 'helena'
print(state_capitals)

{'idaho': 'boise', 'washington': 'olympia', 'oregon': 'portland'}
{'idaho': 'boise', 'washington': 'olympia', 'montana': 'helena', 'oregon': 'portland'}


Note: keys in a dictionary are not sorted in any particular order.

### Dictionary Length with `len()`
We can still determine the number of items (key-value pairs) in a dictionary with `len()`:

In [15]:
state_capitals = {'washington': 'olympia', 'idaho': 'boise', 'oregon': 'portland'}
print(len(state_capitals))

3


### Existence of a Key
We can also test if a key is a valid key in the dictionary with the `in` keyword:

In [16]:
state_capitals = {'washington': 'olympia', 'idaho': 'boise', 'oregon': 'portland'}

print('california' in state_capitals)
print('idaho' in state_capitals)
print('olympia' in state_capitals)

False
True
False


## Looping through a Dictionary
We can traverse a dictionary easily with a `for` loop that walks through each key in the dictionary:

In [17]:
sides = {'square': 4, 'triangle': 3, 'pentagon': 5, 'rectangle': 4}

for side in sides:
    print(side, sides[side], sep= ": ")

triangle: 3
rectangle: 4
square: 4
pentagon: 5


## Example Problem: Letter Frequencies
Suppose we want to keep track of the frequency of letters in a word. For example, the word "hello" has 4 letters with the following frequencies:
* h: 1
* e: 1
* l: 2
* o: 1

Let's write a program to prompt the user to enter a word. Our program will tell the user the frequency of each letter in the word. We could solve this problem with either a list or a dictionary:
* List solution
    1. Create a list with 26 zeros
    1. Write a function to convert a letter into an integer in the range [0-25] to index into the list. We can do this with the `ord(<character>)` function and ASCII codes...
    1. Walk through the word and increment the corresponding list position for each letter
    1. Convert the index of non-zero list entries back to characters using `char(<integer>)` to print out the histogram results
* Dictionary solution
    1. Create an empty dictionary
    1. Walk through the word and add the letter to the dictionary with a count of zero if the letter is not already a key, increment otherwise.
    
The dictionary solution lends itself more suitable to this problem because we do not have to allocate space for all letters ahead of time and we don't have to perform a character to integer conversion to index into the data structure.

In [None]:
def compute_letter_frequencies(word):
    '''
    
    '''
    histogram = {}
    
    for letter in word:
        if letter in histogram:
            histogram[letter] += 1
        else:
            histogram[letter] = 1
    return histogram

print(compute_letter_frequencies("hello"))
print(compute_letter_frequencies("mississippi"))

Compared to the list solution:

In [None]:
def letter_to_index(letter):
    '''
    
    '''
    ascii_val = ord(letter)
    index = ascii_val - ord('a')
    return index

def index_to_letter(index):
    '''
    
    '''
    ascii_val = index + ord('a')
    letter = chr(ascii_val)
    return letter
    
def compute_letter_frequencies_list(word):
    '''
    
    '''
    histogram = [0] * 26
    
    word.lower()
    for letter in word:
        index = letter_to_index(letter)
        histogram[index] += 1
    return histogram

def pretty_print(histogram):
    '''
    
    '''
    for i in range(len(histogram)):
        if histogram[i] != 0:
            letter = index_to_letter(i)
            print("%s: %d" %(letter, histogram[i]), end=" ")
    print("")

histogram = compute_letter_frequencies_list("hello")
pretty_print(histogram)

Note: We have now seen lists of tuples, lists of lists, dictionaries of lists, etc. In general, we can have sequences of sequences. The types of sequences that can be nested and the number of nesting levels is up to you, the programmer!

## Objects
We have already been exposed to the notion of an *object*. For example, when we open a file for reading or writing, the `open()` function returns a *file object*. 

In [1]:
# infile is a file object
infile = open(r"files\transactions.txt", "r")
print(infile.readlines())
infile.close()

['13.42\n', '27.19\n', '9.98\n', '48.56\n', '33.71']


`infile` is a file object that has associated functions, called *methods*. We have already seen this notion of *methods* when we learned about string and list methods (think `my_string.upper()`, etc.). In the above example, `readlines()` and `close()` are methods belonging to file objects.

An *object* is a powerful programming concept that couples data storage (i.e. variables) with associated data operations and functionality (i.e. methods).

## Classes
We know of several Python data types:
* `int`
* `float`
* `string`
* `file`
* etc.

Today, we are going to learn how to define our own types! To do so, we will define *classes*. A class is a collection of *attributes* and *behaviors* that completely describes something. More on *attributes* and *behaviors* to come.

Programmatically, a class is a type definition, and an object is a variable of that type. We also say an object is an *instance* of a class.

Imagine we are writing a program to manage the status of books at a library or bookstore. For this program, it would be useful to have a class called `Book` where we could store information (think variables, called *attributes* when the variables belong to objects) and operations (think functions, called *methods* when the functions belong to objects) related to a book. Using the reserved keyword `class` to define a `Book` class, we can define this type:

In [1]:
class Book:
    '''
    
    '''

We have a definition for a `Book`! This class is not very powerful (yet). Let's see how we can make an instance of this class, called an object:

In [2]:
# my_book is a Book object, i.e. it is an instance of the Book class
my_book = Book()
print(type(my_book))

<class '__main__.Book'>


Now that we have a book class, let's add variables to the class to represent information about books, such as `title` (string), `author` (string), `isbn` (string), and `checked_out` (Boolean). We call variables associated with an object *attributes* to specify they are variables belonging to a class. We can declare and access the attributes of an object with the *member selection* (dot) operator:

In [4]:
my_book = Book()
my_book.title = "The Martian"
my_book.author = "Andy Weir"
my_book.isbn = "978-0-8041-3902-1"
my_book.checked_out = False # it's on the shelf

We have actually seen and used the dot notation to access variables and functions before. Recall accessing pi in the math module (`math.pi`), calling a library function (`math.sqrt(4.0)`), and calling a method of a file object (`in_file.close()`) or a string object (`my_string.upper()`). 

We can display the attribute values just like other variables:

In [5]:
if my_book.checked_out:
    print("The book \"%s\" is checked out" %(my_book.title))
else: # checked in
    print("The book \"%s\" is available on the shelf" %(my_book.title))

The book "The Martian" is available on the shelf


Objects are mutable! We can change the status of a `Book` object should someone check in or check out a book from the library:

In [6]:
my_book.checked_out = True

Now, let's modify an attribute 2 different ways:
1. Via a function
1. Via a method


## Objects and Functions

Remember when we learned about aliasing? We can pass a reference to an object into a function to create an alias. For example, supposed we have a `Book` object called `hp1`. We can make an alias called `book` for `hp1` if we pass in `hp1` into a function with a parameter called `book`:

In [6]:
def display_book(book):
    '''
    
    '''
    print("%s by %s" %(book.title, book.author))

def display_book_status(book):
    '''
    
    '''
    print("%s is checked out: %s" %(book.title, book.checked_out))
    
    
def return_book(book):
    '''
    
    '''
    book.checked_out = False
    
hp1 = Book()
hp1.title = "The Sorcerer's Stone"
hp1.author = "J.K. Rowling"
hp1.isbn = "978-0439708180"
hp1.checked_out = True

display_book(hp1)
display_book_status(hp1)
return_book(hp1)
display_book_status(hp1)

The Sorcerer's Stone by J.K. Rowling
The Sorcerer's Stone is checked out: True
The Sorcerer's Stone is checked out: False


## Objects and Methods
If we place a function *inside* a class definition, the function is a *method* associated with an instance of the class.

In [8]:
class Book:
    '''
    
    '''
    # simply indent the method definition to associate it with the class
    # self is a reference to the calling object
    def display_book(self):
        '''

        '''
        print("%s by %s" %(self.title, self.author))
    
    def display_book_status(self):
        '''

        '''
        print("%s is checked out: %s" %(self.title, self.checked_out))
    
    def return_book(self):
        '''

        '''
        self.checked_out = False

We do have to change one aspect of our function definitions to do this. When we call a method of a class, we do so in the form: `<object>.<method>()`. The method needs a reference to the object in order to access that particular instance's attributes. In Python, the `self` reference provides access to the *current* object. `self` is the first parameter of every method of every class, and it is *implicitly* passed into the method. This means, Python passes it in for us, we do not explicitly pass the object reference in as an argument of the method.

Now, if we have a `Book` object (instance of the `Book` class), we can use the member selection operator to call the `display_book_status()` and `return_book()` methods associated with `Book`s:

In [9]:
hp1 = Book()
hp1.title = "The Sorcerer's Stone"
hp1.author = "J.K. Rowling"
hp1.isbn = "978-0439708180"
hp1.checked_out = True

hp1.display_book()
hp1.display_book_status()
hp1.return_book()
hp1.display_book_status()
print(hp1)

The Sorcerer's Stone by J.K. Rowling
The Sorcerer's Stone is checked out: True
The Sorcerer's Stone is checked out: False
<__main__.Book object at 0x000000163E8D45C0>


## Special Methods

### The `__str__()` Method
The `__str__()` special method is called implicitly when a string representation of the object is required, such as `print(hp1)`. We have already written a method with similar functionality, `display_book()`. We just need to change the method identifier to `__str__()` and return the string instead of print the string, and we can achieve the `print(hp1)` functionality!

In [10]:
class Book:
    '''
    
    '''       
    def __str__(self):
        '''
        
        '''
        return "%s by %s" %(self.title, self.author)
        
hp1 = Book()
hp1.title = "The Sorcerer's Stone"
hp1.author = "J.K. Rowling"
hp1.isbn = "978-0439708180"
hp1.checked_out = True
print(hp1)

The Sorcerer's Stone by J.K. Rowling


Note: We can also explicitly call special methods: `hp1.__str__()`

### The `__init__()` Method
There is a special method identifier, `__init__()` (short for initialize) that is implicitly called by Python everytime you instantiate a new object. The double underscores denote that this method *special* in Python. We can write our own version of the `__init__()` method to specify attribute values at time of instantiation. Here is an example of the `__init__()` method for our `Book` class.

In [11]:
class Book:
    '''
    
    '''
    def __init__(self, book_title, book_author, book_isbn, book_checked_out):
        self.title = book_title
        self.author = book_author
        self.isbn = book_isbn
        self.checked_out = book_checked_out
        

And now we will instantiate a Harry Potter `Book` object:

In [12]:
hp1 = Book("The Sorcerer's Stone", "J.K. Rowling", "978-0439708180", False)

On this instantiation, the `__init__()` method we wrote is implicitly called and the attributes `title`, `author`, `isbn`, and `checked_out` are declared and initialized to the values we passed in as arguments.

## Lists of Objects
Let's put together some of the topics we have learned so far to declare a bookshelf of `Books`. This will be a list of `Book` objects. We can declare this list just like any other list, and populate it with `Book` objects:`

In [4]:
book_shelf = []

hp1 = Book("The Sorcerer's Stone", "J.K. Rowling", "978-0439708180", True)
book_shelf.append(hp1)

hp2 = Book("The Chamber of Secrets", "J.K. Rowling", "978-0439708180", False)
book_shelf.append(hp2)

hp3 = Book("The Prisoner of Azkaban", "J.K. Rowling", "978-0439708180", True)
book_shelf.append(hp3)

hp4 = Book("The Goblet of Fire", "J.K. Rowling", "978-0439708180", True)
book_shelf.append(hp4)

hp5 = Book("The Order of the Phoenix", "J.K. Rowling", "978-0439708180", False)
book_shelf.append(hp5)

hp6 = Book("The Half Blood Prince", "J.K. Rowling", "978-0439708180", False)
book_shelf.append(hp6)

hp7 = Book("The Deathly Hallows", "J.K. Rowling", "978-0439708180", True)
book_shelf.append(hp7)

for book in book_shelf:
    print(book)

The Sorcerer's Stone by J.K. Rowling
The Chamber of Secrets by J.K. Rowling
The Prisoner of Azkaban by J.K. Rowling
The Goblet of Fire by J.K. Rowling
The Order of the Phoenix by J.K. Rowling
The Half Blood Prince by J.K. Rowling
The Deathly Hallows by J.K. Rowling


## Practice Problem
### Part 1
Define a class called `Point`. A `Point` represents a position in 2 dimensional space, defined by an x and a y coordinate (no need to define any methods *yet*). 

Instantiate a `Point` object representing the origin (0,0):

In [None]:
class Point:
    '''
    
    '''

origin = Point()
origin.x = 0
origin.y = 0

### Part 2
Re-write your `Point` definition and instantiation of `Point` to make use of an `__init__()` method:

In [None]:
class Point:
    '''
    
    '''
    def __init__(self, x, y):
        '''
        
        '''
        self.x = x
        self.y = y
    
point = Point(1, 4)

### Part 3
Add a method to `Point` called `display_point()` that displays `Point` information in the form: `(x, y)`. Then call `display_point()` to print a `Point` object.

In [None]:
class Point:
    '''
    
    '''
    def __init__(self, x, y):
        '''
        
        '''
        self.x = x
        self.y = y
        
    def display_point(self):
        '''
        
        '''
        print("(%d,%d)" %(self.x, self.y), end="")
    
point = Point(1, 4)
point.display_point()

### Part 4
Modify `display_point()` to implement the special function `__str__()`. Then print a `Point` object.

In [2]:
class Point:
    '''
    
    '''
    def __init__(self, x, y):
        '''
        
        '''
        self.x = x
        self.y = y
        
    def __str__(self):
        '''
        
        '''
        return "(%d, %d)" %(self.x, self.y)
    
point = Point(1, 4)
print(point)

(1, 4)


### Part 5
Add a predicate method to `Point` called `equals()` that accepts another `Point` object and determines if it has the same `x` and `y` values as the calling object (think `self`). Then call `equals()` to determine if 2 `Point` objects store equivalent data.

In [None]:
class Point:
    '''
    
    '''
    def __init__(self, x, y):
        '''
        
        '''
        self.x = x
        self.y = y
        
    def display_point(self):
        '''
        
        '''
        print("(%d,%d)" %(self.x, self.y), end="")
        
    def equals(self, other_point):
        '''
        
        '''
        if self.x == other_point.x and self.y == other_point.y:
            return True
        return False
    
origin = Point(0, 0)

some_other_point = Point(0, 0)

origin.display_point()
print(" is equal to ", end="")
some_other_point.display_point()
print(": %s" %(origin.equals(some_other_point)))

## Object Oriented Programming
Object oriented programming (OOP) involves designing programs where most of the computation involves operations on objects. Classes are implemented to represent things in the real world and how they interact. While OOP is a vast subject (and sometimes more of an art than a science), we are going to just scratch the surface on how powerful OOP iswith the following concepts:
* Operator overloading
* Composition

Other OOP concepts include:

* Abstraction
* Encapsulation
* Inheritance
* Polymorphism
* Among others!

You can read more about OOP concepts in Chapter 18 of the Downey textbook, as well as online and in other textbooks.

### Operator Overloading
What about changing the syntax to compare two `Point` objects for equality from `point1.equals(point2)` to `point1 == point2`? We can achieve such behavior with special methods for defining operator functionality. This is called *operator overloading*. In the equality example, we are going to define the behavior for comparing two `Point` objects with the `==` operator.

#### The `__eq__()` Method
All we have to do is modify our `equals()` method to implement the special method `__eq__()`:

In [13]:
class Point:
    '''
    
    '''
    def __init__(self, x, y):
        '''
        
        '''
        self.x = x
        self.y = y
        
    def __str__(self):
        '''
        
        '''
        return "(%d,%d)" %(self.x, self.y)
        
    def __eq__(self, other_point):
        '''
        
        '''
        if self.x == other_point.x and self.y == other_point.y:
            return True
        return False
    
point1 = Point(1, 4)
point2 = Point(3, -2)
point3 = Point(3, -2)

# different x,y values
print(point1 == point2)
# same x,y values
print(point2 == point3)
# confirm they are different objects 
print(point2 is point3)

False
True
False


#### Other Operators to Overload
Try implementing the functionality for other operators:
* `+`: `__add__()`
* `-`: `__sub__()`
* `<`: `__lt__()`
* `>`: `__gt__()`
* Read about more in the [Python documentation](https://docs.python.org/3/reference/datamodel.html#specialnames)

## Polymorphism
Suppose we want to overload the `+` add operator. We might want to define two types of functionality for `Point` adds:
1. Adding two `Point` objects (add x + x and y + y): `Point + Point`
1. Adding a numeric value to a single `Point` object (add value to x and y): `Point + 1`

We need to define *multiple behaviors* for the add method. When our functions/methods are able to handle multiple data types, they are called *polymorphic*. From Greek roots, poly means "many" and morphe means "form".

Let's write the `__add__()` method. We will have a parameter called `other`, that we will need to check the type of. If `other` is a `Point`, add the respective `x` and `y` values. Otherwise, add `other` as a numeric to each `x` and `y` of the current object.

In [14]:
class Point:
    '''
    
    '''
    def __init__(self, x, y):
        '''
        
        '''
        self.x = x
        self.y = y
        
    def __str__(self):
        '''
        
        '''
        return "(%d,%d)" %(self.x, self.y)
        
    def __eq__(self, other_point):
        '''
        
        '''
        if self.x == other_point.x and self.y == other_point.y:
            return True
        return False
    
    def __add__(self, other):
        '''
        
        '''
        if isinstance(other, Point):
            self.x += other.x
            self.y += other.y
        else: # not a Point object, for now, assume it is a numeric such as an int or float
            # in the future, we would want to write this code to be more robust
            self.x += other
            self.y += other
        return self
    
point1 = Point(1, 1)
point2 = Point(3, -2)
print("%s" %(point1))
print("%s + %s = %s" %(str(point1), str(point2), str(point1 + point2)))
offset = 10
print("%s + %d = %s" %(str(point1), offset, str(point1 + offset)))

(1,1)
(1,1) + (3,-2) = (4,-1)
(4,-1) + 10 = (14,9)


### Composition
Objects can have attributes that are other objects. Let's define a `Circle` class that has 2 attributes:
1. `center`: a `Point` object representing the location of the center of a circle
1. `radius`: a numeric value representing the radius of the circle

In [15]:
class Circle:
    '''
    
    '''
    def __init__(self, x, y, radius):
        '''
        
        '''
        center = Point(x, y)
        self.center = center
        self.radius = radius
        
    def __str__(self):
        '''
        
        '''
        return "Circle with center: %s and radius %.2f" %(self.center, self.radius)
    
circle = Circle(0, 5, 100.0)
print(circle)

Circle with center: (0,5) and radius 100.00


Note: We can think of the relationship between a `Circle` and a `Point` as: "a `Circle` **has a** `Point`". The "has a" relationship is important to distinguish from the "is a" relationship of inheritance... 

## Inheritance
We can define classes such that they are "extensions" of existing classes. For example, consider we have an object called `Animal` that defines certain traits and behaviors that all animals exhibit:
1. A species name (string attribute)
1. An energy level (integer attribute)
1. A play activity (method that subtracts from the energy level)
1. A rest activity (method that adds to the energy level)

For each specific animal we define (`Lion`, `Tiger`, `Bear`, etc.), we don't want to have to implement these common attributes and methods each time. Instead, we could write classes for each animal, and state these classes *inherit* from `Animal` class, and thus have all the traits and behaviors of `Animal`s. We could then define specific traits and behaviors unique for each animal. For example, a `Lion` might have an attribute called `mane_length` that a `Bear` wouldn't have.

In [16]:
class Animal:
    '''
    
    '''
    def __init__(self, species, energy):
        '''
        
        '''
        self.species = species
        self.energy = energy
        
    def __str__(self):
        '''
        
        '''
        return "%s with energy %d" %(self.species, self.energy)
        
    def play(self, expenditure):
        '''
        
        '''
        self.energy -= expenditure
        
    def rest(self, recovery):
        '''
        
        '''
        self.energy += recovery
        
    
        
class Lion(Animal):
    '''
    
    '''
    def __init__(self, species, energy, mane_length, roar):
        '''
        
        '''
        super().__init__(species, energy)
        self.mane_length = mane_length
        self.roar = roar
        
    def get_roar(self):
        '''
        
        '''
        return self.roar
    
king_lion = Lion("Lion", 100, 24, "GRRRRR")
cowardly_lion = Lion("Lion", 75, 12, "grr")

print(king_lion)
print(king_lion.get_roar())
print(cowardly_lion)
print(cowardly_lion.get_roar())

Lion with energy 100
GRRRRR
Lion with energy 75
grr


`super()` returns a reference to the parent class (`Animal` in this case). Thus, `super().__init__(species, energy)` invokes the initialize method of `Animal`.

How cool is it that when we print a `Lion` object, `Animal`'s `__str__()` is implicitly invoked. Note: we could define a specific `__str__()` for `Lion` if we wanted to! Python figures out which method to call based on the more "specific" class (i.e. the child class, then the parent class). 