# Object Oriented Programming (Classes)

Object Oriented Programming is about how we *organize* our ideas in code.

Programs are made up of two fundamental, conceptual components:
    
  - Data
  - Algorithms to manipulate the data

So to have an expressive and useful programming language, we need ways to both

  - Create new types of data.
  - Create re-usable algorithms to manipulate that data.

Sometimes the algorithms we need to manipulate data are tied closely to the data itself, and in this case we would like to

  - Associate algorithms with specific data structures

### Objectives

  - Give basic examples of objects in python (list and dictionaries)
  - Give definitions of object, attribute, and method, along with examples.
  - Create a new datatype using a class.
  - Describe the difference between a class and an object.
  - Describe what `self` is.
  - Add natural behaviour to a class with dunder methods.

### Basic Example: Lists

Python `list`s are a very useful type of data structure, and they have lots of associated algorithms.  Let's take a closer look a lists and how they work.

In [1]:
# Q: Why didn't I call it 'l', why didn't I call it 'list'
lst = [1, 2, 2, 3, 4, 4, 4]

# Associated algorithm: count
print(lst.count(2))
print(lst.count(3))

print(lst)

2
1
[1, 2, 2, 3, 4, 4, 4]


The `count` function is assoicated with the `list` data type.

Functions that are associated to a specific data type in this way are called *methods*.  So we would say

> `count` is a method of the data type `list`

Methods are (generally) called using the `.` notataion:

```python
data_element.method(additional_arguments)
```

### Pure vs. Impure Methods

Some methods actually *change* the data they operate on:

In [2]:
print(lst)

lst.append(5)

print(lst)

[1, 2, 2, 3, 4, 4, 4]
[1, 2, 2, 3, 4, 4, 4, 5]


Methods which do **not** change the underlying data (`list.count`) are called **pure methods**, methods that *do* change the underlying data (`list.append`) are called **impure methods**.

Changing the data without giving it a new name (or, at the extreme, copying it first) is called **mutating** the data.  Some data types protect against changing the data in place, they are called **imutable types**.

In [3]:
tup = (1, 2, 3)
tup[2] = 4

TypeError: 'tuple' object does not support item assignment

### Magic Methods

Some things that do not look like methods actually are, indexing for example:

In [4]:
print(lst[2])
print(lst.__getitem__(2))

2
2


The `__getitem__` is called a **magic method**.  There are spelled with two underscores and can be called with special syntax, which lead to thier other common name: **dunder maethods**.  This one would be pronounced "dunder-get-item".

In [7]:
# lst[2] = 100
lst.__setitem__(2, 100)

# lst[2]
print(lst.__getitem__(2))

# len(lst)
print(lst.__len__())

# lst[1:5]
print(lst.__getitem__(slice(1, 5)))

100
8
[2, 100, 3, 4]


## More Advanced Examples

The python standard library has many examples of additional data types.  We will be re-implementing two of the more useful ones, `defaultdict` and `OrderedDict`.

In [8]:
from collections import defaultdict, OrderedDict

### defaultdict

`defautdict` is a simple but effective alternative to a dictionary.

Recall that with a normal dictionary, attempting to lookup a key that does not exist is an error.

In [9]:
D = {'a': 1, 'b': 2}

D['c']

KeyError: 'c'

A `defaultdict` allows you to specify a default value to return when a non-existant key lookup is attempted.

In [10]:
def default():
    """A function that returns a default value, called when we attempt to
    access a non-existent key in a default dictionary.
    """
    return 0

D = defaultdict(default, {'a': 1, 'b': 2})

print(D['a'])
print(D['c'])
print(D)

1
0
defaultdict(<function default at 0x10c729488>, {'a': 1, 'b': 2, 'c': 0})


**Note**: In our creation of the default dict above, the line `D = defaultdict(int, {'a': 1, 'b': 2})` is more idomatic.  We chose to write it the way we did above as it makes more explicit what is going on.

**Note**: It's a bit weird to have to pass in a function that returns the default value instead of the default value itself, but this is needed to avoid weird problems arising from mutable objects like lists.  Passing a function guarentes that this will work:

In [11]:
D = defaultdict(list, {})

print(D['a'])

D['a'].append(1)
D['a'].append(2)
D['b'].append(1)

print(D)

[]
defaultdict(<class 'list'>, {'a': [1, 2], 'b': [1]})


A more naive implementation would result in the **same** list being shared by all keys.

### Making Your Own Default Dict

Let's make our own default dictionaries.

**Note:** In practice, we would not do this.  Since the `defaultdict` datatype already exists, there is no benefit in reimplementing it for practial applications.  But it's instructional to see how we could do this if our needs were for something slightly different.

There are two concepts we need

  - A `class` is a template for a new data type.  It contains inforamtion on what data is needed to construct the data type, how to store the data internally, and what algorithms can be applied to the data type.
  - An instance of a class is a concrete object of the new data type.
  
A class is a recipe for constructing instances of that class.

**Question**: In the picture below, what are the classes, and what are the instances of these classes?

![Examples of Objects of Different Classes](classes-and-objects.png)

#### Example of a Class: defaultdict

`defaultdict` is a class

In [12]:
from inspect import isclass
isclass(defaultdict)

True

Using the class `defaultdict` as a function creates an instance of that class.

In [13]:
D = defaultdict(lambda: 0, {'a': 1, 'b': 2})
isinstance(D, defaultdict)

True

Note: The process of using the class itself as a function is called **construction**, and in this context the class is being used as a **constructor**.  The idea is that we are "constructing" a new object whose type is the class.

We usually abbreviate the phrase

> `D` is an instance of class `defaultdict`.

as

> `D` is a `defaultdict`.

In this way, `defaultdict` is thought of as a **type** (or datatype).  This is analagous to the `int`s, `float`s, `string`s, etc that we base all our programs on.

#### Creating A Custom Class

The basics of creating a custom class in python is very easy

In [14]:
class MyClass(object):
    pass  # Do nothing.

We can create a new insance of the class easily

In [15]:
my_instance = MyClass()

In [16]:
isinstance(my_instance, MyClass)

True

This is a pretty dumb class as it stands, it cant really *do* anything.  To get something useful we have to add data and behaviour to our class.

#### How to Store Data in a Class

The first step is to determine what data we need to store.  In this case it's pretty easy, we need

  - The underlying dictionary that we are going to attempt lookups into.
  - The default action to take when a lookup fails.

Let's mimic the way Python's built in default dict works.  We need to add some functionality to **supply and then store** both of these data elements when we create an instance of the class.  This is done using a special *method*, `__init__`.

**Note:** `__init__` is pronounced *dunder-in-it*.

In [17]:
class MyDefaultDict(object):
    """A personal implementation of a default dictionary."""
    
    def __init__(self, default, dictionary):
        self.default = default
        self.dictionary = dictionary

There's a lot of new concepts in this code, but let's first see how it works.

In [18]:
MD = MyDefaultDict(lambda: 0, {'a': 1, 'b': 2})
print(MD.default)
print(MD.default())
print(MD.dictionary)

<function <lambda> at 0x10c754048>
0
{'a': 1, 'b': 2}


When we use a class like a function

```python
my_instance = MyClass()  # <- Called like a function.
```

it is to create *instances of that class*.  

We will often be working with more than one instance of a single class.

In [19]:
MD = MyDefaultDict(lambda: 0, {'a': 1, 'b': 2})
MD2 = MyDefaultDict(lambda: 1, {'a': 2, 'b': 3, 'c': 5})

print(MD.dictionary)
print(MD2.dictionary)

{'a': 1, 'b': 2}
{'a': 2, 'b': 3, 'c': 5}


In [20]:
MD['a']

TypeError: 'MyDefaultDict' object is not subscriptable

Note the important point: **Both** `MD` and `MD2` are instances of the **same class**, but they contain **different data**; they are **independent objects of the same type**.

The data we store in a class in this way is called an **instance varaible**.  Each independent object can have its own independent values of its instance varaibles.

#### The self Placeholder

A statement like

```
self.default = default
```

creates what is known as an **instance varaible** or **instance data**.  In this specific line, we attach the `default` function to the instance of the class currently being created.

There are two main ways that `self` is used:

  - References to `self` inside the `__init__` method refer to the object **currently being created**.
  - References to `self` in any other method (see more below) refer to the object used to reference a call to this method.

For example, when we call a method like:

```python
some_object.some_method(an_argument, another_argument)
```

any references to `self` inside the definition of `some_method` will refer to `some_object`.

So our use of self in the `__init__` method

```python
def __init__(self, default, dictionary):
    self.default = default
    self.dictionary = dictionary
```

Is setting up our `MyDefaultDict` objects so that, once created, each instance of `MyDefaultDict` stores both `default` and `dictionary` data.

#### Addding Methods to Manipulate Data in a Class

Let's implement `__getitem__` and `__setitem__`, which will allow us to index into instances of our class like this

```
MD['a']
# Means the same thing as MD.__getitem__('a')

MD['c'] = 3
# Means the same thing as MD.__setitem__('c', 3)
```

As a first attempt, let's ignore our goal of adding default behaviour, we can add that later on down the line.

In [21]:
class MyDefaultDict(object):
    """A personal implementation of a default dictionary."""
    
    def __init__(self, default, dictionary):
        self.default = default
        self.dictionary = dictionary
        
    def __getitem__(self, key):
        return self.dictionary[key]
    
    def __setitem__(self, key, value):
        self.dictionary[key] = value

Let's test it out.

In [22]:
MD = MyDefaultDict(default, {'a': 1, 'b': 2})

print(MD['a'])
print(MD['b'])

MD['c'] = 3

print(MD.dictionary)

MD['matt']

1
2
{'a': 1, 'b': 2, 'c': 3}


KeyError: 'matt'

#### Adding the Special Default Behaviour

Now lets add in the special behaviour on our indexing, we want to return the default value when an attempt is made to access a key that does not exist in the dictionary.

In [23]:
class MyDefaultDict(object):
    """A personal implementation of a default dictionary."""
    
    def __init__(self, default, dictionary):
        self.default = default
        self.dictionary = dictionary
        
    def __getitem__(self, key):
        if key in self.dictionary:
            return self.dictionary[key]
        else:
            self.dictionary[key] = self.default()
            return self.dictionary[key]
    
    def __setitem__(self, key, value):
        self.dictionary[key] = value

Now the whole thing works as intended

In [24]:
MD = MyDefaultDict(default, {'a': 1, 'b': 2})

print(MD['a'])
print(MD['b'])
print(MD['c'])
print(MD.dictionary)

1
2
0
{'a': 1, 'b': 2, 'c': 0}


#### Adding Other Dict-y Things

A few things that should work for dictionaries still don't work for our new datatype

In [25]:
len(MD)

TypeError: object of type 'MyDefaultDict' has no len()

Additionally, code like

```python
'c' in MD
```

and

```python
for key in MD:
    print key, MD[key]
```

will cause an infinite loop, due to a design error (at least, in the author's opinion) in Python itself.

Let's fix that with more magic methods.

In [26]:
class MyDefaultDict(object):
    """A personal implementation of a default dictionary."""
    
    def __init__(self, default, dictionary):
        self.default = default
        self.dictionary = dictionary
        
    def __getitem__(self, key):
        if key in self.dictionary:
            return self.dictionary[key]
        else:
            return self.default()
    
    def __setitem__(self, key, value):
        self.dictionary[key] = value
        
    def __len__(self):
        return len(self.dictionary)
        
    def __contains__(self, key):
        return key in self.dictionary
    
    def __iter__(self):
        for key in self.dictionary:
            yield key

We have a few new methods:

  - `__len__` allows our datatype to support calls to `len`.
  - `__contains__` allows our datatype to support the `in` keyword.
  - `__iter__` allows our datatype to support iteration, i.e., for loops.  The `yield` keyword here is new, and it is a powerful feature of python you will see often.  You should find some time to read about it [here](http://stackoverflow.com/questions/231767/what-does-the-yield-keyword-do-in-python).

Let's try out our new features.

In [28]:
MD = MyDefaultDict(lambda: 0, {'a': 1, 'b': 2})

print('a' in MD)

for key in MD:
    print(key, MD[key])

True
a 1
b 2


#### Non-Magic Methods

It's worth mentioning that not all methods are magic.  Here is a class that represents a simple quadratic polynomial, and has an `evaluate` method, which plugs a number into the polynomial.

In [29]:
class QuadraticPolynomial(object):
    """A class representing a polynomial like:
    
        a_0 + a_1 x + a_2 x^2
    """
    
    def __init__(self, a0, a1, a2):
        self.coefficients = (a0, a1, a2)
        
    def evaluate(self, x):
        a0, a1, a2 = self.coefficients
        return a2*x*x + a1*x + a0

In [30]:
q = QuadraticPolynomial(10, -1, 3)
q.evaluate(-2)

24

Of course, we are free to define any mix of magic and non-magic methods.  Here we define the `==` operator for our polynomials.

In [31]:
class QuadraticPolynomial(object):
    """A class representing a polynomial like:
    
        a_0 + a_1 x + a_2 x^2
    """
    
    def __init__(self, a0, a1, a2):
        self.coefficients = (a0, a1, a2)
        
    def evaluate(self, x):
        a0, a1, a2 = self.coefficients
        return a2*x*x + a1*x + a0
    
    def __eq__(self, other):
        return self.coefficients == other.coefficients

In [32]:
p = QuadraticPolynomial(10, -1, 3)
q == p

True

There are still some issues.

In [33]:
p = QuadraticPolynomial(2, 0, 0)
q == 2

False

Hmm.

### OrderedDict

The `OrderedDict` type is a dictionary that remembers the order that keys are added.  While a basic dictionary has no order - iterating over a regular dictionary will access the key, values in a random order, iterating through a `OrderedDict` will access the keys in the same order that they were added.

Your task is to **implement an ordered dictionary**.  Here are some questions to ask yourself:

  - What data will you store on each instance.  Clearly you need a `dictionary`, just like in `defaultdict`.  How are you going to remember the order that keys were added to the dictionary?
  - What methods will you need to implement.  Which one is the important one, i.e., the one that adds the new and interesting behaviour?
  - What happens if you add a key twice?  This is an edge case, which your final implementation should account for.


### QuadraticPolynomial

These challenges involve extending our `QuadraticPolynomial` class to support more features.

1. Use the `__add__` magic method to allow something like `QuadraticPolynomial(1, 1, 1) + QuadraticPolynomial(1, 0, 1)`.  The new method should *return* another `QuadraticPolynomial`.

2. Write a class `LinearPolynomial`.  Add a method `differentiate` to `QuadraticPolynomial` that returns a `LinearPolynomial`.

3. Suppose we create a new `QuadraticPolynomial` like `QuadraticPolynomial(1, 1, 0)`.  Is this really a `QuadraticPolynomial`?  What should it be?  How can you resolve this weird inconsistency in data types?