# Object-Orientation and Design Patterns

This is the preliminary content for the next talk, I'll refactor this stuff into slides later.


Subjects to cover:

 * __Object Definition__: the structural concept of an entity with data and operations
   * Creation
   * Referencing objects and identity
   * Everything is an object
   
 * __Classes__: definition of the object blueprint
 
 * __Class Members (or features)__:
   * Attributes: data components of a class
   * Constructors: how objects are constructed
   * Methods: definition of object operations
   
 * __Notion of Interface__: the layer through which clients interact with objects
   * Abstraction
   * Programming to interfaces
   * Duck typing
   
 * __Inheritance__: definition of classes in terms of others
   * Concept of specialization
   * Single inheritance
   * Multiple inheritance
   * When to use composition vs. inheritance
   * Principle of Substitution
   * Inheritance of convenience: don't do this
   
 * __Method Overriding__: modifying behaviour of inherited code by replacing methods
 * __Things Not Present in Python__: method/constructor overloading, variable polymorphism
 
 * Design Patterns:
   * __Subject-Observer__
   * __Iterator__
   * __Template Method__
   * Maybe others, but time will be pressing by now
 
Key concepts:
 * __Encapsulation__: definitions of data entities are packaged together with the code which actuates their behaviour
 * __Abstraction__: the details of how objects function are hidden from clients while still permitting use
 * __Inheritance__: object definitions can be stated as extensions of existing ones, assuming their members for reuse in the new type
 * __Genericity__: code and data structures can operate with existing types or new ones introduced in later client code
 * __Polymorphism__: objects have multiple types, can have differing behaviour from what would be expected of one of these types 
 * __Substitutability__: objects of a subtype should be substitutable with objects of the parent type while preserving correctness
 * __Patterns__: describe common design choices and good practices in a concise but generic architectural notion

Object-orientation is a programming paradigm centered on the concept of encapsulating data with operations. In the simplest form, an object is a piece of data with associated routines which manipulate that data. Object-oriented systems are composed of many objects aggregating together to form structures and cooperate in implementing a program's behaviour. 

This contrasts with imperative languages like Fortran or C which are composed of routines and data defined separately as global or local variables. Here there is a loose relationship between data and the code that manipulates it, and the structure of the code is much less flexible or reusable.

Data structures and their associated code are defined as abstract data types (ADTs) which are composed of a definition for a data entity and routines which manipulate such entities. For example, a `Dimension` type in C to represent a 2D area:

```
struct Dimension { int width,height; }
void init(struct Dimension* d, int w, int h);
int area(struct Dimension* d);
```

The abstract component of this ADT is evident in that the data entity and its routines are declared without being befined. 
A client can still use this ADT without knowing the details of its implementation; abstraction is an important property to ensure a separation of concerns and modularity in code.
This however defines a weak link between the data entity `Dimension` and the routines, as well as creating a definition which cannot be changed or adapted in client code.

Object-oriented programming aims to make the connection between data and code tighter and explicit while preserving abstraction. Objects are instances of ADTs which encapsulate data and routines in one entity, as demonstrated in the C++ equivalent of the above:

```
class Dimension {
private:
  int width,height;
public:
  Dimension(int w, int h);
  virtual int area();
};
```

In Python:

In [52]:
class Dimension(object):
    def __init__(self,w,h):
        self.width=w
        self.height=h
    def area(self):
        return self.width*self.height

An object is created by instantiating the class, that is creating an instance of class which has those members it defines:

In [53]:
d=Dimension(10,20)

The variable `d` references an instance of `Dimension`, therefore this object has the type `Dimension` as well as `object`. Multiple instances of a class can be created, each instance is an independent object with their own unique identities and distinct members:

In [54]:
d1=Dimension(15,30)
print id(d),id(d1)

60768608 60464544


In Python it is important to note that assigning to a variable only causes it to refer to a new object, that is to say a variable is a name that is bound to an object. This doesn't copy the object in any way but may make the previous referenced object inaccessible:

In [55]:
d1=d
print id(d),id(d1)

60768608 60768608


Each object has the members defined by the class. Members (or features in some literature) are the components of an object falling into these broad categories:

 * __Attributes__: named data values stored in the object
 * __Methods__: routines associated with the object and which can refer to the object by name
 * __Constructor__: special method used to setup a new object's state at the point of instantiation
 
Other types of members exist in other languages but they are special forms of either attributes or methods. A member of an object is accessed using the dot notation of the form __`object.member`__. This allows members to be accessed and mutated:

In [56]:
print d.width, d.height
d.width=12
print d.width, d.height
print d.area # methods can be accessed without being called

10 20
12 20
<bound method Dimension.area of <__main__.Dimension object at 0x00000000039F4160>>


One important name that exists in all methods is __self__ which refers to the object whose method was called. Recall the definition of `area()` from `Dimension`:

In [57]:
def area(self):
    return self.width*self.height

When a method of an object is called using the dot notation, the value for `self` is set to the object (called the receiver or callee) within the scope of the call. Thus when calling `area()` with `d` as the receiver, `self` is bound to `d` and so allows access to its members:

In [58]:
print d.area()

240


Calling this method is equivalent to calling a function and passing `d` as the first argument whose members are then accessed in the calculation

In [59]:
def areafunc(self):
    return self.width*self.height

print areafunc(d)

240


This however breaks the relationship between instances of `Dimension` and the operation of calculating an area, especially in Python which does not assign a type to argument values. Nothing prevents the following, which is obviously wrong but only discoverable at runtime:

In [60]:
try:
    print areafunc('I am not a Dimension object')
except Exception as e:
    print e

'str' object has no attribute 'width'


A method defines an operation which is associated with an object, but it also defines an interface in terms of the process of calling it. The `area()` method for example defines an interface as a method named `area` which accepts no arguments and returns a single number. This represents all the information a caller (or client) needs to know about `area()` to be able to use it, that is the interface abstracts away the details of implementation. The interfaces of all the methods for an object, as well as the notions of accessing and mutating attributes, aggregate together to form the __object interface__ for that object. 

A client, whether another object or a routine, need only know about an object's interface to be able to interact with it. Since the interface is abstract there are no requirements regarding the type or implementation of this object, thus two objects can have the same interface but differing implementations. For example, any other class which defines an `area()` method with the same arguments and return type essentially defines the same interface:

In [61]:
class Dimension3(object):
    def __init__(self,w,h,d):
        self.width=w
        self.height=h
        self.depth=d
        
    def area(self):
        return 2*(self.width*(self.height+self.depth)+self.height*self.depth)
    
d3=Dimension3(10,12,15)
print d3.area()

900


If an instance of `Dimension` and another of `Dimension3` both implement the same interface (considering `area()` only), then a client can be implemented which function with either:

In [62]:
def calcsquare(obj):
    area=obj.area()
    return int(area**0.5)

print calcsquare(d)
print calcsquare(d3)

15
30


The function `calcsquare()` uses only the interface of `object` to interact with it, which is either `Dimension.area` or `Dimension3.area` depending on what object was passed in as `obj`. Defining clients in terms of interfaces is called __Programming to Interfaces__ naturally enough. It is an important mechanism for reuse since it allows routines, algorithms, or whole modules to be integrated with introduced code whose implementations are unknown (for example your code). 

In Python this can be done easily since `calcsquare()` doesn't check that its argument `obj` fulfills the interface it needs, it just tries to call `area()` and if it isn't suitable an error occurs at runtime. This is called __duck typing__ since if it looks like a duck, and quacks like a duck, it ain't a moose. In our example having the correct `area()` method constitutes quacking in the right manner.

In other languages like C++, C#, or Java which are statically typed there must be a type which defines the needed interface, and any object must inherit this type to be usable. For example in C++:

```
class AreaInterface { virtual float area()=0; };
float calcsquare(const AreaInterface& a) { ... }
```

__Inheritance__ allows a class to be defined in terms of another, the specific rules for which vary by language but in general the inheriting type (a subtype or subclass) receives all the members of the type being inherited (the supertype or superclass). This allows a class to acquire code members without having to redefine them, so prevents reinventing the wheel in many cases and so is an important component to reuse. Consider a class defining a specific rectangular area in 2D space:

In [63]:
class Rect(Dimension):
    def __init__(self,x,y,w,d):
        Dimension.__init__(self,w,d)
        self.x=x
        self.y=y
        
    def farCorner(self):
        return (self.x+self.width,self.y+self.height)
    
r=Rect(4,4,12,10)
print r.x, r.y, r.width, r.height
print r.area(), r.farCorner()
print isinstance(r,Rect), isinstance(r,Dimension)

4 4 12 10
120 (16, 14)
True True


The class `Rect` has inherited members from `Dimension` and introduced new ones in its constructor. It has an `area()` method which functions as before as well as defined a new method. What is important to note is that instances of `Rect` are also instances of `Dimension`; this is one aspect of __polymorphism__. 

Inheritance is more than just the copy-pasting of members, or at least should be. The idea with inheritance is that the subtype is a __specialization__ of the supertype, it represents a related concept that is more refined or specific to a particular context. In our example, a rectangle has a notion of dimension as well as location and so is a special type of dimension. Other subtypes of `Dimension` could be defined which represent some other specialization but which has no relation to `Rect`. Classes can also inherit from multiple supertypes, thus should be thought of as being specialization of multiple concepts at once.

Complementary to the notion of specialization is the __principle of substitution__. This states that substituting an instance of a type in an algorithm with an instance of a subtype should not affect the algorithm's correctness. The algorithm may behave differently depending on the instance, but correctness should not be compromised. In our example, `calcsquare()` should function correctly with an instance of `Dimension` or one of `Rect`.

The `Rect` class is obviously substitutable since it only adds members. A subclass can however replace an inherited method with a new one which is called __method overriding__. In Python since only the name is relevant an overriding method need only have the same name, in other languages like Java and C++ there are more restrictions in terms of the argument/return type of the overriding method.

Overriding is used to modify the behaviour of objects by replacing the code which clients and other methods access. Consider new versions of `Dimension` and `Rect`:

In [64]:
class Dimension(object):
    def __init__(self,w,h):
        self.width=w
        self.height=h
        
    def midpoint(self):
        return (self.width*0.5,self.height*0.5)
    
    def name(self):
        return '%s, mid = %r'%(self.__class__.__name__,self.midpoint())
    
class Rect(Dimension):
    def __init__(self,x,y,w,d):
        Dimension.__init__(self,w,d)
        self.x=x
        self.y=y  
        
    def midpoint(self):
        return (self.x+self.width*0.5,self.y+self.height*0.5)
    
d=Dimension(10,15)
r=Rect(5,10,10,15)
print d.name()
print r.name()

Dimension, mid = (5.0, 7.5)
Rect, mid = (10.0, 17.5)


The method `midpoint()` is overridden in `Rect`. When `name()` is called on the instance of `Rect` this method is called, even though `name()` itself is not overridden. This demonstrates that an inherited method is not hard wired to the methods defined in the superclass but instead access override methods. Python and other OO languages make extensive use of this in types whose behaviour is deliberately designed to be modified in subtypes, although the semantics of different languages vary. This is a reuse mechanism since these types can define algorithms or data structures in a way that allows them to be adapted to other applications without being rewritten. The principle of substitution factors into this process, dictating that such subtypes should be defined in a semantically substitutable way, otherwise clients reliant on the expected behaviour will not necessarily operate correctly. 

There's a number of object-oriented features not present in Python due primarily to its nature as a dynamically typed language. A brief overview of these:
 * __Method Overloading__: defining multiple methods/constructors with the same name but which vary in terms of the arguments it accespts, this permits defining different behaviours for different input types while presenting clients with what appears to be a single interface point.
 * __Variable Polymorphism__: languages like Java or C++ with statically typed variables allow instances of subtypes to be assigned to variables having the supertype. For example:
 ```
 Dimension *d=new Rect(5,10,10,15);
 ```
 * __Access Modifiers__: members can be assigned an access modify which controls who can access or mutate the member, for example `private` in C++ means a member is accessible to other members and not to external clients or subtypes, `public` means anyone can access the member.
 * __Interface Types__: Java and C# have a notion of an interface type which defines only method signatures. These are unnecessary due to duck typing in Python.
 * __Templates__: C++ have a type template facility that allows definitions to be parameterized by a type, again dynamic typing in Python makes this unnecessary.

### Design Patterns

A design pattern is a medium-level architectural idiom which encapsulates some useful organizational or creational notion in an object-oriented system. No implementation is a pattern is like another, the definition is deliberately vague because patterns are inherently adaptable. In real-world terms think of a window: in the vaguest sense a framed opening in a wall closed off with glass which lets the light in. No two window designs are alike in terms of specific details but the characteristics which make them windows are obvious. 

#### Subject-Observer

This patterns defines a relationship between subject objects and observer objects, usually in a one-to-many relationship.
The observer objects register their interest in the subject, when a particular event occurs the subject notifies the observers. This is the Hollywood "don't call use, we'll call you" idea. This allows objects to keep track of when state changes and channel the process for reacting to change through a specific mechanism.

In [65]:
class Subject(object):
    def __init__(self,n):
        self.name=n
        self.observers=set() 
        
    def addObserver(self,o):
        self.observers.add(o)
        
    def removeObserver(self,o):
        self.observers.remove(o)
        
    def setName(self,n):
        for o in self.observers:
            o.notify(self,n) # notify observer
        self.name=n
            
class Observer(object):
    def __init__(self,n):
        self.name=n
        
    def notify(self,subject,newname):
        # react to notification
        print self.name,'saw that',subject.name,'is now',newname
        
s=Subject('Terry')

o1=Observer('John')
s.addObserver(o1)

o2=Observer('Eric')
s.addObserver(o2)

s.setName('Graham')

John saw that Terry is now Graham
Eric saw that Terry is now Graham


#### Iterator

An iterator is an object which traverses a data structure by producing successive values upon request. The idea is to abstract away how this traversing works and present a common interface for multiple types of structure. Iterators are obiquitous in Python though not always obviously so. An iterator object can be created using the keyword `iter`:

In [66]:
r=[0,1,2,3,4]
print r
it=iter(r)
print it
print next(it)
print next(it)

[0, 1, 2, 3, 4]
<listiterator object at 0x0000000003AD7EF0>
0
1


The relationship between the data structure and its iterators is one-to-many, allows a single structure to be traversed by multiple iterators. In Python, an object implements the iterator pattern if it implements a `next()` method which returns the next value in its notional sequence when called, and raises an exception when the values are exhausted. An equivalent list iterator that does the same as the above:

In [67]:
class mylistiterator(object):
    def __init__(self,lst):
        self.pos=0
        self.lst=lst
    def next(self): # returns successive values
        if self.pos<len(self.lst):
            self.pos+=1
            return self.lst[self.pos-1]
        raise StopIteration # indicates no more items
        
it=mylistiterator(range(5))
print next(it), next(it), next(it), next(it), next(it)

try:
    print next(it) # try to get more items
except StopIteration:
    print 'No more'

0 1 2 3 4
No more


#### Template Method

Method overriding allows a subclass to modify the behaviour of inherited code by selectively replace some methods. This pattern explicitly takes advantage of this by defining behaviour which relies on methods without any definition. A template method calls other methods of the same type which have no code defined for them and so relies on a subtype to supply the code. Consider the example class which relies on methods for accessing a data structure to produce a pretty-printed report:

In [69]:
class PrettyList(object):
    def get(self,i):
        pass
    
    def length(self):
        pass
    
    def printList(self):
        print 'List:'
        for i in range(self.length()):
           print ' '+str(i)+': '+str(self.get(i)) 
        
class PrettyListImpl(PrettyList):
    def __init__(self,lst):
        self.lst=lst
        
    def get(self,i):
        return self.lst[i]
    
    def length(self):
        return len(self.lst)
    
p=PrettyListImpl([1,2,3,'Terry','Graham'])
p.printList()

List:
 0: 1
 1: 2
 2: 3
 3: Terry
 4: Graham


In the above `printList()` is the template method which relies on the abstract methods `get()` and `length()`. If an instance of `PrettyList` were created and `printList()` called, the algorithm would encounter an error, thus a subtype which provides the appropriate definitions for the abstract methods is needed. 