# Chapter 4: Object Oriented Programming and MyCapitains

---

## 1. Introduction to Object Oriented Programming

**OOP, or Object Oriented Programming**, is a way to write your code. OOP is about code architecture : you define objects that will have instances. For example, the Aeneid text would be an instance of an object Text. OOP is about data modelling : when you think about an object, you need to think "OK, so now that I have all these texts, what do they have in common ?"

For a Text, it's pretty simple : they have an author (even if it is anonymous), a title (most of the time), a date and a text body. Because texts exist in a vacuum, they have also an incarnation, which is usually realised through a specific method. The edition of the Aeneid available on Perseus, for example, has the following properties :

- the method is "digitized"
- the text is the text
- the author is "Virgil"
- the title is "Aeneid"
- the age would be about 2000 years.

What does this translate to in Python ?

    Aeneid = Text(...) #
    print(Aeneid.title) # would print "Aeneid"
    print(Aeneid.author) # = "Virgil"
    print(type(Aeneid)) # would print "Text"

The good thing with objects is that the same object can have multiple instances. An **instance** is an emanation of an object, a version of it. Think about the object as a mold. With a mold you can create multiple objects of the same shape that you can paint with different colors. Those are called instances. The Aeneid is an instance of Text. But you could also choose Ovid's Amores :


    Amores = Text(...) #
    print(Amores.title) # would print "Amores"
    print(Amores.author) # = "Ovid"

In previous chapters we saw how one object actually has methods and properties (or attributes). **Properties** are value assigned to an object. They're accessible using a dot and the name of the property, such as `Aeneid.title`. **Methods** are functions that use all the information inside an object, *eg* Amores.read() should return a result such as :

    Amores, Ovid
    text...
    
To continue with objects, those objects can have children. This means not only do they inherit all the functionalities and properties of the parent, but you will also be able to specify some additional information. For example, Text could have two children named Scroll and Book. Scroll would automatically have a method property set to "Hand" and a new method transcript(), while Book would have "Printed" and ocr().

One last thing : Object is a general name and is often mistaken with `Class`. Every object has a `class` : it is the blueprint of the object. To some extent, if an object is a mold, then the blueprint is the plan required to build that mold. They are so bound together that many people mistake one for the other. The `class` for text, on the other hand, would be responsible for the presence of properties and methods. 

![Classes and subclasses](images/Class.svg)

## 2\. Object in Python : class and \_\_init\_\_

To define an object in Python, you will need to use the syntax `class NAME(object):` and give it something, such as a method. This is the minimum requirement (Mind the tabs !):

In [None]:
class Animal(object):
    def shout(self):
        print("")

The class Animal in this case has a shout property. Did you see `self` ? Objects' methods require it : this refers to the instance itself. If Animal had a `.noise` property, `shout()` could use it like that :

In [None]:
class Animal(object):
    def shout(self):
        print(self.noise)

Now, if we'd like to create an Animal object, like Pluto the dog, we'd initiate the instance this way :

In [None]:
Pluto = Animal()

And we could call the method `shout()` this way :

In [None]:
Pluto.shout()

Damn, AttributeError. Of course ! We did not define our attribute "noise" anywhere ! But how do you define properties ? You have to define an \_\_init\_\_ function to initialise an instance. To do so, you would simply type :

In [None]:
class Animal(object):
    def __init__(self):
        self.noise = "Wahwah"
    def shout(self):
        print(self.noise)

Pluto = Animal()
Pluto.shout()

It definitely works better. But not all animals shout the same things. To avoid this issue, just remember that \_\_init\_\_ works like any other function and can take arguments !

In [None]:
class Animal(object):
    def __init__(self, name, noise=""):
        self.noise = noise
        self.name  = name
    def shout(self):
        print(self.noise)

Pluto = Animal("Pluto", "WahWah")
Pluto.shout()
print(Pluto.name)

** DIY ** 

Define a class Vehicle with a property "wheels" for the number of wheels, a property "passengers" for the number of maximal passengers. Create a method `enter(number)` where the number represents how many passengers want to use the vehicle : this method should return False if there are too many people ! Create two instances named `car` and `auto` and run the code in the second block !

In [None]:
# Write your class here

# Create your instances here

In [None]:
#Run it to check it works 

if car.enter(2) is True:
    print("We can use the car with two people")
if auto.enter(3) is True:
    print(" WHAT IS YOUR AUTO BRAND ?") # Side-car does not count.

## 3\. Inheritance, \*args and \*\*kwargs

We saw earlier that OOP was based on inheritance between objects. This is also obviously the case in Python, and we could create new subclasses for new animals. For example, Humans have many more methods when it comes to language, like speaking. A subclass is generated using the class name in parentheses, such as :

In [None]:
class Human(Animal):
    def say(self, something):
        print(something)
        
Me = Human(name="Thibault")
Me.say("Hello !")
Me.say("My name is "+Me.name)


A subclass keeps every methods of its parent. In this case, the initialisation method has been kept because we did not overwrite it. Through, adding only methods is quite a small advantage, so less grow the properties !

In [None]:
class Human(Animal):
    def __init__(self, cloth):
        self.cloth = cloth
    def say(self, something):
        print(something)
        
Me = Human("T-Shirt")
Me.say("Hello !")
Me.say("I wear "+Me.cloth)

Great, but what's my name then ?

In [None]:
Me.name

Obviously we got an AttributeError. You could add it to your current \_\_init\_\_ :

In [None]:
class Human(Animal):
    def __init__(self, cloth, name, noise=""):
        self.name = name
        self.noise = noise
        self.cloth = cloth
    def say(self, something):
        print(something)
        
Me = Human("T-Shirt", "Thibault")
print(Me.name)

Through, this is not super practical and except keeping the methods... A class can call it's parent \_\_init\_\_ function :

In [None]:
class Human(Animal):
    def __init__(self, cloth, name, noise=""):
        super(Human, self).__init__(name, noise)
        self.cloth = cloth
    def say(self, something):
        print(something)
        
Me = Human("T-Shirt", "Thibault")
print(Me.name)
print(Me.cloth)

The `super()` function takes two parameters : a class name, in this case `Human`, which is the class we are currently working on; an instance, in this case `self`, which represents the Human instance. After `super()`, we call the method \_\_init\_\_ of the parent class. `super` computes automatically the parent of the current class we fed it.

After, \_\_init\_\_ takes just the same argument as it would have for Animal, in this case `name` and `noise`

** \*args and \*\*kwargs **

The issue with `\_\_init\_\_()` and parent class is that you will spend quite a lot of time sending attributes. Python have two nices helpers: \*args and \*\*kwargs. 

\*args is a wildcard which allows you to replicate the order of arguments, without naming them in any way. Let's see :

In [None]:
class A(object):
    def __init__(self, b, c):
        print(b)
        print(c)
        
class B(A):
    def __init__(self, a, b, c):
        super(B, self).__init__(b, c)
        print(a)
        
b = B(1,2,3) # Works normaly. Let's use *args now
print("---")

class C(A):
    def __init__(self, a, *args):
        super(C, self).__init__(*args)
        print(a)

b = B(4,5,6) # Behaves the same wa

**\*args** behaves just like an infinite sized list of argument. It will keep the order of the argument given and will allow to feed other functions. Actually, it is really close to a list:

In [None]:
# From http://stackoverflow.com/questions/3394835/args-and-kwargs/3394898#3394898
def print_everything(*args):
    for count, thing in enumerate(args):
        print('{0}. {1}'.format(count, thing))

print_everything(1,2,3,9,7,"(","hello")

The sameway \*args behaves like a list of arguments, **\*\*kwargs** behaves like a dictionary of arguments. It means if some one pass an argument `a` not declared in your method or function signature, it will be automatically put to \*\*kwargs :

In [None]:
class A(object):
    def __init__(self, b, c):
        print(b)
        print(c)
        
class B(A):
    def __init__(self, a, **kwargs):
        super(B, self).__init__(**kwargs)
        print(a)
        
b = B(7,b=8,c=9)

And actually, we can see that it behaves like a dictionary :

In [None]:
# From http://stackoverflow.com/questions/3394835/args-and-kwargs/3394898#3394898
def table_things(**kwargs):
    for name, value in kwargs.items():
        print('{0} = {1}'.format(name, value))
        
table_things(a=1,b=2,c=3,d=4)

The good thing with \*args and \*\*kwargs is that they can work together :

In [None]:
class A(object):
    def __init__(self, b, c):
        print(b)
        print(c)
        
class B(A):
    def __init__(self, a, *args, **kwargs):
        super(B, self).__init__(*args, **kwargs)
        print(a)
b= B(4,"args !", c="kwargs !")

** DIY **

Write out the subclass Baby for Human, with a parameter for age and a method called *cry* :

In [None]:
# Write your code here

## 4\. Some practical functions

Classes can have particular methods which greatly enhance the final object.

**\_\_str\_\_()** provides a function to transform an object into a string. If, for example, you have an object representing a text, you could use a conversion function to recombine all the information together :

In [None]:
class Text(object):
    def __init__(self, text, author):
        self.text = text
        self.author = author
    
class TextSTR(Text):
    def __str__(self):
        return self.author + " said : \n" + self.text
    
hello_normal = Text("Hello world", "I")
hello_string = TextSTR("Hello world", "I")

print(str(hello_normal))
print(str(hello_string))

**\_\_len\_\_()** provides a required method for the object to have a length. To be able to do `len(instance)`, you need to specify this method. The implementation of what this refers to is up to you :

In [None]:
class Vehicle(object):
    def __init__(self, wheels, passengers):
        self.wheels=wheels
        self.passengers=passengers
        
    def __len__(self):
        """ Return the number of wheels """
        return self.wheels
    
class Text(object):
    def __init__(self, text, author):
        self.text = text
        self.author = author
    def __len__(self):
        """ Return length of the text attribute """
        return len(self.text)
    
moto = Vehicle(2, 2)
print(len(moto))

lorem = Text("Lorem ipsum dolor sit amet, consectetur, adipisci velit", "Pseudo-Cicero")
print(len(lorem))

**\_\_eq\_\_()** provides a way to ensure equality between two objects. This will tend to be useful for you when you have to compare complex objects :

In [None]:
class Text(object):
    def __init__(self, text, author):
        self.text = text
        self.author = author
    def __eq__(self, other):
        """ Return length of the text attribute """
        return isinstance(other, Text) and self.text == other.text and self.author == other.author
    

lorem = Text("Lorem ipsum dolor sit amet, consectetur, adipisci velit", "Pseudo-Cicero")
ipsum = Text("Lorem ipsum dolor sit amet, consectetur, adipisci velit", "Pseudo-Cicero")
hello = Text("Hello world", "Pseudo-Cicero")
lorem_ipsum = "Lorem ipsum dolor sit amet, consectetur, adipisci velit"

print("lorem == ipsum is", lorem == ipsum) # True because they have instance of same object
print("lorem == hello is", lorem == hello) # False because they have not the same text attribute
print("lorem == lorem_ipsum is", lorem == lorem_ipsum) # False because one is a string and the other a Text instance
print("ipsum == ipsum is", ipsum == ipsum) # One object should always be equal to itself through

**\_\_gt\_\_()** and **\_\_lt\_\_()** stand for *greater than* and *lower than*. It provides a functionality to compare objects, disconnected from their length :

In [None]:
class MyAuthors(object):
    def __init__(self, name, works, score):
        self.name = name
        self.works = works
        self.score = score
        
    def __len__(self):
        return len(self.works)
    
    def __gt__(self, other):
        return self.score > other.score
    
    def __lt__(self, other):
        return self.score < other.score
        
cicero = MyAuthors("Cicero", 45, 6)
virgil = MyAuthors("Virgil", 10, 8)
caesar = MyAuthors("Caesar", 4, 5)

print("I like less Caesar than Cicero :", caesar < cicero)
print("I like more Virgil than Cicero :", virgil > cicero)

## Going further

- A StackOverflow post about \*args and \*\*kwargs, http://stackoverflow.com/a/3394898/2390493
- Official function documentation, https://docs.python.org/3/tutorial/controlflow.html#more-on-defining-functions

##Exercises :

**1\.** Transform the diagram explaining objects into Python. Function read should return the text, quote should take 2 integers that return the passage of a text. Don't worry about preserve(), transcript(), and ocr().

In [None]:
# Write your code here

**2\.** Create a class having a property x and y named "Point". When printed, it would show "(x,y)" and one Point would be bigger if both its x and y are greater than the other Point.

**3\.** MyCapytain is an implementation of CTS norms into Python. CTS split texts into passages, which are made of an identifier and a text body.


Create a Text object that takes an ordered dict and registers each passages internally. When casted to a string, it should returns the text with identifiers and line breaks such as the following :

a. is a letter

b. is also a letter

In [None]:
from collections import OrderedDict
#Write your code here :

# Test the results
dic = DictText(
    OrderedDict(
        [
            ("a", " is a letter"),
            ("b", " is also a letter")
        ]
    )
)
print(str(dic) == "a. is a letter\nb. is also a letter")

**4\.** Using and updating the previous class, create a new class Text which will create xml elements based on the first init parameter in the form of an xpath regexp

In [None]:
from collections import OrderedDict
from lxml import etree
#Write your code here :

# Test the results
dic = DictText(
    "/lg/l[@n=$1]",
    OrderedDict(
        [
            ("a", " is a letter"),
            ("b", " is also a letter")
        ]
    )
)

print(etree.tostring(dic.xml(), encoding=str))
# <l n='a'> is a letter</l>
# <l n='b'> is also a letter</l>

**5\.** Using and updating the previous class, create a new class Text which will create xml elements based on the first init parameter in the form of an xpath regexp taking into account a list of references for the tree.

- a
    - 1
    - 2
- b
    - 1

In [None]:
from collections import OrderedDict
from lxml import etree
#Write your code here :

# Test the results
dic = DictText(
    "/root/div[@n=$1]/l[@n=$2]",
    OrderedDict(
        [
            ("a.1", " is the first definition of a letter"),
            ("a.2", " is the second definition of a letter"),
            ("b.1", " is the first definition of b letter"),
            ("b.2", " is the second definition of b letter")
        ]
    )
)

print(etree.tostring(dic.xml(), encoding=str))

-----

In [None]:
# Don't worry about this cell, it's just here to make the page nicer.

from IPython.core.display import HTML
def css_styling():
    styles = open("styles/custom.css", "r").read()
    return HTML(styles)
css_styling()

---

<p><small><a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a><br /><span xmlns:dct="http://purl.org/dc/terms/" property="dct:title">Python Programming for the Humanities</span> by <a xmlns:cc="http://creativecommons.org/ns#" href="http://fbkarsdorp.github.io/python-course" property="cc:attributionName" rel="cc:attributionURL">http://fbkarsdorp.github.io/python-course</a> is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>. Based on a work at <a xmlns:dct="http://purl.org/dc/terms/" href="https://github.com/fbkarsdorp/python-course" rel="dct:source">https://github.com/fbkarsdorp/python-course</a>.</small></p>