# MSDM5051 Tutorial 9 - Object Oriented Programming 

## Contents

1. Class & Object
2. Inheritance

---
# 1. Class & Object

Up to this point, you should be already familiar with the class syntax in Python. In the early tutorials, we have been frequently defining a `Node` class to demonstrate different data structures. In fact, everything in Python are coded as class. We can show this using the `type()` function:

In [1]:
var1 = 42
var2 = "I love Python"

print(type(var1))
print(type(var2))

<class 'int'>
<class 'str'>


which says that `42` is an object created under the class `int` (integer), and `"I love Python"` is an object created under the class `str` (string). In lower level languages like C++, data types like `int` and `str` are sets of rules that tells the computer how to interpret a binary machine code. But in Python, the difference between data type and class is not very unclear - on one hand, you can write OOP-liked syntax for some built-in data type (E.g. check out the `int` class [here](https://docs.python.org/3/library/stdtypes.html#additional-methods-on-integer-types)); On the other hand, defining a new class is equivalent to creating a user-defined data type.

## 1.1. Basic syntax

In general, a class definition consists of two main kinds of elements: attribute and method. 
- **Attribute** = Properties of the objects. The parameters that describe the objects. 
- **Method** = Action to the objects. What the object can do or can be done. 


We always define a new class first using the `class` keyword. Then 
- The attributes should be defined under the constructor function `__init__()`, i.e. it defines what parameters should be setup in order to create an object of this class, and will be run immediately everytime a new object is initialized.
- Finally each of the methods are added as individual functions following `__init__()`. 

**Note 1:** It is possible to define additional attributes of the class without using `__init__()`, or even add attributes to the objects outside the class definition. But it will just make your code more painful to maintain. 


In [2]:
class My_Class1:
    def __init__(self, prop1, prop2):
        self.attribute1 = prop1
        self.attribute2 = prop2
    
    def method1(self):
        print("this is ", self.attribute1, self.attribute2, " outputing from method1") 
    
    def method2(self, input1, input2):
        self.attribute3 = input1 + input2    # we can create extra attribute outside of __init__(), but this will be painful for debugging
        
    def method3(self, input1, input2):
        pass                                 # "pass" means this function do nothing

In [3]:
obj1 = My_Class1(5,10)

obj1.method2(3,4)       # you CAN create extra attribute by calling a method that does so
obj1.attribute4 = 15    # you CAN add extra attribute to the object
                        # But it does not mean you SHOULD do so because this will be painful for debugging

print(obj1.attribute1, obj1.attribute2, obj1.attribute3, obj1.attribute4)

5 10 7 15


**Note 2:** Every function in the class definition, including `__init__()`, must be supplied with at least the first argument, which refer to the object itself. The word `self` is often used as a convention, but you can actually use any word.

In [4]:
# It works perfectly fine when "self" is replaced by another word

class My_Class2:
    def __init__(WOW, prop1, prop2):
        WOW.attribute1 = prop1
        WOW.attribute2 = prop2
        
    def method1(WOW):
        print("this is ", WOW.attribute1, WOW.attribute2, " outputing from method1") 

In [5]:
obj2 = My_Class2(1,4)
obj2.method1()

this is  1 4  outputing from method1


## 1.2. Class variable

When creating multiple objects under the same class, the data (attributes) of each objects will be individually stored at different memory locations, while the functions (methods) will be stored and shared the same piece of memory (there is no reason to make multiple copies of the same function). So what if we want to create some data that can be shared by all objects of the same class? For example, making a contact list that is shared by all people? 

Python's class allows us to set up class variables which serve this purpose. The class variables can be added simply after the `class` statement. When we want to read or write this class variable outside of the class's definition, we can call it by the syntax `ClassName.varName`.

In [6]:
class People:
    contact_list = {}        # define the class variable call contact_list as a dictionary, initialize to be empty
    
    def __init__(self, name, phone):
        self.name = name
        self.phone = phone
        People.contact_list[name] = phone    # adding items to the "contact_list" variable under "People" class
        
        # Note that we can call by self.contact_list inside the class definition as well
    

In [7]:
p1 = People("Tom", 123456)
p2 = People("Mary", 789012)

People.contact_list    # reading the "contact_list" variable under "People" class from outside of the class's definition

{'Tom': 123456, 'Mary': 789012}

## 1.3. Access control

In many language like C++, we can control what can access of attributes/methods of a class. Usually the level of access is divided into 3 levels: 

- **Private** - only methods within the same class can access them. 
- **Protected** - only methods from the same class, or any subclass that are derived from this class (i.e. inheritance) can access them. 
- **Public** - Any function can access them.

The idea of access control gives convenience to abstraction and security to data. For example, in a `student` object, we expect the `id` of the student should never be edited after created. Then we can set the `id` to be a private attribute and add a public `.read_id()` method, so that we can retrieve the student's ID, but we have no way to edit it.

***However Python is an exception*** - every attributes/methods are public. There is no way to restrict the user from calling attributes/methods that are supposed to be private. Python programmer follow the conventions:

- **The weaker version** - Prefix the internal attribute or method with an underscore character i.e. `_`, as a reminder that this variable should not be called outside of the object. However the attribute or method is still callable as `_varName`. 

- **The stronger version** - Prefix the internal attribute or method with double underscore i.e. `__`. This technique is called **name mangling**. This will change the attribute or method's name into the form `_ClassName__varName`, thus adding extra steps to make it more annoying to be called outside the object.


In [8]:
class Student:
    def __init__(self, name, ID):
        self._name = name        # the weaker version
        self.__id = ID           # the stronger version
  
    def display_student(self):
        print("Student ", self._name, " with ID ", self.__id)
  

In [9]:
s = Student("ABC", "12345")

# Any method of the object can access the attributes as normal
s.display_student()
  
# attribute with only 1 underscore can still be accessed outside as normal
print("student's name is ", s._name)

# but attribute with 2 underscores cannot be accessed by its original name (error should occur)
print("student's ID is ", s.__id)


Student  ABC  with ID  12345
student's name is  ABC


AttributeError: 'Student' object has no attribute '__id'

In [10]:
# We can obtain a list of available attributes and methods of the object via the dir() function
print(dir(s))

# We can see the "__id" attribute's name is changed to "_Student__id". With this we can access the id outside the object
print("\n student's ID is ", s._Student__id)


['_Student__id', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_name', 'display_student']

 student's ID is  12345


---
# 2. Inheritance

If we make multiple copies of the same or similar code in different places, it will become a nightmare when we come to bug fixing. To avoid this problem, we hope to reuse the same piece of code whenever possible. In OOP, this technique is called "inheritance", i.e. defining a "parents-children" relation between class definitions. In textbooks you may see the terminologies:

- **Superclass** = The parent class. For containing the most common logics shared by all subclasses.
- **Subclass** = The child class. For extending specific details from the superclass. 

## 2.1 Basic inheritance

Technically, every class in Python we create uses inheritance. All Python classes are subclasses of the special built-in class named `object`. If no arguments are supplied to the `class` statement, the user-definied class automatically inherits from `object`.

In [11]:
### These two syntax are equivalent

class My_Class3:
    def __init__(self, prop1, prop2):
        self.attribute1 = prop1
        self.attribute2 = prop2

########################

class My_Class3(object):
    def __init__(self, prop1, prop2):
        self.attribute1 = prop1
        self.attribute2 = prop2

In a basic inheritance relation, the subclass will be able to access all attributes, methods and class variables of its superclass. In addition, the subclass can extend the superclass by providing more methods and class variable specific to this subclass. For example, extending the `People` class in the previous example by defining a subclass call `Friend`:

In [12]:
class People:
    contact_list = {}
    
    def __init__(self, name, phone):
        self.name = name
        self.phone = phone
        People.contact_list[name] = phone        
        

class Friend(People):    # Friend is a subclass of People
    
    # contact_list and attributes in __init__() are automatically inherited
    # so the new object is added to the contact_list when created
    
    def play(self):
        print('Today I play with ', self.name, '. His number is ', self.phone)    # we can access the attributes from the People class

In [13]:
p1 = People("Tom", 123456)
p2 = People("Mary", 789012)
f = Friend("Eric", 999999)

f.play()
print(People.contact_list)


Today I play with  Eric . His number is  999999
{'Tom': 123456, 'Mary': 789012, 'Eric': 999999}


## 2.2. Overriding

Note that if we define a method in the subclass of the same name as one of the method in the superclass, the original method in the superclass will be overwritten. This happens similarly to the `__init__()` function - if we define a new `__init__()` function in the subclass, all original attributes in the superclass will be lost, and substituted by the attribute in the new `__init__()`. 

In [14]:
class People:
    
    def __init__(self, name, phone):
        self.name = name
        self.phone = phone
        

class Friend(People):    # Friend is a subclass of People
    
    def __init__(self, nickname):        # if the original __init__ is being replaced by this new one
        self.nickname = nickname         # we cannot define a Friend object as like a People object anymore
    
    def play(self):
        print('Today I play with ', self.name, '. His number is ', self.phone)    # and the name and phone attribute are no long accessible

In [15]:
f = Friend("Eric", 999999)    # We cannot provide the phone number to create a Friend object anymore

f.play()

TypeError: __init__() takes 2 positional arguments but 3 were given

The correct syntax would be using the function `super()`, which acts like an object from the superclass and allow us to use the parent's methods directly. 

- In `__init__()`, it acts like creating the attributes from the superclass first, then pass the attributes to the `__init__()` of the subclass to continue initiating the remaining attributes.

- In other functions, it does exactly as calling the original methods in the superclass object. This is convenient if we want to create a method of the same name (i.e. overriding) that uses some results from the method of the superclass. 

In [16]:
class People:
    
    def __init__(self, name, phone):
        self.name = name
        self.phone = phone
        
    def play(self):
        print('Today I play with ', self.name, '. His number is ', self.phone)

        
class Friend(People):    # Friend is a subclass of People
    
    def __init__(self, name, phone, nickname):
        super().__init__(name, phone)              # initiate as an object from People class first, using super()
        self.nickname = nickname                   # continue initiating with the attributes specific for Friend class
    
    def play(self):                                # the play() method of Friend class is overriding the original definition in People class
        print('I have a friend called ', self.name, '. He has a nickname ', self.nickname)
        super().play()                             # but we can still run the play() method of People class using super()

In [17]:
f = Friend("Eric", 999999, 'King')    

f.play()

I have a friend called  Eric . He has a nickname  King
Today I play with  Eric . His number is  999999


## 2.3. Multiple inheritance

There is no restriction in that a subclass can only inherit properties from one superclass only. However inheritance from multiple superclasses can lead to maintainance difficulty If we are adding/editing some methods of one superclass, we also have to go into other superclass of the same subclass to check if the new change can lead to new conflits. Therefore multiple inheritance is generally not recommended. 

You may check out the [diamond problem](https://www.geeksforgeeks.org/multiple-inheritance-in-python/), a common issue occuring when there are overlapping in the name of some attributes and methods from the superclass, and how Python deal with it.

## 2.4. Polymorphism

Polymorphism is a fancy name that describe the concept: we would like to use the *same* interface to make carry out *different* operations, depending on what the data type the input is, i.e. **overloading the functions / operators of the interface**. As a example, the operator `+` can carry out addition to number, but can also do concatenation to strings. By overloading the operators in the programming language design can reduce the number of operators needed for different operation, and make the language more convenient to use.

In OOP, the idea of polymorphism is applied to subclasses which are derived from the same superclass - **it could be convenient if we can use the same set of names of attributes and methods for all the subclasses**, even though the attributes and methods may carry similar but different values and functionalities. This kind of design principle is also called Liskov Substitution Principle.

To achieve such design in Python, you can simply give the same name for the share-named attributes and methods. Take a look at the example:

In [18]:
class AudioFile:
    ext = None        # this line is optional, but serves as a reminder that the all subclass should hold a class variable "ext"
    
    def __init__(self, filepath):
        if not filepath[-4:] == self.ext:
            raise ValueError("Invalid file format")
        self.filepath = filepath
        

class MP3File(AudioFile):
    ext = ".mp3"
    def play(self):
        print("playing", self.filepath, "as mp3")

        
class WavFile(AudioFile):
    ext = ".wav"
    def play(self):
        print("playing", self.filepath, "as wav")

In [19]:
# Both subclass share the class variable name "ext" and method name "play()", but contain different values or functionalities
# Even then, we can use the same line of code to get their corresponding "ext" or run their "play()" method

m1 = MP3File("path1.mp3")
m2 = WavFile("path2.wav")

for m in [m1,m2]:
    print(m.ext)
    m.play()


.mp3
playing path1.mp3 as mp3
.wav
playing path2.wav as wav


In fact, OOP polymorphism in Python is made super easy thanks to the "[duck typing](https://en.wikipedia.org/wiki/Duck_typing)" in python - the type of a variable is not explicitly fixed, but can be used interchangeably as long as it supports the required methods. You can see in the below, even when `MP3File` and `OggFile` have no inheritance relation, the class variable and method of the same name can be used without any problem.

In [20]:
class OggFile:    # OggFile class is not a subclass of AudioFile, but still has the "ext" attribute and "play()" method
    ext = ".ogg"
    
    def __init__(self, filepath):
        if not filepath[-4:] == self.ext:
            raise ValueError("Invalid file format")
        self.filepath = filepath
        
    def play(self):
        print("playing", self.filepath, "as ogg")

In [21]:
m1 = MP3File("path1.mp3")
m2 = WavFile("path2.wav")
m3 = OggFile("path3.ogg")

for m in [m1,m2,m3]:        # Even then, we can use the same line of code to get their corresponding "ext" or run their "play()" method
    print(m.ext)
    m.play()

.mp3
playing path1.mp3 as mp3
.wav
playing path2.wav as wav
.ogg
playing path3.ogg as ogg
