# Object-oriented programming

One of the main strenghts of Python is that it supports object-oriented programming (OOP), although it is not unique in that sense: most codes are built around that concept these days. When writing complex programs, or when developing Python packages or libraries, an object-oriented approach is the way to go. An object has unique properties and behaviour. Its properties are defined through *attributes* and the *methods* of an object determine its behaviour. Once an object is defined, you can use it mulitple times in your program, which has the obvious advantage that you don't have to repeat the same code over and over again. By letting objects interact with each other, you define how your program works.

When you are not used to this type of programming, OOP may seem like an abstract concept, and it is best explained by showing examples. But before we can do that, we have to talk about functions first.

## Functions

A function is a block of code that performs a specific taks and is only executed when it is called from somewhere else inside your program. You typically create functions when certain tasks have to be executed several times. So as a general rule, when you're programming and you find that you are repeating the same lines of code twice or more, then it is time to think about bundling them into a function. This leaves less room for errors and it makes your code much easier to read and understand. Moreover, it is less work to change the function than it is to change the code in all the places where repetition occurs.

The code block below demonstrates how a function is defined. The first line starts with `def`, which tells Python that you plan to define a function. After `def` comes the name of the function, followed by parentheses. In this example, the word `name` appears between parentheses, meaning that the function `describe_person` accepts an *argument*, which in this case is called `name`. Effectively, `name` is a variable that will be available inside (and only inside) the function. Note that the line with the function definition must end with a colon, and that all code that belongs to the function must be indented.

In [2]:
def describe_person(name):
    if name == "Vincent":
        print("Vincent is one of the instructors of the Python Masterclass")
    elif name == "Anushree":
        print("Anushree is the AWS support person during the Python Masterclass")
    else:
        print("Name is unknown, not sure what message to print for this person.")

The function contains a series of conditional statements that check the value of `name` and print a certain message to the screen depending on the name provided. Calling the function (this is programming terminology for executing the function) is done in the following way

In [3]:
describe_person("Vincent")

Vincent is one of the instructors of the Python Masterclass


It is not compulsory for a function to accept arguments, so the following function definition is fine as well

In [4]:
def func_without_args():
    print("This function does not accept arguments")

To execute the function it needs to be called like this

In [5]:
func_without_args()

This function does not accept arguments


***Exercise 1***: In the code cell below, try what happens when you enter the function name without parentheses.

In [6]:
func_without_args

<function __main__.func_without_args()>

A function can also return a value. For example

In [7]:
def describe_person(name):
    if name == "Vincent":
        rv = "Vincent is one of the instructors of the Python Masterclass"
    elif name == "Anushree":
        rv = "Anushree is the AWS support person during the Python Masterclass"
    else:
        rv = "Name is unknown, not sure what message to print for this person."
    
    return rv

Now it can be called in the following way

In [8]:
msg = describe_person("Anushree")
print(msg)

Anushree is the AWS support person during the Python Masterclass


A function can have multiple arguments. It can also have optional arguments, which are called *keyword arguments* (or simply *kwargs*) in Python. A keyword argument is defined by specifying a default value within the function definition. In the example below `age` becomes a keyword argument because a default value of 23 years is assigned to it using the equal sign.

In [9]:
def describe_person(name, age=23):    
    return f"{name} is {age} years old."

The function can now be called like this

In [10]:
msg = describe_person("Vincent", 29)
print(msg)

Vincent is 29 years old.


Or like this

In [11]:
msg = describe_person("Anushree")
print(msg)

Anushree is 23 years old.


Note that keyword arguments must always come after non-keyword arguments, otherwise Python will throw an error.

## Docstrings
It is common practice to document the behaviour of a function using a so-called *docstring*. First of all, a docstring should describe what the function's purpose is. Then it provides information about the arguments (or parameters) that must/can be passed to the function, and the value that the function returns.

In [12]:
def describe_person(name, age=23):
    """
    This function returns a message with a persons name and age.
    
    Parameters
    ----------
    name : str
        A string with the name of the person.
    age : int, optional
        An integer with the person's age in years. Default: 23.
        
    Returns
    -------
    result : str
        A string containing the message.
    """
    return f"{name} is {age} years old."

Suddenly our code got a lot longer and the time involved in creating good docstrings should not be underestimated. But they are an essential part of good coding practice because they allow others to understand what the function does. This includes the users of your program, other developers and yourself when you return to a piece of code several years after you last worked on it.

The style used in the code cell above follows the convections for the <A href="https://numpydoc.readthedocs.io/en/latest/format.html">numpydoc extension for Sphinx</A>. Sphinx is a tool that creates the documentation of your code project partially based on the docstrings you have provided, so writing good docstrings can save you lots of time later on when you want to share your code with others.

The docstring also serves to provide interactive help about a function within an IDE, or a notebook environment. For example, we can now get help on our function by typing

In [13]:
describe_person?

[1;31mSignature:[0m [0mdescribe_person[0m[1;33m([0m[0mname[0m[1;33m,[0m [0mage[0m[1;33m=[0m[1;36m23[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m
This function returns a message with a persons name and age.

Parameters
----------
name : str
    A string with the name of the person.
age : int, optional
    An integer with the person's age in years. Default: 23.
    
Returns
-------
result : str
    A string containing the message.
[1;31mFile:[0m      c:\users\vince\appdata\local\temp\ipykernel_17484\381386399.py
[1;31mType:[0m      function

or

In [14]:
help(describe_person)

Help on function describe_person in module __main__:

describe_person(name, age=23)
    This function returns a message with a persons name and age.
    
    Parameters
    ----------
    name : str
        A string with the name of the person.
    age : int, optional
        An integer with the person's age in years. Default: 23.
        
    Returns
    -------
    result : str
        A string containing the message.



In an IDE, typing the function name followed by the opening parenthesis genereally results in a window popping up with the docstring. In a workbook environment you can try the function name followed by the opening parenthesis and then hitting Shift + Tab (from experience this does not work in all cases, but give it a try in the code cell below)

In [15]:
describe_person( # Place your cursor after the opening parenthesis and hit Shift + Tab to display the docstring

SyntaxError: incomplete input (4222750075.py, line 1)

## Checking arguments

The following function calculates the square of $x$ when $x <= 0$ and the square root of $x$ when $x > 0$ (not sure when you'd ever need this but it is just an example)

In [17]:
import numpy as np

def funky_function(x):
    if x <= 0:
        return x ** 2
    else:
        return np.sqrt(x)

Printing the result for $x = -10$ and $x = 10$ demonstrates that it works

In [18]:
print(funky_function(-10))
print(funky_function(10))

100
3.1622776601683795


But what if we pass an array to the function? This results in an error because of the `if` statement

In [19]:
x = np.linspace(-10, 10, 3)
print(x)
funky_function(x)

[-10.   0.  10.]


ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

To avoid such things from happening, it is important to add some code to your function that checks the type of x and rewrite the function in such a way that it will work both when `x` is a scalar and an array. There could be multiple ways to do this, for example

In [20]:
def funky_function(x):
    x = np.atleast_1d(x)
    rv = np.empty_like(x)
    idx = x <= 0
    rv[idx] = x[idx] ** 2
    rv[~idx] = np.sqrt(x[~idx])
    
    return rv

Now let's try the code that did not work before

In [21]:
x = np.linspace(-10, 10, 3)
funky_function(x)

array([100.        ,   0.        ,   3.16227766])

and also try it with a scalar `x`

In [22]:
funky_function(10)

array([3])

That worked! Or did it? Is the square root of 10 really 3? Well, almost but what happened here is that NumPy rounded the result of the `sqrt` function to the nearest whole number. Why would it do that, since it is clearly wrong?! The answer has to do with variable types. In this case we passed 10, which is an integer. The number got converted to an array with NumPy's `atleast_1d` function, from which the array with the return values was derived using the `empty_like` function. Because the function's input argument `x` was an integer, both arrays end up being integer arrays and the numbers they contain will remain integers, even when you assign a float to an element.

This shows that writing functions can be very tricky and that you should always verify that their result is what you intended under the widest range of conceivable circumstances (there are in fact special techniques for this, which are known as unit testing). Now let's solve this issue by ensuring that `rv` is an array of floats, simply by adding `dtype=float` when we create it.

In [23]:
def funky_function(x):
    x = np.atleast_1d(x)
    rv = np.empty_like(x, dtype=float)
    idx = x <= 0
    rv[idx] = x[idx] ** 2
    rv[~idx] = np.sqrt(x[~idx])
    
    return rv

In [24]:
funky_function(10)

array([3.16227766])

Does this mean we are done now? Not quite. What if a user calls our function and accidentally passes a string variable rather than a number?

In [25]:
funky_function('ten')

TypeError: '<=' not supported between instances of 'numpy.ndarray' and 'int'

Obviously that didn't work. If this were a real function that was being used inside a `for` loop for example, the program would crash. As a programmer you have to think of ways to prevent such things from happening. One way would be to embed the code inside the function between `try` and `except` statements. The code under `try` gets executed but when an error occurs, Python jumps to the `except` part where you can specify what needs to happen in that case. In this example, it probably makes most sense to issue some sort of warning and return an empty return value, which is available in Python as `None'

In [26]:
def funky_function(x):
    try:
        # The code lines below are the same as before
        x = np.atleast_1d(x)
        rv = np.empty_like(x, dtype=float)
        idx = x <= 0
        rv[idx] = x[idx] ** 2
        rv[~idx] = np.sqrt(x[~idx])

        return rv
    except:
        print("Warning: an error occurred. Check input argument.")
        return None

Now let's try if it worked by embedding the function inside a `for` loop

In [27]:
for x in [-10, 0, 10, 'ten']:
    y = funky_function(x)
    print(y)

[100.]
[0.]
[3.16227766]
None


Finally, let's have a look at how we can set up the code within the for loop to ensure that our program does not crash when we try to add the first element of the array `y` to a list. Since we know that `funky_function` returns `None` when an error occurred, and an array otherwise, we can check for this condition, and only add the first element of `y` when `y` is not `None`.

In [28]:
meaningful_results = []
for x in [-10, 0, 10, 'ten']:
    y = funky_function(x)
    if y is not None:
        meaningful_results.append(y[0])

print(meaningful_results)

[100.0, 0.0, 3.1622776601683795]


Checking for errors can be a lot of work and require lots and lots of lines of code. Nonetheless it should become an integral part of your coding as the potential for incorrect results is enormous.

## Classes

In Python classes form the blueprint for the structure and behaviour of objects. A class has properties (also called attributes) and methods, as shown in the code cell below. The definition of a class starts (not surprisingly) with the word `class`, which is followed by its name (and a colon to end the code line). 

All code that defines the class is indented. In the example below, two functions are defined as part of the class (and therefore they should rather be called methods). The first one is `__init__`, which is the method that gets executed when an object gets initialized (more on that later). It can be seen that there are four arguments: `self`, `name`, `age` and `nationality`. The argument `self` is a bit of a special one. It occurs because the object needs to know about its own existence in the computer memory. A user doesn't have to worry about `self`, the argument gets passed to the function invisibly by Python, so as far as the user is concerned, there are really only three arguments for the function (`name`, `age` and `nationality`).

Within the `__init__` method, the arguments are used to set the properties of the class. These are `_name`, `_age` and `_nationality`. Because the are preceded by `self`, we know that they are properties of the class. The leading underscore is there because there is a convention in Python that if a variable is not supposed to be modified by the user, its name receives a leading underscore. That is not to say that the user can't change it (there is nothing that prevents them from doing this) but at least they've been warned not to touch it, as otherwise unexpected behaviour might occur.

The second method is a simple function that prints a message to the screen depending on the state (i.e. the values of the properties) of the object. Notice that the word `self` appears once more as an argument, simply because Python requires it because we are defining a class. In practice, the function gets called without any arguments, as will be demonstrated in the next code cell.

In [29]:
class Person:
    def __init__(self, name, age, nationality):
        self._name = name
        self._age = age
        self._nationality = nationality
    
    def introduce_yourself(self):
        print(f"Hello, my name is {self._name} and I am {self._age} years old.")

With the class `Person` now defined, we can create an instance of it. That means that we will initialize it as an object that has actual data. The resulting class instance is called `vincent` and the code looks like we're calling a function. In fact, we are because Python understands that it needs to call the `__init__` functions with the arguments that we pass in the parentheses behind `Person`. Once the class instance exists, its methods can be called, in this case we ask Vincent to introduce himself. Note that, as mentioned previously, there is no need to worry about that `self` argument, neither when initializing `Person`, nor when calling `introduce_yourself`.

In [30]:
vincent = Person("Vincent", 29, "Dutch")
vincent.introduce_yourself()

Hello, my name is Vincent and I am 29 years old.


Besides `__init__` there are a few other special class methods, for example `__str__`. This method can be defined to provide an elegant string representation of the class. The code cell below copies the class definition from before, but adds the `__str__` method. As can be seen, it returns a string with information about the class property values.

In [31]:
class Person:
    def __init__(self, name, age, nationality):
        self._name = name
        self._age = age
        self._nationality = nationality
    
    def introduce_yourself(self):
        print(f"Hello, my name is {self._name} and I am {self._age} years old.")
    
    def __str__(self):
        return f"Name: {self._name}, age: {self._age}, nationality: {self._nationality}"

Let's see what happens when we initialize `vincent` based on the new class definition and then combine it with a `print` statement

In [32]:
vincent = Person("Vincent", 29, "Dutch")
print(vincent)

Name: Vincent, age: 29, nationality: Dutch


There can be as many instances of the class as we want, so we can define multiple instances, which all follow the blueprint of `Person`, but each have different properties.

In [33]:
# Define additional persons`
onno = Person("Onno", 23, "Dutch")
michael = Person("Michael", 23, "Australian")
stacey = Person("Stacey", 23, "Australian")

This is useful when we also have a class that defines a course, which is defined in the code cell below. The `__init__` method for this class works a little different than before. No arguments are provided when `Course` is initialized. All the `__init__` method does is create two properties (`_attendees` and `_instructors`) which are empty lists. So directly after initialization, these lists are still empty, and they will only be populated when the class method `add_person` gets called. This method expects an instance of the `Person` class and takes a keyword argument `role`, which the user can provide to indicate if a person is an attendee or an instructor. Depending on the value of `role`, the `Person` instance will get added to either the list `_attendees` or `_instructors`.

The method `list_persons` prints the attendees and the instructors to the screen, thereby making use of the fact that we added a `__str__` method to the `Person` class.

In [34]:
class Course:
    def __init__(self):
        self._attendees = []
        self._instructors = []
    
    def add_person(self, person, role="attendee"):
        if role == "attendee":
            self._attendees.append(person)
        elif role == "instructor":
            self._instructors.append(person)

    def list_persons(self):
        print("Attendees")
        for person in self._attendees:
            print('\t', person)
        print("Instructors")
        for person in self._instructors:
            print('\t', person)

We can now initialize `Course` and call the resulting object `PythonMasterclass`. Since we already defined multiple persons earlier, we can add them to the course and indicate their roles.

In [35]:
PythonMasterclass = Course()
PythonMasterclass.add_person(michael)
PythonMasterclass.add_person(stacey)
PythonMasterclass.add_person(vincent, role="instructor")
PythonMasterclass.add_person(onno, role="instructor")

Now let's see who are in this course

In [36]:
PythonMasterclass.list_persons()

Attendees
	 Name: Michael, age: 23, nationality: Australian
	 Name: Stacey, age: 23, nationality: Australian
Instructors
	 Name: Vincent, age: 29, nationality: Dutch
	 Name: Onno, age: 23, nationality: Dutch


### Inheritance

One of the strengths of object-oriented programming is that you can mould classes after other classes. This means that you can create a class with generic attributes and methods, and derive classes that inherit these but add specific properties and behaviour. Continuing with the course example above, we could have used the `Person` class to derive separate classes for attendees and instructors. They'd all be persons, but an attendee requires different attributes (for example, a record of their homework completions) than an instructor.

A class that derives from another class is called a child class, and the original class is the parent. Let's define two new classes `Attendee` and `Instructor`, which derives from `Person`. For `Attendee` we add a new property which is a list that contains the session numbers for which the homework assignments were submitted. For `Instructor` we add a property that describes their expertise. 

Note that for `Attendee` the definition of the `__init__` method is identical to the `Person` class. Because the `Person` class used the function arguments to assign values to the class properties we must call that same method to make sure that this gets done. This is why the first line of code in the `__init__` method for `Attendee` uses the function `super()` to call the parent class' `__init__` method. The second line defines a new property `_homework_completed` for the `Attendee` class.

For `Instructor` the `__init__` method accepts one additional argument `expertise`. The properties that are also in `Person` are set by calling `super().__init__`, the new `_expertise` property is set using the value of the `expertise` argument.

In [38]:
class Attendee(Person):
    def __init__(self, name, age, nationality):
        super().__init__(name, age, nationality)
        self._homework_completed = []

class Instructor(Person):
    def __init__(self, name, age, nationality, expertise):
        super().__init__(name, age, nationality)
        self._expertise = expertise

With these classes define, we can also change the way the `Course` class works. Rather than having to pass `role` as a keyword argument, we can let the method decide to which of the two lists, `_attendees` or `_instructors`, the person needs to be assigned. For this purpose, Python has the function `isinstance`, which checks if a variable is an instance of a certain class.

In [48]:
class Course:
    def __init__(self):
        self._attendees = []
        self._instructors = []
    
    def add_person(self, person):
        if isinstance(person, Attendee):
            self._attendees.append(person)
        elif isinstance(person, Instructor):
            self._instructors.append(person)

    def list_persons(self):
        print("Attendees")
        for person in self._attendees:
            print('\t', person)
        print("Instructors")
        for person in self._instructors:
            print('\t', person)

The code cell below demonstrates how the new classes are used

In [49]:
# Define the instructors
vincent = Instructor("Vincent", 29, "Dutch", ["Python", "groundwater"])
onno = Instructor("Onno", 23, "Dutch", ["Python", "groundwater"])
# Define the attendees
michael = Attendee("Michael", 23, "Australian")
stacey = Attendee("Stacey", 23, "Australian")

# Initialize an instance of Course
PythonMasterclass = Course()

# Add the instructors and the attendees
PythonMasterclass.add_person(vincent)
PythonMasterclass.add_person(onno)
PythonMasterclass.add_person(michael)
PythonMasterclass.add_person(stacey)

# List everyone in the class
PythonMasterclass.list_persons()

Attendees
	 Name: Michael, age: 23, nationality: Australian
	 Name: Stacey, age: 23, nationality: Australian
Instructors
	 Name: Vincent, age: 29, nationality: Dutch
	 Name: Onno, age: 23, nationality: Dutch


***Exercise 2***: Add a check to the `add_person` method that prevents instructors who don't have any Python expertise from being added to the course.

*Hint*: You can check if the word 'Python' occurs in the `_expertise` property (which is a list ) of the `Instructor` class instance with the following code: `'Python' in 'person._expertise'`

In [55]:
class Course:
    def __init__(self):
        self._attendees = []
        self._instructors = []
    
    def add_person(self, person):
        if isinstance(person, Attendee):
            self._attendees.append(person)
        elif isinstance(person, Instructor):
            self._instructors.append(person)

    # Note that the list_persons method was omitted for brevity

In [56]:
# Answer (delete before distributing)
class Course:
    def __init__(self):
        self._attendees = []
        self._instructors = []
    
    def add_person(self, person):
        if isinstance(person, Attendee):
            self._attendees.append(person)
        elif isinstance(person, Instructor):
            if 'Python' in person._expertise:
                self._instructors.append(person)
            else:
                print(f"Instructor {person._name} is not qualified to teach Python.")

    # Note that the list_persons method was omitted for brevity

darcy = Instructor("Henry", 175, "Frech", ["groundwater"])
PythonMasterclass = Course()
PythonMasterclass.add_person(darcy)

Instructor Henry is not qualified to teach Python.


***Exercise 3***: We can only add a single person each time, which gets a little annoying when there are many people in the course. Modify the definition of the `add_person` method in the code cell below to add multiple persons at the same time. 

*Hint*: You can pass a `list` as a function argument.

In [48]:
class Course:
    def __init__(self):
        self._attendees = []
        self._instructors = []
    
    def add_person(self, person):
        if isinstance(person, Attendee):
            self._attendees.append(person)
        elif isinstance(person, Instructor):
            self._instructors.append(person)

    def list_persons(self):
        print("Attendees")
        for person in self._attendees:
            print('\t', person)
        print("Instructors")
        for person in self._instructors:
            print('\t', person)

In [63]:
# Answer (delete before distributing)
class Course:
    def __init__(self):
        self._attendees = []
        self._instructors = []
    
    def add_persons(self, persons):
        for person in persons:
            if isinstance(person, Attendee):
                self._attendees.append(person)
            elif isinstance(person, Instructor):
                self._instructors.append(person)

    def list_persons(self):
        print("Attendees")
        for person in self._attendees:
            print('\t', person)
        print("Instructors")
        for person in self._instructors:
            print('\t', person)
            
PythonMasterclass = Course()
PythonMasterclass.add_persons([vincent, onno, stacey, michael])
PythonMasterclass.list_persons()

Attendees
	 Name: Stacey, age: 23, nationality: Australian
	 Name: Michael, age: 23, nationality: Australian
Instructors
	 Name: Vincent, age: 29, nationality: Dutch
	 Name: Onno, age: 23, nationality: Dutch


In [59]:
# Raise a TypeError
