# Subclasses and Inheritance

In the last Jupyter notebook, we covered how you can write custom classes to create Python objects with behaviours defined by you. However, we did not discuss a major feature of classes: their extensibility. In this notebook, we will cover creating subclasses to extend classes, i.e., create a copy of the base class with additional or modified behaviour. To explore subclasses, we will consider two scenarios:

1. extending or modifying classes written by someone else (e.g., Python built-in types)
2. Writing multiple related classes yourself

## Writing a basic subclass

Let's start with an example of functionality we might want that we have already covered in this course: an iterable `int`. An iterable `int` is a great example of when you might want to create a subclass. We want a class that has all the characteristics and behaviours of an `int`, but we also want it to be iterable. i.e., we want to add a little bit of extra functionality to something that already exists.

To refresh your memory, the approach we have taken before to iterate over an `int` is to convert it to a `str` and then iterate over the characters of the `str`, yielding each as an `int`. That function looked something like this

In [1]:
from typing import Iterator

def iter_int(num: int) -> Iterator[int]:
    num_string = str(num)
    
    for char in num_string:
        yield int(char)
    
for x in iter_int(123):
    print(x)

1
2
3


Perhaps we don't want to have to call this function all the time, but instead just want iteration to happen for `int`s like it does for `str` and other iterable classes. In order for a class to support iteration, it needs to have an implemented `__iter__()` method. Note the double underscores which tell us that this is a method that Python will look for under some circumstances.

Let's start by creating a subclass of `int` that is simply a copy of the `int` class. Here, I'm using the word copy to mean that our new class has the same methods etc. However, it's not strictly a copy due to differences in inheritance which we will discuss shortly.

In [2]:
class IterInt(int):
    pass

We now have a minimal subclass of `int`, which is named `IterInt`. If you compare this minimal subclass to the minimal class defined in the last notebook, you will see that the only difference is the inclusion of parentheses after the class name. Indeed, that's all there is to it if you want to create a subclass. Now that we have our subclass of `int` we can go about customizing its behaviour exactly as if we were writing a class from scratch. The only difference between the classes created in the last notebook and a subclass is how Python looks for attributes and methods. Python searches using a system of "inheritance". Let's go on a quick tangent to discuss what inheritance means before we get into making our `IterInt` class.

## Inheritance

In the `IterInt` class definition above, we included `int` in parentheses after the class name. That declares that `IterInt` is a subclass of `Int`. What that means in practice is that `IterInt` will inherit its methods and attributes from `Int`. i.e., unless we overwrite a method, any time a method is called for our `IterInt` class, the `Int` method will be used. We can see that more clearly if we quickly throw together a simple example of a custom base class and a subclass of it.

In [3]:
class BaseClass:
    some_attribute = "I am an attribute of BaseClass"
    
    def some_method(self):
        print("This is BaseClass.some_method()")

class SubClass(BaseClass):
    pass

Our BaseClass works just like classes we've already discussed

In [4]:
x = BaseClass()

print(x.some_attribute)
x.some_method()

I am an attribute of BaseClass
This is BaseClass.some_method()


Our SubClass behaves the same way as we haven't changed or overwritten anything yet

In [5]:
y = SubClass()

print(y.some_attribute)
y.some_method()

I am an attribute of BaseClass
This is BaseClass.some_method()


But if in our `SubClass` we overwrite something that was defined in our `BaseClass`, then which does Python use?

In [6]:
class SubClass(BaseClass):
    some_attribute = "Now I'm an attribute of SomeClass"
    
z = SubClass()

print(z.some_attribute)
z.some_method()

Now I'm an attribute of SomeClass
This is BaseClass.some_method()


Now our `SubClass` has an attribute with the same name as one in `BaseClass`. Creating something in a subclass that has the same name as something in the base class is called overriding. The way it works is by taking advantage of how Python figures out what it should run when you call a method or attribute.

Python identifies what it should run by performing the following search (using the example of `z.some_method()` in the above code block):

1. Is `some_method()` found within the namespace of the instance, `z`, of `SubClass`? (You can define functions and store them in individual instances if you like)
2. Is `some_method()` found within the namespace of the class `SubClass`?
3. Is `some_method()` found within the namespace of the class `BaseClass`?

As soon as Python finds an attribute or method that matches, it uses that attribute or method and stops searching. That means that if you override anything in your base class, Python will see the version in your subclass first and therefore use that.

Before we move back to our `IterInt` class, there are two more things you should know about the above search, which have implications for how you will create subclasses:

1. The search is recursive. i.e., Python will check the classes inherited by each class from your current subclass all the way up to the end of the chain. All classes inherit from the class `object` in Python, so the chain will terminate there. If you don't declare a subclass with parentheses, your class still actually inherits from `object`. You can see [the docs for object here](https://docs.python.org/3/library/functions.html#object).
2. The search proceeds from left to right through base classes if there are more than one. You can actually specify more than one base class when defining a class. Consider the example `NewClass(BaseA, BaseB):`. When searching for methods or attributes called in an instance of `NewClass`, first the namespace of the instance will be searched, then `NewClass`, then `BaseA`, then the parents of `BaseA` all the way up to `object`, then `BaseB`, then the parents of `BaseB`.

## Writing and Overriding Methods

Let's get back to the `IterInt` class we were working on. To refresh your memory, we were creating a subclass of `int` that implements a method to allow iteration. Iteration is implemented using the `__iter__()` method. To demonstrate that is the case, let's first implement something silly just to show that we truly have complete control over how iteration will work in our class.

In [7]:
class IterInt(int):
    def __iter__(self):
        for s in ["look", "at", "me", "go"]:
            yield s

x = IterInt(123)

for i in x:
    print(i)

look
at
me
go


As you see above, simply implementing an `__iter__()` method is sufficient to allow iteration with an instance of our class. Furthermore, we don't actually need to iterate over something relating to our class. We can really do whatever we want. However, it's probably more sensible in most cases to implement an `__iter__()` method that iterates over some attribute of your class instance. Let's do that instead.

In [8]:
class IterInt(int):
    def __iter__(self):
        for s in str(self):
            yield IterInt(s)

x = IterInt(123)

for i in x:
    print(i)

1
2
3


Note that what we are actually doing in this implementation of the `__iter__()` method is using `str()` to return a `str` representation of our object. It's important to keep in mind that, [as described in the docs](https://docs.python.org/3/library/stdtypes.html#str), all `str()` does is call the `__str__()` or `__repr__()` method of the object. The above approach works, because `int.__str__()` simply returns the number as a `str`. If we tried this with one of the classes we wrote `__str__()` and `__repr__()` methods for in the last notebook, it wouldn't work as well as this. In short, remember that Python isn't magically able to tell what we want it to do. It is simply running functions with defined behaviour. It is important we keep in mind exactly what those functions are and how they will behave so that we can foresee or deal with errors or bugs in our code.

Now we have a class that does what we wanted. It has all the behaviours of an `int`, except it is also iterable. However, we might not be satisfied just yet. Let's see what could bug us in the future:

In [9]:
print(type(x))
y = x - 5
print(type(y))

<class '__main__.IterInt'>
<class 'int'>


If we subtract or add an `int` from our `IterInt`s, or perform other mathematical operations on them, would we want to get another `IterInt` as the result or are we happy to get an `int`? It could definitely be reasonable to want to get out an instance of our new class.

What if we subtract an `IterInt` from another `IterInt`?

In [10]:
z = IterInt(5)
y = x - z
print(type(y))

<class 'int'>


Getting an `int` out when performing mathematical operations with two `IterInt`s is very unlikely to be expected behaviour. We would certainly want it to return an `IterInt` instead. There are a couple of ways we could achieve that. First, we could manually reimpliment all of the method of `int` in our `IterInt` class. However, at that point what was the point in subclassing `int` to begin with? For simple methods that's not going to be too much work, but I'm sure you can imagine cases where you might want to subclass a class that has complex methods which would be burdensome to rewrite. Luckily, there is a better way. Whenever you want to call a method defined in a class that your subclass inherits from, you can use a special function, `super()`. `super()` refers to classes from which your subclass inherits. Python will search as described above for any methods you name. Let's write a modification of the `__sub__()` method that is called when you use the subtract operation `-`. This modified `__sub__()` method should simply run the `int.__sub__()` method, and convert the result to our new `IterInt` class.

In [11]:
class IterInt(int):
    def __iter__(self):
        for s in str(self):
            yield IterInt(s)
            
    def __sub__(self, num):
        int_result = super().__sub__(num)
        return IterInt(int_result)

x = IterInt(123)
print(type(x))
y = x - 5
print(y)
print(type(y))

<class '__main__.IterInt'>
118
<class '__main__.IterInt'>


Now our `IterInt` returns an instance of `IterInt` whenever you subtract an `int` from it. We can implement the other methods so that our class can handle other mathematical operations too. The list of dunder methods for these operations are listed [in the docs here](https://docs.python.org/3/reference/datamodel.html?highlight=iadd#emulating-numeric-types). We'll just implement a few here in the interest of time.

In [12]:
class IterInt(int):
    def __iter__(self):
        for s in str(self):
            yield IterInt(s)
            
    def __sub__(self, num): # called by -
        int_result = super().__sub__(num)
        return IterInt(int_result)
    
    def __add__(self, num): # called by +
        int_result = super().__add__(num)
        return IterInt(int_result)
    
    def __mul__(self, num): # called by *
        int_result = super().__mul__(num)
        return IterInt(int_result)
    
    def __truediv__(self, num): # called by /
        int_result = super().__truediv__(num)
        return IterInt(int_result)
    
    def __floordiv__(self, num): # called by //
        int_result = super().__floordiv__(num)
        return IterInt(int_result)

x = IterInt(123)

print(type(x + 5))
print(type(x - 5))
print(type(x * 5))
print(type(x / 5))
print(type(x // 5))

# but
print("didn't implement x**y")
print(type(x ** 5))

<class '__main__.IterInt'>
<class '__main__.IterInt'>
<class '__main__.IterInt'>
<class '__main__.IterInt'>
<class '__main__.IterInt'>
didn't implement x**y
<class 'int'>


## Writing Subclasses of Our Own Classes

When writing simple software, it will be common to encounter a situation in which you need custom functionality that inspires you to write a class. It will be less common for you to need to write multiple related classes. However, there will be times when something like that is called for. For example, in the last document We wrote a `BLASTResult` class to read in the output of a BLAST. We then added to it by writing some methods that were very much tailored to isPCR. In doing so, we no longer had a class that was generic to BLAST output. Perhaps a better approach would have been to write a generic class to read BLAST output, then subclass that to write specific classes which provide functionality for all the different analysis we might want to perform on our BLAST output. Perhaps one subclass would handle BLAST results for primers, while another could be tailored to BLAST results corresponding to protein domains for which we want to identify homologs. Both of those subclasses would need the basic functionality of reading in the fields of a BLAST output. However, each subclass would then need different methods for the analysis of the BLAST outputs. If you ever find yourself in a situation where you want to create multiple different objects with custom functionality AND those objects are going to have some overlap in their functionality, you should consider creating a base class that handles the overlapping functionality.

The approach we are going to discuss here is simply an application of the concepts covered above. However, to help to keep this as clear as possible, I'm going to use a simple example rather than reimplimenting something complicated like the `BLASTResult` class from the last notebook. Instead, lets create some classes for different shapes.

In [13]:
class Shape:
    def __init__(self, sides: int):
        self.sides = sides

All shapes have some number of sides, but other attributes (or at least the attributes you might wish to use) can differ between shapes. Let's make some subclasses that extend our shape class to be more specialized

In [14]:
class Circle(Shape):
    def __init__(self, radius: float):
        self.radius = radius
        super().__init__(1) # Call the init of our base class to initialize the generic parts
    
    def area(self) -> float:
        return 3.14 * (self.radius ** 2)
    
    def circumference(self) -> float:
        return 2 * 3.14 * self.radius

c = Circle(2)
print(vars(c))
print(c.area())

{'radius': 2, 'sides': 1}
12.56


We can also create a different subclass for other shapes

In [15]:
class Rectangle(Shape):
    def __init__(self, height: float, width: float):
        self.height = height
        self.width = width
        super().__init__(4) # Call the init of our base class to initialize the generic parts
    
    def area(self) -> float:
        return self.height * self.width
    
    def perimeter(self) -> float:
        return 2 * (self.height + self.width)
    
r = Rectangle(2, 4)
print(vars(r))
print(r.area())

{'height': 2, 'width': 4, 'sides': 4}
8


As you can see, both subclasses now work as independent classes that do not share the more specialized components that we defined, but do share the `.sides` attribute. We can then further subclass our `Circle` and `Rectangle` classes to create new classes with additional functionality.

In [16]:
class Sphere(Circle):
    def __init__(self, radius: float):
        super().__init__(radius) # Sphere's don't need anything more than the init implemented for Circle
    
    def volume(self) -> float:
        return (4 / 3) * 3.14 * (self.radius ** 3)
    
    def area(self) -> float:
        return 4 * 3.14 * (self.radius ** 2)
    
    def cross_sectional_area(self) -> float: # We can rename the area method for this class
        return super().area()

sphere = Sphere(3)
print(vars(sphere))
print(sphere.volume())
print(sphere.area()) # overrode method from Circle
print(sphere.cross_sectional_area())

{'radius': 3, 'sides': 1}
113.03999999999999
113.04
28.26


In [17]:
class Cuboid(Rectangle):
    def __init__(self, width: float, height: float, depth: float):
        self.depth = depth
        super().__init__(height, width)
    
    def volume(self) -> float:
        return self.width * self.height * self.depth
    
    def area(self) -> float:
        return 2 * ((self.width * self.height) + (self.height * self.depth) + (self.width * self.depth))
    
    def cross_sectional_area(self, axis: str) -> float:
        """ area for a slice when facing axis parallel to named dimension """
        if axis == "width":
            return self.height * self.depth
        
        if axis == "depth":
            return self.height * self.width
        
        if axis == "height":
            return self.width * self.depth
        
        raise ValueError("axis must be one of {'width' , 'height', 'depth'}")
    
cuboid = Cuboid(2, 3, 4)
print(vars(cuboid))
print(cuboid.volume())
print(cuboid.area()) # uses overrode method instead of base class method
print(cuboid.cross_sectional_area("width"))

{'depth': 4, 'height': 3, 'width': 2, 'sides': 4}
24
52
12


As this example with shapes shows, when you have attributes or methods that you want to use in different ways or with variations, an effective strategy can be to create a heirarchy of classes that inherit desired functionality form base classes. Furthermore, you are never trapped into keeping the functionality of base classes, your subclasses are always free to rename or override the methods and attributes implemented in their base classes.