# Advanced inheritance

**Outline**:
- Reminder on inheritance
- Proper multiple inheritance
- Composition over inheritance
- Metaclasses
- Dynamic class creation

## Reminder on inheritance

### Inheritance basics

Just a small reminder on simple inheritance:

In [None]:
from abc import ABCMeta, abstractmethod

class Shape(metaclass=ABCMeta):

    @abstractmethod
    def compute_area(self):
        raise NotImplementedError()

    @abstractmethod
    def compute_perimeter(self):
        raise NotImplementedError()

class Rectangle(Shape):
    
    def __init__(self, width, height):
        super().__init__()
        self.width = width
        self.height = height

    def compute_area(self):
        return self.width * self.height

    def compute_perimeter(self):
        return 2 * (self.width + self.height)

class Square(Rectangle):
    def __init__(self, side_length):
        super().__init__(side_length, side_length)


### Multiple inheritance and MRO


Python allows for Multiple Inheritance. An issue that arises from such mechanism is called the Diamond problem. Consider the following inheritance diagram:
```
      Person        
     /      \       
Teacher  Researcher 
     \      /       
    Professor       
```
What happens if a method of a `Professor` instance is called when
- the method is defined only in `Professor`;
- the method is defined only in `Person`;
- the method is re-defined both in `Teacher` and `Researcher`?

The most common such method is the `__init__` dunder.

In [None]:
class Person:
    def __init__(self):
        print("Person init")

class Teacher(Person):
    def __init__(self):
        print("Teacher init")
        super().__init__()

class Researcher(Person):
    def __init__(self):
        print("Researcher init")
        super().__init__()


In [None]:
class Professor1(Researcher, Teacher):
    def __init__(self):
        print("Professor init (R, T)")
        super().__init__()

_ = Professor1()

In [None]:
class Professor2(Teacher, Researcher):
    def __init__(self):
        print("Professor init (T, R)")
        super().__init__()

_ = Professor2()

Python uses the C3 linearization algorithm to determine the orders of the calls. In practice, a deterministic order is established and stored in the `__mro__` (Multi-inheritance Resolution Order) field of the **class**, also accessible via the `mro` class method:

In [None]:
print("Professor 1:", Professor1.__mro__, Professor1.mro())
print("Professor 2:", Professor2.__mro__)

the `super` call follows the MRO. See the "Advanced Inheritance" chapter for more on multiple inheritance.

> Note how the last one if always the `object` base class.

> :warning: The C3 algorithm is neither a breadth-first, nor depth-first exploration of the hierarchy DAG. Rather it is a merging algorithm which aims to enforce three poperties (local precedence, monotonicity and consistency).

Even in the case of simple inheritance, Python relies on the MRO for the lookup, although it is trivial then:

In [None]:
class Rectangle:

    def __init__(self, width, height):
        self.width = width
        self.height = height

    def area(self):
        return self.width * self.height
    

class Square(Rectangle):
    def __init__(self, side):
        super().__init__(side, side)
        
Square.mro()

## Proper multiple inheritance

### Pitfalls
The MRO ensures a clear ordering of the class hierarchy, with good properties and deterministic behavior. It does not solve all the problems however. Consider the following piece of code:

In [None]:
class Printer:
    def output(self):
        return "Printing..."

class Logger:
    def output(self):
        return "Logging..."

class Service(Printer, Logger):
    pass

s = Service()
print(s.output())


In this case, we have a collision between the `output` methods: one masks the other. This happens with common names, in both methods and attributes. A special case is dunder methods (where the name is imposed), with the most frequent issue being  `__init__`.

In [None]:
class A:
    def __init__(self):
        print("Initializing A")

class B:
    def __init__(self):
        print("Initializing B")
        
class C(A, B):
    def __init__(self):
        super().__init__()
        print("Initializing C")

C()  # Skipping B

Note that you can partially circumvent the above issue with more specific calls, but this then breaks the MRO properties, leading to other issues

In [None]:
class A:
    def __init__(self):
        print("Initializing A")

class B(A):
    def __init__(self):
        print("Initializing B")
        super().__init__()

class C(A):
    def __init__(self):
        print("Initializing C")
        super().__init__()

class D(B, C):
    def __init__(self):
        print("Initializing D")
        B.__init__(self) # Explicit choice of super class
        C.__init__(self) # Explicit choice of super class 

_ = D()  # C and A are initialized twice

The double initilization might just be a waste of time, but it might also be an issue in case of side effects.

The remainder of this section is concerned with best practices in the context of multiple inheritance.

### Interface/protocol

In OOP, a common way of doing "soft" multiple inheritance is to have *interfaces*. An interface is like an API: it exposes some methods to the user. Implementing those is the responsibility of the concrete class. It is a contract with the user saying that he can use those methods. Typically, an object can implement several interfaces without risking name collisions since no such "inheritances" provide a real implementation to mask.

Prior to typing, this was not widely used in Python, where the philosophy was Duck typing (try instead of check). Typing introduce the need for a `Protocol` mechanism, which can also be used to define something similar to an interface.


In [None]:
from typing import Protocol, runtime_checkable


@runtime_checkable  # To be able to use isinstance at runtime
class SupportsFlush(Protocol):
    def flush(self) -> None: ...


class Serializer(SupportsFlush):
    def flush(self) -> None:
        print("Flushing data...")


When used a priori to build a class hierarchy, we inherit from the `Protocol` child class. `Protocol` can be used at posteriori as well without being part of the class hierarchy.

### Mixin

A common way to use multiple inheritance is to use the Mixin construct. A Mixin is a piece of code designed to encapsulate a common behavior, allowing to share the implementation between classes that are not part of a same hierarchy. 

Here is an example:

In [None]:
class ClonableMixin:
    def clone(self, deep: bool = False):
        import copy
        return copy.deepcopy(self) if deep else copy.copy(self)
    
class ReprMixin: 
    def __repr__(self):
        return f"<{self.__class__.__name__} {self.__dict__}>"
        
    
class Shape:
    def area(self) -> float:
        raise NotImplementedError()
    
    
class Rectangle(ClonableMixin, ReprMixin, Shape):
    def __init__(self, width: float, height: float):
        self.width = width
        self.height = height

    def area(self) -> float:
        return self.width * self.height
    
class Person(ReprMixin, ClonableMixin):  # Reusable
    def __init__(self, name):
        self.name = name

Rectangle(10, 20).clone()

Mixin is a powerful construct, provided some good practices are observed:
- a mixin scope should be as small as possible (one mixin = one purpose), prefer chaining orthogonal scopes;
- suffix the name with `Mixin` to indicate a clear purpose;
- no `__init__` (avoid issues highlighted above), as little state as possible (you can use class attribute for customization);
- always place the mixins first in the MRO (why :question:);
- don't sublcass mixins;
- avoid when non-trivial;
- unit-test mixins in isolation and perform integration tests over the full classes.

> :pushpin: Mixins are used a lot in Scikit-learn.

> Although mixins should avoid statefulness and initialization, there are ways to circumvent that (https://realpython.com/python-mixin/#how-can-you-use-stateful-mixins-safely). Whether this is a good idea is not that clear.

In [None]:
# TODO exercice wth EnergyUnitMixin

### Cooperation

Another way to take advantage of multiple inheritance is to create a collection of classes in a cooperative manner. This can take several flavors:
- cooperative `__init__`
- chainable hooks (cooperative super)
- mergeable result (cooperative super)


#### Cooperative initialization
One way to make the initialization work in complex hierarchy is to swallow non-relevant arguments in kwargs:

In [None]:
class Swimmer:
    def __init__(self, swim_speed: float, **kwargs):
        self.swim_speed = swim_speed
        super().__init__(**kwargs)

class Flyer:
    def __init__(self, fly_speed: float, **kwargs):
        self.fly_speed = fly_speed
        super().__init__(**kwargs)

class FlyingFish(Swimmer, Flyer):
    def __init__(self, swim_speed: float, fly_speed: float):
        super().__init__(swim_speed=swim_speed, fly_speed=fly_speed)

fish = FlyingFish(swim_speed=10, fly_speed=20)
print(f"Swim speed: {fish.swim_speed}, fly speed: {fish.fly_speed}")

Note in the example above how the the `Swimmer` and `Flyer` can be inverted, since the `super` call will pass on the call of initialization in the MRO. The `kwargs` make sure the signatures are compatible.

#### Cooperative returns (merging)

Another way to design class copperatively, is to be able to merge results:

In [None]:
class Base:
    def validate(self):
        return []

class NameValidation:
    def validate(self):
        errors = super().validate()
        if not getattr(self, "name", None):
            errors.append("Missing name")
        return errors

class AgeValidation:
    def validate(self):
        errors = super().validate()
        if getattr(self, "age", 0) < 0:
            errors.append("Age invalid")
        return errors

class PermissionValidation:
    def validate(self):
        errors = super().validate()
        if not getattr(self, "is_admin", False):
            errors.append("Not an admin")
        return errors

class User(NameValidation, AgeValidation, PermissionValidation, Base):
    def __init__(self, name, age, is_admin):
        self.name = name
        self.age = age
        self.is_admin = is_admin


user = User(name="", age=-1, is_admin=False)
print(user.validate())

Note how the `Base` is still included; that is where chained calls end up and create the empty list which gets filled up with the other calls. This pattern can be used with dictionaries, or as part of more complex design patterns (builder, visitor). 

> :pushpin: the transform and composition pattern works like this (cf. `Pytorch`, `Hermes`)

> The validation subclasses can seen as Mixins.

#### Cooperative hooks

Another form of cooperation is to propose overriable hooks:

In [None]:
class Pipeline:
    def run(self, data):
        data = self.pre(data)
        data = self.process(data)
        data = self.post(data)
        return data

    # Noop defaults to end up the MRO
    def pre(self, d): return d
    def process(self, d): return d
    def post(self, d): return d

class LoggingMixin:
    def pre(self, d):
        print("pre >>")
        return super().pre(d)
    
    def post(self, d):
        print("<< post")
        return super().post(d)

class Upper(Pipeline):
    def process(self, d): 
        print("<upperizing>")
        return d.upper()

class Job(LoggingMixin, Upper): pass


job = Job()
print(job.run("hello"))  # pre, post, HELLO

Note how the hook are NOOPs, and not raising a `NotImplementedError`. 

> :pushpin: hooks are used a lot in `Pytorch` and `Lightning`. This is also used for the data loading in `Hermes`.

:microphone: What do you think of those approaches?

## Composition over inheritance

One common piece of advice is that composition should be prefered over inheritance. Here is an example on how to structure the validation example of above with composition:



In [None]:
from abc import ABC, abstractmethod
from typing import Collection


class Validator(ABC):
    @abstractmethod
    def validate(self, obj) -> Collection[str]:
        raise NotImplementedError()
    


class CompositeValidator:
    def __init__(self, *validators: Validator):
        self._validators = validators
    
    def validate(self, obj) -> Collection[str]:
        errors = []
        for v in self._validators:
            errors.extend(v.validate(obj))
        return errors
    
class NameValidator(Validator):
    def validate(self, obj) -> Collection[str]:
        if not getattr(obj, "name", None):
            return ["Missing name"]
        return []
    
class AgeValidator(Validator):
    def validate(self, obj) -> Collection[str]:
        if getattr(obj, "age", 0) < 0:
            return ["Age invalid"]
        return []
    
class PermissionValidator(Validator):
    def validate(self, obj) -> Collection[str]:
        if not getattr(obj, "is_admin", False):
            return ["Not an admin"]
        return []
    
class User:
    _validator = CompositeValidator(
        NameValidator(),
        AgeValidator(),
        PermissionValidator(),
    )
    
    def validate(self):
        return self._validator.validate(self)
    
    def __init__(self, name, age, is_admin):
        self.name = name
        self.age = age
        self.is_admin = is_admin

user = User(name="", age=-1, is_admin=False)
print(user.validate())

:wrench: convert the swimmer, flyer, flyingfish case to a compositional pattern

In [None]:
# Swim, fly, fish

Although "composition over inheritance" is a common saying, the truth is a bit more nuanced, each having pros and cons.

| Aspect                          | Composition                                                                                                    | Inheritance                                                                |
| ------------------------------- | -------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------- |
| **Reusability**                 | Components can be reused across unrelated classes easily (e.g., a `Logger` can be injected into many classes). ✅ | Reuse tied to hierarchy → only subclasses benefit (but Mixin and cooperation)                         |
| **Flexibility**                 | Swap components at runtime (different strategies, mocks in tests). Loose coupling ✅                             | Class behavior fixed at class-definition time.  Tight coupling            |
| **Hierarchy depth**             | Flat structures; avoids fragile deep trees. ✅                                                                   | Encourages deep hierarchies that are harder to maintain.                   |
| **Name collision**              | Components live in their own namespace, fewer accidental overrides. ✅                                           | Method/attribute collision possible (diamond problem).            |
| **Evolution**                   | Easy to extend by adding/removing components ✅                                                                  | Extending requires modifying hierarchy; can lead to brittle base classes.  |
| **Testing**                     | Components can be tested in isolation. ✅                                                                        | Harder to test base classes without concrete subclasses.                   |
| **Runtime behavior**            | Behavior can be delegated dynamically (Strategy, State patterns). ✅                                             | Behavior fixed by parent methods unless overridden.                        |
| **Discoverability**             | Can obscure where a method is defined (delegated via `__getattr__`, forwarding).                                 | Hierarchy shows all inherited methods; IDEs handle it well (although investigating several path is a cognitive challenge). ✅               |
| **Boilerplate**                 | Requires explicit delegation/wrappers (`self.comp.method()`).                                                    | Inherited methods work “for free” once defined. ✅                           |
| **Performance**                 | Slight overhead for delegation calls.                                                                            | Direct method lookup is slightly faster. ✅                                  |
| **Simplicity (small projects)** | Might feel verbose for trivial extensions.                                                                       | Inheritance can be simpler when just adding tiny customizations. ✅          |

In short, composition offers more flexibility and better utility but is more verbose by needing to setup properly the indirection layer. Inheritance patterns work best when the full scope is known a priori and we can commit to the design.

## Metaclasses

A metaclass creates class in the same way a class create instances. Metaclass offers a mechanism to customize a hierarchy of classes. The metaclass can be used to control how class are defined and how instances are created.

:warning: Since it is a class creation mechanism and not an instance creation, the `__new__`/`__init__` are called at the subclass creation (when it is read from the REPL). The `__call__` is called when a instance is created.


To create a metaclass, you must inherit from `type`, then use it in the class definition via the `metaclass` kwargs. See the example below.

In [None]:
# Customize instance creation
class Singleton(type):
    _instances = {}
    
    def __call__(cls, *args, **kwargs):
        if cls not in cls._instances:
            print(f"Creating instance of {cls.__name__} with args={args}, kwargs={kwargs}")
            instance = super().__call__(*args, **kwargs)
            cls._instances[cls] = instance
        return cls._instances[cls]
    

class Logger(metaclass=Singleton):
    def log(self, msg):
        print(f"[LOG] {msg}")


logger1 = Logger()
print("---")
logger2 = Logger() 
print("---")
print(logger1 is logger2)  # True, same instance

In [None]:
# Customize class creation
class Cloneable(type):
    def __new__(cls, name, bases, attrs):
        print(f"Defining class {name} with Cloneable metaclass")
        def clone(self, deep: bool = False):
            import copy
            return copy.deepcopy(self) if deep else copy.copy(self)
        attrs['clone'] = clone
        return super().__new__(cls, name, bases, attrs)
    
print("---")
class Person(metaclass=Cloneable):
    def __init__(self, name):
        self.name = name

    def __repr__(self):
        return f"Person(name={self.name!r}) @ {id(self)}"
print("---")

alice = Person("Alice")
print(alice)
print(alice.clone())
    

> :skull: Besides, `__new__`, `__init__`, and `__call__`, it is also possible to override `__prepare__` which control the namespace dict before class body executes.


Metaclass is a powerful mechanism, which is seldom used for a couple of reasons. Firstly, it adds complexity in an unsual way. More importantly, a class can only have **one metaclass**, so combining metaclasses is not possible. In case of multiple inhertiance with different (incompatible) metaclasses, Python will raise a `TypeError`, making it hard to work in context where several metaclasses co-exists. As a consequence, compositional patterns are to be preferred, like class decoration:

In [None]:
def cloneable(cls):
    def clone(self, deep: bool = False):
        import copy
        return copy.deepcopy(self) if deep else copy.copy(self)
    cls.clone = clone
    return cls

@cloneable
class Person:
    def __init__(self, name):
        self.name = name

    def __repr__(self):
        return f"Person(name={self.name!r}) @ {id(self)}"
    
class Person2(Person): pass

alice = Person("Alice")
print(alice)
print(alice.clone())
print("---")
print("Is 'clone' in Person.__dict__? ", "clone" in Person.__dict__)

Person2("Bob").clone()  # Works as well

Hence the `cloneable` decorator is re-usable and does not interact with class hierarchy. 

:question: How come the clone method is inherited as well?

:wrench: You want to make some class immutable (to be precise, you want instance immutability: once initialized you can no longer change the attributes of the instance). How would you proceed? Design the code. 

> You might need to allow mutability for the initialization and prevent it afterwards.

In [None]:
# Make the class Point immuatable

class Point:
    def __init__(self, x: float, y: float):
        self.x = x
        self.y = y


try:
    Point(1, 2).x = 10 
    print("**KO** the class is not immutable")
except AttributeError:
    print("Ok,the class is immutable")


> Metaclass should also be used to control **class creation** not **instance creation** (hence the Singleton example is not the best use case); this further reduces the scope of metaclasses. It is also discouraged when the `__init_subclass__` hook can be used instead.

## Dynamic class creation

Python allows for creating classes programmatically:

In [None]:
from types import new_class

class Person:
    def __init__(self, name: str):
        self.name = name

def exec_body(namespace):
    def teacher_init(self, name, matter):
        Person.__init__(self, name)  # cannot use super()
        self.matter = matter

    namespace['__init__'] = teacher_init
    namespace['say_hello'] = lambda self: f"Hello, I'm {self.name}, I am teaching {self.matter}"
    

Teacher = new_class("Teacher", (Person,), {}, exec_body)


alice_the_teacher = Teacher("Alice", "math")
print(alice_the_teacher.say_hello())
print("isinstance of Person:", isinstance(alice_the_teacher, Person))
print("---")
print("Instance dict:", alice_the_teacher.__dict__)
print("Class dict:", Teacher.__dict__)
print("MRO:", Teacher.mro())

See https://docs.python.org/3/library/types.html for more.

Python allows also to create dataclasses on the fly via `dataclasses.make_dataclass` with a syntax very close to the old way of creating namedtuple (https://docs.python.org/3/library/dataclasses.html#dataclasses.make_dataclass)

Once again, this is a very powerful tool with which subtle bugs can be introduced easily. It is encourage to only use this when the data structure is not known until runtime (eg. ORMs, testing). Hereafter is a case where we read a yaml schema, generate a class for that schema with a `to_json` method, and load some data.

In [None]:
import yaml
import json

person_yml = """
name: str
age: int
active: bool
"""

person_csv = """
name,age,active
Alice,30,true
Bob,25,false
"""

def build_class_from_schema(name, schema: dict):
    TYPES = {
        "str": str,
        "int": int,
        "float": float,
        "bool": lambda x: x.lower() in ("1", "true", "yes"),
    }
    # constructor for instances
    def __init__(self, **kwargs):
        for field, ftype in schema.items():
            value = kwargs.get(field)
            if value is not None:
                value = TYPES[ftype](value)
            setattr(self, field, value)

    def to_json(self):
        return json.dumps(self.__dict__)

    # namespace dict for the class
    attrs = {
        "__init__": __init__,
        "to_json": to_json,
        "__annotations__": schema,  # nice to have for typing
    }
    return type(name, (object,), attrs)

person_schema = yaml.safe_load(person_yml)
Person = build_class_from_schema("Person", person_schema)

persons = []
field_names = None
for i, line in enumerate(person_csv.strip().split("\n")):
    if i == 0:
        field_names = line.split(",")
    else:
        fields = line.split(",")
        data = dict(zip(field_names, fields))
        persons.append(Person(**data))

[p.to_json() for p in persons]


## Closing words

This module was about advanced inheritance concepts. We refresh our memory with the basics, and the issues of the multiple inheritance. We then covered a few clean ways to build complex hierarchies. We then delved a bit into specific constructs around classes with metaclasses (including its limitation and alternatives), as well as dynamic class creation.

The general advice is to use all those features sparingly, and with clear communication and intent. 

> :wrench: Let's imagine--for the sake of the exercise--that you want to design an abstraction called `Connector`. A connector is about reading a given source for a given purpose. We want to support loading a specific chunk of data (eg. a slice over a datetime index), but also for some connectors targeting a specific value (eg. partition) or the last update. 
>
> Not all sources are in the same databases and rely on the same technology to access the data (and there is no 1:1 mapping between technologies and sources). 
>
> It would also be nice to log the source, do some processing, have a snapshot of the end result, and enforce some quality checks.
>
> How would you proceed so that it is very user-friendly to use, while being efficient for the developper as well?
