# The Composition Over Inheritance Principle

## Problem: the subclass explosion
A crucial weakness of inheritance as a design strategy is that a class often needs to be specialized along several different design axes at once, leading to what the Gang of Four call “a proliferation of classes” in their Bridge chapter and “an explosion of subclasses to support every combination” in their Decorator chapter.

Python’s logging module is a good example in the Standard Library itself of a module that follows the Composition Over Inheritance principle, so let’s use logging as our example. Imagine a base logging class that has gradually gained subclasses as developers needed to send log messages to new destinations.

In [5]:
import sys
import logging

# The initial class.

class Logger(object):
    def __init__(self, file):
        self.file = file

    def log(self, message):
        self.file.write(message + '\n')
        self.file.flush()

# Two more classes, that send messages elsewhere.

class SocketLogger(Logger):
    def __init__(self, sock):
        self.sock = sock

    def log(self, message):
        self.sock.sendall((message + '\n').encode('ascii'))

class SyslogLogger(Logger):
    def __init__(self, priority):
        self.priority = priority

    def log(self, message):
        syslog.syslog(self.priority, message)

The problem arises when this first axis of design is joined by another. Let’s imagine that log messages now need to be filtered — some users only want to see messages with the word “Error” in them, and a developer responds with a new subclass of `Logger`:

In [6]:
# New design direction: filtering messages.

class FilteredLogger(Logger):
    def __init__(self, pattern, file):
        self.pattern = pattern
        super().__init__(file)

    def log(self, message):
        if self.pattern in message:
            super().log(message)

# It works.

f = FilteredLogger('Error', sys.stdout)
f.log('Ignored: this is not important')
f.log('Error: but you want to see this')

Error: but you want to see this


The trap has now been laid, and will be sprung the moment the application needs to filter messages but write them to a socket instead of a file. None of the existing classes covers that case. If the developer plows on ahead with subclassing and creates a `FilteredSocketLogger` that combines the features of both classes, then the subclass explosion is underway.

Maybe the programmer will get lucky and no further combinations will be needed. But in the general case the application will wind up with 3×2=6 classes:

`
Logger           FilteredLogger
SocketLogger     FilteredSocketLogger
SyslogLogger     FilteredSyslogLogger
`

The total number of classes will increase geometrically if `m` and `n` both continue to grow. This is the “proliferation of classes” and “explosion of subclasses” that the Gang of Four want to avoid.

The solution is to recognize that a class responsible for both filtering messages and logging messages is too complicated. In modern Object Oriented practice, it would be accused of violating the “Single Responsibility Principle.”

But how can we distribute the two features of message filtering and message output across different classes?

### Solution #1: The Adapter Pattern
One solution is the `Adapter Pattern`: to decide that the original logger class doesn’t need to be improved, because any mechanism for outputting messages can be wrapped up to look like the file object that the logger is expecting.

1. So we keep the original `Logger`.
2. And we also keep the `FilteredLogger`.
3. But instead of creating destination-specific subclasses, we adapt each destination to the behavior of a file and then pass the adapter to a Logger as its output file.

Here are adapters for each of the other two outputs:

In [None]:
import socket

class FileLikeSocket:
    def __init__(self, sock):
        self.sock = sock

    def write(self, message_and_newline):
        self.sock.sendall(message_and_newline.encode('ascii'))

    def flush(self):
        pass

class FileLikeSyslog:
    def __init__(self, priority):
        self.priority = priority

    def write(self, message_and_newline):
        message = message_and_newline.rstrip('\n')
        syslog.syslog(self.priority, message)

    def flush(self):
        pass

Python encourages duck typing, so an adapter’s only responsibility is to offer the right methods — our adapters, for example, are exempt from the need to inherit from either the classes they wrap or from the `file` type they are imitating. They are also under no obligation to re-implement the full slate of more than a dozen methods that a real file offers. Just as it’s not important that a duck can walk if all you need is a quack, our adapters only need to implement the two file methods that the `Logger` really uses.

And so the subclass explosion is avoided! Logger objects and adapter objects can be freely mixed and matched at runtime without the need to create any further classes:

In [None]:
sock1, sock2 = socket.socketpair()

fs = FileLikeSocket(sock1)
logger = FilteredLogger('Error', fs)
logger.log('Warning: message number one')
logger.log('Error: message number two')

print('The socket received: %r' % sock2.recv(512))

Note that it was only for the sake of example that the `FileLikeSocket` class is written out above — in real life that adapter comes built-in to Python’s Standard Library. Simply call any socket’s `makefile()` method to receive a complete adapter that makes the socket look like a file.

### Solution #2: The Bridge Pattern
The Bridge Pattern splits a class’s behavior between an outer “abstraction” object that the caller sees and an “implementation” object that’s wrapped inside. We can apply the Bridge Pattern to our logging example if we make the (perhaps slightly arbitrary) decision that filtering belongs out in the “abstraction” class while output belongs in the “implementation” class.

As in the Adapter case, a separate echelon of classes now governs writing. But instead of having to contort our output classes to match the interface of a Python `file` object — which required the awkward maneuver of adding a newline in the logger that sometimes had to be removed again in the adapter — we now get to define the interface of the wrapped class ourselves.

So let’s design the inner “implementation” object to accept a raw message, rather than needing a newline appended, and reduce the interface to only a single method `emit()` instead of also having to support a `flush()` method that was usually a no-op.

In [None]:
# The “abstractions” that callers will see.

class Logger(object):
    def __init__(self, handler):
        self.handler = handler

    def log(self, message):
        self.handler.emit(message)

class FilteredLogger(Logger):
    def __init__(self, pattern, handler):
        self.pattern = pattern
        super().__init__(handler)

    def log(self, message):
        if self.pattern in message:
            super().log(message)

# The “implementations” hidden behind the scenes.

class FileHandler:
    def __init__(self, file):
        self.file = file

    def emit(self, message):
        self.file.write(message + '\n')
        self.file.flush()

class SocketHandler:
    def __init__(self, sock):
        self.sock = sock

    def emit(self, message):
        self.sock.sendall((message + '\n').encode('ascii'))

class SyslogHandler:
    def __init__(self, priority):
        self.priority = priority

    def emit(self, message):
        syslog.syslog(self.priority, message)

Abstraction objects and implementation objects can now be freely combined at runtime:

In [None]:
handler = FileHandler(sys.stdout)
logger = FilteredLogger('Error', handler)

logger.log('Ignored: this will not be logged')
logger.log('Error: this is important')

This presents more symmetry than the Adapter. Instead of file output being native to the Logger but non-file output requiring an additional class, a functioning logger is now always built by composing an abstraction with an implementation.

Once again, the subclass explosion is avoided because two kinds of class are composed together at runtime without requiring either class to be extended.

### Solution #3: The Decorator Pattern
What if we wanted to apply two different filters to the same log? Neither of the above solutions supports multiple filters — say, one filtering by priority and the other matching a keyword.

Look back at the filters defined in the previous section. The reason we cannot stack two filters is that there’s an asymmetry between the interface they offer and the interface they wrap: they offer a `log()` method but call their handler’s `emit()` method. Wrapping one filter in another would result in an `AttributeError` when the outer filter tried to call the inner filter’s `emit()`.

If we instead pivot our filters and handlers to offering the same interface, so that they all alike offer a `log()` method, then we have arrived at the Decorator Pattern:

In [None]:
# The loggers all perform real output.

class FileLogger:
    def __init__(self, file):
        self.file = file

    def log(self, message):
        self.file.write(message + '\n')
        self.file.flush()

class SocketLogger:
    def __init__(self, sock):
        self.sock = sock

    def log(self, message):
        self.sock.sendall((message + '\n').encode('ascii'))

class SyslogLogger:
    def __init__(self, priority):
        self.priority = priority

    def log(self, message):
        syslog.syslog(self.priority, message)

# The filter calls the same method it offers.

class LogFilter:
    def __init__(self, pattern, logger):
        self.pattern = pattern
        self.logger = logger

    def log(self, message):
        if self.pattern in message:
            self.logger.log(message)

For the first time, the filtering code has moved outside of any particular logger class. Instead, it’s now a stand-alone feature that can be wrapped around any logger we want.

As with our first two solutions, filtering can be combined with output at runtime without building any special combined classes:

In [None]:
log1 = FileLogger(sys.stdout)
log2 = LogFilter('Error', log1)

log1.log('Noisy: this logger always produces output')

log2.log('Ignored: this will be filtered out')
log2.log('Error: this is important and gets printed')

And because Decorator classes are symmetric — they offer exactly the same interface they wrap — we can now stack several different filters atop the same log!

In [None]:
log3 = LogFilter('severe', log2)

log3.log('Error: this is bad, but not that bad')
log3.log('Error: this is pretty severe')

But note the one place where the symmetry of this design breaks down: while filters can be stacked, output routines cannot be combined or stacked. Log messages can still only be written to one output.

### Solution #4: Beyond the Gang of Four patterns
Python’s logging module wanted even more flexibility: not only to support multiple filters, but to support multiple outputs for a single stream of log messages. Based on the design of logging modules in other languages — see PEP 282’s “Influences” section for the main inspirations — the Python logging module implements its own Composition Over Inheritance pattern.

1. The `Logger` class that callers interact with doesn’t itself implement either filtering or output. Instead, it maintains a list of filters and a list of handlers.
2. For each log message, the logger calls each of its filters. The message is discarded if any filter rejects it.
3. For each log message that’s accepted by all the filters, the logger loops over its output handlers and asks every one of them to `emit()` the message.

Or, at least, that’s the core of the idea. The Standard Library’s logging is in fact more complicated. For example, each handler can carry its own list of filters in addition to those listed by its logger. And each handler also specifies a minimum message “level” like `INFO` or `WARN` that, rather confusingly, is enforced neither by the handler itself nor by any of the handler’s filters, but instead by an `if` statement buried deep inside the logger where it loops over the handlers. The total design is thus a bit of a mess.

But we can use the Standard Library logger’s basic insight — that a logger’s messages might deserve both multiple filters and multiple outputs — to decouple filter classes and handler classes entirely:

In [None]:
# There is now only one logger.

class Logger:
    def __init__(self, filters, handlers):
        self.filters = filters
        self.handlers = handlers

    def log(self, message):
        if all(f.match(message) for f in self.filters):
            for h in self.handlers:
                h.emit(message)

# Filters now know only about strings!

class TextFilter:
    def __init__(self, pattern):
        self.pattern = pattern

    def match(self, text):
        return self.pattern in text

# Handlers look like “loggers” did in the previous solution.

class FileHandler:
    def __init__(self, file):
        self.file = file

    def emit(self, message):
        self.file.write(message + '\n')
        self.file.flush()

class SocketHandler:
    def __init__(self, sock):
        self.sock = sock

    def emit(self, message):
        self.sock.sendall((message + '\n').encode('ascii'))

class SyslogHandler:
    def __init__(self, priority):
        self.priority = priority

    def emit(self, message):
        syslog.syslog(self.priority, message)

Note that only with this final pivot in our design do filters really shine forth with the simplicity they deserve. For the first time, they accept only a string and return only a verdict. All of the previous designs either hid filtering inside one of the logging classes itself, or saddled filters with additional duties beyond simply rendering a verdict.

In fact, the word “log” has dropped entirely away from the name of the filter class, and for a very important reason: there’s no longer anything about it that’s specific to logging! The `TextFilter` is now entirely reusable in any context that happens to involve strings. Finally decoupled from the specific concept of logging, it will be easier to test and maintain.

Again, as with all Composition Over Inheritance solutions to a problem, classes are composed at runtime without needing any inheritance:

In [None]:
f = TextFilter('Error')
h = FileHandler(sys.stdout)
logger = Logger([f], [h])

logger.log('Ignored: this will not be logged')
logger.log('Error: this is important')

There’s a crucial lesson here: design principles like Composition Over Inheritance are, in the end, more important than individual patterns like the Adapter or Decorator. Always follow the principle. But don’t always feel constrained to choose a pattern from an official list. The design at which we’ve now arrived is both more flexible and easier to maintain than any of the previous designs, even though they were based on official Gang of Four patterns but this final design is not. Sometimes, yes, you will find an existing Design Pattern that’s a perfect fit for your problem — but if not, your design might be stronger if you move beyond them.

### Dodge: “if” statements
I suspect that the above code has startled many readers. To a typical Python programmer, such heavy use of classes might look entirely contrived — an awkward exercise in trying to make old ideas from the 1980s seem relevant to modern Python.

When a new design requirement appears, does the typical Python programmer really go write a new class? No! “Simple is better than complex.” Why add a class, when an if statement will work instead? A single logger class can gradually accrete conditionals until it handles all the same cases as our previous examples:

In [None]:
# Each new feature as an “if” statement.

class Logger:
    def __init__(self, pattern=None, file=None, sock=None, priority=None):
        self.pattern = pattern
        self.file = file
        self.sock = sock
        self.priority = priority

    def log(self, message):
        if self.pattern is not None:
            if self.pattern not in message:
                return
        if self.file is not None:
            self.file.write(message + '\n')
            self.file.flush()
        if self.sock is not None:
            self.sock.sendall((message + '\n').encode('ascii'))
        if self.priority is not None:
            syslog.syslog(self.priority, message)

# Works just fine.

logger = Logger(pattern='Error', file=sys.stdout)

logger.log('Warning: not that important')
logger.log('Error: this is important')

You may recognize this example as more typical of the Python design practices you’ve encountered in real applications.

The if statement approach is not entirely without benefit. This class’s whole range of possible behaviors can be grasped in a single reading of the code from top to bottom. The parameter list might look verbose but, thanks to Python’s optional keyword arguments, most calls to the class won’t need to provide all four arguments.

(It’s true that this class can handle only one file and one socket, but that’s an incidental simplification for the sake of readability. We could easily pivot the file and socket parameters to lists named files and sockets.)

Given that every Python programmer learns if quickly, but can take much longer to understand classes, it might seem a clear win for code to rely on the simplest possible mechanism that will get a feature working. But let’s balance that temptation by making explicit what’s been lost by dodging Composition Over Inheritance:

1. **Locality** - Reorganizing the code to use if statements hasn’t been an unmitigated win for readability. If you are tasked with improving or debugging one particular feature — say, the support for writing to a socket — you will find that you can’t read its code all in one place. The code behind that single feature is scattered between the initializer’s parameter list, the initializer’s code, and the log() method itself.


2. **Deletability** - An underappreciated property of good design is that it makes deleting features easy. Perhaps only veterans of large and mature Python applications will strongly enough appreciate the importance of code deletion to a project’s health. In the case of our class-based solutions, we can trivially delete a feature like logging to a socket by removing the SocketHandler class and its unit tests once the application no longer needs it. By contrast, deleting the socket feature from the forest of if statements not only requires caution to avoid breaking adjacent code, but raises the awkward question of what to do with the socket parameter in the initializer. Can it be removed? Not if we need to keep the list of positional parameters consistent — we would need to retain the parameter, but raise an exception if it’s ever used.


3. **Dead code analysis** - Related to the previous point is the fact that when we use Composition Over Inheritance, dead code analyzers can trivially detect when the last use of SocketHandler in the codebase disappears. But dead code analysis is often helpless to make a determination like “you can now remove all the attributes and if statements related to socket output, because no surviving call to the initializer passes anything for socket other than None.”


4. **Testing** - One of the strongest signals about code health that our tests provide is how many lines of irrelevant code have to run before reaching the line under test. Testing a feature like logging to a socket is easy if the test can simply spin up a SocketHandler instance, pass it a live socket, and ask it to emit() a message. No code runs except code relevant to the feature. But testing socket logging in our forest of if statements will run at least three times the number of lines of code. Having to set up a logger with the right combination of several features merely to test one of them is an important warning sign, that might seem trivial in this small example but becomes crucial as a system grows larger.


5. **Efficiency** - I’m deliberately putting this point last, because readability and maintainability are generally more important concerns. But the design problems with the forest of if statements are also signalled by the approach’s inefficiency. Even if you want a simple unfiltered log to a single file, every single message will be forced to run an if statement against every possible feature you could have enabled. The technique of composition, by contrast, only runs code for the features you’ve composed together.

For all of these reasons, I suggest that the apparent simplicity of the if statement forest is, from the point of view of software design, largely an illusion. The ability to read the logger top-to-bottom as a single piece of code comes at the cost of several other kinds of conceptual expense that will grow sharply with the size of the codebase.

There’s a crucial lesson here: design principles like Composition Over Inheritance are, in the end, more important than individual patterns like the Adapter or Decorator. Always follow the principle. But don’t always feel constrained to choose a pattern from an official list. The design at which we’ve now arrived is both more flexible and easier to maintain than any of the previous designs, even though they were based on official Gang of Four patterns but this final design is not. Sometimes, yes, you will find an existing Design Pattern that’s a perfect fit for your problem — but if not, your design might be stronger if you move beyond them.

### Dodge: “if” statements
I suspect that the above code has startled many readers. To a typical Python programmer, such heavy use of classes might look entirely contrived — an awkward exercise in trying to make old ideas from the 1980s seem relevant to modern Python.

When a new design requirement appears, does the typical Python programmer really go write a new class? No! “Simple is better than complex.” Why add a class, when an if statement will work instead? A single logger class can gradually accrete conditionals until it handles all the same cases as our previous examples:

In [24]:
# Each new feature as an “if” statement.

class Logger:
    def __init__(self, pattern=None, file=None, sock=None, priority=None):
        self.pattern = pattern
        self.file = file
        self.sock = sock
        self.priority = priority

    def log(self, message):
        if self.pattern is not None:
            if self.pattern not in message:
                return
        if self.file is not None:
            self.file.write(message + '\n')
            self.file.flush()
        if self.sock is not None:
            self.sock.sendall((message + '\n').encode('ascii'))
        if self.priority is not None:
            syslog.syslog(self.priority, message)

# Works just fine.

logger = Logger(pattern='Error', file=sys.stdout)

logger.log('Warning: not that important')
logger.log('Error: this is important')

Error: this is important


You may recognize this example as more typical of the Python design practices you’ve encountered in real applications.

The if statement approach is not entirely without benefit. This class’s whole range of possible behaviors can be grasped in a single reading of the code from top to bottom. The parameter list might look verbose but, thanks to Python’s optional keyword arguments, most calls to the class won’t need to provide all four arguments.

(It’s true that this class can handle only one file and one socket, but that’s an incidental simplification for the sake of readability. We could easily pivot the file and socket parameters to lists named files and sockets.)

Given that every Python programmer learns if quickly, but can take much longer to understand classes, it might seem a clear win for code to rely on the simplest possible mechanism that will get a feature working. But let’s balance that temptation by making explicit what’s been lost by dodging Composition Over Inheritance:

1. **Locality** - Reorganizing the code to use if statements hasn’t been an unmitigated win for readability. If you are tasked with improving or debugging one particular feature — say, the support for writing to a socket — you will find that you can’t read its code all in one place. The code behind that single feature is scattered between the initializer’s parameter list, the initializer’s code, and the log() method itself.


2. **Deletability** - An underappreciated property of good design is that it makes deleting features easy. Perhaps only veterans of large and mature Python applications will strongly enough appreciate the importance of code deletion to a project’s health. In the case of our class-based solutions, we can trivially delete a feature like logging to a socket by removing the SocketHandler class and its unit tests once the application no longer needs it. By contrast, deleting the socket feature from the forest of if statements not only requires caution to avoid breaking adjacent code, but raises the awkward question of what to do with the socket parameter in the initializer. Can it be removed? Not if we need to keep the list of positional parameters consistent — we would need to retain the parameter, but raise an exception if it’s ever used.


3. **Dead code analysis** - Related to the previous point is the fact that when we use Composition Over Inheritance, dead code analyzers can trivially detect when the last use of SocketHandler in the codebase disappears. But dead code analysis is often helpless to make a determination like “you can now remove all the attributes and if statements related to socket output, because no surviving call to the initializer passes anything for socket other than None.”


4. **Testing** - One of the strongest signals about code health that our tests provide is how many lines of irrelevant code have to run before reaching the line under test. Testing a feature like logging to a socket is easy if the test can simply spin up a SocketHandler instance, pass it a live socket, and ask it to emit() a message. No code runs except code relevant to the feature. But testing socket logging in our forest of if statements will run at least three times the number of lines of code. Having to set up a logger with the right combination of several features merely to test one of them is an important warning sign, that might seem trivial in this small example but becomes crucial as a system grows larger.


5. **Efficiency** - I’m deliberately putting this point last, because readability and maintainability are generally more important concerns. But the design problems with the forest of if statements are also signalled by the approach’s inefficiency. Even if you want a simple unfiltered log to a single file, every single message will be forced to run an if statement against every possible feature you could have enabled. The technique of composition, by contrast, only runs code for the features you’ve composed together.

For all of these reasons, I suggest that the apparent simplicity of the if statement forest is, from the point of view of software design, largely an illusion. The ability to read the logger top-to-bottom as a single piece of code comes at the cost of several other kinds of conceptual expense that will grow sharply with the size of the codebase.