# Lecture 4: software architecture principles

This lecture concerns software development in Python, but also in all programming language by side effet. 

More precisely, we are here concerned by the **good practice** required to write production ready software, or quality software. 

Up to now you have learned the basics about programming. The goal is now to apply this knowledge and skills for real world code... Well, in our example it will remain small applications, but the principles are valid for real application including the ones for PhD students, for a single developer, or for many ones working together onto the same project for some years. 

In any kind of application the first question to ask is about its **architecture**. What do we mean by that? The architecture describes the organisation between the different class, different levels corresponding to libraries and functionalities, grouped by topics like logging, UI, etc. An application depending on the context can be built upon micro-services, onto a central bus, using distributed component, or around kernels made as Russian dolls, etc. 

So, the main questions are:
- how to test the application (unit test, integration test)?
- how to organise the different communication between the objects?

While these questions can have different answers, in this lecture we focus on some very **basic principles** that should simplify the coder life, whatever the final chosen architecture is. We start with the *must-be-used-else-you-are-fired* **SOLID** ones (so, five patterns, one per letter), then **DRY** and we conclude by a **KISS**.

## Single responsibility principle
This pattern is the S into SOLID. If you take a look at Wikpedia, you will see the following simple definition coming from *Uncle Bob* (aka *Robert C. Martin*):
```quote
A module should be responsible to one, and only one, actor
```
Another, perhaps more comprehensible definition is:
```quote
There should never be more than one reason for a class to change.
```

You may find different examples of class that breaks this principle. Here the one from wikipedia:
```quote
As an example, consider a module that compiles and prints a report. Imagine such a module can be changed for two reasons. First, the content of the report could change. Second, the format of the report could change. These two things change for different causes. The single-responsibility principle says that these two aspects of the problem are really two separate responsibilities, and should, therefore, be in separate classes or modules. It would be a bad design to couple two things that change for different reasons at different times.
```

Obviously here *module* should be understood as *class* or *object*, as this principle is agnostic regarding the used programming language. Let us take another example, in Python this time. 

In [None]:
%%python
# Bad conception, break the Single Responsibility Principle

class CoffeeMaker:
    def make_black_coffee(self): ...
    def add_milk(self): ...
    def make_foam(self): ...
    def prepare_an_order(self, use_milk, do_foam): ...
    def serve(self): ...
    def take_order(self): ...
    def take_payment(self): ...

Well, no need to do programming since 53 years to understand there is a conception problem here, isn't it? 

Of course, this class has a lot of responsibilities:
- it can do different recipes of coffee, 
- it can manage the payment,
- it can serve a coffee (think about an automata)...

We should use more classes, each one should do have no more than one responsibility (and at least one)...
We should split it into three different classes at least:

In [None]:
%%python
# architecture more responsible (ha ha ha)


class Coffee:
    def __init__(self, milk=False, foam=False):
        self.milk = milk
        self.foam = foam
        ...
    

class Command:
    """One coffee at a time"""
    def __init__(self, milk=False, foam=False, sugar: int):
        self.__milk = milk
        self.__foam = foam
        self.__sugar = sugar
    
    @property
    def milk(self):
        return self.__milk
    
    @property
    def foam(self):
        return self.__foam
    
    @property
    def sugar(self):
        return self.__sugar


class CoffeeChief:
    """Chief can prepare one coffee at time"""
    def prepare(self, command=Command) -> Coffee:
        # boil water, and to the necessary (assuming an automate)
        ...

class Waiter:
    def take_command(self) -> Command:
        # may ask user politely or just through a UI
        ...

    def delivers(self, command: Command):
        # may open the automate door for instance
        ...

class Cashier:
    """Take the payment (cash, CB...) """
    def take_payment(self, command: Command) -> bool:
        # take the payment and validate it
        ...
        return payment_done

class CoffeeMaker:
    """Here we loop taking command, payment, preparing, serving"""
    ...

Obviously, this example application is not finished, as we need to add many things related to the automate itself. 
But it shows what is a responsibility here:
- The `CoffeeMaker` serves one client at a time in an infinite loop.
- The `Waiter` take one command, asking for milk and foam, eventually size of the coffee, brand and so on. At least he can serve the prepared coffee, but not doing it. Notice that **this can be split in two!**
- A `Command` stores all the details about a coffee we want to prepare and serves.
- A `Cashier` is responsible to get the payment, and validate the command. 
- The `Cooker` cooks one command at a time.
- A `Coffee` is an object here, but just as a simple example as for a real automate we should do it for real &#x1F601; 

You probably have understood at this point where we goes: with the Single Responsibility Principle, our classes becomes lighter, with a few methods generally except may be for some class storing data where many getters and setters may be present (*et encore...*). 
A way to little the limit effect of having a large number of very small classes is to apply the following thinking of *Uncle Bob*:
```quote
Gather together the things that change for the same reasons.
Separate those things that change for different reasons.
```
Hence, a coffee can be a single class with many property for instance...

Let us try another funny example:
```python
# Before the single responsibility principle
class Model:
  
    def pre_process(self):
        pass  

    def train(self):
        pass
  
    def evaluate(self):
        pass  
    
    def predict(self):
        pass
```

Tss, that ugly isn't it? Especially considering the following steps for the pre processing:
```python
    def pre_process(self):
        #importing data
        #converting data types
        #handling missing values
        #handling outliers
        #transforming data
```
A better solution is then the following:
```python
import abc
# After the single responsibility principle applied

class PreProcess:
    @abc.abstractmethod
    def import_data(self): 
        pass
    @abc.abstractmethod
    def convert_data_type(): 
        pass
    @abc.abstractmethod
    def handle_missing_values(): 
        pass
    @abc.abstractmethod
    def handle_outliers(): 
        pass
    @abc.abstractmethod
    def transform_data(): 
        pass

class Train:
    pass

class Evaluate:
    pass

class Predict:
    pass
```
In this example, we are using `@abstractmethod` to say to Python that a method should not have any implementation, leading to an interface rather than to an implementation. Of course, **abstract class** should be overridden in the heirs, to obtain one or more than one implementation. It is a key concept in any real-world application.

As a matter of fact, the different abstract method used into class `PreProcess` could and should be implemented thanks to abstract class too, as we will see it later with the `D` letter of SOLID.

Notice that the single responsibility principle is quite close of KISS. But that's another story...

## Open-closed principle
The second principle of SOLID is the **open-closed principle**. While some think that *Bertrand Meyer* (who first defines it) is a schizophrenic guy, the different mes do not agree. 
This principle states the following:
```quote
Software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification.
```
It is quite important to explain here the differences:
- A module will be said to be **open if it is still available for extension**. For example, it should be possible to add fields to the data structures it contains, or new elements to the set of functions it performs.
- A module will be said to be **closed if it is available for use by other modules**. This assumes that the module has been given a well-defined, stable description (the interface in the sense of information hiding).

The modern way to apply that is to use interface and no class, again.

Then, this principle becomes the following:
- **Use interface and build the hierarchy onto them as inner nodes**.
- **Do as many as necessary implementations and use them as outer nodes of your class hierarchy**. 
    The implementation can be modified in the future without breaking anything (there are leaves...).

As a matter of consequence, in modern Python, an implementation should use extensively the `final` annotation (see [PEPS 591](https://peps.python.org/pep-0591/)).

## Liskov Substitution Principle
This principle was first proposed by Barbary Liskov in 1988.
It is close to the *Design by Contract* pattern.

In short it can be summarized by the following rule:
```quote
Function that uses an instance of a base class A must be able to use an instance of any inherited class of A without knowing it. 
``` 

Liskov's principle defines a notion substitutability for objects, where the instances of the heirs can be used in place of the parent's instance, without altering the correctness of the function/program. 

Thinking about Python programming, this principle is already widely used by you, e.g. for the `len()` function. 
Indeed, this function takes as parameter an instance of a class that should be a kind of container, and returns the number of elements it contains. 
How this function is made? 
To work following the Liskov Substitution Principle (LSP) it relies on a specific method that should be supported by any of its parameter: the `__len__()` method. 
Let us recall the corresponding part of our previously seen `LinkedList` class:
```python
class LinkedList:
    ...
    def __len__(self) -> int:
        counter = 0
        node = self.head
        while none is not None:
            counter = counter + 1
            node = node.next
        return counter
    ...
```
The `len()` function works with any object having the `__len__(self)->int` method:
```python
def len(container) -> int:
    return container.__len__()
```
Somewhere, it delegates to its parameter the responsibility of the length calculation. 
If the container does not have the `__len__()` method then an exception is raised.

Below is an example of simple Python code that does not following this principle (coming from python [tutorial web site](https://www.pythontutorial.net/python-oop/python-liskov-substitution-principle/)):
```python
from abc import ABC, abstractmethod


class Notification(ABC):
    @abstractmethod
    def notify(self, message, email):
        pass


class Email(Notification):
    def notify(self, message, email):
        print(f'Send {message} to {email}')


class SMS(Notification):
    def notify(self, message, phone):
        print(f'Send {message} to {phone}')


class Contact:
    def __init__(self, name, email, phone):
        self.name = name
        self.email = email
        self.phone = phone


class NotificationManager:
    def __init__(self, notification, contact):
        self.contact = contact
        self.notification = notification

    def send(self, message):
        if isinstance(self.notification, Email):
            self.notification.notify(message, contact.email)
        elif isinstance(self.notification, SMS):
            self.notification.notify(message, contact.phone)
        else:
            raise Exception('The notification is not supported')
```
From the first class we can see first a conception problem with notifications:
the `notify` method receive the parameter `message`, that is change to `phone` by the `SMS` subclass (which is ok for the syntax, while change appears only for semantic). 
The main problem appears after with the `NotificationManager.send(self, message)` method, that relies onto the instance of the `Notification` class received during the initialization: to work properly this method has to check the real type of the notification class, in order to send the email or the phone number... This is the breaking of LSP!

Pouah!

Ugly it is, is not it? 

To respect the LSP we have to do the following modifications:
- Remove the `email` parameter from the `Notification` class.
- Add the email or phone number to the constructor of the two subclasses `SMS` and `Email`.
- Modify (simplify) the `send` method of the `NotificationManager` class. 

## Interface Segregation Principle
This principle can be sum up with the following sentence (from Wikipedia):
```quote
Clients should not be forced to depend upon interfaces that they do not use.
```
At first glance it seems to be something obvious, indicating that we do not really understand it...

What does it really means?

It means that when you receive an instance, which is necessarily an heir of an interface following previous principles, then all method defined by this interface should be used. 
Well, with an example it will be easier to see the point.
Let us consider the following example coming from [python tutorial](https://www.pythontutorial.net/python-oop/python-interface-segregation-principle/):
```python
from abc import ABC, abstractmethod


class Vehicle(ABC):
    @abstractmethod
    def go(self):
        pass

    @abstractmethod
    def fly(self):
        pass


class Aircraft(Vehicle):
    def go(self):
        print("Taxiing")

    def fly(self):
        print("Flying")


class Car(Vehicle):
    def go(self):
        print("Going")

    def fly(self):
        raise Exception('The car cannot fly')
```
The classes `Vehicle` and `Aircraft` seems to be correct at first, because the later may fly and go on the taxiway.
But, clearly we can see that the `Car` class is weird, since we have to put an exception into the `fly()` method (considering not flying cars only...).

How to solve this misconception? 
That is simple actually: by splitting interface in different *roles*:
```python
from abc import ABC, abstractmethod

class Movable(ABC):
    @abstractmethod
    def go(self):
        pass


class Flyable(Movable):
    @abstractmethod
    def fly(self):
        pass
```
Contrary to Python, some languages do not accept multiple inheritance (e.g. Java). 
In this case we have to use `interface` and not `class` to define the interfaces instead of abstract class.
In fact, the main idea of the SOLID principles is to use interfaces, and since interface does not really exist in Python we can only use abstract class. This is done by inheriting from the `ABC` class defined in `abc` module (*Abstract Base Class*). 

## Dependency Inversion Principle
This principle is very simple and widely used in so many frameworks that it seems to be used even by dinosaurs.
It is a fundamental principle for testing, and since testing is absolutely required in any code (your code is ready for trash else), it should be considered as an axiom.

Dependency Inversion Principle (DIP) can be stated as follows:
```quote
Depend upon abstractions, [not] concretions.
```
Waouh, so short principle! 
But, if you take a look on Wikipedia or others pages, you will see very long pages with lot of explanation. 
We try to stay short here, so we do not add too many extra details. 

In short, DIP states:
- High-level modules should not import anything from low-level modules. Both should depend on abstractions (interfaces).
- Abstractions should not depend on details. Details (concrete implementations) should depend on abstractions.

By dictating that both high-level and low-level objects must depend on the same abstraction, this design principle inverts the way some people may think about object-oriented programming.
This is very important way to think about DIP: there are two ways, one from interface to implementation, the other considering only interfaces and barely interface. Of course the second way is the *dark side of coding*, that should be banished, while the first is the *light side of coding*. The first produces code quickly than the second when we start application... but when the application grows, it is more and more difficult to modify it, and should lead in most of the case to application rewriting process. The second side needs to think more before to code, but after the beginning it allows to modify and extend an application quite simply.

You probably have already use this principle in many different *frameworks* in Java, Python, C++, C#... 
A **service** is typically the result of the DIP application. Instead of considering the service as a concrete implementation, most of the frameworks propose some abstractions of services. These abstraction are used at different level of the application, and with **dependency injection** (DI) some service providers are added for instance into the configuration layers...

Let us take an example for DIP, again from [python tutorial](https://www.pythontutorial.net/python-oop/python-dependency-inversion-principle/). 
```python
class FXConverter:
    def convert(self, from_currency, to_currency, amount):
        print(f'{amount} {from_currency} = {amount * 1.2} {to_currency}')
        return amount * 1.2


class App:
    def start(self):
        converter = FXConverter()
        converter.convert('EUR', 'USD', 100)


if __name__ == '__main__':
    app = App()
    app.start()
```
Well, normally you should have seen the problem: `App.start()` uses directly the class `FXConverter`.
It is not that we use the `converter.convert()` method, but the fact that we build an instance of this class... 
This should not be done here, since it break the DIP (we depend on the implementation, not onto an abstraction).

To solve this dark mistake, we must use an interface... 
```python
from abc import ABC


class CurrencyConverter(ABC):
    @abstractmethod
    def convert(self, from_currency, to_currency, amount) -> float:
        pass
```
Then we may add some implementations:
```python
class FXConverter(CurrencyConverter):
    def convert(self, from_currency, to_currency, amount) -> float:
        print('Converting currency using FX API')
        print(f'{amount} {from_currency} = {amount * 1.2} {to_currency}')
        return amount * 2
```
Now, DIP can be applied to the `App` class as follows:
```python
class App:
    def __init__(self, converter: CurrencyConverter):
        self.converter = converter

    def start(self):
        self.converter.convert('EUR', 'USD', 100)
```
As you have noticed, the `start()` method does create any instance of converter... The instance is received into the constructor (but it can be received into a setter as well). 

Recalling the previous lecture, it is crystal clear that this is mandatory to do correct *unit testing*, as testing the class `App` should be done by mocking all the other class (else, errors into these other classes will probably fail the test of `App`...). 

## Test Driven Development
Test Driven Development (TDD) is not exactly a coding principle, but more a **coding methodology**. It states the following:
- First, writing the unit and integration testing you will do. 
- Second, the implementing the functionalities you will do. 

It means that the first step in programming consists to write the test (assuming interface then), before to write the concrete implementation. 

This methodology seems crazy for most of the baby programmers, that want to start by the concrete implementation and finish by playing on their phone instead of writing good test...
But, let us recall that a code with no test must be stored into `/dev/null`... 
Test matters more than the concrete implementation, as if your implementation does have test or does not pass the test, then it cannot be included into the rest of the application. 

This is really the case in production, where the git/svn **continuous integration** pipeline (CI) launches the tests before to accept or reject your push. The acceptation is made using the result of the tests, of course.

May be you think that a way to pass this is to not write test? 
Ha, little player you are then :wink: 
The CI pipeline should not accept your code if no test is provided too! 
This can be checked using **code coverage**, that corresponding to the ratio between the tested methods on their number.

Of course, code coverage assume the tests exist, but not that they are good enough... This is the responsibility of the coder, and this is check when you apply for a job and regularly but TCBCF, i.e. your supervisor (TCBCD for *Test Checker and Bad Coder Firer*).

## Don't Repeat Yourself!
This principle is not part of SOLID, but is one of the very important principle you have learnt during your bachelor.

**Don't Repeat Yourself** (DRY) is quite simple to understand actually: it means that your should not write the same piece of code more than once. If it is the case, then you must put it into a function or a class. Following SOLID, this implies to write one or some interfaces if they are missing, and at least one implementation...

To be honest, this very important principle is very difficult to follow in big application, because to avoid to repeat itself one programmer may need to add some new interface/implementation... But what happens if this already exist? What happens if another coders needs the same at the same time? There are a lot of situation where the DRY principle leads us to misconception, repetition in the application, or simply led us to reinvent the wheel. 

To avoid this bad consequence, we must think about the application architecture: where the classes will go? Into which module? This is another principle actually, called **Separation of Concerns** (SoC). The same thing at at higher level...

There are different kind of applications. The most used one consists to the Russian dolls model, with a central kernel providing some basic functionalities, onto which a first layer is added with other functionalities and so on. It is not very efficient for application made by many different programmers, actually. 

Another architecture relies on different vertical layers, where is layer contains some responsibility like UI, inputs, logging, database connexion, and so on. This can be done in the same way with Russian dolls where is layer contains different parts... 

Well, as you may have understood it the most important is then the documentation about the architecture and the communication between the programmers. With Agile methods, there is a lot of meetings (daily, weekly, start/stop...) to coordinate the different levels of information, and to exchange a lot about what everyone is doing. 

Being a good programmer is not only about technical skills, but requires others more general skills, communication among the others.

## KISS
As stated by the DRY principle discussion, a good programmer needs to work its extra skills, like communication, sociability and so on. She/he must consider the group (the other programmers, but also the customers) as very important to achieve a valuable application.

For this purpose we should follow the KISS principle. 
What is it about? No, we will not kiss each others, first because the covid, second because it is an acronym! KISS means *Keep It Simple Stupid*.

It is very important to try to write the more simple things as possible, in other words. 
Being simple, staying clear. 
Notice that a common rule stated by Uncle Bob is the code of a function should not require more than 2 or 3 lines. 

Ouch, that's short!

In practice this implies some very simple small rules:
- Use good identifier, that says something to others (try on your colleagues).
- Use small functions (2-3 lines), calling other small functions.
- Use syntactic sugar (depends on the language).
- Stay coherent: use enumeration, short interfaces, to give meaning to your code. 