# 2. Inheritance, Polymorphism, Abstraction, and Encapsulation

In the previous lesson we learnt how to separate concerns using functions and classes. However, each of these separations create new levels of granularity that might be difficult to handle when your project becomes larger.

We saw how to download images of animals, however, once we get those images, we might want to clean them, or use them to train a model. Implementing methods in the same class for including a model might be confusing, after all, the class was responsible solely for downloading the images. For those cases, we can include more levels of granularity.

# Abstraction

The term abstract in programming is very similar to the concept of abstract in art: extracting certain ideas and removing the specifics of it. When you browse a webpage, you don't worry about the intrincacies of what your browser is doing, you just look at the layout _(the main idea)_, and strip the specifics _(the work behind it)_. In other words, data abstraction is like a black box.

> ### Abstraction allows to hide the internal details of the code, showing just the basic functionalities

<sub>We already saw abstract classes, don't confuse data abstraction (black boxes) and class abstraction (forcing a subclass to have a specific method). Abstract classes might help with the separation of concerns, but they are different topics.</sup>


Let's see an example

The image below shows the functions (or methods) used to download animal images

![](images/AnimalScraper.png)

Imagine that now we need to clean the images: using the same size, they all have 3 channels (RGB), and the pixels are normalized (values are between 0 and +1). Of course, we could simply add these steps to the right hand side of the graph, but they are not part of the scraper anymore. Entering Abstraction.

We can apply abstraction to bundle different functions that we know are going to have a specific behaviour. The gathered functions are going to be accept certain type input(s) and it will return certain type of output(s). This means that if you find a bug, or you want to tweak a function inside the functions, you can vary it without worrying that the type of the output will change.

In this example, the AnimalScraper class shouldn't be concerned with the ImageCleaner, however, its output will affect ImageCleaner. Thus, we can _'abstract'_ AnimalScraper, so we 'forget' about its intrincacies. If we need to change anything inside AnimalScraper, we can do it without changing ImageCleaner

![](images/AnimalScraper_Cleaned.png)

As you keep adding code to your project, you will add more behaviour to it. When implementing abstraction, you can think of your code as an ogre: it has many layers (or think about it as an onion if you don't get the reference). Low-level layers will involve small concerns, and being in higher-level layers doesn't mean that the specific layer added larger functionalities, but it means it is involving more concerns

<p align=center><img src=images/Layers.png width=500></p>



Small concerns (Animal Scraper) is going to be used over and over, so we should design it in a way that it has to change infrequently. DataLoader or ModelTrain on the other hand might vary because we need to obtain different models. 

When you have a long code, it might be difficult to traceback an error showing a poor maintainability. Abstraction can solve this by allowing us to pinpoint the error. Let's see an example (different from the animal scraper for a change):

In [None]:
import re

product_review = '''This is a fine milk, but the product
line appears to be limited in available colors. I
could only find white.''' 

sentence_pattern = re.compile(r'(.*?\.)(\s|$)', re.DOTALL) 
matches = sentence_pattern.findall(product_review) 
sentences = [match[0] for match in matches]


In [None]:

word_pattern = re.compile(r"([\w\-']+)([\s,.])?") 
for sentence in sentences:
    matches = word_pattern.findall(sentence)
    words = [match[0] for match in matches] 
    print(words)

The code above looks for all matches of sentences ending with a period. And then, from the sentences, it prints a list where each item is a word from the sentence.

Even though they don't have the same concern, obtaining `sentences` and `words` have a similar procedure, so we can create a new function

In [None]:
product_review = '''This is a fine milk, but the product
line appears to be limited in available colors. I
could only find white.''' 
sentence_pattern = re.compile(r'(.*?\.)(\s|$)', re.DOTALL) 
matches = sentence_pattern.findall(product_review) 

In [None]:
for match in matches:
    print(match[0])

In [None]:
import re


def get_matches_for_pattern(pattern, string):
    matches = pattern.findall(string)
    return [match[0] for match in matches]

product_review = '''This is a fine milk, but the product
line appears to be limited in available colors. I
could only find white.''' 

sentence_pattern = re.compile(r'(.*?\.)(\s|$)', re.DOTALL)
sentences = get_matches_for_pattern(sentence_pattern, product_review)

word_pattern = re.compile(r"([\w\-']+)([\s,.])?")
for sentence in sentences:
    words = get_matches_for_pattern(word_pattern, sentence)
    print(words)


In the example above, we can forget about the internal code of `get_matches_for_pattern`, and even though in this example is fairly simple to know what it does because of its short length, in longer codes, abstraction will help you reducing cognitive load, so you don't have to remember what the function has inside (that is one reason to give your functions a descriptive name)

# Encapsulation

Another very useful practice in OOP is _Encapsulation_. Encapsulation is the process of wrapping (or __encapsulating__) similar concerns AND data into a larger construct. Very often you will see abstraction and encapsulation back to back, the difference is that abstraction shows the main functionality of certain pieces of code without worrying about its internal structure, whereas encapsulation consists on taking related functionalities and group them together into a larger construct. 

> ## __Encapsulation__ is the procedure of bundling related functionalities together into a larger construct



Abstraction and Encapsulation are often confused, and the first time you see these terms, they looks almost the same. However, let's see an example to understand the differences. We saw the AnimalScraper class, and that class was abstracted from the rest of the code. At first glance, this also looks like encapsulation, since after all we are bundling together a bunch of methods. However, the keyword here is the word _related_. 

AnimalScraper grouped together some methods, but at a closer look, these methods are only related in the sense that they together form a pipeline (a series of steps). Nevertheless, _related_ methods don't necessarily have to be working in tandem. They can work separately, and then we orchestrate them using encapsulation.

Let's define two new functions: get_taxonomy and get_class. 

1. get_taxonomy will obtain a list of zoological synonyms (so you might find that animal in another webpage using a different name)
2. get_class will obtain the animal class (mammalia, birds, amphibious, reptiles or fish)

In [None]:
import re
import requests
from typing import List
from bs4 import BeautifulSoup

def get_class(animal:str) -> str:
    ROOT = 'https://en.wikipedia.org/wiki/'
    r = requests.get(ROOT + animal)
    soup = BeautifulSoup(r.content, 'html.parser')
    class_row = soup.find('td', text = re.compile('Class:'))
    animal_class = class_row.find_next_sibling().text.strip()
    return animal_class

def get_taxonomy(animal:str) -> List:
    ROOT = 'https://en.wikipedia.org/wiki/'
    r = requests.get(ROOT + animal)
    print(type(r.text))
    soup = BeautifulSoup(r.content, 'html.parser')
    print(type(soup))
    syn_text = soup.find('a', text = re.compile('Synonyms'))
    if syn_text:
        syn_header = syn_text.find_parent('tr')
        syn_table = syn_header.find_next_sibling()
        contents = syn_table.find_all('i')
        if contents:
            contents = [x.text for x in contents]
            return contents
    else:
        return []

print(get_class('koala'))
print(get_taxonomy('koala'))


Notice that these functions are independent one from another, but their concerns are in the same field (extracting information about the animal). Thus, we could bundle them together under the same class, so next time we need information about an animal, we can go to that class and use the corresponding method.

In [None]:
class AnimalReporter:
    __priv = 2
    def __init__(self, animal: str):
        self.animal = animal
    
    def _say_hello_priv(self):
        print('Hi, Im a private method')
    
    def say_hello_public(self):
        print('Hi, Im a public method')
        self._say_hello_priv()

    def get_class(self) -> str:
        ROOT = 'https://en.wikipedia.org/wiki/'
        r = requests.get(ROOT + self.animal)
        soup = BeautifulSoup(r.content, 'html.parser')
        class_row = soup.find('td', text = re.compile('Class:'))
        animal_class = class_row.find_next_sibling().text.strip()
        return animal_class
    
    def get_taxonomy(self):
        ROOT = 'https://en.wikipedia.org/wiki/'
        r = requests.get(ROOT + self.animal)
        soup = BeautifulSoup(r.content, 'html.parser')
        syn_text = soup.find('a', text = re.compile('Synonyms'))
        if syn_text:
            syn_header = syn_text.find_parent('tr')
            syn_table = syn_header.find_next_sibling()
            contents = syn_table.find_all('i')
            if contents:
                contents = [x.text for x in contents]
                return contents
        else:
            return []

ar = AnimalReporter('koala')

In [None]:
ar._say_hello_priv()

There is something not very OOP here... I am tired of writing code, so it's your turn now to define two new methods:

1. _get_requests(self) -> Union[bytes, str]
2. _get_soup(self, html: Union[bytes, str]) -> BeautifulSoup

Wait, why is there an underscore? One of the beautiful things of encapsulation is privacy. You will define protected variables and methods, so the user can't access to them. In Python this is not technically true, you can't have a protected method, but there is convention that if a method has a prefixed underscore, it means you shouldn't change that (they trust you not do it). These protected methods are (or should be) only accesible within the class, or the module as we will see later.

If you want a higher level of privacy, you can define private methods by adding two underscores. This will make the attribute or the method unaccesible to the user, and the only way you can access to those is within the class.

> <font size=+1>Encapsulation sets boundaries for your methods, so they are private and only accesible within the class or module</font>

On the other hand, public methods are also called interface, because those are the methods that are going to be visible to the public. 

Think about encapsulation like building walls around your class. Private methods will be within the walls, and public methods will be the gates for getting access to those private methods

Now, for real, it's your turn: _tip_: read the type hinting that I left to know what type of variables you should return

In [None]:
from bs4 import BeautifulSoup
from typing import Union
from typing import List
import requests
import re

class AnimalReporter:
    def __init__(self, animal: str):
        self.animal = animal
    
    def _get_request(self) -> Union[bytes, str]:
        ROOT = 'https://en.wikipedia.org/wiki/'
        r = requests.get(ROOT + self.animal)
        return r.text

    def _get_soup(self, html: Union[bytes, str]) -> BeautifulSoup:
        soup = BeautifulSoup(html, 'html.parser')
        return soup
        
    def get_class(self) -> str:
        html = self._get_request()
        soup = self._get_soup(html)
        class_row = soup.find('td', text = re.compile('Class:'))
        animal_class = class_row.find_next_sibling().text.strip()
        return animal_class
    
    def get_taxonomy(self) -> List:
        html = self._get_request()
        soup = self._get_soup(html)
        syn_text = soup.find('a', text = re.compile('Synonyms'))
        if syn_text:
            syn_header = syn_text.find_parent('tr')
            syn_table = syn_header.find_next_sibling()
            contents = syn_table.find_all('i')
            if contents:
                contents = [x.text for x in contents]
                return contents
        else:
            return []

ar = AnimalReporter('koala')
ar.get_class()

# Abstraction and Encapsulation 

Now you have two classes, AnimalScraper and AnimalReporter, and they are related in the sense that if we need data about an animal we can go to one of them. However, grouping them into the same class would be quite inefficient and vague. Instead, we can use a module to gather them into a script. Modules are even higher-level than classes, and they are a type of encapsulation that group multiple related classes and functions together.

> ### Modules are another type of encapsulation that bundles related functions or classes

<p align=center><img src=images/animal_module.png width=400></p>

Notice that we are using both abstraction and encapsulation for creating this module. Usually, abstraction and encapsulation work together by grouping related functionalities together and hiding the parts of it that don't matter to anyone else. This will allow to change the internal code rapidly without affecting the output

If the difference is still not very clear, here is a table summarizing a comparison:

<p align=center><img src=images/Abstraction_vs_Encapsulation.png width=400></p>

# Inheritance and Polymorphism

> ## If it walks like a duck and it quacks like a duck, then it must be a duck

This is the principle of _duck typing_, and in essence means that you don't have to explicetily specify the requirements that your objects have to meet, Python will try everything before raising an error. This is usually applied to dynamically typed languages, like Python.

In [None]:
class DuckTest:
    quack = 2
    def quake(self):
        print('Should I quack?')

duck = DuckTest()
# We attempt to call the method quack()
duck.quack
# It didn't find it, so the next thing it will look for is an attribute
# As we can see, it made the attempt!

Thanks to this duck typing, Python will achieve a degree of polymorphism, which is a methodology for providing specialized behaviour using a consistent method name.

> ## Polymorphism is the procedure by which the same method presents different behaviour

The classical example is the Animal class that speaks:

In [None]:
class Animal:
    def __init__(self, name: str):
        self.name = name
    def speak(self):
        print(f'My name is {self.name}')

class Dog(Animal):
    def speak(self):
        print('Woof!')

class Cat(Animal):
    def speak(self):
        print('Meow')

jake = Dog('Jake')
felix = Cat('Felix')
jake.speak()
felix.speak()


This is a basic type of polymorphism: Cat and Dog inherit from Animal, and they override the `.speak()` method. So you have two objects with the same method name, but the method is doing something different depending on the instance that called it.

Thus, polymorphism is achieved by inheritance. However, inheritance can present problems if you are not careful! For example, take a look at the next figure:

<p align=center><img src=images/Inheritance.png width=500></p>

In this case, the class Dog inherits from Canine, and Canine in turn inherits from Quadruped, which in turn inherits from Mammal. That's sounds good right? A dog is a canine, a quadruped, and a mammal, but wait... According to this, all quadrupeds are mammals, but of course, that's not always true! We need to find an alternative to this rigid structure.

Before you scroll down, try to think of a solution (you don't have to be very technical)



## Using Composition

Here is the solution: instead of creating a rigid inheritance structure, we can leverage Python's duck typing and multiple inheritance. As we saw, Python allows multiple inheritance, and it is possible to create a class that inherits from multiple classes.

Following this principle, composition is a more flexible alternative to inheritance. It is possible to create a class that contains characteristics from many parent classes, so we can use that feature to ONLY inherit what we want.

_Consider composition as pieces of a Lego set. We can combine these pieces to create a complex object. But those pieces can also be used to create a different object._

<p align=center><img src=images/composition.png width=500></p>

> ## Composition is the converse of decomposition: pieces with different functionalities are combined to create a whole.

Many languages implement composition through interfaces, which are formal definitions of methods and data that a particular class MUST implement. Python does not have interfaces, but by using multiple inheritance, we can build a similar mechanism, which in Python is referred to as a mixin.

 A mixin is a class that provides methods to other classes but are not considered a base class. For example, a dog can speak and roll_over, and eventually you will want to create a class that can speak and roll_over, so you can create classes to be inherited to add the speak and rolling over abilities for other objects.

In [None]:
class SpeakMixin:
    def speak(self):
        name = self.__class__.__name__.lower()
        print(f'The {name} says: "hello... I mean... woof!"')


class RollOverMixin:
    def roll_over(self):
        print('Look at me, I am rolling!')


class Dog(RollOverMixin, SpeakMixin):
    pass

class Cat(SpeakMixin):
    pass

jake = Dog()
jake.speak()
jake.roll_over()


This structure is incrediblely useful when we are dealing with classes that share multiple behaviour, and we want to keep some of these behaviours separated.

# Composition in Python

You are likely to encounter other implementations of composition in other books. Due to the non-strict behaviour of Python, some of the concepts that were characteristics from other languages, are different in Python. 

Composition is one of these terms. You might find other resources using this term to refer to a class that instantiate another class inside. For example:

In [None]:
class Leg:

    def __init__(self, position):
        self.position = position

    def __repr__(self):
        return f'I am the {self.position} leg'

class Dog:
    def __init__(self, name):
        self.name = name
        self.back_left_leg = Leg('Back_Left')
        self.back_right_leg = Leg('Back_Right')
        self.front_left_leg = Leg('Front_Left')
        self.front_right_leg = Leg('Front_Right')

You can see that the goal of both type of compositions are similar, adding features to classes without resorting to a strict inheritance structure, but the way they do it is quite different. Also, think what will happen if you delete an instance of the Dog class in this type of composition, all instance of Leg will also be deleted, which makes this relationship a _tight coupling_.

To solve that issue, you can use __Aggregation__



# Aggregation

If instead of instantiate the Leg instance inside the class, you pass it to the constructor as an argument, there will be no problem if you delete the Dog instance:

In [None]:
class Leg:

    def __init__(self, position):
        self.position = position

    def __repr__(self):
        return f'I am the {self.position} leg'

back_left_leg = Leg('Back_Left')
back_right_leg = Leg('Back_Right')
front_left_leg = Leg('Front_Left')
front_right_leg = Leg('Front_Right')

list_legs = [back_left_leg, back_right_leg, front_left_leg, front_right_leg]

class Dog:
    def __init__(self, name, list_legs):
        self.name = name
        self.legs = list_legs


If you remove an instance of dog, list_legs will still exists. Now, having some loose legs walking around is another problem, but that's not of our concern. 

Speaking of loose legs, this type of relationship is called _loose coupling_ where instances are not so dependant.

# UML diagrams

Unified Modeling Language diagrams are a way to represent the relationships between the pieces that constitute your code. The reason for using UML diagrams is to keep all the dependencies mapped, so you know how to access a specific class, method or function just by looking at the diagram.

UML diagrams have existed for a long time, and as such, it was designed for older programming languages. Thus, the composition term we are going to see is based on the latter definition of composition we saw, where there is a tight coupling between classes.

The next image represents the basic syntax of a class in a UML diagram
<p align=center><img src=images/UML1.png width=500></p>



As we start adding more classes to the project, they will have relationships, like inheritance, composition or Aggregation, and these will be represented with a different arrow
<p align=center><img src=images/UML2.png width=500></p>

To finish off, let's take a look at a real life UML diagram
<p align=center><img src=images/UML3.png width=600></p>


# Summary

- Abstraction is a tool to hide complexity, the user is not aware of the implementation details.
- Encapsulation is a tool to group related functionalities together.
- Inheritance and Polymorphism are useful tools, but be aware of rigid inheritance structures.
- Composition can solve said problem by building a wider structure.
- Creating UML diagrams can help obtaining information about the structure of your program

# Practical

## Mixin class for private methods
1. Create a mixin class named AsDictMixin 
2. This class will be just inherited, so don't use a constructor for it
3. You just need to define the following method: `to_dict(self)` which returns a `dict` representation of the object that inherits this mixin class.
4. You might want to use the `__dict__` method, which returns a dictionary representation of an object.
5. The class should look like this:

In [None]:
class AsDictMixin:
    def to_dict(self):
        ### Your code here
        pass
    def _represent(self, value):
        if isinstance(value, object):
            if hasattr(value, 'to_dict'):
                return value.to_dict()
            else:
                return str(value)
        else:
            return value

    def _is_internal(self, prop):
        return prop.startswith('_')

#### So when running the following code, the to_dict() method doesn't return private attributes.

```
class Person(AsDictionaryMixin):
    def __init__(self, name, address, salary):
        self.name = name
        self.address = address
        self._salary = salary

ivan = Person('Ivan', 'London', '100000000')
ivan.to_dict()
```

{'name': 'Ivan', 'address': 'London'} (No salary is shown, because it's private)

# Assessment

### 1. Look information about modules, packages, and how they are organized (we will see more on this in next sections, so just read about them)
### 2. How does encapsulation benefit from modules? 
### 3. How does encapsulation benefit from packages?