# Session 10 - Object Oriented Programming

## Notions of object oriented programming
Object-oriented programming is a programming paradigm that provides a means of structuring programs so that properties and behaviors are bundled into individual objects (from [here](https://realpython.com/python3-object-oriented-programming/)).

For instance, an object could represent the results from an experiment with properties like the date, the acquisition parameters, the raw data as well as have associated transformations (e.g. plot the data, compute a given features...). 

Just like Jourdain speaking prose all his life without knowing it, you've been doing Object Oriented Programming since the beginning ! Let see for instance the `np.ndarray` object you have been manipulating for quite some time:

In [None]:
import matplotlib.pyplot as plt
import numpy as np

In [None]:
arr = np.array([1, 2, 3, 4, 5, 6])  # initialize an array from a list
arr

In [None]:
arr.shape  # property/attribute of np.ndarray

In [None]:
np.reshape(arr, (3, 2))  # function that can take a np.ndarray as input

In [None]:
arr.reshape(3, 2)  # np.ndarray method

In [None]:
arr + 1  # special behavior of the "+" sign

### Creating a class
- Method called `__name__` (E.G. `__init__`, `__len__`) have special meaning in python. These are the ones allowing operations such as `+` on an object.
- The `__init__` method in particular is called when running `ClassName(...)` and create the object.
- The first argument of the methods of a class, traditionally called `self`, is a reference to the object itself.
- The `property` decorator allows to get the result of a method using the property syntax (without parenthesis).

In [None]:
from matplotlib.patches import Rectangle as RectanglePlot


class Rectangle():
    "Class representing a rectangle"

    def __init__(self, h, w, origin_x=0., origin_y=0.):
        """Instantiate a rectangle from a height, a width and an origin.
        
        The origin here corresponds to the bottom-left point.
        """
        self.h = h  # height
        self.w = w  # width
        self.origin = [origin_x, origin_y]

    @property
    def area(self):
        "Return the area of the rectangle"
        return self.h * self.w    
    
    def translate(self, x=0., y=0.):
        "Translate the rectangle."
        self.origin[0] += x
        self.origin[1] += y
    
    def get_corners(self):
        """Return a list of the coordinates of the rectangle's corners.
        
        The coordinates are given from bottom to top then from left to right.
        """
        bottom_left = self.origin
        ... # TODO
        return [bottom_left, bottom_right, top_left, top_right]
    
    def plot(self, ax=None, **kwargs):
        if ax is None:
            ax = plt.gca()  # grab the currently active axis
        rect = RectanglePlot(self.origin, self.w, self.h, **kwargs)
        ax.add_patch(rect)
    
    def adjust_plot_limits(self, ax):
        "Change the limits of an input ax to match the rectangle's dimension"
        # TODO
        ax.set_xlim([..., ...])
        ay.set_ylim([..., ...])


In [None]:
my_rect = Rectangle(4, 2)
my_rect.translate(x=2)
my_rect

In [None]:
my_rect.h

In [None]:
my_rect.origin

In [None]:
my_rect.area  # result of a method with the property syntax

In [None]:
fig, ax = plt.subplots()
my_rect.plot(ax=ax, color="pink")  # Note the kwarg passed to RectanglePlot
# Change the limits of the x and y axis
ax.set_xlim([0, 6])
ax.set_ylim([-1, 5])

**TODO**
- Complete the method `get_corners` above so that it returns the coordinates of the four corners of the rectangle.
- Complete `adjust_plot_limits` using the corners's coordinates so that the rectangle is visible when plotting it.
- Add a new method called `rescale` which rescales the height and width by a common factor given as input.
- **Optional**, create a new method called `rotate` which takes as input a center and an angle and rotate the rectangle. Adapt the plot method accordingly.

### Class inheritance
All properties and method of a given class can be passed to a new one using the following syntax:

In [None]:
class MyCustomList(list):
    "Example of an extension of a numpy array"
    
    def linear_transformation(self, scale, offset=0.):
        "Apply a linear transformation (multiply by a scale and add an offset)"
        return [e * scale + offset for e in self]
    
# Same __init__ as a list !
l = MyCustomList([1, 2, 42])
l

In [None]:
l.linear_transformation(3, -1)

## Exercise: Reader for an aln file
Using the data and functions already implemented during **session 6** (originally inspired by some content by [Antonin Affholder](https://www.ens.psl.eu/actualites/antonin-affholder), data by [Guillaume Borrel](https://research.pasteur.fr/en/member/guillaume-borrel/) from Institut Pasteur), we will create our own `class` for parsing and processing aln files. 

### Data loading

In [None]:
import aln_parser
import numpy as np
import pandas as pd
from pathlib import Path

In [None]:
# Path to the data
path_aln = Path.cwd() / "data" / "Ftr_A.aln"
path_metadata = Path.cwd() / "data" / "metadata.csv"

# Check that the path exists
path_aln.exists()

In [None]:
data = aln_parser.file_parsing.load_aln_data(path_aln)
data.head(5)

In [None]:
metadata = pd.read_csv(path_metadata)
metadata.head(5)

### Number of substitutions between two sequences

In [None]:
import aln_parser.substitutions

In [None]:
# 1st and 2nd columns of the dataframe
seq1 = data.iloc[:, 0]
seq2 = data.iloc[:, 1]
seq1

In [None]:
aln_parser.substitutions.compute_substitution_count(seq1, seq2)

### Substitution matrix between two sets of sequences

In [None]:
multiple_sequences_1 = data.iloc[:, :10].values.T
multiple_sequences_2 = data.iloc[:, 10:20].values.T

multiple_sequences_1

In [None]:
aln_parser.substitutions.compute_substition_matrix(multiple_sequences_1, multiple_sequences_2)

### TODO: A class for parsing the data and visualizing a substitution matrix

In [None]:
class AlnParser():
    "Object used for parsing and visualizing data from an aln file."

    def __init__(self, path):
        "Read an aln file and store its content as a dataframe."
        self.path = path
        # TODO
        pass
    
    @property
    def len_sequence(self):
        "Return the len of the amino-acid sequences."
        # TODO
        pass
    
    def __len__(self):
        "Return the number of organisms in the stored data"
        # TODO
        pass
    
    def substitution_count(self, organism_1, organism_2):
        "Return the substitution count for two organisms"
        # TODO
        pass
    
    def substitution_matrix(self, organism_1_all, organism_2_all):
        "Return a substitution matrix for two lists of organisms"
        # TODO
        pass
    
    def plot_substitution_matrix(self):
        "Plot the substitution matrix for the data"
        # TODO
        pass
    

## Aln file parsing script
Copy and paste your AlnParser class in the `__init__.py` file of the aln_parser folder.

Write a python script taking as input the path to an aln file and an output path, that parses an aln file and save a figure of the resulting substitution matrix at the specified output path.