## Object oriented programming 
### Creating your own object types


You can do perfectly good data science _without_ ever writing a `class`. 

However, using `Object-Oriented Programming` can make your data science <u>easier to write</u>, <u>easier to read</u>, and <u>more intuitive</u> while also making it **more shareable/extensible**.

---
#### Object-Oriented Programming

Whenever you code in Python, you should always have a similar questions that you ask yourself during your workflow: "What do I have?" and "What do I need?". While working on subcomponents of a function, you should always ask yourself "What ***kind*** of object am I working with, and what does it do?"

In Python, ***EVERYTHING*** is an object!

In [1]:
# Different types

type(list)

type

In [3]:
type({1:"d"})

dict

In [4]:
dir(list)

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__imul__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__rmul__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'append',
 'clear',
 'copy',
 'count',
 'extend',
 'index',
 'insert',
 'pop',
 'remove',
 'reverse',
 'sort']

### So what _are_ objects?

<img width = 450 src='https://ih1.redbubble.net/image.9426655.9925/fc,550x550,silver.jpg'/>

---
### A New Frontier

Up to this point, we have used objects already defined for us. However, we are not limited by those boundaries, we can *make* our own objects. This is done through the `class` keyword.

<img src='https://ds055uzetaobb.cloudfront.net/image_optimizer/9996aa83f77a2837f41a4de7f2ab517168716532.png' width = 500/>

Using `class` is much like `def` functions. However, later on we get to play around with some of those 'dunder' (\_\_) methods we have been steering you away from.

### First, the syntax

<img src='class_def.png' width=700 align='left'/>

### The Big Idea 
> The idea behind objects is to **bundle** coherent <u>methods</u> (things the object can _do_) and <u>attributes</u> (things the object _has_) that logically go together into a well-defined _interface_.

They are a data abstraction that has 2 main jobs:
1. Captures internal *representation* of the data it is abstracting
2. Creates an *interface* for the abstracted data

#### Let's think about a gene and basic information we want to store about it.
We could make a dictionary:

```python

{"symbol": "Gene1", seq = "AACGT"}

```


While this structure works perfectly fine, if we want to add new elements or add a function specific to the gene it will be hard to keep track of it all.
It makes sense that all of these things could be wrapped up into a single object (_mainly because it is hard to manage and add new funtionality to it_).

#### Let's make a `Gene` object 

- <font color = "red">Use the `__init__` method to define class variables/ attributes</font>
- Write functions in the class for extra functionality
- Use `self` to refer to the object you are creating



In [5]:
class Gene:
    def __init__(self, psymbol = "Gene1", pseq = "AACGT"):
        self.symbol = psymbol
        self.sequence = pseq
        
    def __str__(self):
        return self.symbol
    
    def get_startseq(self, n):
        return self.sequence[:n]
  

---
We need to take a second to talk about 3 things real quick:
1. Functions within `class`es (like `self.update_status`) are called ***methods*** or procedural attributes
2. `self.symbol` and `self.sequence` are called ***attributes*** since they only contain data
3. What in the world is `self`?

**PS**: `self` is a parameter that allows an object to look back at its self. Specifically the current *instance* of itself. However, outside of writing the class, you will never actually have to pass the word `self` into the methods.

Let's use our new class.

In [6]:
help(Gene)

Help on class Gene in module __main__:

class Gene(builtins.object)
 |  Gene(psymbol='Gene1', pseq='AACGT')
 |  
 |  Methods defined here:
 |  
 |  __init__(self, psymbol='Gene1', pseq='AACGT')
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __str__(self)
 |      Return str(self).
 |  
 |  get_startseq(self, n)
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)



In [7]:
dir(Gene)

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'get_startseq']

In [None]:
type(Gene)

In [8]:
g = Gene()

In [9]:
g

<__main__.Gene at 0x110de3370>

In [10]:
str(g)

'Gene1'

In [11]:
g.sequence

'AACGT'

In [12]:
g.symbol

'Gene1'

In [14]:
help(g.get_startseq)

Help on method get_startseq in module __main__:

get_startseq(n) method of __main__.Gene instance



In [15]:
g.get_startseq(2)

'AA'

In [16]:
g.sequence = "AACTTGAA"

In [17]:
g.sequence

'AACTTGAA'

In [18]:
len(g)

TypeError: object of type 'Gene' has no len()

____

#### Let's add more functionality to the `Gene` class
- Implement specific dunder methods for added functionality 
    - the dunder method name is representative for the function/ operation it is used for
- Use @property to set up read-only attributes
- Add documentation using docstrings

In [40]:
# Gene object
class Gene:
    """
    Contains the information about a Gene such as the symbol, description,
    exon number, and sequence 
    
    Attributes:
    symbol (str):  gene symbol
    description (str):  gene description
    exon_no (int): total number of exons for the gene
    sequence (str): the gene sequence
    status (str): the gene status, one of 'new', 'current', 'deprecated', 'in process' 
    """
    def __init__(self, psymbol = "Gene1", pdesc = "Gene for testing", 
                 pexon_no = 1, pseq = "AACGT"):
        self.symbol = psymbol
        self.description = pdesc
        self.exon_no = pexon_no
        self.sequence = pseq
        self.__status = "current"
        
    def __str__(self):
        return f"Hello Gene('{self.symbol}','{self.description}',{self.exon_no},'{self.sequence}')"
    
    def __repr__(self):
        return f"Gene('{self.symbol}','{self.description}',{self.exon_no},'{self.sequence}')"

    def __len__(self):
        return len(self.sequence)
    
    def __add__(self, gene):
        new_gene = Gene(self.symbol + gene.symbol,
                    self.description + gene.description,
                    self.exon_no + gene.exon_no,
                    self.sequence + gene.sequence)
        new_gene.update_status("new")
        return new_gene

        
    @property # getter
    def status(self):
        """
        Get the status for the gene
        """
        return self.__status
    
    # @status.setter # property setter - same name as the property
    # def status(self, pstatus):
    #    self.__status = pstatus
             
    def update_status(self, pstatus):
        """
        Updates the status of a gene
        """
        self.__status = pstatus
        
        

In [41]:
# Use Gene - explore
# create/init Gene objects, we need the __init__ method
# The constructor without arguments works if we have default values
# The cell will use the __repr__ method to display the representation of the object

g = Gene()
g

Gene('Gene1','Gene for testing',1,'AACGT')

In [42]:
g.update_status("deprecated")
print(g.status)
g.status = "current"
g.status


deprecated


AttributeError: can't set attribute

_____

See, I didn't need to use `self` on the outside.
_____

Wait, with just that, we made a new object? I don't believe you...

In [43]:
# what did we create? type()
type(g)



__main__.Gene

In [44]:
help(g)

Help on Gene in module __main__ object:

class Gene(builtins.object)
 |  Gene(psymbol='Gene1', pdesc='Gene for testing', pexon_no=1, pseq='AACGT')
 |  
 |  Contains the information about a Gene such as the symbol, description,
 |  exon number, and sequence 
 |  
 |  Attributes:
 |  symbol (str):  gene symbol
 |  description (str):  gene description
 |  exon_no (int): total number of exons for the gene
 |  sequence (str): the gene sequence
 |  status (str): the gene status, one of 'new', 'current', 'deprecated', 'in process'
 |  
 |  Methods defined here:
 |  
 |  __add__(self, gene)
 |  
 |  __init__(self, psymbol='Gene1', pdesc='Gene for testing', pexon_no=1, pseq='AACGT')
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __len__(self)
 |  
 |  __repr__(self)
 |      Return repr(self).
 |  
 |  __str__(self)
 |      Return str(self).
 |  
 |  update_status(self, pstatus)
 |      Updates the status of a gene
 |  
 |  ---------------------------------------

In [45]:
# we need the __str__ method to print the object value

print(g)



Hello Gene('Gene1','Gene for testing',1,'AACGT')


In [46]:
# dir
dir(g)


['_Gene__status',
 '__add__',
 '__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__len__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'description',
 'exon_no',
 'sequence',
 'status',
 'symbol',
 'update_status']

In [47]:
# check attributes
g.exon_no


1

In [48]:
g.sequence

'AACGT'

In [49]:
# check the status

g.status

'deprecated'

In [50]:
# try to change the status
g.status = "test"


AttributeError: can't set attribute

In [51]:
# Use the update_status method to change the status of the gene
g.update_status("current")


In [52]:
g.status

'current'

In [53]:
# get the gene length - works only if __len__ is implemented

len(g)


5

In [54]:
g.sequence

'AACGT'

In [55]:
g.sequence = "TTTTGGAGTA"

In [56]:
len(g)

10

In [57]:
# add two genes - works only if __add__ is implemented

g1 = Gene("GeneToAdd", "The gene I use to test add", 3, "CCGTAACCAA")

In [58]:
g1

Gene('GeneToAdd','The gene I use to test add',3,'CCGTAACCAA')

In [59]:
g

Gene('Gene1','Gene for testing',1,'TTTTGGAGTA')

In [60]:
g + g1

Gene('Gene1GeneToAdd','Gene for testingThe gene I use to test add',4,'TTTTGGAGTACCGTAACCAA')

In [61]:
# multiply two genes - works only if __mul__ is implemented

g1 * g


TypeError: unsupported operand type(s) for *: 'Gene' and 'Gene'

____

### Expanding classes 
#### Create a general base class then a more specific child class


Design an object called `Cell`:
1. Takes three attributes: 
    - `type`: epithelial, connective, muscle, or nervous
    - `organism`: human, mouse, ....
    - `level`: number - division level
1. Has a method called `divide` that returns in a tuple two cells of the same type

In [1]:
class Cell:
    def __init__(self, ctype = "epithelial", corganism = "human", 
                 clevel = 0, cstatus = "living"):
        self.type = ctype
        self.organism = corganism
        self.level = clevel
        self.status = cstatus
        
    def __str__(self):
        return f"Cell('{self.type}','{self.organism}',{self.level},'{self.status}')"
    
    def __repr__(self):
        return f"Cell('{self.type}','{self.organism}',{self.level},'{self.status}')"
    
    def divide(self):
        return (Cell(self.type, self.organism, self.level + 1),
                Cell(self.type, self.organism, self.level + 1))
        
        
    

In [2]:
# Explore the cell

c = Cell()

In [3]:
c.type

'epithelial'

In [4]:
c.divide()

(Cell('epithelial','human',1,'living'), Cell('epithelial','human',1,'living'))

In [5]:
c.divide()[0].type

'epithelial'

In [6]:
# type, isinstance

type(c)

__main__.Cell

In [7]:
isinstance(c, list)

False

In [8]:
isinstance(c, Cell)

True

### Expanding the Cell class  

- <font color = "red">Add parent class in parantheses after the class name to build on it's functionality</font>
- Uses the super() functions to access functionality form the parent class

Design an object called `ImmuneCell`:
1. Takes three attributes: 
    - `type`: connective
    - `organism`: human, mouse, ....
    - `level`: number - division level
1. Has a method called `divide` that returns in a tuple two cells of the same type
1. Has a method called `kill_cell` that deletes the cell given as an argument


In [10]:
class ImmuneCell(Cell):
    def __init__(self, corganism = "human", 
                 clevel = 0, cstatus = "living"):
        super(ImmuneCell, self).__init__() # calls the __init__ of the Cell (parent) class
        self.type = "epithelial"
        self.organism = corganism
        self.level = clevel
        self.status = cstatus
        
        
    def __str__(self):
        return f"ImmuneCell('{self.organism}',{self.level})"
    
    def __repr__(self):
        return f"ImmuneCell('{self.organism}',{self.level})"
    
    def divide(self):
        return super(ImmuneCell,self).divide()
    
    def kill_cell(self, c):
        c.status = "dead"

In [11]:
# Explore the new type

ic = ImmuneCell()
ic

ImmuneCell('human',0)

In [12]:
type(ic)

__main__.ImmuneCell

In [13]:
# try isinstance for the immune cell see if it a cell

isinstance(ic, ImmuneCell)


True

In [14]:
isinstance(ic, Cell)

True

In [16]:
c

Cell('epithelial','human',0,'living')

In [17]:
isinstance(c, ImmuneCell)

False

In [18]:
# use the immune cell ic to kill the cell c

ic.kill_cell(c)


In [19]:
c

Cell('epithelial','human',0,'dead')

Resources:    
https://docs.python.org/3/tutorial/classes.html#inheritance    
https://realpython.com/python-super/    
https://docs.python.org/3/library/functions.html#super

---

### Extra Practice

Design an object called `Point`:
1. Takes two attributes: `x` and `y`
1. Has a method called `distance` that returns the Euclidean distance from another point 

In [None]:
# Define Point here

Design an object called `Line`:
1. Takes two attributes that are both `Point`s: `start` and `stop` 
1. Has a method called `length` that returns the distance between `start` and `stop`

In [None]:
# Define Line here 

In [None]:
# check attributes and methods



Design and object called `Rectangle`
1. Takes 3 attributes: 
    * `origin` (the lower left `Point` of the `Rectangle`)
    * `height`
    * `width`
1. Has a method called `perimeter` that returns the length of the perimeter of the `Rectangle`
1. Has a method called `area` that returns the area of the `Rectangle`

In [None]:
# Define Rectangle here

In [None]:
# check attributes and methods


