## Object oriented programming 
### Creating your own object types

### BIOINF 575 - Fall 2022

---
##### Adapted from material created by Marcus Sherman
---


You can do perfectly good data science _without_ ever writing a `class`. 

However, using `Object-Oriented Programming` can make your data science <u>easier to write</u>, <u>easier to read</u>, and <u>more intuitive</u> while also making it **more shareable/extensible**.

---
#### Object-Oriented Programming

Whenever you code in Python, you should always have a similar questions that you ask yourself during your workflow: "What do I have?" and "What do I need?". While working on subcomponents of a function, you should always ask yourself "What ***kind*** of object am I working with, and what does it do?"

In Python, ***EVERYTHING*** is an object!

In [None]:
# Different types

type(list)

In [None]:
{1:"d"}

In [None]:
# dir(list)

### So what _are_ objects?

<img width = 450 src='https://ih1.redbubble.net/image.9426655.9925/fc,550x550,silver.jpg'/>

---
### A New Frontier

Up to this point, we have used objects already defined for us. However, we are not limited by those boundaries, we can *make* our own objects. This is done through the `class` keyword.

<img src='https://ds055uzetaobb.cloudfront.net/image_optimizer/9996aa83f77a2837f41a4de7f2ab517168716532.png' width = 500/>

Using `class` is much like `def` functions. However, later on we get to play around with some of those 'dunder' (\_\_) methods we have been steering you away from.

### First, the syntax

<img src='class_def.png' width=700 align='left'/>

### The Big Idea 
> The idea behind objects is to **bundle** coherent <u>methods</u> (things the object can _do_) and <u>attributes</u> (things the object _has_) that logically go together into a well-defined _interface_.

They are a data abstraction that has 2 main jobs:
1. Captures internal *representation* of the data it is abstracting
2. Creates an *interface* for the abstracted data

#### Let's think about a gene and basic information we want to store about it.
We could make a dictionary:

```python

{"symbol": "Gene1", seq = "AACGT"}

```


While this structure works perfectly fine, if we want to add new elements or add a function specific to the gene it will be hard to keep track of it all.
It makes sense that all of these things could be wrapped up into a single object (_mainly because it is hard to manage and add new funtionality to it_).

#### Let's make a `Gene` object 

- <font color = "red">Use the `__init__` method to define class variables/ attributes</font>
- Write functions in the class for extra functionality
- Use `self` to refer to the object you are creating



In [None]:
class Gene:
    def __init__(self, psymbol = "Gene1", pseq = "AACGT"):
        self.symbol = psymbol
        self.sequence = pseq
        
    def __str__(self):
        return self.symbol
    
    def get_startseq(n):
        return self.sequence[:n]
  

---
We need to take a second to talk about 3 things real quick:
1. Functions within `class`es (like `self.update_status`) are called ***methods*** or procedural attributes
2. `self.symbol` and `self.sequence` are called ***attributes*** since they only contain data
3. What in the world is `self`?

**PS**: `self` is a parameter that allows an object to look back at its self. Specifically the current *instance* of itself. However, outside of writing the class, you will never actually have to pass the word `self` into the methods.

Let's use our new class.

In [None]:
dir(Gene)

In [None]:
type(Gene)

In [None]:
g = Gene()

In [None]:
g

In [None]:
str(g)

In [None]:
g.sequence

____

#### Let's add more functionality to the `Gene` class
- Implement specific dunder methods for added functionality 
    - the dunder method name is representative for the function/ operation it is used for
- Use @property to set up read-only attributes
- Add documentation using docstrings

In [None]:
# Gene object
class Gene:
    """
    Contains the information about a Gene such as the symbol, description,
    exon number, and sequence 
    
    Attributes:
    symbol (str):  gene symbol
    description (str):  gene description
    exon_no (int): total number of exons for the gene
    sequence (str): the gene sequence
    status (str): the gene status, one of 'new', 'current', 'deprecated', 'in process' 
    """
    def __init__(self, psymbol = "Gene1", pdesc = "Gene for testing", 
                 pexon_no = 1, pseq = "AACGT"):
        self.symbol = psymbol
        self.description = pdesc
        self.exon_no = pexon_no
        self.sequence = pseq
        self.__status = "current"
        
    def __str__(self):
        return f"Gene('{self.symbol}','{self.description}',{self.exon_no},'{self.sequence}')"
    
    def __repr__(self):
        return f"Gene('{self.symbol}','{self.description}',{self.exon_no},'{self.sequence}')"

    def __len__(self):
        return len(self.seq)
    
    def __add__(self, gene):
        new_gene = Gene(self.symbol + gene.symbol,
                    self.description + gene.description,
                    self.exon_no + gene.exon_no,
                    self.sequence + gene.sequence)
        new_gene.update_status("new")
        return new_gene

        
    @property # getter
    def status(self):
        """
        Get the status for the gene
        """
        return self.__status
    
    # @status.setter # property setter - same name as the property
    # def status(self, pstatus):
    #    self.__status = pstatus
             
    def update_status(self, pstatus):
        """
        Updates the status of a gene
        """
        self.__status = pstatus
        
        

In [None]:
# Use Gene - explore
# create/init Gene objects, we need the __init__ method
# The constructor without arguments works if we have default values
# The cell will use the __repr__ method to display the representation of the object

g = Gene()
g

In [None]:
g.update_status("deprecated")
print(g.status)
g.status = "current"
g.status


_____

See, I didn't need to use `self` on the outside.
_____

Wait, with just that, we made a new object? I don't believe you...

In [None]:
# what did we create? type()




In [None]:
# we need the __str__ method to print the object value





In [None]:
# dir



In [None]:
# check attributes



In [None]:
# check the status



In [None]:
# try to change the status



In [None]:
# Use the update_status method to change the status of the gene



In [None]:
# get the gene length - works only if __len__ is implemented

In [None]:
# add two genes - works only if __add__ is implemented



In [None]:
# multiply two genes - works only if __mul__ is implemented



#### Resources
https://docs.python.org/3/tutorial/classes.html     
https://docs.python.org/3/reference/datamodel.html      
https://python-textbok.readthedocs.io/en/1.0/Classes.html      
https://www.w3schools.com/python/python_classes.asp          
https://www.geeksforgeeks.org/python-classes-and-objects/          
https://www.tutorialspoint.com/python/python_classes_objects.htm   
https://python-course.eu/oop/inheritance.php     
https://www.geeksforgeeks.org/python-oops-concepts/      
https://gist.github.com/rurtubia/f5c506f414bb85efc4d8
