# Day 3 - Classes and Objects

In Python, classes are a fundamental concept of object-oriented programming (OOP). Classes provide a way to organize and structure code, allowing you to define custom objects with their own properties (attributes) and behaviors (methods). Fundementally, it means you are setting parameters for how your data should look, and what methods and functions it should have.

Lets say we are defining a class named House. Well I'd expect it to have things like "Number of doors", "Number of Windows", "Address", "People who live there".....

We could store the data in dictionaries or lists, but by building a class I am then able to use something like ```new_house(2, 8, "123 new street", ['John', 'Paul', 'George', 'Ringo'])``` and then access the data easily and reliably. 

It also can contain specialised functions that relate directly to that class, for example 
```
new_house.distance_to("Liverpool")
``` 
could be made to use the internal data, the new variables, and it's all stored away inside the class. In reality they work much like string manipulation methods but we define them ourselves.
 
It is particularly useful in bioinformatics and biology as it allows you to model real-world entities and their interactions, and especially when working with less common types of data.


Classes are important because they contain all of those methods and functions available to ***every*** sequence object that is created. There are two types:

- Class specific user defined functions (plain text, lower case)
    
    The list that we see could be functions, variables, methods, constants. But they can all be refered to for any object that is created in the class structure.

- System prescribed functions or "constructors" (contained within double-underscores or "**dunders**")

    Noted as ```__method__``` they are typically automaticly ran or common functions. You'll often see dunder methods like ```__len__``` or ```__str__``` to calculate length or return a string.

The most important constructor is the ```__init__``` method though, called the constructor, as this defines the class and is called whenever an object is created.


## Creating our own classes

All good and complex, but lets actually create our own class to better understand what we are looking at.

Note that convention has classes named with Capital Letters and no underscores. I find it easiest to do the same as everyone else for simplicity and code reviewing.

Lets first make an empty class object and then "Initialise" it three times. This creates three objects that have the structure of the class. Note how we use the ```def __init__(self)``` to say what to do on Initialisation (creating the object).

In [8]:
class Experiment:
    def __init__(self):
      print("Created a new Experiment")

# Creating a class 
first = Experiment()
second = Experiment()
third = Experiment()

Created a new Experiment
Created a new Experiment
Created a new Experiment


So far that is not doing much! Lets add a variable in. The first piece of data in the brackets is the first variable read in the __init__ function (just like a normal function!). 

Firstly note how we use the ```__init__``` constructor first. This takes the input variables and stores them in the object. We use the ```self``` keyword to refer to all variables inside the class object and it must come first in all functions. Assigning the variables to ```self.variable``` keeps it tightly attactched to the object.

In [1]:
class Experiment:
    def __init__(self, species_code):
      self.species_code = species_code

      print("Created a new Experiment metadata object for species:", self.species_code)


# Creating a class and giving it one piece of information
Athal = Experiment('at')
human = Experiment('hg')
foxy = Experiment('vv')


Created a new Experiment metadata object for species: at
Created a new Experiment metadata object for species: hg
Created a new Experiment metadata object for species: vv


Right now it doesn't look like anything exciting, but it we now have three objects which are ready to be populated.



Lets now add a function to the code. I've also put a built in dictionary in the class. It isn't a passed in variable so it can live outside the ```__init__```, but it is still refered to as self.species_names.



In [9]:
class Experiment:
    species_names = {"at" : "Arabidopsis thaliana", "hg" : "Homo sapiens", "mm" : "Mus Musculus", "vv" : "Vulpes vulpes"}

    def __init__(self, species_code):
      self.species_code = species_code

    def get_full_name(self):
      if self.species_code in self.species_names:
        return self.species_names[self.species_code]

# Creating our classes again now that the class has more features.
Athal = Experiment('at')
human = Experiment('hg')
foxy = Experiment('vv')

# Using the class's one function to search the species dictionary
print("This object's species name is", Athal.get_full_name())
print("This object's species name is", human.get_full_name())
print("This object's species name is", foxy.get_full_name())


This object's species name is Arabidopsis thaliana
This object's species name is Homo sapiens
This object's species name is Vulpes vulpes


We've created a new class, but I want to have a list of all my methods included. We can use the ```dir()``` function to output that

In [None]:
dir(Experiment)

Here we can now use the method on our class objects to interrogate them. Lets add a few more details to the class to really show off. The code is starting to look long and complex but if you break it down it is just three functions and a dictionary, inside one object,

In [33]:
from datetime import datetime

class Experiment:
    species_names = {"at" : "Arabidopsis thaliana", "hg" : "Homo sapiens", "mm" : "Mus Musculus", "vv" : "Vulpes vulpes"}

    ## Reading in our class variabls
    def __init__(self, species_code, collected, replicates):
      self.species_code = species_code
      self.collected = collected
      self.replicates = replicates

    # Get the full name from the dictionary
    def get_full_name(self):
      if self.species_code in self.species_names:
        return self.species_names[self.species_code]

    # Calculate whether the year is longer ago from now, than te collected variable
    def is_expired(self, years_to_expire):
      years_since_collection = datetime.now().year - self.collected
      if years_since_collection > years_to_expire:
        return f"Throw it out, this sample expired {years_since_collection - years_to_expire} years ago!"
      else:
        return f"It's fine, use it! This sample only expires in {years_to_expire - years_since_collection} years"

# Creating our classes again now that the class has more features.
Athal_01 = Experiment('at', 2018, 5)
Athal_02 = Experiment('at', 2015, 4)
Athal_03 = Experiment('at', 2019, 8)
Athal_04 = Experiment('at', 2012, 6)

print(Athal_02.is_expired(5))
print(Athal_03.is_expired(5))

Throw it out, this sample expired 3 years ago!
It's fine, use it! This sample only expires in 1 years


One thing to do extra that can be useful. What happens if we ```print()``` our class object?

In [34]:
print(Athal_01)

<__main__.Experiment object at 0x7f051cae3820>


Well it's an object which contains many things. That's fine, but we could also add a default output option for when we print, just to make our lives easier!

This is where we can use the ```__repr__``` function to specify the "Representation".

Exercise: Take this function and add it to the code above. Try running the pring again now!

In [None]:
def __repr__(self):
  output_string = "Experimental object for " + self.get_full_name()
  return output_string

One last thing we can look at, is creating a class to create objects that  contains a number of other objects. Lets group all of our Arabidopsis experiments together.

In [None]:
class ExperimentCollection:
    def __init__(self, experiments):
      self.experiments = experiments
    
    # counting up the number of replicates, as defined in the __init__ of our Experiments class
    def number_of_reps(self):
      reps_tally = 0
      for self.exp in self.experiments:
        reps_tally += self.exp.replicates
      
      return int(reps_tally)

Arabidopsis_experiments = ExperimentCollection([Athal_01, Athal_02, Athal_03, Athal_04])

print("We have a total of", Arabidopsis_experiments.number_of_reps(), " samples in our collection")

## Exercise - Building classes

Build your own DNA sequence toolbox! Lets create a class named Sequence and takes at least two parameters: "ID" and "raw sequence"

Give your class at least the following:
- The ```__init__``` function assigning your two inputs
- A ```__repr__``` function so that when you print your object it returns ID and sequence length.
- A function doing our favourite thing, CG%!

Extension ideas:
- Create a function that translates into RNA (T -> U)
- Create a function that searches the string for a defined motif (test with: "TGACTG"). Return a frequency count.
- What other things can you think of?


In [None]:
## Create your class here

In [None]:
## Some tests here

my_first_sequence = Sequence("HSA01", "TGTGTCATGCAAAACTAGGTCATGCGTCCGCTGACTGATGACTGACACTGGTGGCACAACTGACTGAC")
my_second_sequence = Sequence("MET01", "AAAAAAACGCGACTACGCGGCGACTATGTGTCATGCAAAACTAGGTCATGCGTCCGCTTGTGTGTGCAACGATGCGACTA")

print(my_first_sequence)
print(my_second_sequence)

print(my_first_sequence.calc_GC())
print(my_second_sequence.calc_GC())