# Bioinformatics Introduction to Coding

## Programming Basics 5

### Last lesson recap:

- control flow
    - conditional statements
    - for loops
    - while loops
- scope

### Coming up this lesson:

- Procedural Programming vs. Object Orientation
- Classes and Objects: Blueprints and Buildings
- Sequence Exercise

## Procedural programming vs Object Orientation

Remember how I said we would learn more about objects back in notebook 2? Well here we are, so brace yourself.

Most of what we've done so far is called **procedural programming**, where we organize code around a particular task or tasks. This is a pretty natural way to think about programming, especially as scientists who are often using code to perform sequential analyses. First we might write some code that will reformat our data, then we would graphically display the data for exploration, then we might write more code to trim any outliers before we finally feed our data into some sort of model. Step after step, each chunk of code is defined by the overall process we wish to complete.

Another coding style emerged in the 80's and 90's and has gained prominence since then called **object orientation**. Instead of organizing code by tasks, object orientation tries to organize code such that it is associated around code belonging to similar concepts or things. Instead of storing a bunch of student names in a list and a bunch of student ID numbers in a separate list, object oriented programming would create a student object type and then associate ID and Name strings with that object.

The merits of each style have been argued by people way smarter than me, so can develop a preference for whatever you like. However, given the ubiquity of objects in today's programs it's important to at least learn the basics. Also, as Python is an object-oriented language, **everything** in Python is an object; every variable, function, method, class, type, module, package, etc. are treated as objects. 

## Classes and Objects: Blueprints and Buildings

A **class** is a way to define collections of data and code, and each instance of a class is one of the **objects** we talked about earlier. You can think of the class as a blueprint, and each house we build from that blueprint is considered an object of that class.

In [None]:
# class is a reserved keyword, decide what you want to call your class after that!
# let's start by building our generic student class
class Student:
    # as before, code belonging to the class/loop/whatever will be indented
    # a variable associated with an object is often called a data field
    name = "default"
    birthdate = "01/01/1900"
    ID = "00000"

Just because we've designed our blueprint doesn't mean we've built any houses though. Let's do some building.

In [None]:
# let's make the variable name for our first student s1
# we initialize a student object by calling the class with parentheses
s1 = Student()
s2 = Student()

# we can access any data fields by simply typing variablename.datafield
print(s1.name)
print(s1.birthdate)
print(s1.ID)

Having data clustered together this way is very useful, but we need to be able to change the information dynamically to really do impressive things. Let's explore adding methods to our classes! Remember, methods are just functions associated with an object.

Note: When defining methods for your class, there **must** be at least one argument, and that argument is traditionally named `self`. This is because the object itself is always passed as the first argument to every method call. That's why you will see lots of `self.something`-like statements; we are assigning variables that belong to the object itself.

In [None]:
# let's rewrite our Student class so it's a little more flexible
class Student:
    # The __init__ method is a special method that you should use for every class you design
    # It is run automagically when you initialize the class; you should never explicitly call __init__()
    # Along with self, you can define additional arguments just like a regular function
    def __init__(self, name, birthdate, ident):
        # we need to assign the argument value to the self.name variable that will belong to the class
        # that goes for all other variables/data fields as well
        self.name = name
        self.birthdate = birthdate
        self.ID = ident
    # we can also make other methods to do things relationed to our class
    # here we can overwrite the value already stored in the name datafield
    def change_name (self, new_name):
        self.name = new_name
        
    # HERE: add more methods to change all the other data fields
    

In [None]:
# now we'll create some Student objects using our class
# notice that we now have to pass in the arguments we named in __init__
s1 = Student("James", "09/13/1940", "0001")
# Create another student. It could even be you!
s2 = Student("Josh", "02/23/2002", "9345")

# let's check out the name datafield for each student object we made
print(s1.name)
print(s2.name)

# now let's use our method to change one up and make sure it works
s1.change_name("Jimbo")
print(s1.name)

Make sure you wrote methods to change the other data fields in the Student class. Change Change the ID numbers of the Students to be 80082 and 1337, respectively.

In [None]:
# change student IDs here by completing the code.
s1.change_ID()
s2.


## Sequence Exercise

For this exercise, you'll be designing a sequence class that stores and transforms sequence data. Here are the things we want to be able to do with our class:

1. Create the object by giving it sequence data that is either DNA, RNA, or protein.
2. Automatically detect what type of sequence the object contains.
3. BONUS: Transcribe the sequence from DNA to RNA. (Only if you have time)

In [None]:
# Python regular expression library; useful for matching patterns in text
import re

class Sequence:
    # HERE: Our __init__ method needs to take two arguments
    def __init__():
        # HERE: assign immediately relevant variables from arguments
        self.seq =
        # HERE: let's go ahead and standardize the sequence case to Upper; don't forget to assign it back to the variable name
        
        # We want our Sequence objects to automatically check what kind of data it contains when we initialize them.
        # We should define a method below that will do this for us and then call it here.
        self.sequence_type = self.get_type()
        
    def get_type(self):
        # need to decide whether something is RNA, DNA, or protein
        amino_acids = "[BDEFHIJKLMNOPQRSVWXYZ]"
        # check to see if the sequence we've found contains any of the letters used exclusively for protein sequences
        check = re.search(amino_acids, self.seq)
        # if we didn't find anything, we're looking at RNA or DNA
        if check is None:  # re.search() returns None if the pattern was not found
            if "U" in sequenceString:
                sequence_type = 
            else:
                 sequence_type = 
        else:
            sequence_type = 
        # What should we return?
        return 

    def transcribe(self):
        if self.sequence_type != "DNA":
            # This is called an early return. Because hitting a return will always stop
            # execution of a function, we can use this to not do anything and return from the method early
            return
        # Otherwise, we have DNA and can continue transcribing
        # For this method, don' return the transcribed sequence, change the object's seq datafield
        # Don't forget to also change the sequence type!
        

        

In [None]:
# Let's make a list of a couple Sequence objects
sequences=[
    Sequence("ATACGGCACTCATCACGGCATCT"),
    Sequence("AUACGGCACUCAUCACGGCAUCU"),
    Sequence("MGHTYA"),
]

for seq in sequences:
    #look at the sequence type here
    print(seq.sequence_type)
    # If you implemented the transcribe method, uncomment the below lines to test
    #seq.transcribe()
    #print(seq.sequence_type, seq.seq)