In [None]:
# In colab run this cell first to setup the file structure!
%cd /content
!rm -rf MOL518-Intro-to-Data-Analysis

!git clone https://github.com/shaevitz/MOL518-Intro-to-Data-Analysis.git
%cd MOL518-Intro-to-Data-Analysis/Precept_3

# Precept 3

This precept will supplement material discussing functions and control flow.

The goals of this precept are:
1. To introduce code organization through creating functions
2. To introduce error handling within functions
3. To practice implementing code with a more abstract lens

### How to write your own function

Some of the code you will write will be long and perform many repetitive complex tasks. In these cases, functions can help reduce the length of the code and make it easier to read.

You will find that you may need to write your own functions to take advantage of the benefits of functions. Python lets you write your own functions with special syntax.

#### Creating your own function
<p align="center">
<img src="media/FunctionCreation.png" alt="Creating your own function" width="1000" />
</p>

### Handling your own errors

As you've seen during this course, improper handling of variables and data can result in errors. These errors are caught and handled by python as they can sometimes lead to disasterous consequences.

But what happens if your code makes a tiny mistake? This still can lead to some potentailly poor consequences for your own data and analyses. It's important that you make sure while you write functions that raise errors where the inputs are unintended or would lead to nonsensical results.

Let's look at the structure of error handling and assertions in this function below.

In [None]:
# The header of a defined function.
# The function name is 'orangify' and takes in one input.
def orangify(colorless_string):
    if type(colorless_string) != str:
      raise TypeError("Target of orangify needs to be a string!")

    vibrant_string = "orange " + colorless_string

    assert "orange" in vibrant_string

    return vibrant_string

my_accessory = "belt" # Preparing the string to orangify
my_accessory = orangify(my_accessory) # Orangifying my accessory and overwriting the variable
print(my_accessory) # Checking my accessory


A very common example of a function that CS students learn to make is an adder. Let's make an adder!

#### Exercise 3

In [None]:
# Write a function called adder() that takes two arguments, adds them together, and returns the result.
# Test your function by using it to add two numbers.
# What happens if you feed your adder two strings? What about a string and an integer?
# Assert or raise an error if your function receives the improper types

### The power of simplification


Let's take a look at two code blocks. Which one better allows you to easily understand the purpose?

#### Example 1

In [None]:
from pandas import read_csv
from pathlib import Path
import matplotlib.pyplot as plt

infile = Path("data/pssm.csv")
data_df = read_csv(infile)

identity = data_df["identity"]
pos = data_df["position"]

smoothed_identity = list()
for i in range(len(identity)):
    start = i - 25
    if i < 25:
        start = 0

    end = i + 25
    if i > (len(identity) - 25):
        end = len(identity)

    data_average = 0
    for j in range(start, end):
        data_average = data_average + identity[j]

    data_average = data_average / (end - start)
    smoothed_identity.append(data_average)

double_smoothed_identity = list()
for i in range(len(smoothed_identity)):
    start = i - 25
    if i < 25:
        start = 0

    end = i + 25
    if i > (len(smoothed_identity) - 25):
        end = len(smoothed_identity)

    data_average = 0
    for j in range(start, end):
        data_average = data_average + smoothed_identity[j]

    data_average = data_average / (end - start)
    double_smoothed_identity.append(data_average)

plt.plot(pos, double_smoothed_identity)
plt.xlabel("Position")
plt.ylabel("Identity")
plt.show()


#### Example 2

In [None]:
from pandas import read_csv
from pathlib import Path
import matplotlib.pyplot as plt

def smooth_data(data, window=50):
    smoothed_data = list()
    half_window = window // 2
    for i in range(len(data)):
        start = i - half_window
        if i < half_window:
            start = 0

        end = i + half_window
        if i > (len(data) - half_window):
            end = len(data)

        data_average = 0
        for j in range(start, end):
            data_average = data_average + data[j]

        data_average = data_average / (end - start)
        smoothed_data.append(data_average)

    return smoothed_data


def plot_data(X, Y, xlabel="X", ylabel="Y"):
    plt.plot(X, Y)
    plt.xlabel(xlabel)
    plt.ylabel(ylabel)
    plt.show()



infile = Path("data/pssm.csv")
data_df = read_csv(infile)

identity = data_df["identity"]
pos = data_df["position"]

smoothed_identity = smooth_data(identity)
double_smoothed_identity = smooth_data(smoothed_identity)

plot_data(pos, double_smoothed_identity, xlabel="Position", ylabel="Identity")

    

Let's put it all together.

Make the following code underneath run properly!

#### Exercise 4

In [None]:
# Write functions that will perform the expected behaviors
# I provide a function, deep_copy(list) that will return a deep copy of a list

data = fabricate_data()

filtered_data = filter_data(data)

plot_data(filtered_data)

def deep_copy(list):
  if type(list) != list:
    raise TypeError("deep_copy can only take arguments of type list")

  copy = list()
  for item in list()
    copy.append(item)

  return copy