In [1]:
'''
Preamble
'''

# Time to start practicing your good habits! 
# Don't forget to execute this cell before doing anything with your code below

import pandas as pd
import random

# Introduction

Welcome to Lab 2! This interactive document will walk you through an active learning approach to several important Python concepts, and also give you more practice working in Python and applying the theories presented in lecture. This document will also serve as your lab report - as you work through the file, there are coding cells designated for your own code. 

### To submit your lab report:

* rename this file as LastName_Lab_2.ipynb,
* render your file using quarto
* submit the rendered file on Sakai

# Module 1: Dictionaries

Dictionaries are a special kind of list in python. They have all of the same features of a list, with one additional benefit - in a list, we generally reference items by their position, or index. In a dictionary, yo uhave the added feature of a *key*, which gives a name to each item in the dictionary. In this way, it acts like you would think of a traditional dictionary, a word (key) with a defintion (value). You define a dictionary in the following way:

    d = {
            <key>: <value>,
            <key>: <value>,
            .
            .
            .
            <key>: <value>,
    }

And you reference the values in a dictionary much like you would in a list (using '[]'), but in a dictionary, instead of a numerical index, you call the value by its key.

Work within the following coding block to explore dictionaries. For this lab, you will

* Add comments to the coding block (2 pts)
* Perform the tasks described in the existing comments (1 pt each)

In [11]:
''' 
Dictionaries
'''

e_coli = {
    'water': 2E10,
    'mem_prot': 1E6,
    'ion': 6E7,
    'lipid': 5E7,
    'protein': 2E6,
    'mrna': 2E3,
    'ribosome': 2E4,
    'dna': 5E6
}

# Task 1.1 - calcuate the ratio of ribosomes to mRNA, and output the result

e_coli['volume'] = 1E-15

# Task 1.2 - create a variable that stores the volume of a single molecule of water (you'll have to look it up!)

# Task 1.3 - find out what percentage of an E. Coli cell is water, by volume

# Module 2 DataFrames

The easiest way to think of DataFrames is 'Lists on Steroids' - DataFrames are the Python equivalent of spreadsheets, which in the data world are considered a 2-D data container. The pandas module was developed in 2008 so that Python could compete with the more powerful `R` (not that I'm biased) and work with large imported datasets. We will ignoring data import for now, and focus solely on the mechanics of DataFrames.

You'll notice in the coding block below that a dataframe looks a *lot* like a dictionary - there are keys and values. The main differences are:

1. The value for each key must be a list
2. There are a greater variety of functions and properties specific to dataframes

In [3]:
'''
DataFrames
'''

# Import pandas dataframe
import pandas as pd

# Create data frame for exercise heartrates
hr_data = {
    'heartrate': [70, 120, 90, 100, 105],
    'duration': [60, 60, 30, 45,70]
}

# Load data into DataFrame object
hr_df = pd.DataFrame(hr_data)
print(hr_df)

# Reference a DataFrame row
print(hr_df.loc[1])

# Reference a DataFrame column
print(hr_df['heartrate'])

# Reference Row 4, column 1 (remember the rules for indexes!)
print(hr_df.iloc[3,0])

   heartrate  duration
0         70        60
1        120        60
2         90        30
3        100        45
4        105        70
heartrate    120
duration      60
Name: 1, dtype: int64
0     70
1    120
2     90
3    100
4    105
Name: heartrate, dtype: int64
100


Now it's time for you to try! Below, I have created a sample dataframe of the concenration of CheY-P over time (3 minutes). Find the following information:

1. The maximum concentration of CheY-P (1 pt)
2. The time-step at which CheY-P is at a maximum (hint: you will want to use the python `index()` function) (1 pt)
3. Based on Monday's lecture, what is the approximate probabilty (or bias) of E. Coli being in the CCW configuration, based on the Che-Y? (1 pt)

Don't forget to comment your code! (2 pts)

In [4]:
'''
Che-Y Study
'''

# Time Step
time = list(range(1,181))
# Randomized Concentration
conc = []
# Note: don't write for-loops like this
for i in range(0,len(time)): conc.append(random.uniform(0.1, 8.0))

# Assemble data
cheyp_data = {
    'Time (s)': time,
    'Concentration (uM)': conc
}
cheyp_df = pd.DataFrame(cheyp_data)
print(cheyp_df)

     Time (s)  Concentration (uM)
0           1            2.570304
1           2            6.227533
2           3            5.042491
3           4            2.796762
4           5            7.210369
..        ...                 ...
175       176            2.575867
176       177            5.799804
177       178            7.467018
178       179            7.379387
179       180            0.327175

[180 rows x 2 columns]


# Module 3 - Loops

Enter the humble loop. The powerhouse of programming. The secret weapon of serial work. Loops can be your greatest friend, or your worst enemy (sometimes both!).  

### If/else/elif
Our first loop is...technically not a loop. The `if` statement if a very important conditional function, used when you have multiple categories of outcome - that is, what you do next is determined by some value that has multiple possibilities. The simplest version is the `if/else` conditional:

    if <condition>:
        <action>
    else:
        <alternative action>

This is use when there are only 2 possible outcomes. However, it's possible that you may have *multiple* possible outcomes, in which case you would then add the `elif` statement. It's very important than when creating `if/else/elif` that you know what all possible outcomes are, otherwise you'll end up with errors in your code (think back to the assumptions lecture).

    if <condition1>:
        <action1>
    elif<condition2:
        <action2>
        .
        .
        .
    elif<conditionN>:
        <actionN>
    else:
        <alternative action>

Let's show this by creating a game. You have 2 players, A and B. They both roll a 12-sided die, and whoever has the highest roll wins. If they roll the same number, it's a tie. Once you've played the game a few times (every time you execute the code cell, it's a new "roll"), create your own if statement - determine if you're eating in the Commons or at GnTs, based on the menu. (2 points)

(Hint: you'll probably want to make use of the `in` operator)

Don't forget your comments! (2 points)

In [5]:
'''
if/else/elif
'''
# Generate the roll
a = random.randint(1,12)
b = random.randint(1,12)

# Determine who won
if a > b:
    print("A won!")
elif a < b: 
    print("B won!")
else:
    print("It's a tie!")

B won!


In [6]:
'''
Task 3.1
'''

# Wednesday Menu
main = ['pork chop','potatoes','broccoli','carrots','pasta','chicken']
pizza = ['white','pierogi','cheese','pepperoni']
gf_veg = ['ratatouile','hash','cauliflower']
grill = ['hamburger','sloppy joe','fries']
gnts_open = True



### For Loops

`For` loops are useful when you know you want to repeat an action multiple times over a sequence, such as a list, a DataFrame, a tuple (we don't talk about tuples), or even a dataframe. For loops are best when you have a *fixed number* of times you want to complete and action (you'll see why this distinction is important in a moment). 

A good example of this is data normalization. In science, we sometimes want to measure values as they relate to the *maximum* value - so we adjust our measured data for analysis purpsoses:

In [7]:
'''
For Loop
'''
# Generate random data
data = []
# Create 50 data points between 1.0 and 25.0
for i in range(0,50): data.append(random.uniform(1.0, 25.0))

# Determine max value
max_val = max(data)
# Normalize all data
for x in range(0,len(data)):
    data[x] = data[x]/max_val
    

Let's go back to our dataframe of our CheY-P data. Let's combine `if` statements and `for` loops. Loop through the data over time  (2 points), an determine at each point if E. Coli is in the CW or CCW configuration (2 points). Don't forget your comments! (2 points)

In [8]:
'''
Task 3.2
'''



'\nTask 3.2\n'

### While Loops

`While` loops are useful when you want an action to be performed *while* (get it?) a condition is met. Most biologigal reactions act like `while` loops. *While* there is sufficient energy, need, and supplies, the action will be carried out. Once one of these 3 things drops below a certain threshold (often determined by concentration), the process will stop or slow down. 

In [9]:
# Define starting concentration of our chemical (uM)
st_conc = 75

# Carry out biological reaction that uses 4 uM of the chemical each time
# stop when there is insufficient supply

while st_conc > 4:
    st_conc = st_conc - 4
    print(st_conc)

71
67
63
59
55
51
47
43
39
35
31
27
23
19
15
11
7
3


# Module 4 - Functions

Sometimes in Python (as in life), we find ourselves repeating the same task over and over again. It sure would be useful if we could have Python do the work, instead of typing out the same code over and over again. Luckily, we can! Python allows us to create `functions` which allow us to save specific lines of code to use over and over again easily. 

Functions take the following format:

    def <func_name>(<arguments>):
        <OPERATIONS>
        .
        .
        .
        <Output>

As an example, it's a common problem in introductory coding courses to write code that determines if a word is a palindrome. The first step of this is to take the word of interest and to write it backwords. This is easy if I know what the specific word is, but to do this manually for any word? That would take forever. It would be much easier if I could have *one* piece of code that performs this task. I create the function `palindrome` below to do just that. I also want to add in the option for whether or not I want to print out the reversed word. 

After you believe that my function works, try writing your own! (2 pts)


In [10]:
'''
Functions
'''

# Function to reverse a string
def palindrome(word):
    # Use a slicing trick to reverse the word
    rev_word = word[::-1]
    #Check to see if it's a palindrome
    if rev_word == word:
        print(congraulations, you have a palindrome!)
    else:
        print("Better Luck Next Time!")

# Testing the function:
print(palindrome('tacocat')

# Task 4.1 Create a funciton that calculates the average value of a list
# Make sure it ONLY calculates the average if the list contains numbers


# Test list
test = [3,1,4,1,5,9,2,6,5]


SyntaxError: invalid syntax. Perhaps you forgot a comma? (1257501191.py, line 11)

# DocString

That's great that now you can write functions - but what if you hid your file away, and take your code out years later - how are you going to remember how you function works, even with exceptional commenting? What if you want someone else to use your function, how are they going to know how to use it? That's where Docstring comes in . DocString is a specific type of commenting that's used to create documentation for user-created functions. It has a whole host of properties for creating help menus and professional documentation, but for now it's enough to know that it's a more efficient way of storing information about your function. Let's recreate the above function, only now, let's add the DocString. Look through the below example, and then go back up to your function for task 4.1 and add the DocString (2 point)

I'm also going to change the function a bit. Instead having the function print the outcome, I'm going to have the function **return** a value. This way, I can use the output from the function as a *value* in another calculation or piece of code (this is more useful when the function is doing calculations of its own)

In [None]:
def palindrome(word, to_print = 'N'):
    """
    Parameters
    ----------
    word : str
        Word or phrase to be reversed
    """
    # Use a slicing trick to reverse the word
    rev_word = word[::-1]
    #Check to see if it's a palindrome
    if rev_word == word:
        return(True)
    else:
        return(False)

# Module 5 - Homework

Complete the following problems, based on problems from Physical Biology of the Cell (PBOC) 2nd edition.

### Problem 1 (PBOC 2.1)
A scientist tried to estimate the mass of an E. coli cells by assuming the cell had the same density as water. However, it turns out that the density of the macromolecules of the cell is 1.3 times that of water. In Task 1.3, you calculated what percentage of the E. Coli cell is water, by volume. Assume that the remaining percentage is macromolecular. 
What is the percentage error made by treating the macromolecular density as the same as that of water?
(Hint: density is calculated as mass/volume)
(5 points)

(type answer here)

### Problem 2 (PBOC 2.8)

Find the approximate volume of an HIV virion. Assume that it has the same density of E.coli - calculate its mass. Be sure to include the source where you found the volume. 
(5 points)

(type answer here)

### Problem 3 (PBOC 3.5(b))
Recall that we completed part (a) in class. You will need the results from that part of the problem to solve this problem. 

With a 3000 s division time for E. coli, about 25% of its protein is ribosomal. Note that the microbe *Salmonella typhimurium* is very similar to E. coli. Using these numbers and your results from above, what fraction of the protein would be ribosomal for the highest growth rate studied in the paper? *The highest growth rate studied was 2.5 gen/h, meaning that the population doubles 2.5 times per hour* (5 points)

How does this compare to their measured ratio of ribosomes to soluble protein, R/P at these growth rates? *The papers relative ribosomal concentration was measured at 0.035-0.09* (2 points)

How does the predicted R/P at high growth rates change if you now assume that dP/dt is proportional to R as they did in the paper? (3 points)

(type answer here)