# Part I: Counting Genes
After performing a sequence analysis, you discover that the sequence in one region of interest is the following:
``` GGATAGTGCTTGAGTCTGTGCTGACCTATTATTAAGAACTAAATGGACAATATTATGGAGCATCTCATGTATAAATTGGTGCGTAAAATCGTTGGATCTCTCTTCTAAGTACATCCTACTATAACAATCAAGAAAAACAAGAAAATCGGACAAAACAATCAAGTATGGATTCTAGAACAGTTGGTATATTAGGAGGGGGACAATTGGGACGTATGATTGTTGAGTAAGCTAACAGGCTCAACATTAAGACGGTAATACTAGATGCTGAAAATTCTCCTGCCAAACAAATAAGCAACTCCAATGACCACGTTAATGGCTCCTTGTCCAATCCTCTTGATATCGAAAAACTAGCTGAAAAATGTGATGTGCTAACGATTGAGATTGAGCATGTTGATGTTCCTACACTAAAGAATCTTCAAGTAAAACATCCCAAATTAAAAATCTACCCTTCTCCAGAAACAATCAGATTGATACAAGACA
```
First, run the cell below to assign the sequence to a variable, `sequence`.

In [None]:
sequence = 'GGATAGTGCTTGAGTCTGTGCTGACCTATTATTAAGAACTAAATGGACAATATTATGGAGCATCTCATGTATAAATTGGTGCGTAAAATCGTTGGATCTCTCTTCTAAGTACATCCTACTATAACAATCAAGAAAAACAAGAAAATCGGACAAAACAATCAAGTATGGATTCTAGAACAGTTGGTATATTAGGAGGGGGACAATTGGGACGTATGATTGTTGAGTAAGCTAACAGGCTCAACATTAAGACGGTAATACTAGATGCTGAAAATTCTCCTGCCAAACAAATAAGCAACTCCAATGACCACGTTAATGGCTCCTTGTCCAATCCTCTTGATATCGAAAAACTAGCTGAAAAATGTGATGTGCTAACGATTGAGATTGAGCATGTTGATGTTCCTACACTAAAGAATCTTCAAGTAAAACATCCCAAATTAAAAATCTACCCTTCTCCAGAAACAATCAGATTGATACAAGACA'

## Q1
You'd like to know if this region of DNA codes for Phenylalanine, which is coded by either `TTT` or `TTC`. Do the following steps to figure out whether either of these codons are in your data.

1. Create a Boolean `has_TTT` that indicates whether `TTT` is in your DNA string.
2. Create a Boolean `has_TTC` that indicates whether `TTC` is in your DNA string.
3. Create a Boolean `has_phenyl` that is a Boolean containing information about whether or not `TTT` **or** `TTC` is in your sequence.

In [None]:
### BEGIN SOLUTION
has_TTT = 'TTT' in sequence
has_TTC = 'TTC' in sequence
has_phenyl = has_TTT or has_TTC
print(has_phenyl)
### END SOLUTION

In [None]:
# Tests for Q1, note: hidden tests. Worth 10 points.
assert isinstance(has_TTT,bool)
assert isinstance(has_TTC,bool)
assert isinstance(has_phenyl,bool)

### BEGIN HIDDEN TESTS
assert has_TTT == False
assert has_TTC == True
assert has_phenyl == True
### END HIDDEN TESTS

## Q2
Wait, you just realized that you forgot about *codons*! Gene sequences are read by the ribosome in threes. So, we have to search through each third chunk of DNA to see if that is a phenylalanine. 

Use the following steps to set this up:
1. Determine the number of codons in your sequence by dividing the length of the sequence by three. Assign this to `num_codons_float`.
2. Convert `num_codons_float` to an integer, and assign this to `num_codons`. This is how many times you'll need to run your loop in the next question.

In [None]:
### BEGIN SOLUTION
num_codons_float = len(sequence)/3
num_codons = int(num_codons_float)
print(num_codons)
### END SOLUTION

In [None]:
# Tests for Q2, Note: hidden tests. Worth 5 points.

assert isinstance(num_codons,int)

### BEGIN HIDDEN TESTS
assert num_codons_float == 160.0
assert num_codons == 160
### END HIDDEN TESTS

## Q3

Now that we know how many codons there are, we can use this to determine how many times we need to run our loop.

Create a for loop that looks through each codon and determines whether or not it is a `TTT` or `TTC` sequence by following these steps:
1. Create a for loop that runs through a range equal to the number of codons in your sequence.
2. For each subsequent codon, we'll need to slice the sequence. The first codon is `sequence[0:3]`, the second is `sequence[3:6]`, then `sequence[6:9]`, etc. Hint: you can programatically identify these start and stop points using a `start_id` that is equal to the codon number (`i`) times 3, and a `stop_id` that is equal to the codon number + 1 times 3. You can then use these ids to slice your sequence differently in each loop, and assign this slice to `this_codon`.
3. Within each loop, check whether `this_codon` is equal to `TTT` or `TTC`.
4. Use a counter (`num_phenyl`) to add up how many of these sequences you find. Print this after your loop.

In [None]:
### BEGIN SOLUTION
num_phenyl = 0

for i in range(num_codons):
    start_id = i*3
    stop_id = (i+1)*3
    this_codon = sequence[start_id:stop_id]
    if this_codon == 'TTT':
        num_phenyl += 1
    elif this_codon == 'TTC':
        num_phenyl += 1

print(num_phenyl)

### END SOLUTION

In [None]:
## Tests for Q3, Note: hidden tests. Worth 20 points.

assert isinstance(num_phenyl,int)
assert isinstance(this_codon,str)

### BEGIN HIDDEN TESTS
assert num_phenyl == 3
### END HIDDEN TESTS

As you may remember, RNA has uracil instead of thymine.

## Q4

Convert our DNA sequence to RNA by replacing every "T" with a "U". Assign this to `sequence_RNA`. Don't overthink this -- you may use built-in methods of strings to complete this task!

In [None]:
### BEGIN SOLUTION
sequence_RNA = sequence.replace('T','U')
print(sequence_RNA)
### END SOLUTION

In [None]:
## Tests for Q4, Note: hidden tests. Worth 5 points.
assert isinstance(sequence_RNA,str)

### BEGIN HIDDEN TESTS
assert sequence_RNA == 'GGAUAGUGCUUGAGUCUGUGCUGACCUAUUAUUAAGAACUAAAUGGACAAUAUUAUGGAGCAUCUCAUGUAUAAAUUGGUGCGUAAAAUCGUUGGAUCUCUCUUCUAAGUACAUCCUACUAUAACAAUCAAGAAAAACAAGAAAAUCGGACAAAACAAUCAAGUAUGGAUUCUAGAACAGUUGGUAUAUUAGGAGGGGGACAAUUGGGACGUAUGAUUGUUGAGUAAGCUAACAGGCUCAACAUUAAGACGGUAAUACUAGAUGCUGAAAAUUCUCCUGCCAAACAAAUAAGCAACUCCAAUGACCACGUUAAUGGCUCCUUGUCCAAUCCUCUUGAUAUCGAAAAACUAGCUGAAAAAUGUGAUGUGCUAACGAUUGAGAUUGAGCAUGUUGAUGUUCCUACACUAAAGAAUCUUCAAGUAAAACAUCCCAAAUUAAAAAUCUACCCUUCUCCAGAAACAAUCAGAUUGAUACAAGACA'
### END HIDDEN TESTS

# Part I: Neuron Spikes

We can think of neuron spike trains as a list of booleans: did the cell spike (fire an action potential), or not? Let's say that we sample our neuron once a millisecond to see if it spikes (`spike_train`). A **True** value means the neuron spiked in that sample, a **False** value means it did not spike.

## Q5

1. Programmatically determine how many times your neuron fires an action potential and assign this to `num_spikes`.
2. Determine the length of your spike train in milliseconds and assign this to `train_length`.
3. Determine your neuron's firing rate by dividing the number of spikes by the length of the train, and converting it into seconds from milliseconds. Assign this to `spike_rate`.

In [None]:
# First, run this cell to assign the spike train.
spike_train = [False,False,True,False,False,False,False,True,False,False,False,False,False,False,False,True,False,True,False,False,False,False,False,True,False,False,True,False,False,False,False]

In [None]:
### BEGIN SOLUTION
num_spikes = sum(spike_train)
train_length = len(spike_train)
spike_rate = 1000*num_spikes/train_length
### END SOLUTION

In [None]:
## Tests for Q5, worth 15 points

assert isinstance(num_spikes,int)
assert isinstance(train_length,int)
assert isinstance(spike_rate,float)

### BEGIN HIDDEN TESTS
assert num_spikes == 6
assert train_length == 31
assert spike_rate == 193.5483870967742
### END HIDDEN TESTS

## Q6
As your final challenge, write a function called `calculate_spike_rate` that takes in `spike_train` and returns the `spike_rate`. First, you'll write the function, and then, we'll use it.

You may use the code you wrote above to complete this task. In the first cell below, write your function. 

In [None]:
### WRITE YOUR FUNCTION HERE

### BEGIN SOLUTION

def calculate_spike_rate(spike_train):
    num_spikes = sum(spike_train)
    train_length = len(spike_train)
    spike_rate = 1000*num_spikes/train_length
    
    return spike_rate

### END SOLUTION

In [None]:
## Tests for Q6, worth 10 points. 

# The line below checks that you have a function called "calculate_spike_rate"
"calculate_spike_rate" in dir()

Finally, let's test your function on a new spike train. Run the cell below to assign `new_spike_train`. Then, in the following cell, use your function to calculate the spike rate in this new spike train and assign the output to `my_spike_rate`.

In [None]:
new_spike_train = [True,False,False,False,False,False,False,False,False,False]

In [None]:
## CALL YOUR FUNCTION HERE

### BEGIN SOLUTION
my_spike_rate = calculate_spike_rate([True,False,False,False,False,False,False,False,False,False])
### END SOLUTION

In [None]:
## Additional tests for Q6, worth 5 points. Note: hidden tests.

### BEGIN HIDDEN TESTS
assert my_spike_rate==100.0
### END HIDDEN TESTS