<a href="https://colab.research.google.com/github/tmckim/materials-sp24-colab/blob/main/projects/project2/Project2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Project 2: Defining cell types by their electrophysiology


![](https://media.springernature.com/full/springer-static/image/art%3A10.1038%2Fnrn.2017.85/MediaObjects/41583_2017_Article_BFnrn201785_Fig1_HTML.jpg)
[Zeng & Sanes (2017)](https://www.nature.com/articles/nrn.2017.85)<br>
<br>
![](https://drive.google.com/uc?export=view&id=148YBIn8dLuCFexo99V77WzlFRmaHnhe-)
<br>
[Scala et al. (2021)](https://www.nature.com/articles/s41586-020-2907-3)

---
### The brain is complex and contains contains thousands of cells. How do we make sense of this information in a meaningful way?


<font size=4> We can define neurons by features such as:
* <font size=4><b>gene expression patterns</b> 🧬
* <font size=4><b>electrophysiology features</b> ⚡
* <font size=4><b>structure</b> 🔬


<font size=4>We'll use these three features to compare and contrast cell types in the brain.

<font size=4><b>Note</b>: there are additional ways to classify neurons as well, including function, location, neurotransmitters, and morphology!


<font size=4>This notebook will help us investigate specific features in the [electrophysiology cell types dataset ](https://celltypes.brain-map.org/) from the Allen Brain Institute.

<font size=4>You will apply the coding skills you've learned in Python 🐍 along the way.



---
### Prior activities that provide a foundation for using this notebook:

- <font size=4>Complete [the web-based activity from the Project Instructions](https://webcampus.unr.edu/courses/109833/assignments/1377062), which asks you to look at and work with this data on the Allen Institute website.
- <font size=4>Work through Lab10 from the course that introduced `pandas` and related it to what you've already learned in the `datascience` package


<hr>

**Learning Objectives:**  🙋


*   <font size=4> Understand the metrics that we can use to compare cell types 💭 🧠 🤯
*   <font size=4> Practice and apply Python coding skills to access the AllenSDK 💻 🐍
*   <font size=4> Compare electrophysiological characteristics of neurons between humans and mice 📈 🔎
*   <font size=4> Interpret basic science data and communicate results 📢 ❗





---
### Logistics ⚠

<font size=4> **Rules** ❗ Don't share your code with anybody but your partner. You are welcome to discuss questions with other students, but don't share the answers. The experience of solving the problems in this project will prepare you for exams (and life). If someone asks you for the answer, resist! Instead, you can demonstrate how you would solve a similar problem.

<font size=4> **Support** 💪 You are not alone! Come to office hours and talk to your classmates. If you want to ask about the details of your solution to a problem, come see me. If you're ever feeling overwhelmed or don't know how to make progress, email for help.


<font size=4> **Free Response Questions:** 🖊  Make sure that you put the answers to the written questions in the indicated cell we provide. **Every free response question should include an explanation** that adequately answers the question.


<font size=4> **Advice** 💬 Develop your answers incrementally. Break your code up into steps, perform each step on a different line, give a new name to each result, and check that each intermediate result is what you expect. You can add any additional names, etc. if you want to in the provided cells. Make sure that you are using distinct and meaningful variable names throughout the notebook.

<font size=4> You **never** have to use just one line in this project or any others. Use intermediate variables and multiple lines as much as you would like!

---
## 💾 Before you start - Save this notebook!

<font size=4> When you open a new Colab notebook from the WebCampus (like you hopefully did for this one), you cannot save changes. So it's  best to store the Colab notebook in your personal drive `"File > Save a copy in drive..."` **before** you do anything else.

<font size=4> The file will open in a new tab in your web browser, and it is automatically named something like: "**Copy of Project2.ipynb**". You can rename this to just the title of the assignment "**Project2.ipynb**". Make sure you do keep an informative name (like the name of the assignment) so that you know which files to submit back to WebCampus for grading! More instructions on this are at the end of the notebook.


<font size=4> **Where does the notebook get saved in Google Drive?**

<font size=4> By default, the notebook will be copied to a folder called “Colab Notebooks” at the root (home directory) of your Google Drive. If you use this for other courses or personal code notebooks, I recommend creating a folder for this course and then moving the assignments AFTER you have completed them. <br>

<font size=4> I also recommend you give the folder where you save your notebooks^ a different name than the folder we create below that will store the notebook resources you need each time you work through a course notebook. This includes any data files you will need, links to the images that appear in the notebook, and the files associated with the autograder for answer checking.<br>
You should select a name other than '**NS499-DataSci-course-materials**'. <br>
This folder gets overwritten with each assignment you work on in the course, so you should **NOT** store your notebooks in this folder that we use for course materials! <br><br>For example, you could create a folder called 'NS499-**notebooks**' or something along those lines.

---
# Setup 🧰 🛠

<font size=4> **Do not skip this!**

1. [Set up our coding environment](#setup)

## Part 1

<font size=4> **Data from a single cell**

<font size=4> Steps 2-4 are a demonstration of a cell - they might be the same from the cell you chose, or a new one from the [web-based activity](https://webcampus.unr.edu/courses/109833/assignments/1377062).

2. [Import data for a single cell](#import)
3. [Plot a raw sweep of data](#plotsweep)
4. [Plot the morphology of the cell](#morphology)

<font size=4> Complete Part 1 to reach the checkpoint for this project. Submit your notebook in WebCampus after this is completed (more below).

## Part 2

**Data from many cells**

<font size=4> *Steps 5-7: we will review and plot pre-computed features for all of the cells in the database.*

5. [Analyze computed features](#metrics)
6. [Compare action potential waveforms](#waveforms)
7. [Compare cell types](#compare)

<font size=4> Complete Part 2 to finish this project. Submit your final notebook in WebCampus after this is completed (more below).



---
<a id="setup"></a>

## Step 1.1: Set up coding environment 🔧 📚
<font size=4> Each time we start an analysis in Python, we must import the necessary code packages. The cells below will install packages into your coding environment (reminder, these are not installed on your computer).

### Install the AllenSDK 💻
<font size=4> The Allen Institute has compiled a set of code and tools called a **Software Development Kit** (SDK). These tools will help us import and analyze the cell types data. See [Technical Notes](#technical) at the end of this notebook for more information about working with the AllenSDK.

**Technical notes about installing the allensdk** 🔄
- <font size=4> If you receive an error, contact your instructor and also check out the documentation <a href="http://alleninstitute.github.io/AllenSDK/install.html">here</a>.

In [None]:
#@title Task
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Task</h4>
Run the following cell to install the AllenSDK. </div>
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
# Install the AllenSDK and other packages we need
# Keep this to a minimum (no-deps)

!pip install allensdk --no-deps
!pip install SimpleITK
!pip install pynwb
!pip install simplejson
!pip install requests_toolbelt

**Note**: You may receive notifications from above that not all package dependencies were installed. In this case, it should be okay to run this notebook. It was tested with the minimum requirements and ran without issues. <br><br>

**If you do run into issues, contact the instructor or TA for help and solutions right away!** <br>

You can review the images here and see if it is similar to what appears for you.Notice that the ones in red are the ones that were not installed, although it worked okay. If the output you see in your notebook matches, then the notebook should work.
![](https://drive.google.com/uc?export=view&id=1Nge8zf_USFaQLd2wxFhNmjzsKRMqwc58)

![](https://drive.google.com/uc?export=view&id=1RNoFvTPcf_w80e9bbUKgRfUEPd55pC14)

In [None]:
#@title Task
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Task</h4>
Run this cell to check that the install was successful.
</div>
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
# Ensure that the AllenSDK is installed
try:
    import allensdk
    print('allensdk imported')
except ImportError as e:
    !pip install allensdk

In [None]:
#@title Task
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Task</h4>
Run the following cell to install other packages (the usual ones we've used!) that we need.
</div>
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

<font size=4>These are our familiar favorites:
* [NumPy](https://numpy.org/)
* [Pandas](https://pandas.pydata.org/)
* [Matplotlib](https://matplotlib.org/)




In [None]:
# Ensure that NumPy, Pandas, and Matplotlib are installed
try:
    import numpy
    print('numpy already installed')
except ImportError as e:
    !pip install numpy
try:
    import pandas
    print('pandas already installed')
except ImportError as e:
    !pip install pandas
try:
    import matplotlib
    print('matplotlib already installed')
except ImportError as e:
    !pip install matplotlib

---
## Step 1.2: Import common packages 📦 📚
<font size=4>Below, we'll `import` a common selection of packages that will help us analyze and plot our data. We'll also configure the plotting in our notebook.


In [None]:
#@title Task
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Task</h4>
Import the numpy module nicknamed as <code>np</code>. Add a <code>print</code> message at the end that says "Packages imported!" so that you know the code ran. Fill in the code below where you see <code>...</code><br>
<b>Hint:</b> You've done this many times in past notebooks. Look back at one of those if you are unsure.
</div>
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
## Edit this ---
# Import our plotting package from matplotlib
import matplotlib.pyplot as plt

# Specify that all plots will happen inline & in high resolution
%matplotlib inline
%config InlineBackend.figure_format = 'retina'

# Import pandas for working with databases
import pandas as pd

# Import numpy below as np
...

# Add your print() statement below
...


---
## Step 1.3: Import the CellTypesModule from the allensdk 🖥
<font size=4>With the allensdk installed, we can `import` the **CellTypesCache module**.

<font size=4>The CellTypesCache that we're importing provides tools to allow us to get information from the cell types database. We're giving it a **manifest** filename as well. CellTypesCache will create this manifest file, which contains metadata about the cache. If you want, you can look in the cell_types folder in your code directory and take a look at the file.

In [None]:
#@title Task
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Task</h4>
Run the cell below.
</div>
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
# Import the "Cell Types Cache" from the AllenSDK core package
from allensdk.core.cell_types_cache import CellTypesCache

# Initialize the cache as 'ctc' (cell types cache)
ctc = CellTypesCache(manifest_file='cell_types/manifest.json')

print('CellTypesCache imported.')

---
# Part 1

<a id="import"></a>

## Step 2. Import Cell Types data 🧠
<font size=4>Now that we have the module that we need, let's import a raw sweep of the data. The cell below will grab the data for the same (or similar) experiment you just looked at on the website. This depends on whether the cell ID you chose for the worksheet is the same as the one you are given below. Don't worry if they are different, please use the one given below. This data is in the form of a [**Neuroscience Without Borders** (NWB)](https://www.nwb.org/) file.

In [None]:
#@title Task
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Task</h4>
Run the cell below with the ID that is given for you. Notice that this is also the first one that appears when you open the Allen Institute Website, and it is also located in the URL.<br><br>
<b>Note:</b> This might take a minute or two. Please wait until you have a green check mark in the upper right of the Colab window to continue.
</div>
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
# Use cell_id below
cell_id = 474626527

# Get the electrophysiology (ephys) data for that cell
data = ctc.get_ephys_data(cell_id)
print('Data retrieved')

---

<font size=4>Thankfully, our NWB file has some built-in **methods** to enable us to pull out a recording sweep. We can access methods of objects like our `data` object by adding a period, and then the method. That's what we're doing below, with `data.get_sweep()`.

In [None]:
#@title Question 1 (1 point)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 1 (1 point)</h4>
Choose your favorite sweep number below.<br>
<b>Hint:</b> Go back to the website to see what the sweep numbers are. <br><br>
<i>Note:</i> If you get an `H5pyDeprecationWarning`, don't worry about it - this is out of our control.
</div>
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
## Edit this ---
# Assign your favorite sweep number to a variable "sweep_number" below.
sweep_number = ...

sweep_data = data.get_sweep(sweep_number)
print('Sweep obtained')

In [None]:
#@title Task
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Task</h4>
Run the cell below and review the data. Notice the format is one we've only briefly touched on in this class: a dictionary.
</div>
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
# Take a look a the data- this looks a little different than the data we've previously worked with in tables and arrays
# This is another format of data in Python
sweep_data

In [None]:
# Notice it's in the format 'dict' with {key:value} pairs
type(sweep_data)

<font size=4>Refer back to Lab10 if you need a refresher on this new type in Python or google any questions you have :)

---
<a id="plotsweep"></a>
## Step 3. Plot a raw sweep of data 📉
<font size=4>So far, we:

*   <font size=4>loaded the data ⌛
*   <font size=4>chose a cell ID 🆔
*   <font size=4>chose a sweep number 🧹

<font size=4>Now, let's plot that data. 📈

In [None]:
#@title Task
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Task</h4>
Run the cell below to get the stimulus and recorded response information from the dataset.
</div>
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
# Get the stimulus trace (in amps) and convert to pA
stim_current = sweep_data['stimulus'] * 1e12

# Get the voltage trace (in volts) and convert to mV
response_voltage = sweep_data['response'] * 1e3

# Get the sampling rate and create a time axis for our data
sampling_rate = sweep_data['sampling_rate'] # in Hz
timestamps = (np.arange(0, len(response_voltage)) * (1.0 / sampling_rate))

In [None]:
#@title Question 2 (4 points)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 2 (4 points)</h4>
Now we want to plot our voltage trace. We will use <code>plt.plot(x,y)</code>.<br>
<br>
<ul>
    <li>You will need two arguments to generate the plot, which are variables we created above: <code>timestamps</code> (x-axis) and <code>response_voltage</code> (y).</li>
    <li>Without changing the limits on the x-axis, you won't be able to see individual action potentials.</li>
    <li>Modify the x-axis using <code>plt.xlim([min,max])</code> to specify the limits by replacing <code>min</code> and <code>max</code> with numbers that make sense for this x-axis.</li>
    <li>Add correct labels for <code>x_label</code> and <code>y_label</code></li>
    <li>Add a <code>title</code> that includes text about what you are plotting</li>

</div>
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

<font size=4>Here is the documentation for `matplotlib.pyplot`: https://matplotlib.org/stable/tutorials/pyplot.html#sphx-glr-tutorials-pyplot-py

In [None]:
## Edit this ---
# Plot the raw recording here
plt.plot(x, y)                # change these to correct variable names
plt.xlabel(...)               # add the string to describe the x-axis values
plt.ylabel(...)               # add the string to describe the y-axis values
plt.title(...)                # Add a descriptive title

# run this cell to see the plot. Then change the code below and run again to adjust the x-axis
# plt.xlim(...,...)

<font size=4>Below is an example plotting both the stimulus and the response.

In [None]:
#@title Task
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
Just run this cell to review the output.
</div>
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
# More complex- plot both stim and resp
# Set up our plot- general layout parameters
fig, axes = plt.subplots(2, 1, sharex=True,figsize=(8,6))

# axes 0 is our first plot, of the recorded voltage data
axes[0].plot(timestamps, response_voltage, color='blue')    # code to actually plot
axes[0].set_ylabel('mV')                                    # make sure to label our y-axis values
axes[0].set_xlim(0,3)                                       # determines the scaling of the x-axis
axes[0].set_title('whole-cell patch recording')             # make sure to set a title for what we are looking at

# axes 1 is our second plot, of the stimulus trace
axes[1].plot(timestamps, stim_current, color='gray')        # code to actually plot
axes[1].set_ylabel('pA')                                    # make sure to label our y-axis values- different units from above!
axes[1].set_xlabel('seconds')                               # label here and not above because they are shared axis values. It will look best at bottom and not cluttered
axes[1].set_title('stimulus')                               # set the title- it's not the same as the first plot!

plt.show()

<font size=4>Review this combined plot. Does the stimulus shape match the sweep number you selected? For example, the sweep numbers vary and are associated with the 'stimulus type'. The options included:

*   <font size=4>Long square
*   <font size=4>Noise 1
*   <font size=4>Noise 2
*   <font size=4>Short square

![](https://drive.google.com/uc?export=view&id=1Av1UTB1tr2v9gciQ4FesozCicKI8CDD1)



In [None]:
#@title Task
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Task</h4>
In the cells below, select a <b>different sweep number</b> from <b>a different stimulus type</b> and plot below to compare.
</div>
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

<font size=4>You can refer back to the Allen Cell Types Dataset website [here](https://celltypes.brain-map.org/experiment/electrophysiology/474626527). <br>
Make sure to check the cell ID so it is consistent with what you input to the notebook at the beginning! <br>
You can 'Browse Electrophysiology Data' to review the sweeps and also the change the stimulus type to see what your plot should look like at the end of the steps below!

In [None]:
#@title Question 3 (10 points)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 3 (10 points)</h4>
We will repeat the process above. You can copy and paste the same code, but make sure to <b>rename</b> your variables so the names do not overwrite your original. <br>
#Comments provided below - replace the <code>...</code> with the correct information.
</div>
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

<font size=5>**Step 3.1**: Create a new variable to hold the new number so it doesn't overwrite your original choice below. <font size=4>(**2 points**)

In [None]:
## Edit this ---
# Assign your favorite sweep number to a variable "a_new_sweep_number" below.
a_new_sweep_number = ...

a_new_sweep_data = data.get_sweep(...)                    # insert the correct variable name here
print('Another sweep obtained')

<font size=5>**Step 3.2**: Get the information needed about this sweep for plotting. Create new variables so we don't overwrite the original data. <font size=4>(**4 points**)

In [None]:
## Edit this ---
# Get the stimulus trace (in amps) and convert to pA
new_stim_current = a_new_sweep_data['...'] * 1e12          # insert the variable name for our new data

# Get the voltage trace (in volts) and convert to mV
new_response_voltage = a_new_sweep_data['...'] * 1e3       # insert the variable name for our new data. should match line right above, except we select the 'response' column instead of 'stimulus'

# Get the sampling rate and can create a time axis for our data
new_sampling_rate = ...['sampling_rate']                   # insert the variable name for our new data.  this value has units of Hertz (Hz)
new_timestamps = (np.arange(0, len(...)) * (1.0 / ...))    # insert variables we recently used in the two lines right before this. if you're unsure, look back up at the original code, but used the modified names

<font size=5>**Step 3.3**: Adapt your plotting code from above to plot the new sweep number data you selected. Adjust the x-axis as needed to see the data here. <font size=4>(**4 points**)

In [None]:
## Edit this ---

# Plot both stim and resp
#Set up our plot
fig, axes = plt.subplots(2, 1, sharex=True,figsize=(8,6))

# axes 0 is our first plot, of the recorded voltage data
axes[0].plot(..., ..., color='magenta')                       # insert new variable names
axes[0].set_ylabel(...)
axes[0].set_xlim(...,...)                                     # adjust as needed to set the scaling of the x-axis
axes[0].set_title(...)

# axes 1 is our second plot, of the stimulus trace
axes[1].plot(..., ..., color='black')                         # insert new variables names
axes[1].set_ylabel(...)
axes[1].set_xlabel(...)
axes[1].set_title(...)

plt.show()

<font size=4>Your plot should look different from the first one you created above. If it is exactly the same, then you need to check that you changed the correct variables in selecting a new sweep and the plotting code.

<font size=4>Does your plot match what you saw on the website? Check that your x-axis limits (`set_xlim`) allow you to see enough of the plot. If it doesn't, go back and change those numbers to expand your axis values.

---
<a id="morphology"></a>
## Step 4. Plot the morphology of the cell ❄
<font size=4>The Cell Types Database also contains **3D reconstructions** of neuronal morphologies. Here, we'll plot the reconstruction of our cell's morphology. We took a look at these already when interacting with the website and completing the Data Sheet.

<font size=4>We will now use code to produce plots!

<font size=4>*Note*: It could take up to several minutes to run the cell below, possibly longer over a slow internet connection.
Try it out. If it doesn't work for you, that is okay- it is a less fancy version of what is already on the website!

In [None]:
#@title Task
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Task</h4>
Just run this cell and review the plot.
</div>
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
# Import necessary toolbox
from allensdk.core.swc import Marker

# Download and open morphology and marker files
morphology = ctc.get_reconstruction(cell_id)
markers = ctc.get_reconstruction_markers(cell_id)

# Set up our plot
fig, axes = plt.subplots(1, 2, sharey=True, sharex=True, figsize=(10,10))
axes[0].set_aspect('equal')
axes[1].set_aspect('equal')

# Make a line drawing of x-y and y-z views
for n in morphology.compartment_list:
    for c in morphology.children_of(n):
        axes[0].plot([n['x'], c['x']], [n['y'], c['y']], color='black')
        axes[1].plot([n['z'], c['z']], [n['y'], c['y']], color='black')

# cut dendrite markers
dm = [ m for m in markers if m['name'] == Marker.CUT_DENDRITE ]
axes[0].scatter([m['x'] for m in dm], [m['y'] for m in dm], color='#3333ff') # blue
axes[1].scatter([m['z'] for m in dm], [m['y'] for m in dm], color='#3333ff')

# no reconstruction markers
nm = [ m for m in markers if m['name'] == Marker.NO_RECONSTRUCTION ]
axes[0].scatter([m['x'] for m in nm], [m['y'] for m in nm], color='#333333') # grey
axes[1].scatter([m['z'] for m in nm], [m['y'] for m in nm], color='#333333')

axes[0].set_ylabel('y')
axes[0].set_xlabel('x')
axes[1].set_xlabel('z')
plt.show()

<font size=4>Notes on the plot above:<br>
* <font size=4>We used the marker file, which contained information about the reconstruction.
* <font size=4>The blue circles  🔵 are locations where dendrites have been truncated due to slicing. When axons were not reconstructed in the images, these lines would appear grey.
* <font size=4>Depending on the cells, it may or may not show these on the image if they were not features in the data!

<font size=4>Great job! You have reached the checkpoint. Please save your work and submitted your notebook with code completed up until this point.

---
## Checkpoint Submission ✅
<font size=4>Please submit the above sections of your completed notebook for Part 1 as a checkpoint to the project. Follow the instructions below.

### **Important submission steps:**
1. <font size=4>Choose **Save** (and make sure you've already saved a copy in your drive) from the **File** menu.
3. <font size=4>You will make sure your notebook file is saved in the following steps.
4. <font size=4>You will submit the notebook for this assignment to the corresponding Assignment on the WebCampus (Canvas) course website.

<font size=4>**It is your responsibility to make sure your work is saved before following the instructions in the last cell.**



### **Submission** 📩

<font size=4>Make sure you have run all cells in your notebook in order before running the cell below, so that all images/graphs appear in the output.
**Please save (or check again) before exporting!**
You will save the notebook file (.ipynb):


1.  <font size=4>Go to `"File > Download"` and choose the **.ipynb format** (first option)
  - T<font size=4>his will save a copy of the python notebook file- extension .ipynb- in the Downloads folder on your computer (or wherever you have opted to save files)


2. <font size=4>If the above option is not available to you, make sure to use ctrl + s on a pc (press both keys at same time, do not include the + sign) or command + s (press both keys at same time, do not include the + sign) for apple devices. Look at the top of the Menu in google colab, and toward the middle, it might say that changes were saved.
  * <font size=4>If you want to check that things were saved recently, go to your Google drive (via an online browser or from the app) and check the timestamp for when your notebook was last updated. If it wasn't saved recently, go back to the tab where you have your notebook open and resave.
  * <font size=4>The notebook file `"Copy of Project2.ipynb"` will be in your google drive under the `"Colab Notebooks"` folder. (see info at top for more on where things get saved)

<hr>

# Part 2

<font size = 4>When you are ready to start this part of the notebook, you will need to rerun the steps in the initial [Set up our coding environment](#setup) section before you start here. <br> You do not need to repeat Part 1 though if you've already completed it and are re-opening/returning to your notebook. <br>Rerunning the import of the allensdk and packages you need should be sufficient to allow you to jump to this point and start 🚀

<a id="metrics"></a>
## Step 5. Analyze pre-computed features 💠

<font size=4> The Cell Types Database contains a set of features that have already been computed, which could serve as good starting points for analysis. We can query the database to get these features. Below, we'll use the `pandas` package that we imported above to create a **[dataframe](https://pandas.pydata.org/pandas-docs/stable/user_guide/dsintro.html#dataframe)** of our data. We practiced working with `pandas` during lab time for the course the other week.

In [None]:
#@title Task
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Task</h4>
Run the cell below. Note we have a new method to access the ephys data called <code>data.get_ephys_features()</code>.<br>
Scroll to the right to see many of the different features available in this dataset.
<br><br>
<b>Note:</b> It may take ~10 seconds. A list of all of the features available will be printed, as well as produce a dataframe, which looks like tables you've worked with before.
</div>
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
# Download all electrophysiology features for all cells using `.get_ephys_features`
ephys_features = ctc.get_ephys_features()
dataframe = pd.DataFrame(ephys_features).set_index('specimen_id', drop=False)   # this is the main dataset (dataframe in pandas) we will work

print('Ephys features available for %d cells:' % len(dataframe))
dataframe.head()                                                                # Just show the first 5 rows (the head) of our dataframe

<font size=4> What if we want to see **all** columns in our dataframe? Notice there are 56 columns, and all are not shown. This is indicated by the `...` in the middle of the table column labels.

In [None]:
#@title Question 4 (1 point)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 4 (1 point)</h4>
Fill in the code to show all of the columns in the dataframe. There are multiple ways to do this! Try googling how to do this with a dataframe in <code>pandas</code>, or you can go directly to the pandas documentation and search.<br>
We've also done this in previous lab and hw notebooks! <br>
<br>
<b>Hint</b>: I'm not looking for a specific format here. They can be a list, an object, etc. The output should show all the names. You can refer back to Lab10 for an example.</div>
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
## Edit this ---
# Viewing *all* column labels from our dataframe
...

---
## Available data includes:

### Electrophysiological features
*  <font size=4>The `adaptation` is the rate at which firing speeds up or slows down during a stimulus
*  <font size=4>The `avg_isi` is the mean value of all interspike intervals in a sweep
*  <font size=4>The `electrode_0_pa` pipette offset
*  <font size=4>The `f_i_curve_slope` is the slope of the curve between firing rate (f) and current injected (i)
<br><font size=4><b>fast_trough</b>
  * <font size=4>the timestamp (<code>_t</code>) or voltage (<code>_v</code>) of the trough within the interval 5 ms after the peak in response to the stimulus
*  <font size=4>The `fast_trough_t_long_square` fast trough timestamp for the long square stimulus
*  <font size=4>The `fast_trough_t_ramp` fast trough timestamp for the ramp stimulus
*  The `fast_trough_t_short_square` fast trough timestamp for the short square stimulus
*  The `fast_trough_v_long_square` is the minimum voltage value of the membrane potential in the interval lasting 5 ms after the peak for the long square stimulus
*  The `fast_trough_v_ramp` is the minimum voltage value of the membrane potential in the interval lasting 5 ms after the peak for the ramp stimulus
*  The `fast_trough_v_short_square`is the minimum voltage value of the membrane potential in the interval lasting 5 ms after the peak for the short square stimulus
*  The `has_burst` is a boolean (True/False). A spike train was defined as having a burst if the first two ISIs were both less than or equal to 5 ms
*  The `has_delay` is a boolean (True/False). A spike train was defined as having a delayed start to firing if the latency was greater than the average ISI
*  The `has_pause` is a boolean (True/False). A spike train was defined as having a pause if any ISI was more than 3 times the duration of the ISIs immediately before and after it (i.e., at least two spikes on average were 'skipped')
*  The `id` is the internal id for the cell. Can be used to cross-reference with morphology data or web viewer
*  The `input_resistance_mohm` is the input resistance of the cell in milliOhms
*  The `latency` is the time for the stimulus onset to the threshold of the first spike
<br><font size=4><b>peak</b>
  * <font size=4>the timestamp (<code>_t</code>) or voltage (<code>_v</code>) of the maximum value of the membrane potential during the action potential. Between the action potential's threshold and the time of the next action potential, or the end of the response
*  The `peak_t_long_square` is the timestamp of the maximum value of the membrane potential during the action potential (i.e. between the action potential's threshold and the time of the next action potential, or the end response) in response to a long square stimulus
*  The `peak_t_ramp` is the timestamp of the maximum value of the membrane potential during the action potential (i.e. between the action potential's threshold and the time of the next action potential, or the end response) in response to a ramp stimulus
*  The `peak_t_short_square` is the timestamp of the maximum value of the membrane potential during the action potential (i.e. between the action potential's threshold and the time of the next action potential, or the end response) in response to a short square stimulus
*  The `peak_v_long_square` is the voltage of the maximum value of the membrane potential during the action potential (i.e. between the action potential's threshold and the time of the next action potential, or the end response) in response to a long square stimulus
*  The `peak_v_ramp` is the voltage of the maximum value of the membrane potential during the action potential (i.e. between the action potential's threshold and the time of the next action potential, or the end response) in response to a ramp stimulus
*  The `peak_v_short_square` is the voltage of the maximum value of the membrane potential during the action potential (i.e. between the action potential's threshold and the time of the next action potential, or the end response) in response to a short square stimulus <br>
<b>rheobase:</b> the minimum current needed to elicit an action potential. When the current is below the rheobase, an action potential will never occur regardless of how long the stimulation is.
*  The `rheobase_sweep_id` sweep ID corresponding to the rheobase current
*  The `rheobase_sweep_number` sweep number corresponding to the rheobase current
*  The `ri` is the input resistance
*  The `sag` is a measurement of the return to a steady state divided by the peak deflection (see figure in Ephys White Paper for more details)
*  The `seal_gohm` measurement of the gigaohm seal obtained in whole-cell patch clamp configuration
<br><font size=4><b>slow_trough</b>
  * <font size=4>the timestamp (<code>_t</code>) or voltage (<code>_v</code>) of the membrane potential in the interval between the peak and the time of the next action potential in response to the stimulus. If the time between the peak and the next action potential was less than 5 ms, this value was identical to the fast trough
*  The `slow_trough_t_long_square` is the timestamp of the membrane potential in the interval between the peak and the time of the next action potential in response to the long square stimulus
*  The `slow_trough_t_ramp` is the timestamp of the membrane potential in the interval between the peak and the time of the next action potential in response to the ramp stimulus
*  The `slow_trough_t_short_square` is the timestamp of the membrane potential in the interval between the peak and the time of the next action potential in response to the short square stimulus
*  The `slow_trough_v_long_square` is the voltage of the membrane potential in the interval between the peak and the time of the next action potential in response to the long square stimulus
*  The `slow_trough_v_ramp`  is the voltage of the membrane potential in the interval between the peak and the time of the next action potential in response to the ramp stimulus
*  The `slow_trough_v_short_square` is the voltage of the membrane potential in the interval between the peak and the time of the next action potential in response to the long square stimulus
*  The `specimen_id` which is a unique identifier (note that an individual can contribute multiple cells; if that's the case, the same `specimen_id` is different across rows, but `donor__id` is the same)
*  The `tau` is the time constant of the membrane in milliseconds
<br><font size=4><b>threshold</b>
  * <font size=4>the current (<code>_i</code>) or timestamp (<code>_t</code>) or voltage (<code>_v</code>) of the threshold current necessary depending on the stimulus
*  The `threshold_i_long_square` is the  current (pA) to elicit an action potential (threshold) when 1 s long square current injected
*  The `threshold_i_ramp` is the current to elicit threshold for the ramp stimulus
*  The `threshold_i_short_square` is the current to elicit threshold for the short square stimulus
*  The `threshold_t_long_square` the timestamp of the threshold from a long square stimulus
*  The `threshold_t_ramp` the timestamp of the threshold from a ramp stimulus
*  The `threshold_t_short_square` the timestamp of the threshold from a short square stimulus
*  The `threshold_v_long_square` the voltage of the threshold from a long square stimulus
*  The `threshold_v_ramp` the voltage of the threshold from a ramp stimulus
*  The `threshold_v_short_square`  the voltage of the threshold from a short square stimulus
*  The `thumbnail_sweep_id` sweep ID chosen for a thumbnail image
<br><font size=4><b>trough</b>
  * <font size=4>the minimum timestamp (<code>_t</code>) or voltage (<code>_v</code>) of the membrane potential during the after-hyperpolarization
*  The `trough_t_long_square` is the timestamp of the minimum value of the membrane potential during the after-hyperpolarization for the long square stimulus
*  The `trough_t_ramp` is the timestamp of the minimum value of the membrane potential during the after-hyperpolarization for the ramp stimulus
*  The `trough_t_short_square` is the timestamp of the minimum value of the membrane potential during the after-hyperpolarization for the short square stimulus
*  The `trough_v_long_square` is the minimum value of the membrane potential during the after-hyperpolarization for the long square stimulus
*  The `trough_v_ramp` is the minimum value of the membrane potential during the after-hyperpolarization for the ramp stimulus
*  The `trough_v_short_square` is the minimum value of the membrane potential during the after-hyperpolarization for the short square stimulus
<br><font size=4><b>upstroke_downstroke_ratio</b>
    * <font size=4>is the ratio between the absolute values of the action potential peak upstroke and the action potential peak downstroke
*  The `upstroke_downstroke_ratio_long_square` is the ratio for the long square stimulus
*  The `upstroke_downstroke_ratio_ramp` is the ratio for the ramp stimulus
*  The `upstroke_downstroke_ratio_short_square` is the ratio for the short square stimulus
*  The `vm_for_sag` the peak deflection at which sag is measured (targeted at -100 mV, but not always exact)
*  The `vrest` is the resting membrane potential in mV

---

In [None]:
#@title Question 5 (3 points)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 5 (3 points)</h4>
Demonstrate your understanding of the line of code below by writing a short description of what is happening in Python. You can always refer back to the lab or hw for similar examples.
<br><br>
<b>Hint</b>: Try starting with the part in the blue brackets first: <code>dataframe['specimen_id'] == cell_id</code>. State what that does and/or returns. What is the name of that type of variable that is returned with the <code>==</code> sign? How do those values then relate to what we get out from our <code>dataframe</code> as the <code>cell_ephys_features</code> that is printed out for you? Feel free to test it out with lines of code below to 'show your work'. <br>
Refer back to Lab10 if you need another example.
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
# Examine the data for our cell_id from above
cell_ephys_features = dataframe[dataframe['specimen_id'] == cell_id]
cell_ephys_features

<font size=4>**Make sure to include your written response OR work through it with code (or do both).**

*Type your response here*

In [None]:
# Add your code here if you would like
...

In [None]:
# Add your code here if you would like
...

---
<font size=4>As you can see in the dataframe above, there are many **pre-computed features** available in this dataset. [Here's a glossary](https://drive.google.com/file/d/1yBfYm1yMtFSFB2erhfZ0SpeeuoWJNMEk/view?usp=sharing), in case you're curious.

![](https://drive.google.com/uc?export=view&id=1bCj0kl4Dd5J_Qf2DlmxRUva57yK5xVf_)


<font size=4>Image from the <a href="https://community.brain-map.org/t/documentation-cell-types-database/2845">Allen Institute Cell Types Database Technical Whitepaper: Electrophysiology.</a>
<br><br>

<font size=4>Let's first look at the **speed of the trough**, and the **ratio between the upstroke and downstroke** of the action potential:
- <font size=4>**Action potential fast trough** (<code>fast_trough_v_long_square</code>): Minimum value of the membrane potential in the interval lasting 5 ms after the peak.
- <font size=4>**Upstroke/downstroke ratio** (<code>upstroke_downstroke_ratio_long_square</code>)</b>: The ratio between the absolute values of the action potential peak upstroke and the action potential peak downstroke.</div>

<font size=4>We created a `pandas` dataframe above of all of these features. Here, we'll assign the columns we're interested in to two different **variables**, so that they will contain the datapoints we're interested in. Remember, we can access different columns of the dataframe by using the syntax `dataframe['column_of_interest']`. The columns of interest here are `fast_trough_v_long_square` and `upstroke_downstroke_ratio_long_square`.

In [None]:
#@title Question 6 (2 points)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 6 (2 points)</h4>
Edit the code where you see <code>...</code> and run the cell below to store these columns into our two new variables.
<div>
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
## Edit this ---
fast_trough = ...
upstroke_downstroke = ...

In [None]:
# Just run this cell to show the data we just extracted
fast_trough

<font size=4>Note that when working with dataframes, the index values (we have labeled them with our `specimen_id`) also appear when you select certain columns you want to work with. This is a good way to help you keep track of the data.

In [None]:
#@title Question 7 (3 points)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 7 (3 points)</h4>
Create a scatterplot that plots the <b>fast trough</b> (x axis) versus the <b>upstroke-downstroke ratio</b> (y axis). Label your axes accordingly using <code>plt.xlabel()</code> and <code>plt.ylabel()</code>.
<div>
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

<font size=4><u>Hint</u>: If you need help, see the [`plt.scatter()` documentation](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.scatter.html), [`plt.xlabel` documentation](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.xlabel.html), and [`plt.ylabel` documentation](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.ylabel.html).

In [None]:
## Edit this ---
# Your scatterplot here
...

<font size=4>It looks like there may be roughly two clusters in the data above. Maybe they relate to whether the cells are presumably excitatory (spiny) cells or inhibitory (aspiny) cells. Let's query the API and split up the two sets to see.

In [None]:
#@title Task
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Task</h4>
Run the cells below. These will load a new data that contains the dendrite type of these cells using <code>.get_cells()</code>. It will then make this a dataframe we can view. Review the code comments and then run the cells.
<div>
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
# Get information about our cells' dendrites
# this is different info that we will need to incorporate into the dataframe we've already been working with
# we call the `Cell Types Cache (ctc)` and use  `.get_cells()`
cells = ctc.get_cells()


In [None]:
# Make this a dataframe so we can view the data
cells = pd.DataFrame(cells)
cells

<font size=4>You can see we have alot of information here, but not as many columns as the previous data we were working with. Some of it isn't relevant for right now, but notice you worked with similar data in the previous homework. <br> <br>

---
### Available data include:

*  <font size=4>The `reporter_status` is the expression of the reporter (mouse only- positive or negative; human data is blank)
*  <font size=4>The `cell_soma_location` contains xyz coordinates for morphological reconstruction
*  <font size=4>The `species` variable indicates the species: mouse (mus musculus) 🐭 or human (homo sapiens) 🧑
*   <font size=4>The `id` which is a unique identifier (note that an individual can contribute multiple cells; if that's the case, the same `id` is different across rows, but `donor_id` is the same)
*   <font size=4>The `specimen_name` is a unique identifier
    * <font size=4>human cells start with 'H'
    * <font size=4>each id is unique, but `donor__id` is the same
*   <font size=4>The `structure_layer_name` indicates the cortical layer of the brain
*   <font size=4>The `structure_area_id` is a unique number for the brain structure
*   <font size=4>The `structure_area_abbrev` is a text abbreviation for the brain region
*   <font size=4>The `transgenic_line` indicates the cell types that were labeled with Cre-recombinase (genetic manipulation, only in mice)
*  <font size=4>The `dendrite_type` indicates the type of the dendritic spines: spiny, aspiny, or sparsely spiny
*  <font size=4>The `apical` indicates the extent of apical dendrite preservation (intact or truncated), if applicable (N/A)
*  <font size=4>The `reconstruction_type` indicates full or dendrite only, if applicable (empty if N/A)
*  <font size=4>The `disease_state` indicates disease state for human tissue: tumor or epilepsy
*  <font size=4>The `donor_id` is a unique ID given to the individual
*  <font size=4>The `specimen_hemisphere` indicates whether the sample is from the right or left side of the brain
*  <font size=4>The `normalized_depth` is the depth of the cell soma normalized between pia (0) and white matter (1)
---


<font size=4>What we need to do is to combine two dataframes to get all the info we need to plot. We will use `join` to do this! 💍

In [None]:
#@title Task
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Task</h4>
The cell below will <code>join</code> our dataframes. Review the comments, run the cell, and check out the columns in the dataframe.
<div>
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
# Use join to combine our dataframes
# dataframe is the original we've been working with so far
# cells is the new one we just imported and want to combine with
# setting the index of cells to be the 'id', indicates which column to join on
# id column from cells dataframe is equal to specimen_id column from dataframe
full_dataframe = dataframe.join(cells.set_index('id'))
full_dataframe

<font size=4>We now have even more columns of data to work with, but again, you can't see them all.

In [None]:
#@title Question 8 (1 point)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 8 (1 point)</h4>
Repeat what you did above in <b>Step 5</b> to show all of the column names. Notice that the new columns got added to where the previous ones ended (after <code>vrest</code>). Refer back to Lab10 if you need an example of how to do this.
<div>
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
## Edit this ---
# Show all columns of our full_dataframe
...

<font size=4>Now it's time to plot the data again, but using the information we have about `dendrite_type` on our plot.

In [None]:
#@title Question 9 (2 points)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 9 (2 points)</h4>
Create a similar scatterplot, but this time each dot will be colored by dendrite type. Insert the appropriate column name <code>insert_col_name</code> and cell type string from the column in the table <code>insert_name</code>. There should only be two types of dendrites. <br>
<div>
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
## Edit this ---
# Create a dataframe for spiny cells, and a dataframe for aspiny cells
spiny_df = full_dataframe[full_dataframe['<insert_col_name>'] == 'insert_name']           #**EDIT** change the names here accordingly
aspiny_df = full_dataframe[full_dataframe['<insert_col_name>'] == 'insert_name']          #**EDIT** change the names here accordingly

# Create our plot! Calling scatter twice like this will draw both of these on the same plot.
plt.scatter(spiny_df['fast_trough_v_long_square'],spiny_df['upstroke_downstroke_ratio_long_square'], c = '#d95f02') # orange
plt.scatter(aspiny_df['fast_trough_v_long_square'],aspiny_df['upstroke_downstroke_ratio_long_square'], c = '#7570b3') # purple

plt.ylabel('Upstroke-Downstroke Ratio')
plt.xlabel('Fast Trough Depth (mV)')
plt.legend(['Spiny','Aspiny'])

plt.show()

In [None]:
#@title Question 10 (3 points)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 10 (3 points)</h4>
Demonstrate your understanding of the two lines of code above where we define <code>spiny_df</code> and <code>aspiny_df</code>.
Write a short description of what is happening in python, or add lines of code with comments to explain your logic. You can always refer back to labs and hw for similar examples.
<br><br>
<b>Hint</b>: This is similar to a previous question when we were looking for the row with a certain cell ID. In this case, we are taking a subset of the data and getting many values for each <code>dendrite_type</code>. Feel free to test it out with lines of code below to 'show your work'. Refer to previous questions that were similar, or back to Lab10 for another example.
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

<font size=4>**Make sure to include your written response OR work through it with code (or do both).**

*Type your response here*

In [None]:
# Add your code here if you would like
...

In [None]:
# Add your code here if you would like
...

<font size=4>Looks like these two clusters do partially relate to the dendritic type.

<font size=4>Cells with spiny dendrites (which are typically excitatory cells) have a big ratio of upstroke:downstroke, and a more shallow trough (less negative).

<font size=4>Cells with aspiny dendrites (typically inhibitory cells) are a little bit more varied. But </i>only</i> aspiny cells have a low upstroke:downstroke ratio and a deeper trough (more negative).

In [None]:
#@title Question 11 (2 points)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 11 (2 points)</h4>
What else can we say about the aspiny dendrites compared to the spiny dendrites? Review the plot and describe where (location) you see purple (aspiny) versus orange (spiny).
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

*Type your answer here*

---
<a id="waveforms"></a>

## Step 6. Compare waveforms 🌀 🔍
<font size=4>Let's take a closer look at the action potentials of these cells to see what these features actually mean for the action potential waveform. We will choose one of the cells with the highest upstroke:downstroke ratio. Our first line of code, where it says [`dataframe.sort_values()`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.sort_values.html), is the line of code that will arrange our dataframe by the
**upstroke_downstroke_ratio_long_square** column.


In [None]:
#@title Question 12 (1 point)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 12 (1 point)</h4>
To choose one of the cells with the highest <code>upstroke_downstroke_ratio_long_square</code>, we need a way to arrange our data based on this value. If we want the <b>highest values</b> at the top of this column in our dataframe, what method can we use to put them in this order?
<br><br>
<b>Hint</b>: We've done this many times with the <code>datascience</code> package, and the name of the method is very similar, but has a slightly different in the name. Think about what steps you would go through if you wanted to find the <code>max</code> value in a column from your table. (it's not <code>max</code> though).
If you aren't sure how to order your data from highest to lowest in that column, review the previous lab notebook.

'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
## Edit this---
# Fill in the code below at locations with `...` and <insert_method_here>
... = dataframe.<insert_method_here>('upstroke_downstroke_ratio_long_square',ascending=...)        #**EDIT** here accordingly
#^ for the name, you should look at the code in the next cell to determine what this name should be. It needs to match what is below so the next cell will run

In [None]:
## Don't change this cell- just run it

# Assign one of the top cells in our dataframe (default = 2)
specimen_id = sorted_dataframe.index[2]                     # index is the column name and we want the third entry-  remember this is the third entry because we index starting with 0

# Also get the ratio value so we can print the info together
ratio = sorted_dataframe.iloc[2]['upstroke_downstroke_ratio_long_square']

# Print our results so that we can see them
print('Specimen ID: ' + str(specimen_id) + ' with upstroke-downstroke ratio: ' + str(ratio))

<font size=4>Notice we've used several methods we've practiced many times with the `datascience` package, and now you know how to do these in `pandas` too!

In [None]:
#@title Question 13 (3 points)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 13 (3 points)</h4>
Demonstrate your understanding of the line of code above that obtained the <code>ratio</code>.
<br><br>
<b>Hint</b>: Start with the first part of this line: <code> ratio = sorted_dataframe.iloc[2]</code>. What does this do? Test that part out in the cells below to show your work. Also describe in words what it does. After this, try out the remainder of the code using the <code>ratio</code> variable you just created and referencing the column name like it appears above <code>[`upstroke_downstroke_ratio_long_square`]</code>. Again, write briefly what this part does. Use the empty cells below to fill in your code and either write comments or type an answer into the textbox. <br>
Refer back to Lab10 for an example.

'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
# Explanation and code part 1
...

In [None]:
# Explanation and code part 2
...

In [None]:
## Add any other cells or text boxes that you need!

<font size=4>*If you don't comment your code to explain what it does, write out your description of each step here*

<font size=4>Now we can take a closer look at the action potential for that cell by grabbing a raw sweep of recording from it, just like we did above. You will just use the sweep that is already setup for you.

In [None]:
#@title Task
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Task</h4>
Run the cell below. This may take a minute or so.<br><br>
<i>Note</i>: If you receive a 'H5pyDeprecationWarning', you can ignore it.

'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
# Get the data for our specimen
upstroke_data = ctc.get_ephys_data(specimen_id)

# Get one sweep for our specimen (I've already handselected a gorgeous one for you, 45)
upstroke_sweep = upstroke_data.get_sweep(45)

# Get the current & voltage traces
current = upstroke_sweep['stimulus'] * 1e12 # in A, converted to pA
voltage = upstroke_sweep['response'] * 1e3 # converted to mV

# Get the time stamps for our voltage trace
timestamps = (np.arange(0, len(voltage)) * (1.0 / upstroke_sweep['sampling_rate']))

print('Sweep obtained')

In [None]:
#@title Question 14 (4 points)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 14 (4 points)</h4>
Plot the sweep we obtained above. <br><br>
<b>Hint</b>: You'll want to use <code>plt.plot(x,y)</code> where <code>x</code> is the <code>timestamps</code> and <code>y</code> is the <code>voltage</code>.<br> You also may want to again change the x-axis limits using <code>plt.xlim</code>.<br> Be sure to give your plot accurate labels- including x and y values (and units) and a title.

'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
## Edit this ---
# Plot the new sweep here
...                           # our solution was 4 lines of code

In [None]:
#@title Question 15 (6 points)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 15 (6 points)</h4>
Generate a similar plot for a cell with a <b>low</b> upstroke ratio. Fill in the code in the cells below where you see <code>...</code>. Similiar to above, zoom in on the x-axis so that you can actually see the shape of the action potential waveform.<br><br>
<b>Hint</b>:You only need to change <b>one</b> value in all of the code in this step in order to make this change. How did we arrange our dataframe at first?

'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

<font size=5>**Step 15.1:** Resort the dataframe <font size=4>(**1 point**)

In [None]:
## Edit this ---
## Part 1

# Sort the dataframe and reassign
low_sorted_dataframe = ...                         # #**EDIT** here accordingly  - use this name since the line below depends on this value

# Assign one of the top cells in our dataframe (default = 2) and the ratio to different variables
low_specimen_id = low_sorted_dataframe.index[2]
low_ratio = low_sorted_dataframe.iloc[2]['upstroke_downstroke_ratio_long_square']

# Print our results so that we can see them
print('Specimen ID: ' + str(low_specimen_id) + ' with upstroke-downstroke ratio: ' + str(low_ratio))

<font size=5>**Step 15.2:** Subset the data needed from the dataframe based on what was defined above. <font size=4>(**1 point**)

In [None]:
## Edit this ---
## Part 2

# Get the data for our specimen
low_upstroke_data = ...                            # #**EDIT** here accordingly - similar to the first plot, but make sure to use the *new* variable name

# Don't change the lines from here down
# Get one sweep for our specimen (I've already handselected a gorgeous one for you, 45)
low_upstroke_sweep = low_upstroke_data.get_sweep(45)

# Get the current & voltage traces
low_current = low_upstroke_sweep['stimulus'] * 1e12 # in A, converted to pA
low_voltage = low_upstroke_sweep['response'] * 1e3 # converted to mV!

# Get the time stamps for our voltage trace
low_timestamps = (np.arange(0, len(low_voltage)) * (1.0 / low_upstroke_sweep['sampling_rate']))

print('Sweep obtained')

<font size=5>**Step 15.3:** Plot the low upstroke data.<font size=4> (**4 points**)

In [None]:
## Edit this ---
# Plot the new sweep here
...

As you'll hopefully see, even that one feature, upstroke:downstroke ratio, means the shape of the action potential is dramatically different. The other feature we looked at above, size of the trough, is highly correlated with upstroke:downstroke. You can see that by comparing the two cells here. Cells with high upstroke:downstroke tend to have less negative troughs (undershoots) after the action potential.


<img src="https://drive.google.com/uc?export=view&id=1AgZCFiLaBcggWoErdO1rfMvEyzY6StUu" width=800><br>

<font size=4>Image from the <a href="https://community.brain-map.org/t/documentation-cell-types-database/2845">Allen Institute Website </a> on Upstroke:Downstroke Ratio.
<br><br>

---
<a id="compare"></a>

## Step 7. Compare cell types 👷

<font size=4>Let's get out of the action potential weeds a bit. What if we want to know a big picture thing, such as: <br>
 * <font size=4>Are *human cells different than mouse cells?* <br>
 <font size=4>or
 * <font size=4>How are excitatory cells different from inhibitory cells?  

<font size=4>To ask these questions, we can pull out the data for two different cell types, defined by their species, dendrite type, or transgenic line.

<font size=4>**About Transgenic Cre Lines.** The Allen Institute for Brain Science uses transgenic mouse lines that have Cre-expressing cells to mark specific types of cells in the brain. This technology is called the **Cre-Lox system**, and is a common way in neuroscience (and some other fields) to target cells based on their expression of specific genetic promotors. For more information about Cre/Lox technology, see [this website](https://old.abmgood.com/marketing/knowledge_base/Cre-Lox_Recombination.php). Information about the different Cre lines that are available can be found in [this glossary](https://drive.google.com/file/d/1nI3tFHaP5Fp-DLj93ObMu3bLxZ7oZ--w/view?usp=sharing) or on the [Allen Institute's website](http://connectivity.brain-map.org/transgenic).

<font size=4>**For this final step, it's up to you to choose which cell types to compare.** You'll also decide which pre-computed feature to compare between these cell types.

- <font size=4>If you'd like to compare cells from different **species**, the column name is `species`.
- <font size=4>If you'd like to compare **spiny vs. aspiny cells**, the column name is `dendrite_type`.
- <font size=4>If you'd like to compare two **transgenic lines** (mouse cells only), the column name is `transgenic_line`. What if we want to know whether different genetically-identified cells have different intrinsic physiology?


In [None]:
#@title Question 18 (1 point)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 18 (1 point)</h4>
Assign <code>column_name</code> below to the name of your column to see the unique values in that column. Make sure your column name is a <b>string</b>.
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
## Edit this ---
# Define your column name below
# The output will tell you which strings you should use in the following cell
column_name = ...

print(full_dataframe[column_name].unique())

<font size=4>Using the possible values in your column, create two separate dataframes by **taking a subset of** the dataframe below. You can think of this as similar to using a method like `where` from the `datascience` package. We want only the rows of the dataframe where `celltype_1` is equal to a specific string value.

In [None]:
#@title Question 19 (2 points)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 19 (2 points)</h4>
Assign <code>celltype_1</code> and <code>celltype_3</code> to the names of your cell types from above.<br>
For example, if you chose <code>dendrite_type</code> they would be <code>'spiny'</code> and <code>'aspiny'</code>.<br> Make sure your cell type names are in quotes (they should be strings) and <b>exactly</b> match what is found in the dataframe.
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
## Edit this ---
# Define your cell type variables below
celltype_1 = ...
celltype_2 = ...

# Create two separate dataframes for each type
celltype_1_df = full_dataframe[full_dataframe[column_name] == celltype_1]
celltype_2_df = full_dataframe[full_dataframe[column_name] == celltype_2]

# Tell us how many cells there are per type
print("Type 1 # Cells: %d" % len(celltype_1_df))
print("Type 2 # Cells: %d" % len(celltype_2_df))

<font size=4>Let's start by plotting a distribution of the recorded resting membrane potential (`vrest`) for one cell type versus the other cell type.

In [None]:
#@title  Question 20 (3 points)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 20 (3 points)</h4>
Run the cell below and fill in the <code>...</code> and also the cell type labels based on what you chose above. Use the documentation below to get the exact name of the feature (<code>vrest</code>), and change the x label axis so that we know what you're plotting. Also include a y label and title.
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
## Edit this ---
plt.figure()

# Start with this feature
feature = 'vrest'

# Plot the histogram, with density = True
plt.hist([celltype_1_df[feature],celltype_2_df[feature]],density = True)

# Change the labels below
plt.xlabel(...)                           #**EDIT** -feature name
plt.ylabel('Normalized Number of Cells')
plt.legend(['Cell Type 1','Cell Type 2']) # **EDIT** -adjust these according to your feature values
plt.title(...)                            # **EDIT** - descriptive title
plt.show()

- <font size=4>Note that the distribution is normalized by the total count (`density=True`), since there may be very different numbers of cells for your two cell types. You can set `density` to false to plot the raw numbers of cells.
- <font size=4>You can also specify the number of bins with `bins= < #bins > `.
- <font size=4>Look through the [`plt.hist()` documentation](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.hist.html) for more information.

In [None]:
#@title  Question 21 (4 points)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 21 (4 points)</h4>
Now it's your turn. Pick a pre-computed feature (options below) of your choice and compare between your two cell types by plotting a histogram.
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

<font size=4>Here are a few additional pre-computed features you might consider comparing (you can find a complete glossary [here](https://drive.google.com/file/d/1yBfYm1yMtFSFB2erhfZ0SpeeuoWJNMEk/view?usp=sharing)):

- <font size=4><b>Tau (<code>tau</code>)</b>: time constant of the membrane in milliseconds
- <font size=4><b>Adapation ratio (<code>adaptation</code>)</b>: The rate at which firing speeds up or slows down during a stimulus<br>
- <font size=4><b>Average ISI (<code>avg_isi</code>)</b>: The mean value of all interspike intervals in a sweep<br>
- <font size=4><b>Slope of f/I curve(<code>f_i_curve_slope</code>)</b>: slope of the curve between firing rate (f) and current injected<br>
- <font size=4><b>Input Resistance(<code>input_resistance_mohm</code>)</b>: The input resistance of the cell, in megaohms.<br>
- <font size=4><b>Voltage of after-hyperpolarization(<code>trough_v_short_square</code>)</b>: minimum value of the membrane potential during the after-hyperpolarization

In [None]:
## Edit this ---
plt.figure()

#**EDIT** Change your feature below.
feature = '...'

# Plot the histogram, with density = True
plt.hist([celltype_1_df[feature],celltype_2_df[feature]],density = True)

# Change the labels below
plt.xlabel(...)                           #**EDIT** -feature name
plt.ylabel('Normalized Number of Cells')
plt.legend(['Cell Type 1','Cell Type 2']) # **EDIT** -adjust these according to your feature values
plt.title(...)                            # **EDIT** - descriptive title
plt.show()

In [None]:
#@title Question 22 (3 points)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 22 (3 points)</h4>
Describe what you see in the plots. Compare the first one that was <code>vrest</code> with the other that you chose. Do the values differ based on the feature you selected? Why or why not? Refer to the plot(s) to support your answers.

'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

*Type your answer here*

In [None]:
#@title Task
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Task</h4>
Now you will choose your own adventure and comparison of choice below. First, pick which <b>cell types </b> you would like to compare. You can stick with what you already did above, or pick a different one from the 3 options below.
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

- <font size=4>If you'd like to compare cells from different **species**, the column name is `species`.
- <font size=4>If you'd like to compare **spiny vs. aspiny cells**, the column name is `dendrite_type`.
- <font size=4>If you'd like to compare two **transgenic lines** (mouse cells only), the column name is `transgenic_line`. What if we want to know whether different genetically-identified cells have different intrinsic physiology?


In [None]:
#@title Question 23 (10 points total)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 23.1 (3 points)</h4>
Complete the code below for your new cell type, or copy and paste if you are sticking with the same (but will do a different feature in the next step)
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
## Edit this ---
# Define your column name below
# The output will tell you which strings you should use in the following cell
column_name2 = ...

# Don't change this line ---
print(full_dataframe[column_name2].unique())

In [None]:
## Edit this ---
# Define your cell type variables below
celltype_3 = ...
celltype_4 = ...

# Don't change these lines ---
# Create two separate dataframes for each type
celltype_3_df = full_dataframe[full_dataframe[column_name2] == celltype_3]
celltype_4_df = full_dataframe[full_dataframe[column_name2] == celltype_4]

# Tell us how many cells there are per type
print("Type 1 # Cells: %d" % len(celltype_3_df))
print("Type 2 # Cells: %d" % len(celltype_4_df))

In [None]:
#@title Task
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Task</h4>
The next step is to define your pre-computed feature. Here is the list again, and also you can could <code>vrest</code> again if you picked a different cell type.'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

<font size=4>Here are a few additional pre-computed features you might consider comparing (you can find a complete glossary [here](https://drive.google.com/file/d/1yBfYm1yMtFSFB2erhfZ0SpeeuoWJNMEk/view?usp=sharing)):

- <font size=4><b>Tau (<code>tau</code>)</b>: time constant of the membrane in milliseconds
- <font size=4><b>Adapation ratio (<code>adaptation</code>)</b>: The rate at which firing speeds up or slows down during a stimulus<br>
- <font size=4><b>Average ISI (<code>avg_isi</code>)</b>: The mean value of all interspike intervals in a sweep<br>
- <font size=4><b>Slope of f/I curve(<code>f_i_curve_slope</code>)</b>: slope of the curve between firing rate (f) and current injected<br>
- <font size=4><b>Input Resistance(<code>input_resistance_mohm</code>)</b>: The input resistance of the cell, in megaohms.<br>
- <font size=4><b>Voltage of after-hyperpolarization(<code>trough_v_short_square</code>)</b>: minimum value of the membrane potential during the after-hyperpolarization

In [None]:
#@title Question 23.2 (4 points)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 23.2 (4points)</h4>
Edit the code below for the feature you selected.
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

In [None]:
## Edit this ---
plt.figure()

#**EDIT** Change your feature below.
feature = '...'

# Plot the histogram, with density = True
plt.hist([celltype_3_df[feature],celltype_4_df[feature]],density = True)

# Change the labels below
plt.xlabel(...)                           #**EDIT** -feature name
plt.ylabel('Normalized Number of Cells')
plt.legend(['Cell Type 3','Cell Type 4']) # **EDIT** -adjust these according to your feature values
plt.title(...)                            # **EDIT** - descriptive title
plt.show()

In [None]:
#@title Question 23.3 (3 points)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 23.3 (3 points)</h4>
Describe what you see in the plots. Do the values differ based on the feature you selected? Why or why not? Refer to the graph to support your answers. <br>
Also compare this to your previous plot(s). Although it is a different feature, do they look the same or different? Refer to the plots and what the data looks like to support your answer.
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

*Type your answer here*

In [None]:
#@title Question 24 (3 points)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 24 (3 points)</h4>
It's more common to plot summary statistics like a mean or median, so let's compare our two cell types with a boxplot. To do so, we can use <code>plt.boxplot()</code>.<br>
The code below is already set up for you -- just run it and edit your labels as necessary.
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

<font size=4>Notice this is to demonstrate how to do this using pyplot from [matplotlib](https://matplotlib.org/3.3.2/api/_as_gen/matplotlib.pyplot.boxplot.html). If you would prefer to try this out like we did using `seaborn` in the previous lab notebook or hw, feel free to import that package here and then write your own code to do so!

In [None]:
## Edit this ---
# Boxplot creation lines below
# this uses the first cell types, but you are welcome to change to the most recent above if you'd like. Either option is fine
plt.boxplot([celltype_1_df[feature],celltype_2_df[feature]])
plt.ylabel(...)                                                 #**EDIT** adjust y-axis label
plt.xticks([1, 2], ['Cell Type 1','Cell Type 2'])               #**EDIT** adjust these values

# Plot title -- be sure to update!
plt.title(...)                                                  #**EDIT**

plt.show()

In [None]:
#@title Question 25 (2 points)
from IPython.display import HTML

alert_info = '''
<div style= "font-size: 20px"; class="alert alert-info" role="alert">
  <h4 class="alert-heading">Question 25 (2 points)</h4>
What is the name of the value shown as an orange line in the boxplot? What are the names for the values that are the black whisker lines? If you aren't sure, review the documentation link above, or search google for an answer =)
'''

display(HTML('<link href="https://nbviewer.org/static/build/styles.css" rel="stylesheet">'))
display(HTML(alert_info))

*Type your answer here*

---
# Project 2 Complete 🎉

In [None]:
from IPython.display import HTML
print('Your neurons are firing!')
HTML('<img src="https://media.giphy.com/media/OBIBNR9ATt3HdpcmLC/giphy.gif">')

---
# Final Submission 📩

### **Important submission steps:**
1. <font size=4>Choose **Save** (and make sure you've already saved a copy in your drive) from the **File** menu.
3. <font size=4>You will make sure your notebook file is saved in the following steps.
4. <font size=4>You will submit the notebook for this assignment to the corresponding Assignment on the WebCampus (Canvas) course website.

<font size=4>**It is your responsibility to make sure your work is saved before following the instructions in the last cell.**

## Submission

<font size=4>Make sure you have run all cells in your notebook in order before running the cell below, so that all images/graphs appear in the output.
**Please save (or check again) before exporting!**
You will save the notebook file (.ipynb):


1.  <font size=4>Go to `"File > Download"` and choose the **.ipynb format** (first option)
  - <font size=4>This will save a copy of the python notebook file- extension .ipynb- in the Downloads folder on your computer (or wherever you have opted to save files)


2. <font size=4>If the above option is not available to you, make sure to use ctrl + s on a pc (press both keys at same time, do not include the + sign) or command + s (press both keys at same time, do not include the + sign) for apple devices. Look at the top of the Menu in google colab, and toward the middle, it might say that changes were saved.
  * <font size=4>If you want to check that things were saved recently, go to your Google drive (via an online browser or from the app) and check the timestamp for when your notebook was last updated. If it wasn't saved recently, go back to the tab where you have your notebook open and resave.
  * <font size=4>The notebook file `"Copy of Project2.ipynb"` will be in your google drive under the `"Colab Notebooks"` folder. (see info at top for more on where things get saved)

---
# Errors and Troubleshooting ❗

If you run into any issues with the allensdk, I've included code here so that you can import the individual datasets, as well as the one we combined.

In [None]:
## Import individual brain region files and brains data
dataframe = pd.read_csv('ephys_features_allencelltypes.csv')
cells = pd.read_csv('cells_allencelltypes.csv')
full_dataframe = pd.read_csv('combined_ephys_cells_df_allencelltypes.csv')
print('data imported!')

-----------
<a id="technical"></a>

# Technical notes & credits 👏 🧑


<font size=4>This notebook demonstrates most of the features of the AllenSDK that help manipulate data in the Cell Types Database.  The main entry point will be through the `CellTypesCache` class. `CellTypesCache` is responsible for downloading Cell Types Database data to a standard directory structure, and you will not have to keep track of where your data lives.

<font size=4>Much more information can be found in their <a href="https://community.brain-map.org/t/documentation-cell-types-database/2845"> documentation</a>.


<font size=4>This file modified from <a href='https://alleninstitute.github.io/AllenSDK/cell_types.html'>these</a> notebooks.

<font size=4>In case you're curious, <a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.plot.html ">here's documentation</a> for plotting pandas series (which we do quite a bit above).

<font size=4>This notebook was modified from [Ashley Juavinett, PhD at UCSD](https://sites.google.com/ucsd.edu/neuroedu/home). The results from this lesson plan for teaching have also been published and are available [online](https://www.funjournal.org/2020-volume-19-issue-1/): Juavinett, A. Learning How to Code While Analyzing an Open Access Electrophysiology Dataset. J Undegrad Neurosci Educ (JUNE). 2020 Dec 31;10(1):A94-A104.