# Introduction
Today we will be using electrophysiology data to better understand how a particular set of drugs effect GABA release. Electrophysiology is a technique that is used to measure electrical changes in cells. In this example, we will be using whole-cell voltage clamp recording data. With this technique, we lower a glass pipette down to the surface of the cell membrane and develop a tight seal with the cell membrane. Then we "break in" to the cell membrane, meaning that the fluid within our pipette tip is continuous with the intracellular fluid within the cell. This allows us electrical access to the cell. 

Utilizing this electical access, we can inject current to maintain a specific voltage within the cell. In voltage clamp mode, we are holding the cell at a fixed resting potential (-70 mV). Ion channels in the cell's membrane result in current flowing in or out of the cell. Normally these currents would lead to a change in the cell's membrane voltage. However, we inject current into the cell to counteract these currents, thereby maintaining the cell's potential at -70 mV. Here, we record the amount of current injected into the cell as a proxy for the current generated by the ion channel. This data is then used to deduce how these ion channels work.

In this experiment, we used electrophysiology to assess the GABAergic miniature inhibitory post-synaptic current (mIPSC).

<img src = "Mini1.jpg" width="600" >
<img src = "Mini_zoom.jpg" width="600" >

Action potentials generate large currents that mask the small contributions from post-synaptic currents. By adding TTX to the incubation medium, we can prevent these action potentials from forming, thereby enabling us to measure the small currents generated by various ion channels. To isolate the role of a particular channel (or family of channels), we need to use various agonists and antagonists. For example, NBQX is an antagonist of AMPA receptors (i.e., it blocks the activity of these receptors). By adding NBQX to the incubation media, we can rule out AMPA receptors as the source of the currents we measure. Cadmium (Cd) is a drug that blocks voltage-gated calcium channels. By using this drug, we are able to assess whether a cellular process is dependent on these voltage-gated calcium channels. Rimonabant (RIM) is a cannabinoind-1 (CB1) receptor antagonist. Adding RIM to the incubation media, we can assess whether the CB1 receptor has 'tone' within this brain region, i.e. whether the CB1 receptor has perpetual activity that is modulating neural activity. 
    
Using these drugs, we are able to sequester GABA spontaneous release. GABA is an inhibitory neurotransmitter whose transmission hyperpolarizes the neuron (through chloride and potassium channels). GABAergic mIPSCs are a way to investigate spontaneous GABA release, by blocking action potentials and glutamate signaling we can sequester GABA spontaneous release through these recordings. 

Each of the downward reflections in a recording represents GABA release onto our recording cell, as we are measuring the post-synaptic effect, or the change in post-synaptic potential. Drugs can have a major effect on neural signaling processes, the goal of this experiment to examine the effect of these two different drugs on GABAergic mIPSP characteristics. The characteristics that we are interested in are the input current, the rise time, the decay, and the inter-event intervals, as well as the frequency of the events within a condition (drug vs no drug). 

# Importing libraries

First, we need to import the libraries that we will be using to quantify these data. Today, we will be using pandas and numpy, two libraries that we have seen recently. We will also be utilizing matplotlib.pyplot to create our histograms. 

In [None]:
# Import the libraries 
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# This ensures that plots are shown below each code cell
%matplotlib inline

# Limit the max number of lines shown in a dataframe
pd.options.display.max_rows = 7

# Loading and Organizing our data

### Exercise

First, let's import our data from `minis data.csv` and store in in a variable, `minis_data`. Once you've loaded the data, display the contents of the dataframe below the cell so you can understand the contents better. 


Remember that, in Jupyter notebooks, cells ending in a variable name or unassigned output of a statement will display that variable without a need for a print statement.

Auto display using a variable name:

    minis_data = ...
    minis_data
   
Auto display using unassigned output of a statement:

    minis_data.head()
    
In this example, `head` is a dataframe method that returns the first few rows of the dataframe. Since you're not assigning it to a new variable, Jupyter will automatically display the return value.

If you were to end the line with a semicolon, `;`, you can suppress this auto-printing:

    minis_data.head();
    
Hint: Recall that we have already imported the pandas library, which has a function that we can use to read a csv file. We learned how to use this function in the first and second classes. Look back at these notebooks if you need a review.

_Thought question: 
When do you use the 'full path' versus the 'relative path' to retrieve your file?_ 

Disclaimer: You can always use either full path or relative path, but sometimes it's easier (or more desirable) to use one vs the other.

In [None]:
# Answer
minis_data = pd.read_csv('minis data.csv') 
minis_data

### Exercise

There are our data, excellent! Is there anything odd that you notice about the data? 

Hint: Look at the `Cell` column. 

Can you figure our how to print all of the *unique* cell names? Extract the cell column and save as a new variable, `cell_column`. Look for a method using tab that will give yout the *unique* values in the column.
 

In [None]:
cell_column = minis_data['Cell']

# Answer
unique_cells = cell_column.unique()
unique_cells

The `NaN` values are because the program that generates the CSV is "lazy" about filling in repeated values. When reading the data, pandas sees blank cells and substitutes the value `NaN` (short for "not a number"). The simple solution is to run the following code to fill in the missing values by "forward-filling" the last cell found. 

### Exercise

Now we are going to `forward fill` our `cell_column` to replace `NaN` with the cell identifier. There is a method that allows us to perform this operation:

    cell_column.fillna(...)
    
Figure out what arguments to pass into this function and run it.

In [None]:
cell_column.fillna?

In [None]:
# Answer
cell_column.fillna(method='ffill', inplace=True)
minis_data

Although we performed the operation on the `cell_column` variable, the `minis_data` dataframe was also updated to reflect this. That's because `cell_column` is a "view" into the data stored in the `minis_data` dataframe. This means that changes to data via `cell_column` will update data in `minis_data`.

## The == operator

Now we want to save the subset of data from the 'TTX + NBQX' condition into a new variable called `minis_baseline`. How can we use the information from a column to create a new variable with a subset of the data? 

You're already familiar with the assignment operator, `=`, e.g.:
    
    my_favorite_number = 'four'

We now introduce a new operator that looks very similar, `==`. This operator [checks for equality](https://dbader.org/blog/difference-between-is-and-equals-in-python). What is equality? Let's demonstrate.

In [None]:
my_favorite_number = 'four'

Will the following cells print True or False?

In [None]:
my_favorite_number == 'four'

In [None]:
my_favorite_number == 4

In [None]:
my_favorite_number == 'five'

In [None]:
her_favorite_number = 'four'
my_favorite_number == her_favorite_number

What are `True` and `False`? They're special objects in Python known as boolean values. 

How does the `==` operator work with pandas objects? When you extract a single column from a Dataframe, do you get a series or dataframe?

In [None]:
minis_data['Drug']

In [None]:
mask = minis_data['Drug'] == 'TTX + NBQX'
mask

### Exercise

It looks like using the equality operator with pandas series returns a new series containing boolean values (note the `dtype: bool`) at the bottom. 

Wait a second! Shouldn't the first three rows be True? Why might this be happening? Any ideas why? If you're not sure, how might you take a closer look at the value for `Drug` in the first row? Remember how to index a dataframe or series?

In [None]:
# Answer
minis_data.loc[0, 'Drug']

## Troubleshooting typos in the spreadsheet

Remember how we used the `unique` method on series objects to check for *unique* elements in a column? 

In [None]:
minis_data['Drug'].unique()

Yeah, our data needs a bit of cleaning up. There's several approaches we can use, but let's stick with our use of boolean masks. First, let's make a mask that marks the cells with the typo.

Knowing how to debug errors like this are very important. A key part of data analysis is cleaning up data from various sources. Another error you will commonly encounter is when you are expecting a column label, row label or cell value to be a number (e.g., `132`), but pandas loaded it as a string instead (e.g., `'132'`). This is why it's important to know how to *inspect* an object.

Remember how we made a mask earlier?
    
    mask = minis_data['Drug'] == 'TTX + NBQX'
    
We can use this mask to pull out only the rows NOT containing the typo.

In [None]:
minis_data.loc[mask]

We can even drill down to a single column.

In [None]:
minis_data.loc[mask, 'Drug']

We've only used `loc` to extract data from a dataframe. We can also use it to update data in a dataframe:

    dataframe.loc[rows, cols] = value

To correct the typo 'TTX + NBQX ' (with a space), we  need to create a typo_mask to locate the rows with the typo and then assign them with the correct name.

In [None]:
typo_mask = minis_data['Drug'] == 'TTX + NBQX '
minis_data.loc[typo_mask, 'Drug'] = 'TTX + NBQX'
minis_data.loc[0, 'Drug']

## Back to organizing our data

### Exercise

Great! Now that we've fixed the typo, let's go back to our original exercise. We want to extract the subset of rows from `minis_data` where `TTX + NBQX` was used and save those rows as a new dataframe, `minis_baseline`. Go ahead and try it.

In [None]:
# Answer
mask = minis_data['Drug'] == 'TTX + NBQX'
minis_baseline = minis_data.loc[mask]
minis_baseline

### Exercise

Now do the same for `TTX + NBQX + Cd` (save as `minis_drug1`) and `TTX + NBQX + Cd + RIM` (save as `minis_drug2`).

You can use `display` when you want to show the contents of more than one dataframe below the cell.

In [None]:
# Answer
mask1 = minis_data['Drug'] == 'TTX + NBQX + Cd'
minis_drug1 = minis_data[mask1]
display(minis_drug1)

mask2 = minis_data['Drug'] == 'TTX + NBQX + Cd + RIM'
minis_drug2 = minis_data[mask2]
display(minis_drug2)

Now we are almost ready to plot some histograms! Let's first look at the mIPSC amplitude of the baseline condition. The code below extracts the amplitude data and stores it in a new parameter named `amp_baseline`. 

In [None]:
amp_baseline = minis_baseline['mIPSC amplitude (pA)']

# Plotting a histogram

Now it's time to plot the distribution of mIPSC amplitudes! Read the documentation of `plt.hist` to figure out how to generate a histogram for `amp_baseline` with eight bins.

### Exercise

In [None]:
# Answer
plt.hist(amp_baseline, bins=8);

Next we want to compare the mIPSC amplitude distribution of different conditions by stacking their histograms on the same axes.

So we already have amp_baseline. We also need to extract the amplitude data from minis_drug1 and minis_drug2. Recall the column is named 'mIPSC amplitude (pA)'.

### Exercise

Save the `amplitude` in variables called `amp_drug1` and `amp_drug2`.

In [None]:
# Answer
amp_drug1 = minis_drug1['mIPSC amplitude (pA)']
amp_drug2 = minis_drug2['mIPSC amplitude (pA)']

Now let's plot the distributions of the amplitudes for drug 1 and drug 2. 

In [None]:
plt.subplot(121)
plt.hist(amp_drug1, bins=8)
plt.title('Drug 1')
plt.xlabel('mEPSC amplitude (pA)')
plt.ylabel('Number of events')

plt.subplot(122)
plt.hist(amp_drug2, bins=8)
plt.title('Drug 2')
plt.xlabel('mEPSC amplitude (pA)')
plt.ylabel('Number of events')

# Autofix issues where text labels overlap
plt.tight_layout()

### Exercise

Now plot the histograms of amp_baseline, amp_drug1 and amp_drug2 on the same axes;
specify their colors to be grey, blue and red respectively; and set the number of bins to 50.

The end product should look like this:

<img src = "hist.png" width="400" >

This is only one step away from the correct answer:

<img src = "hist_wrong.png" width="400" >


In [None]:
# Answer
data = [amp_baseline, amp_drug1, amp_drug2]
color = ['grey', 'blue', 'red']
plt.hist(data, color=color, bins=50, stacked=True);

### Exercise

To make this a more discriptive figure, add axes (i.e., x and y) labels, title and a legend. You've already seen how to add plot titles and axes labels. Legends are a new concept. You can use a function, `plt.legend`. Take a look at the documentation to see how to use this function. Hint: `plt.hist` can take multiple "labels" (one for each dataset provided). Review the documentation for `plt.hist` as well if needed.

In [None]:
plt.legend?

In [None]:
# Answer
data = [amp_baseline, amp_drug1, amp_drug2]
colors = ['grey','blue','red']
labels = ['Baseline','Drug1','Drug2']

plt.hist(data, color=colors, label=labels, bins=50, stacked=True)
plt.legend()
plt.xlabel('Amplitude (pA)')
plt.ylabel('Frequency')
plt.title('Amplitude Histogram');

## Fitting the histograms using curve_fit

This histogram shows us important information about how our drugs are effecting GABA miniature events, but we can dig into these data a little bit further. For instance, the statistical fit of the data within each `drug` condition gives us important information about the mean amplitude and the variance. 

Brad's note: The various distributions have a `fit` method that returns the fitted parameters. `norm` has two parameters, loc and scale. These translate to mu and sigma (mean and standard deviation). `scipy.stats` has a formulation for the various distributions that do not map well to more common formulations used in biological sciencies. I think the parameter names of the `scipy.stats` tend to be oriented towards the ones used by statisticians and mathematicians, so I always have to Google to find out the correct mapping of loc, scale, etc. to the values I want.

First, let's try a standard normal distribution:

Notice that we're switching from a histogram, where we calculate the number of occurances per bin, to a density plot, which is an approximation of the probability density function. This is a normalized histogram in which the value of each bin is the probablity of making that particular observation (i.e., that the mEPSC amplitude will fall within that bin). We make this switch so that the y-values returned by `norm.pdf` can be compared directly with the histogram.

Remember how we used `curve_fit` last week? We'll use it again this week. However, rather than fitting our own equation, we will fit an equation already made available via the `scipy.stats` module. Specifically, we will fit the function provided by `norm.pdf` to our data. The function `norm.pdf` takes three arguments (`x`, `loc` and `scale`) and returns the PDF (probability density function) at each value in `x`. The `loc` and `scale` parameters map to the mean and standard deviation of the normal distribution.

### Exercise

To perform the curve-fitting, we need to do the following steps:

* Generate the x and y values for our data. Here, we are fitting a histogram so we can take the `bins` and `density` calculated by `plt.hist` and use that as our data for fitting.
* Come up with an initial guess for our mean and standard deviation.
* Do the curve fit.
* Plot the results.

In [None]:
curve_fit?

In [None]:
from scipy.optimize import curve_fit
from scipy.stats import norm

# plt.hist returns three values. the value for each bin, 
# the bins (including the rightmost edge) and the plot
density, bins, _ = plt.hist(amp_baseline, bins=50, density=True)

# For curve fitting, len(x) must equal len(y). So, we discard the rightmost edge.
x = bins[:-1]

# Start with an initial guess that's in the correct ballpark. 
# If we start too far off, the curve fitting will just fail. 
# Plus, curve_fit can't figure out how many arguments that `norm.pdf` 
# takes due to how the developers wrote norm.pdf. So, by providing
# p0, curve_fit can figure out how many variables it's supposed to fit.
p0 = [50, 50]

# Answer
# Now, do the fit. We use the underscore, _, to indicate that we don't care
# about the second value returned by curve_fit.
p0_fitted, _ = curve_fit(norm.pdf, x, density, p0)

# Compute the fitted line and plot it.
y = norm.pdf(x, *p0_fitted)
plt.plot(x, y)

The normal distribution fits our data pretty well, but perhaps there is another fit that might work better. Let's look at a gamma fit. Do the same for gamma. Since gamma has three parameters instead of two, our initial guess, `p0` should have three parameters.

In [None]:
from scipy.stats import gamma

density, bins, _ = plt.hist(amp_baseline, bins=50, density=True)
x = bins[:-1]
p0 = [10, 10, 10]

# Answer
p0_fitted, _ = curve_fit(gamma.pdf, x, density, p0)
y = gamma.pdf(x, *p0_fitted)
plt.plot(x, y)

## Bonus: using norm.fit and gamma.fit

Alternatively, we can use norm.fit to estimate best-fit parameters for norm.pdf. 

In [None]:
from scipy.stats import norm

mu, sigma = norm.fit(amp_baseline)

# density=True gives us an estimate of the PDF 
# (i.e., the fraction of observations for that bin).
plt.hist(amp_baseline, bins=50, density=True)

x_fit = np.arange(0, 250)
y_pdf = norm.pdf(x_fit, mu, sigma)
plt.plot(x_fit, y_pdf)

### Exercise

We can see that this distribution does not really fit our data. What distribution might work better to fit our data? 

Cut and paste the code above to make it work with a different distribution. Go ahead and try it. Work with the following template to make a different distribution to fit the data. 

    from scipy.stats import ???
    ??? = ???.fit(amp_baseline)

    plt.hist(amp_baseline, bins=50, density=True)

    x_fit = np.arange(0, 250)
    y_pdf = ???.pdf(x_fit, ???)
    plt.plot(x_fit, y_pdf)

In [None]:
from scipy.stats import gamma
#Answer
shape, loc, scale = gamma.fit(amp_baseline)

plt.hist(amp_baseline, bins=50, density=True)

x_fit = np.arange(0, 250)
y_pdf = gamma.pdf(x_fit, shape, loc, scale)
plt.plot(x_fit, y_pdf)

# Bonus! Creating Loops

In the previous exercise, we extracted three subsets of the dataframe by experimental conditions using the following codes:

    minis_baseline = minis_data[minis_data['Drug'] =='TTX + NBQX'] 
    minis_drug1 = minis_data[minis_data['Drug'] =='TTX + NBQX + Cd']
    minis_drug2 = minis_data[minis_data['Drug'] =='TTX + NBQX + Cd + RIM']  

Then we plotted the stacked histogram:

    plt.hist ([amp_baseline,amp_drug1,amp_drug2], ..., label = ['Baseline','Drug1','Drug2'])

There's a lot of repetitive typing! Let's try simplifying the process with a for loop. Run the cell below and see if you can figure out the syntax of a for loop that loopes through keys and values.

In [None]:
drug_map = {
    'TTX + NBQX': 'baseline',
    'TTX + NBQX + Cd': 'Drug 1',
    'TTX + NBQX + Cd + RIM': 'Drug 2'
}

template = 'The drug name is "{x}" and the label for the plot is "{y}"' 
for key, value in drug_map.items():
    print(template.format(x=key, y=value))

### Exercise

Now, plot the `mIPSC amplitude (pA)` column using a similar for loop as discussed above. Fill in the `...` to create the information needed for the histogram.


    drug_map = {
        'TTX + NBQX': 'baseline',
        'TTX + NBQX + Cd': 'Drug 1',
        'TTX + NBQX + Cd + RIM': 'Drug 2'
    }
    hist_data = []
    hist_labels = []
    col_name = 'mIPSC amplitude (pA)'
    
    for drug_name, drug_label in drug_map.items():
        ...
        
    plt.hist(hist_data, label=hist_labels, stacked=True, bins=50)
    
Hint:   You can extract the subset of the data you want by making a mask and then doing:

    subset = minis_data.loc[rows, cols]

In [None]:
# Answer: 
drug_map = {
    'TTX + NBQX': 'baseline',
    'TTX + NBQX + Cd': 'Drug 1',
    'TTX + NBQX + Cd + RIM': 'Drug 2',
}

hist_data = []
hist_labels = []
col_name = 'mIPSC amplitude (pA)'

for drug_name, drug_label in drug_map.items():
    mask = minis_data['Drug'] == drug_name
    subset = minis_data.loc[mask, col_name]
    hist_data.append(subset)
    hist_labels.append(drug_label)
    #print (hist_data, hist_labels)

plt.hist(hist_data, label=hist_labels, stacked=True, bins=50)
plt.legend()

# Creating a function

## Exercise

Now let's wrap the codes above inside a function:  

    def plot_data(data, column, drugs):
        """
        Short description of function.

        Parameters
        ----------
        data : pandas DataFrame
            Data containing column of interest.
        column : string
            Name of column containing data to plot
        drugs : dict
            Dictonary mapping drug name (key) to legend label (value) in the plot. 
        """
        ???
        
Make sure to run your function and see if it performs as expected.


In [None]:
# Answer
def plot_data(data, column, drugs):
    """
    Plots stacked histograms.
    
    Parameters
    ----------
    data : pandas DataFrame
        Data to plot
    column : string
        Name of column containing data to plot
    drugs : dict
        Dictonary mapping drug name (key) to legend label (value) in the plot. 
    """
    hist_data = []
    hist_labels = []
    for drug_name, drug_label in drugs.items():
        mask = data['Drug'] == drug_name
        subset = data.loc[mask, column]
        hist_data.append(subset)
        hist_labels.append(drug_label)

    plt.hist(hist_data, label=hist_labels, stacked=True, bins=50)
    plt.legend()
    plt.xlabel(column)
    plt.ylabel('frequency')


plot_data(minis_data, 'Decay (ms)', drug_map)

Because we added a docstring, we can actually look at the documentation for our own custom function! Try it:

    plot_data?

In [None]:
plot_data?

Plotting should work the same for all columns (ie. 'mIPSC amplitude (pA)','Rise-Time (ms)','Decay (ms)' and 'Inter-Event Intervals (s)'). Next, we will write codes that ask for user choice of column and assign the chosen column to a parameter called `column_name`.  

We can acquire user input in the following format:  
    
    response = input(text_to_display)  
    
Try running the cell below and see what happens.
    

In [None]:
message = '''
Which parameter should I plot (please specify by letter)?
    A.amplitude 
    B.rise-time 
    C.decay-time 
    D.IEI
'''
choice = input(message)
choice

## Exercise

Now we have A/B/C/D that specifies the parameter of interest, how would you assign the correct column name basing on the letter?  

You can create a dictionary called `choice_map`, which uses A/B/C/D as key and corresponding column names as value. (Refer to drug_map if you don't remember the format.) You can then extract the correct column name from `choice_map` by calling the key specified by user input.

In [None]:
choice = input(message)
#Answer
choice_map = {
    'A': 'mIPSC amplitude (pA)',
    'B': 'Rise-Time (ms)',
    'C': 'Decay (ms)',
    'D': 'Inter-Event Intervals (s)'
}
column_name = choice_map[choice.upper()]
print ('column_name is {x}'.format(x = column_name))

What if the user typed in something other than A, B, C, D? With the codes above, you would get a KeyError message because python cannot find the key in `choice_map`. To get around this problem, you can use `try except`. If an error is encountered in the `try` block, instead of printing an error message and terminating the cell, python will turn to the `except` block and execute whatever command(s) under `except`.

Let's first look at an example together.

In [None]:
a = ['apple','amazon','airbnb']
for ind in range (0,5):
    try:
        print (a[ind])
    except IndexError:
        print ('Index {i} exceeds the upper bound of list a.'.format(i=ind))
    

Here, the largest index in a list of 3 elements is 2 (recall that python index starts from 0). However, in range (0,5), you get up to 4, exceeding the index of `a`. Therefore, when you try to do a[3] or a[4], there will be an IndexError. This is when we turn to the `except` block; hence you see the printed statements.

Now fill in the `...` to assign a string to `column_name` basing on user input and the `choice_map`. Your code should print a reminder when the user inputs something other than A/B/C/D.

    choice = input(message)
    choice_map = {
            'A': 'mIPSC amplitude (pA)',
            'B': 'Rise-Time (ms)',
            'C': 'Decay (ms)',
            'D': 'Inter-Event Intervals (s)'
        }
    try:
        ...
    except ...:
        print('Please specify a letter: A/B/C/D')

In [None]:
choice = input(message)
choice_map = {
        'A': 'mIPSC amplitude (pA)',
        'B': 'Rise-Time (ms)',
        'C': 'Decay (ms)',
        'D': 'Inter-Event Intervals (s)'
    }
try:
    column_name = choice_map[choice.upper()]
    print ('The column name is {x}'.format(x = column_name))
except KeyError:
    print('Please specify a letter: A/B/C/D')

Finally let's combine column selection with plotting.

1) Get the user input using:
    
    choice = input(message)

2) Make a dictionary called `drug_map`:

    drug_map = {
            'TTX + NBQX': 'baseline',
            'TTX + NBQX + Cd': 'Drug 1',
            'TTX + NBQX + Cd + RIM': 'Drug 2',
        }  
    
3) Define a function `choice_to_column_name` that takes one parameter `choice` and returns the column name. 
        
    def choice_to_column_name(choice):
         ... use your answer above to populate this  

4) Recall that we have defined another helper function `plot_data`, which plots a histagram given the dataframe, the column name, and the drug map. You can run the function by doing the following:

    plot_data(data, column, drugs)  
    
5) Using the two helper functions, define `plot_hist`, where data is the general dataframe (e.g. minis_data); choice is a string that specifies user selection; drugs is a dictionary that links experimental conditions with drug names.
    
    def plot_hist (data, choice, drugs): 
        ...           

6) Try running `plot_hist` and see if it works properly!

    


In [None]:
# Answer
choice = input(message)
drug_map = {
    'TTX + NBQX': 'baseline',
    'TTX + NBQX + Cd': 'Drug 1',
    'TTX + NBQX + Cd + RIM': 'Drug 2',
}

def choice_to_column_name(choice):
    choice_map = {
        'A': 'mIPSC amplitude (pA)',
        'B': 'Rise-Time (ms)',
        'C': 'Decay (ms)',
        'D': 'Inter-Event Intervals (s)'
    }
    try:
        return choice_map[choice.upper()]
    except KeyError:
        print('Please specify a letter: A/B/C/D')

def plot_hist (data, choice, drugs):
    column_name = choice_to_column_name (choice)
    plot_data(data, column_name, drug_map)

    
plot_hist(minis_data, choice, drug_map)

# Bonus: Pyplot vs object-oriented Matplotlib interface

There are two interfaces to Matplotlib. The first, which we've used extensively in this class, is `pyplot`. The second is known as the object-oriented interface. While `pyplot` is designed to offer a MATLAB-style plotting experience, the object oriented interface is much more powerful and allows you to customize your plots in greater detail.

Let's compare how we might use each of the two interfaces to plot the histograms.

In the example below, `subplot`, `hist` and `title` are all functions available through the `pyplot` module. Internally, `pyplot` has to remember what the *current* axes (i.e., subplot) is. This means you must make your subplot, title it and label the axes before moving onto the next subplot.

In [None]:
plt.subplot(121)
plt.hist(amp_drug1, bins=8)
plt.title('Drug 1')
plt.xlabel('mEPSC amplitude (pA)')
plt.ylabel('Number of events')

plt.subplot(122)
plt.hist(amp_drug2, bins=8)
plt.title('Drug 2')
plt.xlabel('mEPSC amplitude (pA)')
plt.ylabel('Number of events')

# Autofix issues where text labels overlap
plt.tight_layout()

The following example uses the object-oriented interface. First, we use a function from `pyplot`, `subplots`, to generate our figure and the set of subplots we will be using. Then, we take each axes object returned by `subplots` and use the `hist`, `set_title`, `set_xlabel` and `set_ylabel` methods to format the plot. Since we are calling methods on each axes object, Matplotlib knows which subplot we are attempting to manipulate.

In [None]:
# object oriented interface
figure, axes = plt.subplots(1, 2, sharex=True, sharey=True)

# Get a reference to each individual axes
axes_left = axes[0]
axes_right = axes[1]

# Note that you are calling `hist` as a method on the Axes object
axes_left.hist(amp_drug1, bins=8)
axes_right.hist(amp_drug2, bins=8)

axes_left.set_title('Drug 1')
axes_right.set_title('Drug 2')

for axes in axes:
    axes.set_xlabel('mEPSC amplitude (pA)')
    axes.set_ylabel('Number of events')
    
figure.tight_layout()

# Adjusting the dimensions of a figure and exporting

Bonus code! If you are trying to change the dimensions of a figure or export the figure, you can use the code below. 

In [None]:
data = [amp_baseline, amp_drug1, amp_drug2]
colors = ['grey','blue','red']
plt.hist(data, color=colors, bins=20, ec=None)
fig = plt.gcf()
fig_size = fig.get_size_inches()
print(fig_size)
fig.set_size_inches((fig_size[0]*1.5, fig_size[1]*1.5))