# neuro boot camp day 1
## wrangling ephys data

## contents
* [1. types of data neuroscientists acquire](#data)
* [2. voltage trace as vector (everything is a vector)](#vect)
* [3. peak detection (from scratch)](#peak)
* [4. filtering: low, high pass; notch](#filt)
* [5. spike trains, raster plots](#plot)
* [6. simple spike sorting](#sort)


## 0. preliminaries

(FYI: if you are an advanced student and you breeze through these exercises, I would point you at [Neural Data Science](https://www.sciencedirect.com/book/9780128040430/neural-data-science) by Nylen and Wallisch (a fulltext pdf has been uploaded to github).  You can push yourself to work through some of the more advanced examples there.  Everyone else may find it a useful set of prompts to consider down the road.

In [3]:
import numpy as np
import matplotlib
%matplotlib tk

import matplotlib.pyplot as pl

import scipy.io



<a id="data"></a>
## 1. types of data neuroscientists acquire

In [None]:
# load in .npy and plot it
# what should x be???

In [None]:
# some of these are really flat?  plot the one of the flattest looking ones by itself.

In [None]:
# okay... so how can we plot these comparably?
# some people might do subplots, but we want to lean toward scaling 
# to build an intuition of Z-scoring

<a id="vect"></a>
## 2. voltage trace as a vector (everything is a vector)

In [None]:
# not sure what questions to ask here.

# we saw this laboriously with an image on Friday.

<a id="peak"></a>
## 3. peak detection (from scratch)

Python has some libraries that allow for peak detection, but I think doing some manual peak finding is a useful way to hone your python skills as well as think about ephys traces.

For the early parts of this exercise, we'll start with a simple sine wave.  But even this step requires a little bit of thought.  We're not going to find peaks on an abstract or analog sine wave, but rather that is explicitly sampled over time.  Play with the sampling frequencies(Fs) and sine wave frequency(f) in the code block below to see if you can build some intuition about how the Fs needs to relate to f in order to be able to pick out individual peaks in our sine wave.

(add during lecture: link to discussion of Nyquist limit)

**exercise 3.1:** write a function that will find the local maxima in a sine wave.

In [None]:
#use this code block to develop your function

def find_maxima(sinewave):
    
    # your code here
    
    return result  # what should result look like?



In [None]:
# interlude
#
# what should the input sinewave look like??

# exercise 3.1a: plot 1 second of a sine wave of frequency 8 Hz.

In [4]:
Fses = [400,30,10]   # sampling frequencies
f = 8                # freq of sine wave
sample = 100

for Fs in Fses:
    x = np.arange(sample*Fs/400)
    y = np.sin(2 * np.pi * f * x /Fs) #why /Fs???
    pl.plot(x/Fs, y, "o-", label=Fs)
    
#pl.xlabel('time (s)')
#pl.legend()

In [5]:
# let's try to duplicate that more simply.
Fses = [400,28,10]
f=8
pl.figure(figsize=(8,3))

for Fs in Fses:
    #t = np.linspace(0,1,num=Fs)
    # instead of:
    t = np.arange(0.0, 1.0, 1/Fs)
    y = np.sin(2*np.pi*f*t);
    pl.plot(t, y, ".-", label=Fs)
    
#pl.xlabel('time (s)')
#pl.legend()

    
# Fs = 150;
# t = 0:1/Fs:1;
# x = cos(2*pi*10*t);

Let's proceed with Fs = 400 and f = 8.  Your first task is to find the local maxima in this trace.

In [None]:
# set up sine wave for local maxima detection
Fs = 400
f = 8
#f = 8.01

times = np.arange(0.0, 0.4, 1/Fs)
sinewave = np.sin(2*np.pi*f*times);

pl.figure()
pl.plot(times,sinewave)
pl.xlabel('time (s)')

In [None]:
sinewave.size

In [None]:
pl.fill_between() ##time, mean-sd, mean+sd

In [None]:
# If the overall task is too daunting, you can start with a small piece -- 
# crop the sinewave to a region where you know there's only one peak.
#
#cropped_times = np.arange(0.0, 0.065, 1/Fs)
#cropped_sinewave = np.sin(2*np.pi*f*cropped_times);
#pl.plot(cropped_times,cropped_sinewave)
#pl.xlabel('time (s)')

# draft solution using for loop
print("i\ttime\ty")
for i,x in enumerate(sinewave):
    print(i,"\t%0.9f" % x)
    try:
        if (x>sinewave[i-1] and x>sinewave[i+1]):
            print("max@",i,"\t",times[i],"\t",x)
    except IndexError:
        print("\n end of array")
print("\n done")

In [None]:
print("i\ttime\ty")

for i,x in enumerate(sinewave):
    try:
        if (x>sinewave[i-2] and x>sinewave[i+2]):
            print(i,"\t",times[i],"\t",x)
    except IndexError:
        print("\n done")

In [None]:
# draft solution using max

#start here -- max is green!  we'd be re-defining what max() means!
#max = max(sinewave)

sinemax = max(sinewave)
timemax = [(i,x) for i,x in enumerate(sinewave) if x==sinemax]

timemax_a = sinewave[sinewave==sinemax]

print(timemax, timemax_a)

In [None]:
# can you put your code inside an executable function?

def findpeaks(signal,times):
    
    # your code here
    
    print(peaktimes,peaks)
    

In [None]:
# run findpeaks and print output describing local maxima

In [None]:
# let's embellish findpeaks by having it plot the input and the detected peaks.

In [None]:
# okay, time for the moment of truth: let's run your findpeaks function on a series of trig functions

what's up in that wobbly one?  data aren't so clean!  we need some additional scaffolding to feel good about the peaks that we're detecting

for reference, see the [matlab findpeaks documentation](http://www.mathworks.com/help/signal/ref/findpeaks.html) and their [peak finding tutorial](https://www.mathworks.com/help/signal/examples/peak-analysis.html).
* prominence
* min peak distance
* height threshold

![alt text](http://www.mathworks.com/help/examples/signal/win64/DeterminePeakWidthsExample_02.png "peak features")

In [None]:
# how do we do on real data?  Let's pull out a certain amount of time around each spike and show the average.

# we can compare your peak detector to others.

<a id="filt"></a>
## 4. filtering: low, high pass; notch

In [None]:
# time to talk a bit about file i/o



<a id="plot"></a>
## 5. spike trains, raster plots

In [None]:
A = np.random.choice([0,1], 1000, p=[0.9,0.1]).reshape(10,100)

fig5 = pl.figure()
pl.subplot(2,1,1)

spiketimes = [i for i,x in enumerate(A[0]) if x==1]
pl.vlines(spiketimes,0,1)

print(spiketimes)

# solution to prompt to plot ten of them together.
# same line of thinking if you want to show ten traces together.

pl.subplot(2,1,2)
for i,rows in enumerate(A):
    times = [j for j,x in enumerate(rows) if x==1]
    pl.vlines(times,i,i+1)




**provocation/challenge**:  let's look all the way back at part 1.  Can you use vlines to put color-coded y scales correlated with each trace in your plot?

Here's a reminder of the scale units for each trace:
* 1: µV
* 2: dB

In [None]:
# copy your code that generated the trace here, then add to it.

#put it outside you axes and then: pl.plot(x,y, ... , clip_on=false)

<a id="sort"></a>
## 6. simple spike sorting

We can do simple spike sorting by k-means as illustrated in the tutorial docs for Neu350.  The main problem here is that there's no "algorithm" for them to discover.

*Maybe*, we start with threshold clustering??  Histogram of amplitudes.

*Then* we give them the k-means example, with some caveats.

Finally, if time permits, we should let people play with https://github.com/tridesclous/tridesclous.  No problem just showing them some screenshots.

In [None]:
# if time permits

## ben did an exercise on Friday showing an IFR.  How do we get there?

## do we touch *neo*?