## Review exercises with electrophysiological data

In [1]:
import numpy as np
import matplotlib.pyplot as plt

In [2]:
fileName = "/home/kevin/Downloads/shortRaw.npy" # Change the path to the file you downloaded  
# or 
fileName = "../data/shortRaw.npy" # binder users or people with a local git repository
print(fileName)

../data/shortRaw.npy


In [3]:
dat = np.load(fileName)
type(dat)

numpy.ndarray

## Exercises with NumPy arrays

Here are a few problems to test your NumPy skills.

### NumPy Exercise 1

We will load a NumPy array from a file. Tell me as much as information as you can about it.


In [22]:
fn = "../data/mysteryData.npy"
mysteryData = np.load(fn)

### NumPy Exercise 2

We will use the electrophysiological recordings that you loaded from a file. 

Could you transform the data from each recording channel into a z-score?

### NumPy Exercise 3

Still using the electrophysiological recordings, which channel contains the largest value?

### NumPy and Matplotlib Exercise 4

1. Can you plot a sine wave?

2. Can you add some random noise to your sine wave and plot it?

3. Can you plot a cosine and a sine wave on the same plot?

## Pandas Data Frames

In NumPy arrays, all elements are of the same type. 

Pandas Data Frames allow you to have tabular data where you have elements of different types.

The main objects in the Pandas are 

* DataFrame
* Series


In [14]:
import pandas as pd

We can create a DataFrame from a dictionary containing lists.

In [17]:
myDict = {"name": ["Luke","Stephany","Peter","Antonio","Felix"],
          "grade": [82,95,92,85,83],
          "country": ["Canada","Germany","Germany","Peru","India"]}
myDict

{'name': ['Luke', 'Stephany', 'Peter', 'Antonio', 'Felix'],
 'grade': [82, 95, 92, 85, 83],
 'country': ['Canada', 'Germany', 'Germany', 'Peru', 'India']}

In [18]:
df = pd.DataFrame(myDict)
df

Unnamed: 0,name,grade,country
0,Luke,82,Canada
1,Stephany,95,Germany
2,Peter,92,Germany
3,Antonio,85,Peru
4,Felix,83,India


### Pandas Series

A Pandas Series is a 1D array-like object containing a sequence of values, together with an associated array of data labels called its index.  

In [20]:
obj = pd.Series([4,7,-5,3])
obj

0    4
1    7
2   -5
3    3
dtype: int64

In [39]:
type(obj)

pandas.core.series.Series

It shows the index on the left and the values on the right. You can get them separately.

In [21]:
obj.index

RangeIndex(start=0, stop=4, step=1)

In [22]:
obj.values

array([ 4,  7, -5,  3])

We can set the index when creating the Series.

In [24]:
obj = pd.Series([4,7,-5,3],index=["a","b","c","d"])
obj.index

Index(['a', 'b', 'c', 'd'], dtype='object')

You can select a set of values with the label of the index.

In [25]:
obj['a']

4

In [27]:
obj[["a","d"]]

a    4
d    3
dtype: int64

We can select some elements using NumPy-like operations.

In [28]:
obj>3

a     True
b     True
c    False
d    False
dtype: bool

In [29]:
obj[obj>3]

a    4
b    7
dtype: int64

We can apply NumPy function or other math operations to the values.

In [32]:
np.abs(obj)

a    4
b    7
c    5
d    3
dtype: int64

When doing maths with 2 Series, Pandas automatically aligns the data by index.

In [34]:
obj1 = obj = pd.Series([1,1,1,1],index=["a","b","c","d"])
obj2 = obj = pd.Series([1,2,3,4],index=["d","c","b","a"])
obj1+obj2

a    5
b    4
c    3
d    2
dtype: int64

### Pandas DataFrame

* A Pandas DataFrame is a rectangular table of data.
* It contains an ordered collection of columns, each of which can be a different value type (numeric, string, boolean, etc.).
* The DataFrame has both a row and column index.

 



In [35]:
df = pd.DataFrame(myDict)
df

Unnamed: 0,name,grade,country
0,Luke,82,Canada
1,Stephany,95,Germany
2,Peter,92,Germany
3,Antonio,85,Peru
4,Felix,83,India


In [38]:
type(df)

pandas.core.frame.DataFrame

You can get the values of a particular column this way.

In [36]:
df.name

0        Luke
1    Stephany
2       Peter
3     Antonio
4       Felix
Name: name, dtype: object

In [37]:
type(df.name)

pandas.core.series.Series

You can also use `[]` to get a column. 

In [40]:
df["country"]

0     Canada
1    Germany
2    Germany
3       Peru
4      India
Name: country, dtype: object

To get one row, you can use the `loc` attribute. It returns the row that has the requested index.

In [45]:
df.loc[0]

name         Luke
grade          82
country    Canada
Name: 0, dtype: object

In [46]:
type(df.loc[0])

pandas.core.series.Series

In [None]:
We can add columns to a DataFrame.

In [51]:
df["age"] = [28,35,46,23,np.nan]
df

Unnamed: 0,name,grade,country,age
0,Luke,82,Canada,28.0
1,Stephany,95,Germany,35.0
2,Peter,92,Germany,46.0
3,Antonio,85,Peru,23.0
4,Felix,83,India,


Pandas DataFrames are rather complex. You can manipulate them is many ways. 

We won't go into more details here. 

## Pandas exercise 1

Load from csv

## Matplotlib