# Time series, frequencies, and data visualization
Here, we'll demonstrate how you can use NumPy and Matplotlib to visualize time series in Python.

### By the end of this notebook, you'll be able to:
* Import a Python library and use the functions it contains
* Create NumPy arrays, and execute methods & access attributes of arrays
* Demonstrate how we can add sine waves to create complex waveforms
* Plot time series using Matplotlib
<hr>

## Importing packages
**NumPy** is a useful library that can help store and wrangle homogeneous data. "Homogenous" means that all data points within the data are of the same [data type](https://jakevdp.github.io/PythonDataScienceHandbook/02.01-understanding-data-types.html). In general, you should use this library when you want to do fancy things with lots of numbers, especially if you have matrices or arrays.

Before we can use NumPy, we need to import the package. We can also nickname the modules when we import them. The convention is to import `numpy` as `np`.

In [None]:
# Import packages
import numpy as np

# Use whos to see available modules
%whos

Importing a library is like getting a piece of lab equipment out of a storage locker and setting it up on the bench. Libraries provide additional functionality to the basic Python package, much like a new piece of equipment adds functionality to a lab space. Just like in the lab, importing too many libraries can sometimes complicate and slow down your programs - so we only import what we need for each program.

## Creating NumPy arrays
Once we’ve imported the library, we can create in a few different ways:
* from a Python list or tuples
* by using functions that are dedicated to generating numpy arrays, such as `arange`, `linspace`, `empty`,`zeroes`, etc.
* reading data from files

## What is an array?
![](https://miro.medium.com/max/1400/1*X0Dg7QfSYtWhSAu-afi8-g.png)

(from [this article](https://towardsdatascience.com/numpy-array-cookbook-generating-and-manipulating-arrays-in-python-2195c3988b09))

In [None]:
# Create a list
lst = [1,2,3,4,5]

# Make our list into an array
my_vector = ...

In [None]:
# If we give numpy a list of lists, it will create a matrix
my_matrix = ...

### Benefits of using arrays
In addition to being less clunky & a bit faster than lists of lists, arrays can do a lot of things that lists can't. For example, we can add and multiply them. Alternatively, we can use the `sum` method to sum across a specific axis.

In [None]:
sum_list = ...
sum_array = ...
mult_array = ...

print(sum_list)
print(mult_array)

### Numpy also includes some very useful array generating functions:

* `arange`: like `range` but gives you a useful numpy array, instead of an interator, and can use more than just integers)
* `linspace` creates an array with given start and end points, and a desired number of points
* `logspace` same as linspace, but in log.
* `random` can create a random list (there are <a href="https://docs.scipy.org/doc/numpy-1.14.0/reference/routines.random.html">many different ways to use this</a>)
* `concatenate` which can concatenate two arrays along an existing axis [<a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.concatenate.html">documentation</a>]
* `hstack` and `vstack` which can horizontally or vertically stack arrays

Whenever we call these, we need to use whatever name we imported numpy as (here, `np`).

In [None]:
# Try the functions above


## Use NumPy to generate simple sine waves

Now that we understand the basics of NumPy, let's generate a sine wave using the `np.sin` function. We'll then generate a second sine wave and add these together to understand what a fourier transform of this data would look like. **Sine waves** are defined by their frequency, ampltitude, and and phase.

In [None]:
sampling_freq = 1024 # sampling frequency
dur = 1 # 1 second of signal
freq = 7 # 7 Hz signal
freq2 = 130 # 130 Hz signal

t = np.arange(0, dur, 1/sampling_freq) # Generate a time vector

sig1 = np.sin(2 * np.pi * freq * t) # Generate a sine wave
sig1 = 2 * sig1; # Increase the power of signal 1

sig2 = np.sin(2 * np.pi * freq2 * t) # Generate another sine wave

If we want to plot our sine waves, we need another package! We'll use `matplotlib.pyplot` and import it as `plt`.

In [None]:
# Import plotting package here
import matplotlib.pyplot as plt
%whos

><b>Task</b>:
> 1. Add the two sine waves together, to create a third signal `combined_signal`.
> 2. Plot this signal using `plt.plot(combined_signal)`.
> 3. Show the plot cleanly with `plt.show()`
    

In [None]:
# Add & plot!


## Import and filter real sleep data

Normal physiological data is never as regular as the data above -- it's usually chock full of lots of different waves, as well as noise. Now that we have a sense of the tools we need, let's work with some real data.

The data we'll import here is a real 30-seconds extract of slow-wave sleep from a young individual, collected by the Walker Lab at UC Berkeley. This data was collected at 100 Hz from channel 'F3'. This sampling frequency is fine for EEG data, but wouldn't be enough for high frequency spiking data. That kind of data is typically sampled at 40 **kilo**Hz.

> **Note**: Here we're using a tool called `urllib.request` to download data that is hosted online. However, when you have data that you'd like to use (as a .txt or .csv file, for example), you can use [`np.loadtxt`](https://numpy.org/doc/stable/reference/generated/numpy.loadtxt.html) to load the file into the notebook from your computer.

In [None]:
import urllib.request

# URL of data to download
data_url = 'https://raphaelvallat.com/images/tutorials/bandpower/data.txt'

# Get the data and save it as "sleep_data.txt"
sleep_data, headers = urllib.request.urlretrieve(data_url, './Data/sleep_data.txt')

><b>Task</b>:
> 1. Using the numpy function `loadtxt`, load your text file and assign it to "data". Give `loadtxt` two parameters: `fname` and `delimiter`. This is a comma-delimited file, so delimiter is set to ','.
> 2. Check the shape of your data.

In [None]:
# Load the .txt file


First, we'll plot the raw data. The data provides the information about voltage for the y axis. However, we need to also create a time vector, so that we can accurately label the x axis. The `np.arange()` method is very useful for this. We need to create a vector that is the same length as our data, but where each number is in terms of **time**, not **samples**. To do so, we'll divide our vector by the sampling rate.

In [None]:
# Define sampling frequency, num_samples, and time vector
sampling_freq = 100 # sampling frequency, in Hz
num_samples = len(data)
time_vector = np.arange(0,30,1/sampling_freq) # the signal is 30 seconds long
time_vector

<font color='blue'><h2>**Breakout Room Tasks**</h2>First, plot the entire 30 seconds of signal! You only need to add a line of code where it says `# ADD CODE HERE`</font>

In [None]:
# Plot the signal
fig, ax = plt.subplots(figsize=(12, 4))

# ADD CODE HERE


<font color='blue'>Using NumPy slicing, plot **just the first second** of the signal. Remember that this data was sampled at 100 Hz. In other words, there are 100 data points per second.</font>

In [None]:
# Plot the first second here


## About this notebook

Some of the code in this notebook was adapted from [this tutorial](https://raphaelvallat.com/bandpower.html) by Raphael Vallat, [these tutorials](https://github.com/voytekresearch/Tutorials) from Torben Noto, and _Neural Data Science_ by Pascal Wallisch.