# Time series

A classic of multivariate analysis (and one which we will investigate in more depth in a later class).

# FMRI

Seaborn's example FMRI dataset is taken from https://github.com/mwaskom/Waskom_CerebCortex_2017

If you are curious about further analysis, see the following article related to the data:
* Michael L. Waskom, Michael C. Frank, Anthony D. Wagner. "Adaptive Engagement of Cognitive Control in Context-Dependent Decision Making." Cerebral Cortex, Volume 27, Issue 2, February 2017, Pages 1270–1284, https://doi.org/10.1093/cercor/bhv333

In [None]:
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

In [None]:
fmri = sns.load_dataset("fmri")

In [None]:
fmri

We can use matplotlib to make any of the plots that are in this notebook.  (Remember that pandas and seaborn plotting routines are based on matplotlib).

In [None]:
plt.plot(fmri['timepoint'],
         fmri['signal'],
         'ko')

Of course, there's more structure to the data than what's visible here.  We have different temporal evolution depending on the values in the 'subject', 'event', and 'region' columns.

In [None]:
fmri['subject'].unique()

In [None]:
fmri['event'].unique()

In [None]:
fmri['region'].unique()

In [None]:
fmri.loc[(fmri['subject']=='s0') & 
         (fmri['event']=='stim') & 
         (fmri['region']=='parietal')].sort_values(by='timepoint')

In [None]:
fmri_s0 = fmri.loc[(fmri['subject']=='s0') & 
                   (fmri['event']=='stim') & 
                   (fmri['region']=='parietal')].sort_values(by='timepoint').copy()
fmri_s1 = fmri.loc[(fmri['subject']=='s1') & 
                   (fmri['event']=='stim') & 
                   (fmri['region']=='parietal')].sort_values(by='timepoint').copy()
                   
plt.plot(fmri_s0['timepoint'],
         fmri_s0['signal'],
         'k')
plt.plot(fmri_s1['timepoint'],
         fmri_s1['signal'],
         'b')

Pandas dataframes have built-in plotting methods for making visualizations which can be a bit easier to work with out-of-the-box than matplotlib.

In [None]:
# not quite this easy!
fmri.plot(x='timepoint',y='signal')

In [None]:
ax = fmri_s0.plot(x='timepoint',y='signal',color='k')
fmri_s1.plot(x='timepoint',y='signal',color='b', ax=ax)

Seaborn can also work with Pandas dataframes, and it has fantastic capabilities for making a lot of exploratory visualizations, particularly for the multivariate data we are now interested in exploring.

"lineplot" will draw a line plot with the possibility of semantic groupings. (https://seaborn.pydata.org/generated/seaborn.lineplot.html)

In [None]:
sns.lineplot(data=fmri,x='timepoint',y='signal')

What is the above showing???

It's actually showing the mean and a 95% confidence interval.

In [None]:
sns.lineplot(data=fmri.loc[fmri['subject']=='s0'],x='timepoint',y='signal')

Wait... the above has mean too?

Yes -> it has all the events and region info.

In [None]:
sns.lineplot(data=fmri_s0,x='timepoint',y='signal')

Seaborn does make it easy to "split" the visualizations up using colors (hues), columns, and rows.

In [None]:
sns.lineplot(data=fmri, x='timepoint', y='signal', hue='subject')

In [None]:
sns.lineplot(data=fmri, x='timepoint', y='signal', hue='event')

In [None]:
sns.lineplot(data=fmri, x='timepoint', y='signal', hue='event', style='region')

Here again we note that the styling of a visualization, despite being useful to break data up into different pieces for comparison, does merit some thought, focus, and sometimes experimental improvisation to get into more useful forms.

In [None]:
sns.lineplot(data=fmri, x='timepoint', y='signal', style='event', hue='region')

The above is easier for comparison since the colors are not right next to each other.

Statistical note:  by default the lineplot will show mean and 95% confidence interval.  You can also use standard error, standard deviation, and percentile interval.  See https://seaborn.pydata.org/tutorial/error_bars.html

In [None]:
sns.lineplot(data=fmri, x='timepoint', y='signal', style='event', hue='region', errorbar=None)

The above can be useful if you want to clear the plot of error markings and simply focus on the trend in mean.

In [None]:
sns.lineplot(data=fmri, x='timepoint', y='signal', style='event', hue='region', 
             errorbar=('se',1))

"relplot" is useful for drawing relational plots (like line and scatter plots), onto a FacetGrid (separating values of a given variable along columns or rows).

https://seaborn.pydata.org/generated/seaborn.relplot.html

In [None]:
sns.relplot(data=fmri, x='timepoint', y='signal', hue='event', col='region')

In [None]:
# Remember that I only retained parietal as a region in fmri_s0.

sns.relplot(data=fmri_s0, x='timepoint', y='signal', hue='event', col='region')

In [None]:
sns.relplot(data=fmri, x='timepoint', y='signal', hue='event', col='region', kind='line')

# New material

Being more methodical about time series.

In [None]:
fmri_s0

In [None]:
plt.plot(fmri_s0['timepoint'], fmri_s0['signal'], 'ko')

In [None]:
plt.plot(fmri_s0['timepoint'], fmri_s0['signal'], 'ko')
plt.plot(fmri_s0['timepoint'], fmri_s0['signal'])

In [None]:
plt.plot(fmri_s0['timepoint'], fmri_s0['signal'])

In [None]:
plt.fill_between(fmri_s0['timepoint'], fmri_s0['signal'])

In [None]:
sns.scatterplot(data=fmri.loc[(fmri['subject'].isin(['s0','s1','s2'])) &
                              (fmri['event']=='stim') &
                              (fmri['region']=='parietal')], x='timepoint', y='signal', hue='subject')

In [None]:
sns.lineplot(data=fmri.loc[(fmri['subject'].isin(['s0','s1','s2'])) &
                              (fmri['event']=='stim') &
                              (fmri['region']=='parietal')], x='timepoint', y='signal', hue='subject')

In [None]:
sns.lineplot(data=fmri.loc[(fmri['subject'].isin(['s0','s1','s2'])) &
                              (fmri['event']=='stim') &
                              (fmri['region']=='parietal')], x='timepoint', y='signal', hue='subject', legend=None)
sns.scatterplot(data=fmri.loc[(fmri['subject'].isin(['s0','s1','s2'])) &
                              (fmri['event']=='stim') &
                              (fmri['region']=='parietal')], x='timepoint', y='signal', hue='subject')

In [None]:
d = fmri.loc[(fmri['subject'].isin(['s0','s1','s2'])) &
                              (fmri['event']=='stim') &
                              (fmri['region']=='parietal')]

ax = sns.lineplot(data=d, x='timepoint', y='signal', hue='subject',
                  palette = ['blue','red','green'], 
                  legend=None)

# First, adjust axes limits so annotations fit in the plot
ax.set_xlim(-1.0, 22.0)

# Positions
LABEL_Y = [
    -0.10,  # s2
    -0.05,  # s1
    0.0,    # s0
]

x_start = 18
x_end = 20
PAD = 0.1

COLOR_SCALE = ['blue', 'red', 'green']

# Add labels
for idx, subs in enumerate(d["subject"].unique()):
    data = d[(d["subject"] == subs) & (d["timepoint"] == 18)]
    color = COLOR_SCALE[idx]
    
    # Subject name
    text = data["subject"].values[0]
    
    # Vertical start of line
    y_start = data["signal"].values[0]
    # Vertical end of line
    y_end = LABEL_Y[idx]
    
    # Add line based on three points
    ax.plot(
        [x_start, (x_start + x_end - PAD) / 2 , x_end - PAD], 
        [y_start, y_end, y_end], 
        color=color, 
        alpha=0.5, 
        ls="dashed"
    )
    
    # Add text for subject name
    ax.text(
        x_end, 
        y_end, 
        text, 
        color=color, 
        fontsize=14, 
        weight="bold", 
        va="center"
    )

In [None]:
ds0 = fmri.loc[(fmri['subject'] == 's0') &
                 (fmri['region']=='parietal')].sort_values(by='timepoint',ignore_index=True)
dstim = fmri.loc[(fmri['subject'] == 's0') &
                 (fmri['event'] == 'stim') &
                 (fmri['region']=='parietal')].sort_values(by='timepoint',ignore_index=True)
dcue = fmri.loc[(fmri['subject'] == 's0') &
                 (fmri['event'] == 'cue') &
                 (fmri['region']=='parietal')].sort_values(by='timepoint',ignore_index=True)
dstim['event_diff'] = (dstim['signal'] - dcue['signal'])

In [None]:
fig,ax = plt.subplots(2,1)
sns.lineplot(data=dstim, x='timepoint', y='signal', ax=ax[0])
sns.lineplot(data=dstim, x='timepoint', y='event_diff', ax=ax[1])
ax[1].set_ylabel('difference in stim vs cue')
plt.show()

In [None]:
fig,ax = plt.subplots(2,1)
sns.lineplot(data=ds0, x='timepoint', y='signal', hue='event', ax=ax[0])
ax[1].fill_between(dstim['timepoint'],dstim['event_diff'])
ax[1].set_ylabel('difference in stim vs cue')

In [None]:
dstim

In [None]:
plt.plot(dstim['signal'], dstim['event_diff'])

To color this by time, we can build it up in pieces and assign the pieces different colors.

In [None]:
import matplotlib.cm as cm

In [None]:
cm.jet(2)

In [None]:
dstim['timepoint'].values

In [None]:
[cm.jet(i) for i in dstim['timepoint'].values]

In [None]:
for i in dstim['timepoint'].values[:-1]:
    i0 = dstim.loc[dstim['timepoint'] == i]
    i1 = dstim.loc[dstim['timepoint'] == i+1]
    plt.scatter(i0['signal'], i0['event_diff'],color=cm.jet(i*10))
    plt.plot([i0['signal'],i1['signal']], [i0['event_diff'],i1['event_diff']], color=cm.jet(i*10))