# Welcome to Mind-reading with Movies and Neuroimaging, 2023!

Congratulations, if you are viewing this Jupyter notebook, you have already acquired many of the skills necessary to excel in this course and you are well on your way to learning cutting-edge methods for cognitive neuroscience!

In this course we will use a variety of tools, many of which will likely be new to you. Don't worry if you are having trouble wrapping your head around them now: by the end of this course you will be proficient in not only these useful skills but also the exciting analyses that use them. We also hope that this course gives you the confidence to adopt methods that aren't covered here!

## Goal of this script
    Familiarize yourself with the tools that will be used in the course exercise notebooks


## Table of Contents
[0. Survey](#survey)

[1. GitHub](#git)  

[2. Jupyter](#jupyter)  

[3. Python](#python)  

[4. fMRI](#fMRI)  

[5. Movies](#movies)  

[6. Brainiak](#brainiak)  

[7. Getting started](#starting) 

Exercises
>[Survey](#survey)  
>[Exercise 1](#ex1)    [2](#ex2)    [3](#ex3)  

## 0. Survey <a id="survey"></a>

Before we jump into the material we would like to survey you to get to know you better and to find out your familarity with the content we will be covering. Only your instructors will be able to see what you say. We want this course to be accessible, but it will be a challenge for students who have no experience with the skills we will be developing here. Next week we will check in to see if this course is right for you.

Please fill out the survey by editing the [markdown](https://en.wikipedia.org/wiki/Markdown) text after where it says "A:"

**Q1**: Have you ever used Jupyter Notebooks before?  
If your answer is no, you might not be able to even answer the question! We encourage you to go to this [Jupyter notebook tutorial](https://mybinder.org/v2/gh/ipython/ipython-in-depth/master?filepath=binder/Index.ipynb). In particular the first 3 tutorials are critical: **1. Notebook Basics**, **2. IPython - beyond plain python**, **3. Markdown Cells**)  


A: 

**Q2**: List the programming languages you feel comfortable with:

A: 

**Q3**: Describe your experience (if any!) with git:  
  
A: 

**Q4**: Have you done Psych204A or B? If you have not, please go to the [fMRI](#fMRI) section and the four videos listed (a total of ~2 hours). Please confirm that you have in the answer below.

A:

**Q5**: Describe your experience with fMRI data, including relevant coursework, data collection, data analysis, etc. If you have done preprocessing, what software did you use (AFNI, FSL, SPM, BrainVoyager, fmriprep etc.)?

A: 

**Q6**: **Bold** the following methods you have heard of before, *italicize* the methods you have used before (don't know how to format markdown text like this? Make sure you did the tutorial in Q1!):

A:  
- Support Vector Machine
- Searchlights
- Representational Similarity Analysis  
- Intersubject Connectivity Analysis 
- Shared Response Modeling
- Hidden Markov Modeling
- Encoding models

**Q7**: Describe your experience with high-performance computing. Have you logged in to Stanford's cluster (e.g., Farmshare, Sherlock) before? Have you used other clusters on campus or in the cloud?

A: 

**Q8**: Help us understand why you are taking this course. Create a new cell below this prompt containing three goals of what you hope to learn/accomplish this semester (don't know how to add a cell? Make sure you have done the tutorials in Q1):

## 1. GitHub<a id="git"></a>

To be looking at this notebook means you must have forayed into GitHub. It is a version control system that allows you to manage code when working with others and sharing code publicly.

GitHub can be scary, especially given that a lot of the language used to explain it and its goals can be unfamilar. Googling 'github for beginners' usually assumes a level of knowledge that many don't have. A good step zero for understanding git is found [here](https://readwrite.com/2013/09/30/understanding-github-a-journey-for-beginners-part-1/) (yes, this is from 2013, but it is still relevant and a great first step!). Another useful way to get familiar with the basics of GitHub is by doing this quick [tutorial](https://guides.github.com/activities/hello-world/).

**Some additional things you should be aware of:**

*It is hard to break git:* because git is a version control system, most of the time you will have backups of everything you do. It is actually very hard to delete what you have done. That leads to...

*It is easy to tangle up a git:* your local repository (i.e., the folder on your computer/cluster) can easily fall out of sync of your remote (i.e., the folder on the git website) if you are collaborating with others or you have two copies of a single git repository. To minimize the risk of this, you should...

*Commit often:* This is a cardinal rule. If you are regularly committing then you will catch inconsistencies between repositories before they fester. 

*Don't put data on there!* This is very important. Firstly, fMRI data can contain confidential information and GitHub is not secure. Moreover, you shouldn't store more than 100Mb in a repo because it makes things really slow.


## 2. Jupyter<a id="jupyter"></a>

Jupyter Lab is a convenient GUI for running simple code and visualizing results. Jupyter is not an ideal place to learn a new language like Python since it won't provide a lot of the helpful nudges that coding environments like [VS-code](https://code.visualstudio.com/), [PyCharm](https://www.jetbrains.com/pycharm/) or [Spyder](https://www.coursera.org/lecture/python-programming-introduction/introduction-to-the-spyder-ide-ywcuv) do, but it is a perfect tool to debug analyses and distribute results once you have some Python expertise. 

We will mainly use the notebook feature of Jupyter in this course. As mentioned earlier, a great tutorial is here: [Jupyter notebook tutorial](https://mybinder.org/v2/gh/ipython/ipython-in-depth/master?filepath=binder/Index.ipynb).

Although most of your experience with Jupyter will be using a Python kernel, you can actually use Jupyter for multiple languages, like [R](https://github.com/IRkernel/IRkernel) or [matlab](https://github.com/Calysto/matlab_kernel).


**Really (really) useful shortcuts:**  
`ESC` - Change from editing mode into command mode  

In command mode:

`a` - make new cell above  
`b` - make new cell below  
`dd` - delete cell  
`m` - convert a code cell into a markdown cell  

While writing code:  
`[TAB]` - Autocomplete code  
`[SHIFT] + [TAB]` - Open the Help on a function  

Jupyter Magics (magics are meta-parameters you can set in a notebook to control how the notebook works):  
`%matplotlib inline` - render any plots generated in-line


<div class="alert alert-block alert-warning">
<strong>Don't forget to save regularly!</strong> The jupyter-notebook will run for 4 hours at a time, and if you don't save before your job dies, you CANNOT save it, even if you restart a new job. This means you will lose everything unsaved...   <br>
    <br>
...That said, the window will stay open in your browser even if you lose internet connection or the job ends, so you could copy over the changes
</div>

## 3. Python<a id="python"></a>

Over recent years, Python has grown tremendously in popularity, particularly for analysis of big data. It boasts a number of benefits: open source, free, and developed extensively for machine learning. It is probably now the main language used for advanced neuroimaging analyses. 

Learning to code in Python is moderately difficult compared to other languages: sometimes the syntax is demanding (like C but unlike MATLAB) but it has a ton of online support, solid inline help, and useful error messages (like MATLAB but unlike C).

Included in this directory is a short notebook called `python_introduction.ipynb` with a basic Python intro. This intro will be good for people who are comfortable learning new computer languages but have no background in Python. Another resource that might be helpful if you are coming from matlab is this [cheat sheet](https://mathesaurus.sourceforge.net/matlab-numpy.html). 

If you are unfamiliar with Python and other languages a better place to start is [here](https://www.datacamp.com/courses/intro-to-python-for-data-science) or [here](https://www.learnpython.org/) (just do the free content). That said, if you are learning to code for the first time, this course probably isn't for you.  

When learning a language you should try learn the conventions at the same time. Conventions are useful to make your code more readable, thus understandable and reproducible. [PEP8](https://peps.python.org/pep-0008/) is a standard convention system in Python.


## 4. fMRI<a id="fMRI"></a>

In this class we are going to use fMRI data to understand the mind. We do not have the space or time to teach you about fMRI, so a basic understanding is expected. There are great courses available at Stanford to teach you all you need to know about fMRI, namely [PSYCH204A](https://explorecourses.stanford.edu/search?view=catalog&filter-coursestatus-Active=on&q=PSYCH%20204A:%20Human%20Neuroimaging%20Methods&academicYear=20162017) and [PSYCH204B](https://explorecourses.stanford.edu/search?view=catalog&filter-coursestatus-Active=on&q=PSYCH%20204B:%20Computational%20Neuroimaging&academicYear=20182019). If you need to brush up on the details, below are four videos, primarily from MIT, that provide a great introduction of the **critical** concepts we will be treating as assumed knowledge.

In [1]:
from IPython.display import YouTubeVideo

print('Physics introduction')
YouTubeVideo('NlYXqRG7lus')

Physics introduction


In [4]:
print('Bootcamp part 1')
YouTubeVideo('yA65FuSpOMs')

Bootcamp part 1


In [6]:
print('Bootcamp part 2')
YouTubeVideo('SsJjuJJjNHM')

Bootcamp part 2


In [7]:
print('Bootcamp part 3')
YouTubeVideo('BYEWA_jsJbM')

Bootcamp part 3


To make sure we are on the same page, here is a glossary of common terms used to describe fMRI experiments and components:

> *The functional*: The f in fMRI: a volume collected every 1-2s that reflects blood flow in the brain (which reflects metabolism which reflects activity).  
> *The anatomical*: The high resolution volume used to define the participant's brain anatomy. This is used as a reference to which we align functional data to.    
> *Voxel*: A 3 dimensional pixel that is the smallest divisible unit of MRI and fMRI data.  
> *TR*: Also known as Repetition Time. It is the time interval at which pulses occur and signal is collected. It can thus be considered as the sampling period of the BOLD signal. More details can be found here: https://mriquestions.com/tr-and-te.html  
> *Stimulus*: The item/event/movie that present to the participant in an experiment. Typically a picture, movie or a sound.  
> *Trial*: A complete unit of the experiment that typically involves an exemplar of the condition. For instance, this could be a presentation of a stimulus, but also multiple stimuli could be presented in a trial.  
> *Block*: A sequence of trials usually because they belong to the same condition or a counter-balanced set of conditions. Usually there is a break between blocks.  
> *Run*: Another name for a functional acquisition. A run of fMRI is typically less than 15 minutes, which is broken up so that you can check in on the participant.  
> *Counter-balancing*: It is important that we have a balance between the conditions and that the order they are presented is balanced to avoid order effects.  
> *Rest time*: We insert pauses between trials in a study in order to avoid fatigue.  
> *Region of interest (ROI)*: This is a region of the brain that has been selected for further analysis. It can be based on anatomical boundaries, a boundaries caused by a test of function, or be an arbitrary shape (e.g., a sphere)

## 5. Movies<a id="movies"></a>

In this course we will exclusively be using naturalistic stimuli. Most of the time, that means participants watching and listening to movies while undergoing fMRI; however, sometimes it will mean participants are listening to audio like music or stories. When I say that movies are naturalistic, I really just mean that they are *more* naturalistic than traditional task-based designs (i.e., experiments with traditional trial/block designs and manipulated conditions). Movies usually depict real people in real scenes, with realistic dynamics. 

Movies, of course, are not real and there are a lot of ways that they are *not* naturalistic. For instance, most movies are not from a first person perspective, most movies have scene cuts which don't happen in real life, movies often display fantastical things, and participants are passively experiencing movies rather than actively navigating the content. Nonetheless, movies typically contain more context and complexity than traditional task-based experiments.

This richness comes at the cost of experimental control, so we might need to work harder on the back end to eliminate the effect of confounds and ensure that we are testing the thing we want to test. Nonetheless, movies respect the richness of human experience in ways that might let us better capture how the mind works. Moreover, there is a folk wisdom amongst people in this field that movies evoke stronger neural responses than traditional task stimuli. Finally, movies are compatible with many populations: from infants, to children, to patients, to older adults. This creates a common ground on which all individuals can be compared. It is for these reasons that movies are incredibly popular in cognitive neuroscience, with dozens of publicly available datasets.

We will use many datasets in this course to answer the different questions we are focused on. Some of the datasets will contain minimally processed data (e.g., the functional data might just be aligned to standard space), while other datasets will include heavily preprocessed data (e.g., functional data that has been masked to include activity only from a specific region).

If you would like to learn more about the potential of movies for advancing cognitive neuroscience, [this](https://snastase.github.io/files/Nastase_NIMG_2020a.pdf) is a great paper by leaders in the field.

## 6. BrainIAK<a id="brainiak"></a>

The Brain Imaging Analysis Kit ([BrainIAK](http://brainiak.org/)) is an open source toolbox coming out of a collaboration between computer scientists and neuroscientists. It uses machine learning and high-performance parallel computing to take analyses that took months and made them run in seconds. BrainIAK contains a number of advanced tools that cannot be found anywhere else. We will cover these tools extensively so that by the end of the course you will be conducting some of the most sophisticated fMRI analyses currently possible.

A good introduction to BrainIAK comes from a recent Nature Neuroscience review:

Cohen, J.D., Daw, N., Engelhardt, B., Hasson, U., Li, K., Niv, Y., Norman, K.A., Pillow, J., Ramadge, P.J., Turk-Browne, N.B. and Willke, T.L., (2017). [Computational approaches to fMRI analysis](https://ntblab.yale.edu/wp-content/uploads/2015/01/Cohen_NN_2017.pdf). *Nature Neuroscience, 20(3)*, 304-313.

Towards the end of the course we will go beyond what BrainIAK offers, giving you insight in how to use tools at the bleeding-edge of research.

## 7. Getting started<a id="starting"></a>

Below we are going to get started with some code using Python and BrainIAK.

Up until now all of the cells (sections in jupyter notebook) have been markdown, meaning that they are treated as text. Below we have cells that are treated as (Python) code.

The first thing we do is import "modules" that we are going to use. Python on its own offers very few commands for you to use. To open up the world of things you can easily do with Python, we import modules. 

In [None]:
# suppress warnings
import warnings
import sys 
if not sys.warnoptions:
    warnings.simplefilter("ignore")

# Essential module for data organization and manipulation. Used ubiquitously
import numpy as np #numpy's "nickname" is np

# Import a function from BrainIAK to simulate fMRI data
import brainiak.utils.fmrisim as sim  

# The plotting tool we will be using in this course
import matplotlib.pyplot as plt

# this lets you peek what's inside a function
import inspect 

# Display any generated plots inline 
%matplotlib inline 

# Hint: you can run this cell by pressing shift-enter

### Brain template 

We are now going to use some of the tools we just loaded. First we'll call a function from `brainiak` to load a gray matter mask from the MNI152 standard brain. Here's an article talking about different anatomical standards, including MNI152: <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4324755/">Structural Brain Atlases: Design, Rationale, and Applications in Normal and Pathological Cohorts</a>


In [None]:
# Set the size (in terms of X, Y, Z) of the volume we want to create
dimensions = np.asarray([64, 64, 64])

# Generate an anatomical image with the size above of brain voxels in gray matter
# This outputs variables for two versions of the image, binary (mask) and probabilistic (template)
mask, template = sim.mask_brain(dimensions, mask_self=False)

Congrats, you just ran a command from BrainIAK!!

We are now going to take a slice from that template and display it.

In [None]:
# Get an axial (a.k.a. transverse or horizontal) slice halfway through the brain

mid_idx = dimensions[2] // 2 #two division signs gives you an integer -> which is necessary for making the slice
axial_slice = template[:, :, mid_idx]

# imshow can visualize a 2d array 
plt.imshow(axial_slice)
plt.title('An axial brain slice');

**Exercise 1:**<a id="ex1"></a> Building on the axial brain slice you generated above, use the mask and template versions of the MNI152 anatomical standard to generate *three* new visualizations. For example: slice along a different dimension (coronal or sagittal), add subplots with multiple axial slices, show a histogram of values in the slice, etc. The point of this exercise is to practice plotting.  

In [None]:
# Insert your code here

### "help()"

`help` is a very useful function in Python. If you type `help(function_name)` in Python, you will get some basic infomation about how to use this function. If you run the following line, you will see that `sim.mask_brain` take the dimension of x, y, and z, and then output a MNI152 template with the specified dimensions. Note, you can also do this by typing [SHIFT] + [TAB] while the cursor is hovering over a function name or by typing `function_name?`. 

In [None]:
help(sim.mask_brain)

### Look at the source code
If you want to see the source code, you can use the `getsource` function from the `inspect` package. 

Run the following code to see the source code of `sim.mask_brain`. 

In [None]:
source_code = inspect.getsource(sim.mask_brain)
print(source_code)

### Hemodynamic Response Function

The brain response that we are measuring with fMRI is the Blood Oxygen Level Dependent Response (BOLD), which has a stereotyped behavior called the Hemodynamic Response Function (HRF). Understanding what causes the HRF and how fMRI measures it is not in the scope of this course. However, it is important to see what the HRF looks like because it influences how we interpret fMRI data. 

Below is code to plot an HRF that we typically assume: a double gamma HRF. 

In [None]:
# Preset a figure
plt.figure()

# Functions that start with '_' are hidden when you use [TAB] to list the functions. This reduces clutter. However, you can still use them, like below
hrf = sim._double_gamma_hrf(temporal_resolution=1)

# Plot the figure and add labels
plt.plot(hrf) 
plt.xlabel('Time since onset (s)')
plt.ylabel('Response amplitude');

Here is an example of how the brain might respond to events. Specifically, we are going to create some events that are arbitrary here but could represent faces appearing for example. We then **convolve** the HRF with those events to get a predicted time course. Convolution is a tricky topic, but [here](https://betterexplained.com/articles/intuitive-convolution/) is a great intuitive introduction.

In [None]:
# When do events occur
event_sequence = [5, 12, 14, 20, 32]

# Insert the events into an empty vector, where each element is a timepoint (called a TR)
stim_function = np.zeros((40, 1))
stim_function[event_sequence] = 1

# Convolve the stim_function with the double gamma HRF
response = sim.convolve_hrf(stim_function, tr_duration = 1, temporal_resolution=1)

# Plot the convolved response
plt.plot(response)

# Plot the event onset (i.e., the stim_function)
plt.vlines(event_sequence, 0, 1, 'k')

# Describe the plot
plt.legend(['Convolved brain response', 'Event onset'])
plt.xlabel('Time (s)')
plt.ylabel('Response');

**Exercise 2**<a id="ex2"></a>: How should we change **brain data** so that the onset of events (e.g., a person appears in the scene, or when words are spoken) correspond to the peak of the brain's response to the events? Consider the plot above to form your answer.

**A:**

**Exercise 3:**<a id="ex3"></a> Congratulations, you just completed your first assignment! Now, you need to submit your assignment to Github classrooms.

First, make sure this notebook is saved (it will automatically save every 5 seconds, but never bad to double check!). Then, go to the toolbar at the top and under "Kernel" select "Restart Kernel and Run All Cells". This runs all the code that you've added here, in order from top to bottom, providing us a clean notebook to grade with all your output. **Make sure that your code has run all the way through by checking the output of each cell. If there is a bug in your code, the entire notebook will stop executing, and we cannot grade it if your code hasn't been run!**

Open a shell terminal window and change directory (`cd`) to go into the folder where this notebook is stored. Then type the following commands on the command line:

`git add -A` This tells Github to add all the files in the current directory to your Github repository. 
- You can also add individual filenames here, such as `git add mmn23-week01-setup.ipynb` instead, if you only want to add specific files. 

`git commit -m 'MESSAGE'` This sends those files you added to github with whatever message you want(e.g., “final homework submission”)

`git push` Finalize those changes and add them to git! 

Then, log into your github account in a web browser and make sure it all looks okay online. You can update your homework assignment as many times as you would like before the deadline.

## Contributions<a id="contributions"></a>
  
M. Kumar, C. Ellis and N. Turk-Browne produced the initial notebook  01/2018  
T. Meissner minor edits  
Q. Lu: switch to matplotlib, fix dead links, add resources, encapsulate brainiak fmrisim  
C. Ellis updated with comments from cmhn-s19    
T. Yates edited notebook for cmhn-s21        
E. Busch edited notebook for cmhn-s22  
E. Busch edited for Grace cluster and Jupyter Lab for cmhn-s23
C. Ellis edited for mmn23