-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #34 from lisaannyu/master
Design matrix and array of voxels in top 20% of t-statistics
- Loading branch information
Showing
14 changed files
with
186 additions
and
100 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
from __future__ import absolute_import | ||
from .. import parse_demographics | ||
|
||
import os | ||
import csv | ||
|
||
|
||
def prepare_for_tests(): | ||
with open('demographics.csv', 'w') as csvfile: | ||
file_writer = csv.writer(csvfile, delimiter=',', quotechar='"') | ||
file_writer.writerow(['id', 'gender', 'age', 'forrest_seen_count']) | ||
file_writer.writerow(['1', 'm', '30-35', '5']) | ||
file_writer.writerow(['2', 'm', '30-35', '1']) | ||
test_object = parse_demographics.parse_csv('demographics.csv') | ||
return test_object | ||
|
||
|
||
def test_seen_most_times(): | ||
test_subjects = prepare_for_tests() | ||
seen_count = parse_demographics.seen_most_times(test_subjects) | ||
assert seen_count[0] == 5 | ||
assert seen_count[1] == 1 | ||
delete_file() | ||
|
||
|
||
def test_seen_least_times(): | ||
test_subjects = prepare_for_tests() | ||
seen_count = parse_demographics.seen_least_times(test_subjects) | ||
assert seen_count[0] == 1 | ||
assert seen_count[1] == 2 | ||
delete_file() | ||
|
||
|
||
def test_find_id_by_gender(): | ||
test_subjects = prepare_for_tests() | ||
id_list = parse_demographics.find_id_by_gender(test_subjects, 'm') | ||
assert len(id_list) == 2 | ||
assert id_list[0] == 'm' | ||
assert id_list[1] == 'm' | ||
delete_file() | ||
|
||
|
||
def test_find_count_by_id(): | ||
test_subjects = prepare_for_tests() | ||
count = parse_demographics.find_count_by_id(test_subjects, 1) | ||
assert count == 5 | ||
delete_file() | ||
|
||
|
||
def delete_file(): | ||
os.remove('demographics.csv') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,13 @@ | ||
.PHONY: all clean | ||
.PHONY: all clean progress.pdf final.pdf | ||
|
||
all: clean progress.pdf | ||
all: clean progress.pdf final.pdf | ||
|
||
clean: | ||
rm -f progress.pdf | ||
rm -f progress.pdf final.pdf | ||
|
||
progress.pdf: progress.md | ||
pandoc -t beamer -s progress.md -o progress.pdf | ||
|
||
|
||
final.pdf: final.md | ||
pandoc -t beamer -s final.md -o final.pdf |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,83 @@ | ||
% Project Lambda | ||
% Jordeen Chang, Alon Daks, Ying Luo, Lisa Ann Yu | ||
% December 3, 2015 | ||
|
||
## The Paper | ||
- A high-resolution 7-Tesla fMRI dataset from complex natural stimulation with an audio movie | ||
|
||
## Abstract | ||
|
||
- Brief Overview of Original Study: | ||
- 20 participants recorded at a high field strength of 7 Tesla while listening to *Forrest Gump* | ||
- Collected fMRI data for entire movie in .niigz format with additional information regarding movie scenes | ||
- Used MRI scanner and pulse oximetry to conduct blood oxygen level dependent (BOLD) imaging, structural MRI imaging, and physiological assay | ||
- Its goal is to provide data for others to explore auditory cognition, language and music perception, social perception, etc. | ||
- Our Goal: | ||
- Reproduce a subject of analysis conducted by Hanke et. al | ||
- Apply machine learning to see if we can predict if a subject was listening to a day or night movie scene based on brain state | ||
|
||
# Data Extraction & Exploration | ||
|
||
##Extraction | ||
- First had to overcome the large amount of data to sift through | ||
- Testing only on Subject 1 for sake of time and efficiency: limited project scope from 320 GB to 5 GB | ||
- Chose to only use nonlinear alignment out of raw data, linear alignment, and nonlinear | ||
- Smoothed data by applying Gaussian filter | ||
- Designed a clean schema by introducing a data_path.json file to reference where each raw data file is located | ||
|
||
## | ||
![This is a plot of the SD’s across volumes in the 4-D array for subject 1, run 1 and task 1. Though the SD’s are all within the range (171, 173.5), there is still quite a bit of variability within that range.](sd.jpg?raw=true) | ||
|
||
# Methods & Results | ||
|
||
##Process Overview: Reproducing | ||
- Two methods for correlation mentioned in paper, and chose to reproduce the first: taking BOLD time-series and calculating voxel-wise pearson correlation on the raw data that Matthew provided. | ||
- Using multiple EC2 instances to run analysis, so that we can load both two hour time courses in a pair and parallelize the process since correlation takes 10 minutes on our laptops. | ||
- Defined a UNIX environment variable STAT159_CACHED_DATA that allows users to choose if they want data to be recomputed. | ||
- Results are still being processed. | ||
|
||
##Analysis Overview: Predicting | ||
- Parsed through “scenes.csv” and “demographics.csv” for information about movie scenes and subjects | ||
- Conducted t-test to determine if brain signal is significantly different between day and night scenes for each voxel | ||
- Retrieved top 32 voxels with biggest change between the groups. | ||
|
||
## | ||
Step 1: Determine which slices are on and which are off | ||
|
||
![Day-night time course for Task 1, Run 1. Very few scenes take place at night.](day_night_on_off.png) | ||
|
||
## | ||
Step 2: Create a Design Matrix | ||
|
||
![Intercept, day/night, linear drift](design_matrix_new_3.png?raw=true) | ||
|
||
## | ||
Step 3: Calculate the betas for each voxel | ||
|
||
![Day_Night](day_night_3.png?raw=true) | ||
|
||
## | ||
![Linear Drift](linear_drift_3.png?raw=true) | ||
|
||
## | ||
![Intercept (baseline)](intercept_3.png?raw=true) | ||
|
||
## | ||
Step 4: Test the hypothesis that beta = 0 to generate t-statistics | ||
![Largest 32 t-statistics. Note: We took the absolute value of the t-statistics to find the farthest distances from 0. Additionally, since we are always comparing the estimate of beta to 0, the t-statistic should be positive.](top_32_bar.png?raw=true) | ||
|
||
## | ||
![Time course for the voxel with the highest t-statistic.](highest_t_day_night.png?raw=true) | ||
|
||
## Analysis Overview: Predicting (cont.) | ||
- Using random forest with 1000 in the voxels as the features | ||
- 80% training, 20% testing | ||
- 83% accuracy, but the data is 86% day, so not doing too well. | ||
- Next steps include cross validation on feature set and running prediction on other parts of the data like sentiment and interior/exterior | ||
|
||
## Future Work | ||
- Considered exploring possible physiological responses to certain movie scenes by gender and age, but not enough computing resources and data to do so | ||
- Predict familiarity with *Forrest Gump* | ||
- Difference in voxel activation between participants with high and low amounts of prior exposure | ||
- Machine learning | ||
- However, there are a relatively small number of participants, since fMRI is expensive, so the predictive ability of that study would be rather limited |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.