Statistics 159/259: Reproducible and Collaborative Statistical Data Science
UC Berkeley | Fall 2015
About This Repository
This repository contains our analysis and documentations of the visual object recognition data from OpenfMRI. Our main goal is to first test the reproducibility of their study results, then add additional statistical analysis for further insights.
Detailed descriptions of the data can be found at ds105_old. This link leads to the first dataset that gets downloaded above. For a cleaner, processed version, which was the second dataset, please check out the following: ds105_new. Please read above for download sizes and times.
Please run the following commands in their respective order from the root of the project folder. This will set you up to follow our project. There may be additional Makefiles in subdirectories but they are all wrapped in the one within the root.
git clone https://github.com/berkeley-stat159/project-zeta.git: This will clone our repo to you locally. The clone process should be quite short.
make structure: This creates the directory skeleton necessary for our project.
make data: This downloads two datasets. Their sizes are 2GB and 12.5GB respectively. Both are needed to complete our analysis, so please ensure there is sufficient space on your computer to download the data. Please ensure that you have a stable internet connection during the download.
make validate: This validates that the datasets were downloaded correctly. It will check for appropriate hash values.
make analysis: This will run all the statistical analysis of our study. All models and figures will be generated from this command. Ensure you have enough memory. The process will take about 60 to 80 minutes to finish.
make figures: This will copy all the figures generated above, into their appropriate directories. This is crucial in order to generate the report.
make report: This will generate the pdf of our final report. This includes detailed write-ups for our analysis and corresponding graphs. You must have LaTeX installed.
Here are some other commands you will find helpful.
make all: Runs all commands over our entire project. This will run all the steps above, and then will clean all trash files at the end.
make clean: Will clean all compiled .pyc and other unnecessary files throughout the project directory.
make test: Will run all our tests of the data and our scripts.
make verbose: Does the same as above, except as 'verbose', giving more information.
make coverage: Generates a coverage report for the functions in the
Tzu-Chieh Chen tcchenbtx
Edith Ho edithhcw
Zubair Marediya Zubair-Marediya
Michael Tran miketranx4
Dongping Zhang dpzhang
A big thank you to Jarrod Millman, Matthew Brett, J-B Poline, and Ross Barnowski for your teaching and advice throughout the semester.