Skip to content
code and data from Latent Structure modeling paper
Shell Other
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
CCA
NIF-Disorders
clustering
cogatlas
src
stopwords
topic_modeling
README

README

This repository contains code and relevant files for the Latent Structure
modeling project described in:

Poldrack RA, Mumford JA, Schonberg T, Kalar D, Barman B, Yarkoni T (2012). Discovering relations between mind, brain, and mental disorders using topic mapping. PLOS Computational Biology, in press.

Preprint available at:

http://talyarkoni.org/papers/Poldrack_et_al_in_press_PLoS_CompBio.pdf

Guide to subdirectories:
CCA - contains results from CCA analysis

NIF-Disorders - contains files used for disorders topic modeling

clustering - contains files used for clustering of disorders

cogatlas - contains files used for cogatlas topic modeling

src: contains the source files used for all analyses. these require the following
external libraries:

http://numpy.scipy.org/
http://scikit-learn.org/stable/
http://rpy.sourceforge.net/rpy2.html
http://nipy.sourceforge.net/nibabel/

also note that src/utils needs to be in the python path

1_db_foci_to_image.py - obtains coordinates from neurosynth database and creates images
- uses utils/tal_to_mni.py

1.1_merge_peakfiles.sh - creates merged image and computes mask of voxels with activation
on at least 1% of papers

1.2_extract_nidag_text.py - extract full text of each article from database

2_get_cogat_concepts.py - read cognitive atlas concepts from RDF and get loading for each document

3_create_disorder_list.py - load NIF dysfunction ontology, grab synonyms, and add
additional missing terms

4_get_disorders_NIF.py - get loading for each disorder term from corpus

5_mk_cogatlas_docs.py - make documents based on cogatlas loadings

6_mk_disorders_docs.py - make documents based on disorder loadings

7_mk_pickled_data.py - created a pickle with the full dataset to make loading easier

8_mk_8fold_cogatlas.py - make files for 8-fold CV

9_mk_8fold_disorders.py - make files for 8-fold CV

10_mk_8fold_cogatlas_mallet_data.py - make mallet data from 8-fold files

11_mk_8fold_disorders_mallet_data.py - make mallet data from 8-fold files

12_mk_8fold_cogatlas_topic_models.py - create scripts to run mallet jobs

13_mk_8fold_disorders_topic_models.py - create scripts to run mallet jobs

14_get_best_topic_likelihood.py - check likelihoods to get get dimenstionality

15.1_get_disorders_dimensionality.py - generate additional topic models to get disorder dimensionality

15.3_get_best_disorder_dimensionality.py - get dimensionality that has unique topic dists

15_run_topic_models.py - generate scripts to run final topic models

16_mk_cogatlas_loadingdata.py - load topic data and save loadingdata.txt

17_mk_disorders_loadingdata.py - load topic data and save loadingdata.txt

18_mk_all_chisq_maps.py - make all chisquare maps - uses utils/mk_chisq_maps_filter.py

18.2_mk_6mm_chisq_maps.sh - make 6mm versions of topic

19_mk_slice_images_cogatlas.py - make slice images using p values

20_mk_slice_images_disorders.py - make slice images using p values

22_mk_latexreport_cogatlas.py - create latex report

23_mk_latexreport_disorders.py - create latex report

24_run_CCA_nonneg.py - run CCA analysis

25_cluster_disorders.R - run clustering on disorders data

26_make_cognitive_topic_figure.py - generate figure 2 from initial submission

27_make_disorders_topic_figure.py - generate figure 3 from initial submission

28_mk_8fold_fulltext_topic_models.py - create scripts to run fulltext topic models

28.1_mk_8fold_fulltext_mallet_data.py - make 8-fold data for full text analysis

28.2_run_all_fulltext_nfold.sh - run mallet jobs on 8-fold full text

28.3_get_best_dimensionality.py - get best dimensionality for full text

30_get_wordhist.py - get histograms of # of docs for each filtered paper

31_count_locations.py - count # of locations reported across all papers

32_doc_topic_hist.py - make histograms of docs/topic and topics/doc (for Figure 3)

33_plot_likelihood.py - plot empirical likelihood as function of # of topics (for Figure 2)

34_run_CCA_on_topics.py - run CCA directly on topic distributions

35_mk_fold1_loadingdata.py - get loadingdata for fold1 for each ntopics

36.1_mk_slice_corr_images.disorders.py - make images using correlation rather than p value

36_mk_slice_corr_images.cogatlas.py - make images using correlation rather than p value (for figure 

37_make_cognitive_topic_corr_figure.py - make Figure 4

38_make_disorder_topic_corr_figure.py - make Figure 6

39_get_topic_hierarchy.py - create graphviz .dot file for topic hierarchy (Figure 5)
Something went wrong with that request. Please try again.