Following instructions explain the procedure for replicating the results and the figures in the paper:
Kajić, I., Gosmann, J., Stewart, T., Wennekers, T., Eliasmith, C.: "Towards a Cognitively Realistic Representation of Word Associations"
Most of the requirements should be installable with pip
.
For running the spiking neural network model:
For processing the raw Google n-gram data:
Normally preprocessed n-gram data will be fetched from figshare and this dependency is not required. It is only needed to regenerate the data published on figshare.
Clone the repository in the folder where you want to save the project:
git clone git@github.com:ctn-archive/kajic-cogsci2016.git
Getting and processing the data can take a long time (up to a few hours, depending on the machine).
This project uses data available from other online sources:
To fetch this data and generate corresponding matrices used in the paper, run the script:
doit
in the cloned repository. This script will fetch the data from the corresponding sources and generate matrices in the following folders:
Free association norms and google n-gram: ./data/associationmatrices/
SVD Reduced representations of free norms and n-gram data: ./data/semanticpointers/
The raw Google n-grams are over 120 GB, for that reason the default setting in the script will not attempt to download the raw data. Instead, it gets the processed data stored on figshare which is around 200 MB.
To obtain the data for the Figure 3 A) and B) in the paper, you need to run the network simulation that actually produces the spikes. This requires up to 6GB of memory. This can be done with:
python sparat/model/benchmark.py
For reproduce the RMSE plot in Figure 3 C), we need to run several different models, each corresponding to a different number of neurons. Again, this is a computationally exhausting step which can require up to 6GB of memory and can take some time. To do so, run:
psy-doit
and go and grab a coffee. This will produce psywork/result.h5
. Copy or move this file to
data/neural-accuracy.h5
. Alternatively, you can download the data we have used
in the paper from the figshare.
All the figures in this paper have been generated using Python scripts in
Jupyter Notebooks in the directory notebook
.
Following notebooks reproduce the data:
Target positions.ipynb
: Table 1, Figure 1Match with experimental data with curve fitting.ipynb
: Table 2, Figure 2Neural Accuracy.ipynb
: Figure 3 C)Neural.ipynb
: Figure 3 A), B)
Running these notebooks will generate contents in:
.\txt\cogsci-paper\{figures,tables}
Data files not included in the directory and are either downloaded from external resources or generated by scripts.
Figures used in the paper.
Jupyter notebooks with data analyses and plotting.
Task definition files for the serial farming tool psyrun
.
Data processing scripts meant to be invoked from the command line.
Python source code for processing the data and the model excluding command line tools.
Generated LaTeX tables for the paper.
Documentation including the CogSci paper.