Skip to content

Towards a Cognitively Realistic Representation of Word Associations

Notifications You must be signed in to change notification settings

ctn-archive/kajic-cogsci2016

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Towards a Cognitively Realistic Representation of Word Associations

Instructions for reproducing the results and plots

Following instructions explain the procedure for replicating the results and the figures in the paper:

Kajić, I., Gosmann, J., Stewart, T., Wennekers, T., Eliasmith, C.: "Towards a Cognitively Realistic Representation of Word Associations"

1. Requirements

Most of the requirements should be installable with pip.

For running the spiking neural network model:

For processing the raw Google n-gram data:

Normally preprocessed n-gram data will be fetched from figshare and this dependency is not required. It is only needed to regenerate the data published on figshare.

Clone the repository in the folder where you want to save the project:

git clone git@github.com:ctn-archive/kajic-cogsci2016.git

2. Fetching and processing data

Getting and processing the data can take a long time (up to a few hours, depending on the machine).

This project uses data available from other online sources:

To fetch this data and generate corresponding matrices used in the paper, run the script:

doit

in the cloned repository. This script will fetch the data from the corresponding sources and generate matrices in the following folders:

Free association norms and google n-gram: ./data/associationmatrices/

SVD Reduced representations of free norms and n-gram data: ./data/semanticpointers/

The raw Google n-grams are over 120 GB, for that reason the default setting in the script will not attempt to download the raw data. Instead, it gets the processed data stored on figshare which is around 200 MB.

To obtain the data for the Figure 3 A) and B) in the paper, you need to run the network simulation that actually produces the spikes. This requires up to 6GB of memory. This can be done with:

python sparat/model/benchmark.py

For reproduce the RMSE plot in Figure 3 C), we need to run several different models, each corresponding to a different number of neurons. Again, this is a computationally exhausting step which can require up to 6GB of memory and can take some time. To do so, run:

psy-doit

and go and grab a coffee. This will produce psywork/result.h5. Copy or move this file to data/neural-accuracy.h5. Alternatively, you can download the data we have used in the paper from the figshare.

3. Creating tables and plots

All the figures in this paper have been generated using Python scripts in Jupyter Notebooks in the directory notebook.

Following notebooks reproduce the data:

  • Target positions.ipynb: Table 1, Figure 1
  • Match with experimental data with curve fitting.ipynb: Table 2, Figure 2
  • Neural Accuracy.ipynb: Figure 3 C)
  • Neural.ipynb: Figure 3 A), B)

Running these notebooks will generate contents in:

.\txt\cogsci-paper\{figures,tables}

Repository organization

data

Data files not included in the directory and are either downloaded from external resources or generated by scripts.

figures

Figures used in the paper.

notebooks

Jupyter notebooks with data analyses and plotting.

psy-tasks

Task definition files for the serial farming tool psyrun.

scripts

Data processing scripts meant to be invoked from the command line.

sparat

Python source code for processing the data and the model excluding command line tools.

tables

Generated LaTeX tables for the paper.

txt

Documentation including the CogSci paper.

About

Towards a Cognitively Realistic Representation of Word Associations

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published