precis
Precis (formerly scythe) is a Python package for automated questionnaire abbreviation, as introduced and described in Yarkoni (2010). precis uses customizable genetic algorithms to rapidly abbreviate long questionnaire measures--often reducing their length by as much as 80 - 90% with relatively little loss of fidelity.
Installation
Assuming Python and pip are installed, precis can be installed from PyPI via the command line:
pip install precis
Alternatively, for the latest (development) version, install directly from github:
pip install git+https://github.com/tyarkoni/precis.git
Dependencies
Aside from standard scientific python packages (numpy, matplotlib, and pandas--all conveniently included in the Anaconda bundle), the only current dependency is deap, which can be installed from PyPI ("pip install deap").
Quickstart
This example reproduces the core results in Eisenbarth, Lilienfeld, & Yarkoni (2014). For a more comprehensive and detailed walk-through, including generation of all the figures in the manuscript, see the demo IPython notebook, which can be rendered online. All data needed to run the example below can be found in examples/PPI-R/data.
import precis
# Initialize the measure/questionnaire we want to abbreviate.
# We drop all rows with a missing value for at least one item.
ppi = precis.Measure(X='data/PPI-R_German_data.txt', missing='drop')
# Generate scale scores using the PPI-R scoring key, providing names for the columns.
ppi.score(key='data/PPI-R_scoring_key.txt', columns=['B','Ca','Co','F','M','R','So','St'], rescale=True)
# Initialize a new measure generator
gen = precis.Generator()
# Run the generator for 1000 generations
# We'll seed the random number generator to ensure deterministic results.
gen.run(ppi, n_gens=1000, seed=64)
# Save the resulting abbreviated version
abb_ppi = gen.abbreviate()
abb_ppi.save(prefix='abbreviated')
That's it! We should now have two text files in our working directory--one that provides a basic summary of the abbreviated measure, and one that contains a scoring key we can use to automatically score the abbreviated measure's scales using item scores for the original measure.
precis provides many more options, including the ability to customize many aspects of the evaluation and abbreviation process (e.g., to adjust the amount of desired abbreviation), as well as various plotting functions that can help us evaluate the quality of the result and track the evolutionary process over successive generations. For a more detailed walk-through of some of these features, see the demo IPython notebook.