Cross-platform working memory span task
Python R
Switch branches/tags
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


Py-Span-Task is a simple application for testing working memory span.


July 14, 2017:
  • Switch to fullscreen on start-up.
  • Prevent users from accidentally focusing the text field when it’s not needed. (Focusing the text field previously got the program stuck as it was impossible to escape the text field.)
May 28, 2016:
  • Allow sloppy spelling in Spanish operation span task. If people ignore the accents on words, their responses are still counted as correct. However, other one-letter typos are also accepted.
  • Improved instructions in Spanish operation span task.
June 1, 2015:
  • Added a Japanese operation span task (contributed by Kentaro Nakatani).
  • Added a German reading span task and a German operation span task (contributed by Paul Metzner).
  • More robust handling of Unicode and UTF-8-encoded materials.
  • Working memory score (partial credit unit score) is calculated and stored in results file.
  • Important settings are stored in the results file.
  • Added detailed documentation (see below).

Key features:

  • Runs on Linux, OSX, Windows.
  • Follows the recommendations given in Conway, Kane, Bunting, Hambrick, Wilhelm, & Engle (Psychonomic Bulletin & Review, 2005).
  • Performs operation and reading span tests.
  • Is highly configurable.
  • Supports non-western scripts via Unicode and UTF-8.
  • Computes partial credit unit scores.
  • Saves a protocol of the test procedure in a simple text file that can be loaded in R, Excel, SPSS, …
  • Includes an R script for calculating a range of other measures.

Py-Span-Task is written in Python and uses the TK-toolkit for graphical user interfaces. On Linux and OS X systems, the necessary software for running Py-Span-Task should already be installed. Running Py-Span-Task on Windows may require installing Python (which usually includes the TK-toolkit).

Test batteries

Here’s is a list of test batteries that are currently available in this repository. Please note that neither the authors of py-span-task nor the authors of the test batteries will be liable for any damages or incorrect results. See the license terms for details.


This software and the included test batteries are released under the terms of the GNU General Public License, version 2, or later versions of the license. Before using the software please take a moment to read the license terms.


The configuration file is a Python file that sets a number of settings described below. As an example, see the configuration of the Spanish operation span task included in this repository. When you save the configuration file, make sure you save it in UTF-8 encoding. This way special non-ASCII characters are correctly processed and displayed.

Apart from the configuration file, a task consists of a file containing the “target items” (the items that the participants have to memorize) and a file containing the “processing items”, i.e. the items used in the distractor task (equations or sentences or …). Here is the target items file of the Spanish operation span task and here is the processing items file.


All settings below are mandatory. There are no default values.


fontsize = 22


fontname = "Helvetica"


File containing the items for the processing task (also called the verification task or distractor task).

processing_items_file = "operations.txt"

Format of the file: One item per line. A sentence in case of a reading span task, an equation in the case of a operation span task. First the item, then a delimiting tab, and then the correct answer for the verification task (y or n). Examples:

The queen of England is smoking secretly.	y
( 1 * 2 ) + 1 = 3	y

Make sure that your editor stores tabs as real tabs and does not expand them to spaces.


The file containing the items that the participants have to memorize. In this file, there’s one item per line. Items can be letters, digits or sentences – almost any string is ok. Note that the test is case insensitive. The target items will be displayed as they are stored in this file, but when they are compared with user input the case will be ignored.

target_items_file = "target_words_spanish.txt"


Possible responses and their respective keys: Before the colon is the response as indicated in the file with the processing items (processing_items_file). After the colon you can specify the key on the keyboard that the participants should use to indicate that response.

responses = {


Text shown at the beginning of the test.

welcome_text = """¡Bienvenido!"""


Text shown on page two. Should give an explanation of the first round of practice trials. In this phase only processing items are shown and there is no memory task. The reaction time of the participants is measures to calculate a timeout after which trials are aborted if no response was given. This allows every participant to work at their own pace. People who are really good at checking equations will not have extra time to rehearse memory items.

instructions1 = """En este test, debe indicar …"""


Whether or not minor typos are tolerated when people enter recalled items. If set to True, the entered item is counted as correct if there’s at most one of the following types of typos: omission of a caracter, addition of a character, substitution of a character. NOTE: Don’t use this if your target items are very short, e.g. single digits, because by substitution every digit can be turned into the correct one.

allow_sloppy_spelling = False


Number of processing items for the first practice phase. Don’t set this number too low. The reaction times are measured during these practice trials and the mean + time_out_factor * SD is used as timeout during the actual test.

practice_processing_items = 2


The factor multiplied with the standard deviation plus the mean reaction time for the practice trials is the timeout, i.e. the time after which the presentation of the processing item is interrupted and the response is counted as wrong.

time_out_factor = 2.5


Text shown when a participant took too much time to judge a processing item.

time_out_message = """¡Demasiado lento!"""


When first exposed to the task, participants often take much longer than later. Therefore, it’s advisable to measure processing time only after a number of practice trials. This variable controls when the measurements start.

measure_time_after_trial = 3


If the order of recalled items does not matter, set this to False. If recalled items should be entered in the order in which they were presented, set this to True. Items that are correctly recalled but in the wrong position will then not count towards the score.

heed_order = False


This controls the order in which target items are presented. Either the list of items is shuffled and then each element is presented one after the other. When the list is finished it is shuffled again and the process starts all over. Set pseudo_random_targets to True to get this behavior. If set to False, items are drawn randomly from the set of all items. The crucial difference is that an item can appear in two consecutive trials then. If there are only a few target items, say the digits from 0 to 9, then true random selection is preferable. Otherwise, people can easily guess: if they saw 1, 3, 5, 7, 9 in the last trial, they can guess that in the next they will see 0, 2, 4, 6, 8. If the number of target item is large, shuffled presentation is better, because it avoids repetitions.

pseudo_random_targets = True


Text shown after the first practice phase. Introduces the combined task with processing items and target items for memorization. This phase gives participants a feeling for the timeout and gives them a chance to ask question before the main test begins.

instructions2 = """En la segunda parte, …"""


In each trial, a number of processing and target items are shown. This variable specifies which numbers of items are presented, in the example below, either two or four. The order of the numbers doesn’t matter.

practice_levels = (2, 4)


Number of trials in the second practice phase per level. In the present example, there would be 6 practice trials because there are 2 levels (2 and 4) and 3 trials per level.

practice_items_per_level = 3


Response given in the second practice phase if a processing items was correctly judged. (No feedback will be given during the main experiment.)

practice_correct_response = """¡Muy bien!"""


Response given in the second practice phase if a processing items was incorrectly judged. (No feedback will be given during the main experiment.)

practice_incorrect_response = """¡Lo siento, incorrecto!"""


Summary presented when the second practice phase is finished.

practice_summary = """De %(total)s operaciones, ha obtenido %(correct)s
respuestas correctas.

Presione la barra espaciadora para continuar."""


This text appear after the familiarization period (phase two) and prepares participants for the main test.

instructions3 = """En este momento ya debe …"""


The levels of memory load that are tested in the main test. The same as practice_levels. Order doesn’t matter.

levels = (2, 3, 4, 5, 6)


Number of trials per level in the main test. Like practice_items_per_level.

items_per_level = 1


Text shown before each trial.

next_message = """Cuando esté preparado, sitúe los dedos índice sobre las teclas marcadas y presione la barra espaciadora con el dedo pulgar para continuar."""


Text shown when the main test is finished.

finished_message = """¡Bien hecho!

Presione la barra espaciadora para continuar."""


Specifies how the target items will be displayed (in milliseconds).

target_display_time = 1000


Specifies how long the feedback (correct or wrong) will be displayed during the practice trials.

response_display_time = 1000


Text shown after at the end of the test.

good_bye_text = """¡Gracias por su colaboración!"""

Running the test

To run the test, open a terminal, enter the directory containing and the configuration file of the test, and execute the following command:


The test will prompt for a subject id and conduct some sanity checks on the test materials. For example, it will check whether there are enough target items and whether they are sufficiently different to be uniquely identified when sloppy spelling is tolerated.

Results file

The results will be stored in a file whose name consists of the subject id and the suffix .tsv. The format of the results file is tab-separated-values and can be read by statistical software such as GNU R and spreadsheet applications such as LibreOffice Calc.

A sample output file from the Japanese operation span task can be found here.

Analyzing the results

In GNU R, the following command can be used to read a results file:

d <- read.table("subject1.tsv", sep="\t", head=T,
practice1221790917z rz r
practice231110561544b t rt
practice33316071061n b hn b h
practice4221415581v bv b
test1654452569c x z l v tc x z v l t
test23118001544z y xx

However, this repository also includes a function that reads the data and calculates the usual working memory scores (described in Conway et al., 2005).


To process the data of all subjects in an experiment, you can use the following code:

file.names <- list.files("/path/to/results.files/", "subject.*.tsv")
d <- data.frame(t(sapply(rep(file.names, 2), wm.scores)))
d$subject <- file.names

See the manual of list.files for details.


What’s the state of this project?

We wrote the first version of Py-Span-Task in 2010. Since then, researchers in a number of labs have successfully used this software to obtain working memory scores. The software can thus be considered to be relatively reliable and ready for production use.

Why have we developed this software?

Operation and reading span tests play an important role in our research area. Applications for testing working memory span were already available, however, running them required expensive software licenses. Since these memory tests are actually relatively simple, we decided to write our own software. Apart from saving money another benefit is that we know exactly what the software is doing and that we can fix it ourselves when something doesn’t work as it is supposed to. Since we publish the code for our test software, other researchers can also check how exactly we obtained our data.

Can anyone use this software?

Yes, everybody is invited to freely use our software. We provide material for different operation and reading span tests in several languages. You can use, modify, and improve this material if you want. Note, however, that we can’t take any responsibility for the correctness of the software or its results (see the license terms for details).

How can this software be cited?

If you use our software in your research, we would appreciate if you could acknowledge that in your publications.

- von der Malsburg, T. (2015). Py-Span-Task -- A software for testing
  working memory span. doi: 10.5281/zenodo.18238

Below is a BibTeX entry:

  author       = {von der Malsburg, Titus},
  title        = {{Py-Span-Task -- A Software for Testing Working Memory Span}},
  month        = jun,
  year         = 2015,
  doi          = {10.5281/zenodo.18238},
  url          = {}

Can anyone modify the test software and the test batteries?

Yes, feel free to do so. If you modify the test software or the test material, please consider sharing these changes with us so that we may integrate them in our version. If you create new test materials, or if you translate one of our tests into another language, we would also be happy to integrate these materials in our repository. Your contribution will be duly acknowledged on this page.

Does Py-Span-Task support non-western scripts?

Yes, it does, provided that your configuration files and test materials are saved with the appropriate character encoding (UTF-8) and provided that you are using a font that supports these scripts. On OS X and modern Linux distributions, the default encoding scheme is UTF-8, so it should work out of the box. As far as I know, Windows does not use UTF-8 as its default encoding scheme. Therefore you have to make sure to select UTF-8 when you save the material in your text editor. Create a new entry in the issue tracker in case you run into problems.

What if I find an error in the software or the test materials?

If you find bugs in the software, or errors in the material, please let us know and we try to fix them. To report a problem, please use the issue tracker.

Who are the authors of Py-Span-Task?

Py-Span-Task was originally written by Titus von der Malsburg during his dissertation project in Shravan Vasishth’s lab at the University of Potsdam. Paul Metzner and Bruno Nicenboim made various contributions in the form of suggestions for improvements, code, and test batteries. Kentaro Nakatani contributed the Japanese operation span task.