A directory of publicly available data sets from psycholinguistic studies

politzerahles edited this page Nov 8, 2018 · 35 revisions

A directory of publicly available data sets from psycholinguistic studies. At this time anything else than complete.

Who can contribute: Anyone can contribute their own data sets or add information about other people’s data sets if there is a link, the data was published with the author’s consent, and there is a publication that describes the data.

How to contribute: Click the button labeled “Edit” at the top right of this page. Editing requires that you have a Github account. Markdown is used for formatting. Before saving the document, add a one-sentence summary of what you changed in the field labeled “Edit Message.”

Structure of this document: We’ll start with a simple list of data sets and will evolve this into something more organized as the list grows. However, for each entry please provide at least the following information:

  1. Reference for the paper in which the data was first described ideally with link
  2. Type of data (e.g., self-paced reading, grammaticality judgments, language, …)
  3. URL of the data set

List of data types (feel free to extend if necessary):

  • eye-tracking
  • self-paced reading
  • grammaticality judgments
  • code (data analysis)
  • code (simulation)
  • speed-accuracy tradeoff
  • stimuli
  • working-memory capacity
  • rapid automatized naming
  • computational model
  • event-related potentials
  • language: Spanish, German, English, Cantonese, Mandarin, ...

Entries are separated by a horizontal rule (---).


Reference: Chen, I., Huang, C., & Politzer-Ahles, S. (2018). Determining the types of contrasts: the influences of prosody on pragmatic inferences. Frontiers in Psychology, 9, 2110.

Type of data: Offline binary judgments

Link: https://osf.io/nsgfv/

Reference: Politzer-Ahles, S., & Husband, E. (2018). Eye movement evidence for context-sensitive derivation of scalar inferences. Collabra: Psychology, 4, 3.

Type of data: Eye-tracking while reading, and accuracy.

Link: https://www.collabra.org/article/10.1525/collabra.100/

Reference: Haendler, Y., & Adani, F. (accepted). Testing the effect of an arbitrary subject pronoun on relative clause comprehension: A study with Hebrew-speaking children. Journal of Child Language.

Type of data: Response accuracy data

Link: https://osf.io/km8pe/

Reference: Haendler, Y., Kliegl, R. & Adani, F. (2015). Discourse accessibility constraints in children's processing of object relative clauses. Frontiers in Psychology 6:860.

Type of data: Looking-while-listening eye-tracking data

Link: https://github.com/yhaendler/Haendler-Kliegl-Adani-2015

Reference: Cheung, C., Politzer-Ahles, S., Hwang, H., Chui, R., Leung, M., & Tang, T. (2017). Comprehension of presuppositions in school-age Cantonese-speaking children with and without autism spectrum disorders. Clinical Linguistics and Phonetics, 31, 557-572.

Type of data: Offline judgments (children)

Link: https://osf.io/u2wsz/

Reference: Politzer-Ahles, S., Xiang, M., & Almeida, D. (2017). "Before" and "after": Investigating the relationship between temporal connectives and chronological ordering using event-related potentials. PLoS ONE, 12(4), e017519.

Type of data: ERP (averages and raw continuous data)

Link: https://osf.io/gevfz/

Reference: Nieuwland, M., Politzer-Ahles, S., Heyselaar, E., Segaert, K., Darley, E., Kazanina, K., Von Grebmer Zu Wolfsthurn, S., Bartolozzi, F., Kogan, V., Ito, A., Mézière, D., Barr, D., Rousselet, G., Ferguson, H., Busch-Moreno, S., Fu, X., Tuomainen, J., Kulakova, E., Husband, E., Donaldson, D., Kohút, Z., Rueschemeyer, S., Huettig, F. (ms.). Limits on prediction in language comprehension: A multi-lab failure to replicate evidence for probabilistic pre-activation of phonology. bioRxiv preprint.

Type of data: single-trial ERP amplitudes (from single time window and channel selection)

Link: https://osf.io/eyzaq/

Reference: Politzer-Ahles, S. (ms.). Self-paced reading experiment on SOME and MOST. unpublished manuscript.

Type of data: self-paced reading, Likert scale acceptability judgments

Link: https://osf.io/7swt2/

Reference: Gibson, E. & H. Wu (2013). Processing Chinese relative clauses in context. Language and Cognitive Processes, 28, 125-155.

Type of data: self-paced reading

Link: https://github.com/vasishth/NicenboimVasishthPart2/blob/master/gibsonwu2012data.txt

Reference: Politzer-Ahles, S. & R. Fiorentino (2013). The Realization of Scalar Inferences: Context Sensitivity without Processing Cost. PLoS ONE, 8, e63943.

Type of data: self-paced reading, comprehension question accuracy

Link: http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0063943#s5 (File S2)

Reference: Politzer-Ahles, S., K. Schluter, K. Wu, & D. Almeida (2016). Asymmetries in the perception of Mandarin tones: evidence from mismatch negativity. Journal of Experimental Psychology: Human Perception and Performance, 42, 1547-1570.

Type of data: preprocessed ERP averages

Link: http://supp.apa.org/psycarticles/supplemental/xhp0000242/xhp0000242_supp.html ('Supp3.zip')

Reference: White, A. S., D. Reisinger, K. Sakaguchi, T. Vieira, S. Zhang, R. Rudinger, K. Rawlins, & B. Van Durme. (2016). Universal decompositional semantics on universal dependencies. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 1713–1723, Association for Computational Linguistics.

Type of data: inference likelihood judgments, veridicality judgments, word sense judgments

Link: https://github.com/aaronstevenwhite/UniversalDecompositionalSemantics

Reference: von der Malsburg, T., & Angele, B. (2016). False positives and other statistical errors in standard analyses of eye movements in reading. Journal of Memory and Language, (in press).

Type of data: eye-tracking, reading, code (simulation, data analysis)

Link: https://github.com/tmalsburg/MalsburgAngele2016JML

Reference: Schotter, E. R., Tran, R., & Rayner, K. (2014). Don’t believe what you read (only once): Comprehension is supported by regressions during reading. Psychological Science, 25(6), 1218–1226. http://dx.doi.org/10.1177/0956797614531148

Type of data: eye-tracking, reading, stimuli, English, garden-pathing

Link: http://library.ucsd.edu/dc/object/bb4916286v

Reference: White, A. S. & K. Rawlins. (2016). A computational model of S-selection. In Semantics and Linguistic Theory 26, 641-663. Ithaca, NY: CLC Publications.

Type of data: acceptability judgments

Link: https://github.com/aaronstevenwhite/MegaAttitudeProject

Reference: Schmalz, X., Robidoux, S., Castles, A., Coltheart, M., Marinus, E. (preprint): German and English bodies: No evidence for cross-linguistic differences in preferred grain size.

Type of data: lexical decision times, English, German

Link: https://osf.io/myfk3/

Reference: Nicenboim, B., Vasishth, S., Gattei, C., Sigman, M., & Kliegl, R. (2015). Working memory differences in long-distance dependency resolution. Frontiers in Psychology. http://dx.doi.org/10.3389/fpsyg.2015.00312

Type of data: eye-tracking, self-paced reading, working memory capacity, Spanish

Link: https://github.com/bnicenboim/papers/tree/master/NicenboimEtAl2015.%20Working%20memory%20differences%20in%20long-distance%20dependency%20resolution

Reference: Nicenboim, B., Logačev, P., Gattei, C., & Vasishth, S. (2016). When high-capacity readers slow down and low-capacity readers speed up: Working memory and locality effects. Frontiers in psychology, 7.http://dx.doi.org/10.3389/fpsyg.2016.00280

Type of data: self-paced reading, working memory capacity, Spanish, German

Link: https://github.com/bnicenboim/papers/tree/master/NicenboimEtAl2016.%20When%20High-Capacity%20Readers%20Slow%20Down%20and%20Low-Capacity%20Readers%20Speed%20Up:%20Working%20Memory%20and%20Locality%20Effects

Reference: Lau, J. H., Clark, A., & Lappin, S. (2016). Grammaticality, Acceptability, and Probability: A Probabilistic View of Linguistic Knowledge. Cognitive Science. http://dx.doi.org/10.1111/cogs.12414

Type of data: Acceptability judgments, English

Link: http://www.dcs.kcl.ac.uk/staff/lappin/smog/?page=research

Reference: Enochson, K., & Culbertson, J. (2015). Collecting Psycholinguistic Response Time Data Using Amazon Mechanical Turk. PLoS ONE, 10(3), e0116946. http://doi.org/10.1371/journal.pone.0116946

Type of data: Self-paced reading, English, Mechanical Turk

Link: http://mars.gmu.edu/handle/1920/9116

Reference: Logačev, P., & Vasishth, S. (2016). Understanding underspecification: A comparison of two computational implementations. Quarterly Journal of Experimental Psychology, 69(5). http://www.tandfonline.com/doi/full/10.1080/17470218.2015.1134602

Type of data: computational model, code (data analysis)

Link: https://github.com/plogacev/manuscript_LogacevVasishth_TQJEP_Underspecification

Reference: Nicenboim, B., & Vasishth, S. (2016). Statistical methods for linguistic research: Foundational Ideas - Part II. Language and Linguistics Compass. In Press.

Type of data: code (data analysis)

Link: https://github.com/vasishth/NicenboimVasishthPart2

Reference: Patil, U., Hanne, S., Burchert, F., De Bleser, R., & Vasishth, S. (2016). A computational evaluation of sentence comprehension deficits in aphasia. Cognitive Science, 40. http://onlinelibrary.wiley.com/doi/10.1111/cogs.12250/abstract;jsessionid=4D66BEAD359E8F97604C5BB0E7A9BC18.f04t04

Type of data: computational model, code (data analysis), German

Link: http://cogsci.uni-osnabrueck.de/~upatil/src/Patil-EtAl-2014-AphasiaModels.zip

Reference: Patil, U., Vasishth, S., & Lewis, R. L. (2016). Retrieval interference in syntactic processing: The case of reflexive binding in English. Frontiers in Psychology. http://journal.frontiersin.org/article/10.3389/fpsyg.2016.00329/full

Type of data: computational model, code (data analysis), English

Link: http://cogsci.uni-osnabrueck.de/~upatil/src/Reflexives-Data-Analysis.zip

Reference: Safavi, M. S., Husain, S., & Vasishth, S. (2016). Dependency resolution difficulty increases with distance in Persian separable complex predicates: Implications for expectation and memory-based accounts. Frontiers in Psychology, 7. http://journal.frontiersin.org/article/10.3389/fpsyg.2016.00403/full

Type of data: self-paced reading, eye-tracking, code (data analysis), Persian

Link: http://www.ling.uni-potsdam.de/~vasishth/code/SafaviEtAl2016DataCode.zip

Reference: Vasishth, S., & Nicenboim, B. (2016). Statistical Methods for Linguistic Research: Foundational Ideas – Part I. Language and Linguistics Compass, 10(8). http://onlinelibrary.wiley.com/doi/10.1111/lnc3.12201/abstract

Type of data: code (simulation, data analysis)

Link: https://github.com/vasishth/VasishthNicenboimPart1

Reference: Frank, S. L., Trompenaars, T., & Vasishth, S. (2015). Cross-linguistic differences in processing double-embedded relative clauses: Working-memory constraints or language statistics?. Cognitive Science.

Type of data: self-paced reading, code (data analysis), Dutch, English, German

Link: https://github.com/vasishth/StanJAGSexamples/tree/master/FrankEtAlCogSci2015

Reference: Logačev, P., & Vasishth, S. (2015). A Multiple-Channel Model of Task-Dependent Ambiguity Resolution in Sentence Comprehension. Cognitive Science.

Type of data: self-paced reading, code (data analysis), German

Link: https://github.com/plogacev/manuscript_LogacevVasishth_CogSci_SMCM

Reference: Hofmeister, P., & Vasishth, S. (2014). Distinctiveness and encoding effects in online sentence comprehension.

Type of data: self-paced reading, code (data analysis), English

Link: http://www.ling.uni-potsdam.de/~vasishth/code/HofmeisterVasishth2014.zip

Reference: Husain, S., Vasishth, S., & Srinivasan, N. (2014). Strong Expectations Cancel Locality Effects: Evidence from Hindi. PLoS ONE, 9(7).

Type of data: self-paced reading, code (data analysis), Hindi

Link: http://www.ling.uni-potsdam.de/~vasishth/code/HusainEtAl2014PLoSONE.zip

Reference: Vasishth, S., Chen, Z., Li, Q., & Guo, G. (2013). Processing Chinese Relative Clauses: Evidence for the Subject-Relative Advantage. PLoS ONE, 8(10).

Type of data: self-paced reading, code (data analysis), Chinese

Link: http://www.ling.uni-potsdam.de/~vasishth/code/PLoSOneVasishthetaldata.zip

Reference: Keshtiari, N., & Vasishth, S. (2013). Reactivation of antecedents by overt vs null pronouns: Evidence from Persian.

Type of data: self-paced reading, Persian

Link: http://www.ling.uni-potsdam.de/~vasishth/code/KeshtiariVasishthJLM2013.zip

Reference: McCurdy, K., Kentner, G., & Vasishth. S. (2013). Implicit prosody and contextual bias in silent reading. Journal of Eye Movement Research, 6(2).

Type of data: eye-tracking, code (data analysis), German

Link: http://www.ling.uni-potsdam.de/~vasishth/code/McCurdyetalJEMRdata.zip

Reference: Vasishth, S., Shaher, R., & Srinivasan, N. (2012). The role of clefting, word order and given-new ordering in sentence comprehension: Evidence from Hindi. Journal of South Asian Linguistics.

Type of data: code (data analysis), Hindi

Link: http://www.ling.uni-potsdam.de/~vasishth/code/VasishthShaherSrinivasan2012JSAL.zip

Reference: Bartek, B., Lewis, R. L., Vasishth, S., & Smith, M. (2011) In Search of On-line Locality Effects in Sentence Comprehension. Journal of Experimental Psychology: Learning, Memory and Cognition, 37(5).

Type of data: self-paced reading, eye-tracking, code (data analysis)

Link: http://www.ling.uni-potsdam.de/~vasishth/code/BarteketalJEP2011data.zip

Reference: Chen, Z., Jäger, L., & Vasishth, S. (2011). How structure sensitive is the parser? Evidence from Mandarin Chinese. Empirical approaches to linguistic theory: Studies of meaning and structure, Studies in Generative Grammar, Mouton de Gruyter.

Type of data: self-paced reading, code (data analysis), Chinese

Link: http://www.ling.uni-potsdam.de/~vasishth/code/Chenetal2010LingEvidence.zip

Reference: Vasishth, S., Suckow, K., Lewis, R. L., & Kern, S. (2011). Short-term forgetting in sentence comprehension: Crosslinguistic evidence from head-final structures. Language and Cognitive Processes, 25(4).

Type of data: self-paced reading, eye-tracking, code (data analysis), German, English

Link: http://www.ling.uni-potsdam.de/~vasishth/code/VSLK_LCP.zip

Reference: Beck, S., & Vasishth, S. (2009). Multiple Focus. Journal of Semantics.

Type of data: code (data analysis), English

Link: http://www.ling.uni-potsdam.de/~vasishth/code/BeckVasishthJoS2009.zip

Reference: Boston, M. F., Hale, J. T., Patil, U., Kliegl, R., & Vasishth, S. (2008). Parsing costs as predictors of reading difficulty: An evaluation using the Potsdam Sentence Corpus. Journal of Eye Movement Research, 2(1).

Type of data: eye-tracking, code (data analysis), German

Link: http://www.ling.uni-potsdam.de/~vasishth/code/JEMRSurprisal.zip

Reference: Vasishth, S., Bruessow, S., Lewis, R. L., & Drenhaus, H. (2008). Processing Polarity: How the ungrammatical intrudes on the grammatical. Cognitive Science, 32(4).

Type of data: eye-tracking, computational model, German

Link: http://www.ling.uni-potsdam.de/~vasishth/code/VasishthBruessowetal2008CogSci.zip

Reference: Vasishth, S., & Lewis, R. L. (2006). Argument-head distance and processing complexity: Explaining both locality and antilocality effects. Language, 82(4).

Type of data: self-paced reading, Hindi

Link: http://www.ling.uni-potsdam.de/~vasishth/code/VasishthLewis2006.zip

Reference: Lewis, R. L., & Vasishth, S. (2005). An activation-based model of sentence processing as skilled memory retrieval. Cognitive Science, 29.

Type of data: computational model, reading times

Link: http://www.ling.uni-potsdam.de/~vasishth/code/LewisVasishthModel05.tar.gz

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.