This repository is the landing page for code and resources for the Tool for the Automatic Analysis of Syntactic Sophistication and Complexity (version >= 2.0).
TAASSC draws on at least four perspectives on the analysis of syntactic use:
- Syntactic complexity via clausal subordination (e.g., Ortega, 2003; Wolfe-Quintero et al., 1998, inter alia)
- Syntactic complexity via phrasal elaboration (e.g., Biber, 2011; Kyle & Crossley, 2018; Lu, 2010)
- Construction grammar/Usage-based theories of language development (e.g., Ellis, 2002; Ellis & Ferreira-Junior, 2009; Goldberg, 1995, 2006, Tomasello, 2003)
- Lexicogrammatical variation (Biber, 1988; Biber et al., 2004; Biber et al., 2011; Biber et al., 2014)
The original version of TAASSC (Kyle, 2016; Kyle & Crossley, 2017, 2018) used Stanford CoreNLP (Chen & Manning, 2014) and corpus data drawn from the Corpus of Contemporary American English (COCA; Davies, 2009). Beginning with TAASSC 2.0, Spacy (Explosion AI, 2020) was used for part of speech tagging and dependency parsing, primarily because Spacy is written in Python and some end users had difficulty installing Java dependencies for Stanford CoreNLP. Additionally, because Mark Davies does not want frequency lists from COCA distributed publicly, TAASSC 1.x could not be truly open source. Accordingly, in TAASSC 2.x corpus data was drawn from sections of the Corpus of the Web project (COW; Schäfer, 2015; Schäfer & Bildhauer, 2012).
TAASSC 1.x: Please see information at https://www.linguisticanalysistools.org/.
TAASSC 2.0.0.58: Version used in Kyle et al. (2021) | version notes | download code
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.