Measuring the complexity of a learner text is considered to be a significant factor in assessing the level of foreign language proficiency. We aim to study syntactic complexity (SC), which is usually interpreted as the variety and degree of complexity of the syntactic structures that are present in a text.
The research was carried out based on 984 learner texts written in English by Russian speakers, which were collected in the corpus REALEC (Kuzmenko & Kututzov, 2014). Each text has a grade given by independent experts and information on the number of 7 types of syntactic errors identified by annotators.
This study examines methods of SC evaluation via automated tools for analysis of SC: TAASSC (Kyle, 2016), L2SCA (Lu, 2010), and Inspector (Lyashevskaya et al., 2021). It has not yet been established which SC constructions or errors in their use are often found among Russian learners of English. We hypothesize that there is a correlation between the level of language proficiency and the number of syntactic errors and values of SC parameters. Hence, the objective of our study is to answer the research questions: Which parameters of SC most accurately reflect the level of English proficiency among Russian speakers? How can we explain the results of SC evaluation? Is there a correlation between the level of language proficiency and the number of syntactic errors and SC? For the analysis we used rank correlation coefficients.
Consequently, the SC parameters of learner texts which correlate most with the essay grade or the number of syntactic errors were identified. We can’t report a strong correlation (the maximum value of Spearman’s correlation coefficient is 0.439). The correlation between the SC parameters and the number of syntactic errors was found to be much weaker than the correlation between the same parameters and the grade.