title
CzechIT! - A linguistic corpus of Czech learners acquiring Italian

Browse

Browse the texts here.

Aims

Second Language Acquisition (SLA) is a fertile field of research in linguistic studies, either by applied and empirical standpoints than from theoretical and general perspectives. This corpus stands for comparative and contrastive analyses exhibited among linguistic structures patterns among languages during the acquisitional path by the learner.

Data

The project is based on quantitative analyses of the corpus, which is constituted by an amount of different kinds of data, in order to retain a wide range of linguistic behaviors and styles:

Email communications
Text messages (SMS, Chat)
Oral production
Auto-judgements of grammaticality

Methods

Data is marked and annotated with NLP tools running in the Python environment.

Timeline

The project starts from July, 2017 and does not have an upper limit of time, so please check the news to stay tuned.

Usage

The corpus itself will be released as soon as possible in open file format with a CC0 license.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

index.md

index.md

Browse

Browse the texts here.

Aims

Data

Methods

Timeline

Usage

Files

index.md

Latest commit

History

index.md

File metadata and controls

Browse

Browse the texts here.

Aims

Data

Methods

Timeline

Usage