ILCoWE

Israeli Learner Corpus of Written English

Learner corpora---datasets that reflect the language of non-native speakers---are instrumental for research of language learning and development, as well as for practical applications, mainly for teaching and education. Such corpora now exist for a plethora of native--foreign language pairs; but until recently, none of them reflected native Hebrew speakers, and very few reflected native Arabic speakers.

We introduce a recently-released corpus of English essays authored by learners in Israel. The corpus consists of two sub-corpora, one of them of Arabic native speakers and the other consisting mainly of Hebrew native speakers. We report on the composition and curation of the datasets; specifically, we processed the data so that both sub-corpora are now uniformly represented, facilitating seamless research and computational processing of the data. We provide statistical information on the corpora and outline a few research projects that had already used them. All the resources related to the corpus are freely available.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
texts		texts
README.md		README.md
metadata.csv		metadata.csv
paper.pdf		paper.pdf
prompts.json		prompts.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ILCoWE

About

Releases

Packages

Contributors 2

HaifaCLG/ILCoWE

Folders and files

Latest commit

History

Repository files navigation

ILCoWE

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages