Skip to content

HaifaCLG/ILCoWE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ILCoWE

Israeli Learner Corpus of Written English

Learner corpora---datasets that reflect the language of non-native speakers---are instrumental for research of language learning and development, as well as for practical applications, mainly for teaching and education. Such corpora now exist for a plethora of native--foreign language pairs; but until recently, none of them reflected native Hebrew speakers, and very few reflected native Arabic speakers.

We introduce a recently-released corpus of English essays authored by learners in Israel. The corpus consists of two sub-corpora, one of them of Arabic native speakers and the other consisting mainly of Hebrew native speakers. We report on the composition and curation of the datasets; specifically, we processed the data so that both sub-corpora are now uniformly represented, facilitating seamless research and computational processing of the data. We provide statistical information on the corpora and outline a few research projects that had already used them. All the resources related to the corpus are freely available.

Creative Commons License This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published