Skip to content

Native language cognate effects on second language lexical choice

Notifications You must be signed in to change notification settings

ellarabi/reddit-l2

Repository files navigation

reddit-l2

Script for cleanup, simple truecasing, pos-tagging and NER for text collected from Reddit.

"Native language cognate effects on second language lexical choice" (Ella Rabinovich, Yulia Tsvetkov and Shuly Wintner)

The code makes use of COCA ngrams (https://www.ngrams.info/download_coca.asp), spacy NLP package (https://spacy.io/) and polyglot language detector (http://polyglot.readthedocs.io/en/latest/Detection.html).

The data (extended through Sep 2018) can be downloaded from http://www.cs.toronto.edu/~ella/reddit.l2.post.tar.gz (posts) and http://www.cs.toronto.edu/~ella/reddit.l2.sent.tar.gz (sentences).

About

Native language cognate effects on second language lexical choice

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages