Extracting Lemma-Frequency from Spanish OPUS files
Using python to create a frequency dictionary from (mainly) spoken word data, assembled and downloaded from OPUS.
These are the corpora that are included:
- OPUS OpenSubs: http://opus.lingfil.uu.se/OpenSubtitles2016.php
- OPUS TED: http://opus.lingfil.uu.se/TED2013.php
- OPUS WMT News: http://opus.lingfil.uu.se/WMT-News.php
- OPUS books: http://opus.lingfil.uu.se/Books.php
Thanks to everyone who contributed to this great open source linguistic project!!