Skip to content

volkancirik/sprml13-word-embeddings

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 

Repository files navigation

sprml13-word-embeddings

  • A better and generalized version of this repository with new word vectors is at here

This repository contains word embeddings generated using [1] for a work [2] of the Shared Task of SPRML 2013.

  • use german2.embeddings for better coverage. german.embeddings are used in the SPRML'13 Shared Task submission.

  • For english, we used WSJ 120M token corpus as LM corpus, and 1M Penn Treebank Corpus as target corpus. They may have coverage problem. Use [seed number]english50.embeddings or [seed number]english50+f.embeddings. The first one is extracted without morphologic and ortographic features. The second one is extracted with morphologic and ortographic features.

If you use these word embeddings in your work you may want to cite:

[1]. Learning Syntactic Categories Using Paradigmatic Representations of Word Context, In Proceedings of the 2012 Conference on Empirical Methods in Natural Language Processing (EMNLP-CONLL 2012), Jeju, Korea, July. Association for Computational Linguistics, Paper, Presentation & Code, bib.

[2]. The AI-KU System at the SPMRL 2013 Shared Task : Unsupervised Features for Dependency Parsing, In Proceedings of the Fourth Workshop on Statistical Parsing of Morphologically-Rich Languages, pp 78--85, Seattle, Washington, USA, October. Association for Computational Linguistics, Paper, Word Embeddings, bib

About

25 Dimensional Word Embeddings for the Shared Task of SPRML 2013

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published