Extraction of a German Reddit Corpus
Barbaresi, Adrien (2015). Collection, Description, and Visualization of the German Reddit Corpus, in Proceedings of the 2nd Workshop on Natural Language Processing for Computer-Mediated Communication, pp. 7-11, German Society for Computational Linguistics & Language Technology.
Tools released for the NLP 4 CMC workshop.
The whole Reddit corpus is available from archive.org