Extraction of a German Reddit Corpus
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
LICENSE
README.md
extract-de.py

README.md

german-reddit

Extraction of a German Reddit Corpus

References

Barbaresi, Adrien (2015). Collection, Description, and Visualization of the German Reddit Corpus, in Proceedings of the 2nd Workshop on Natural Language Processing for Computer-Mediated Communication, pp. 7-11, German Society for Computational Linguistics & Language Technology.

Tools released for the NLP 4 CMC workshop.

Requirements

The whole Reddit corpus is available from archive.org

Requirements: