This project concentrates on articles published on Spiegel Online from 01.01.2000 - 05.05.2018 for the sentiment analysis.
The code for crawling the articles from Spiegel Online is contained in spiegel_scraper.py
.
Everything else (except the SentiWS files and the trained punkt tokenizer model) is contained in the notebook.ipynb
.
As we use pyspark to parallelize the process of crawling and the sentiment analysis sufficient memory should be provided.
If you want to use the mybinder.org link you need to decrease the sample size, as they only assure you 1GB of memory.
-
Notifications
You must be signed in to change notification settings - Fork 1
sschauss/css
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Computational Social Science Project SoSe 2018
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published