This project downloads every new article from zeit.de in xml, scrapes it and writes the data in a csv table. My inspiration was a talk from David Kriesel on 33c3. This project runs on a Raspberry Pi Zero via a scheduled cronjob.
- Install the requirements from
requirements.txt
pip install -r requirements.txt
- OPTIONAL: Edit the
config.ini
file to use PushNotifier. For more info see pushnotifier.de - Execute the
run.py
file. (run.py -e
to enable PushNotifier) - Have fun with your data!!!
author | genre | ressort | sub_ressort | edited | ... |
---|---|---|---|---|---|
Max Mustermann | Kommentar | Sport | Fussball | Yes | ... |
These charts were made with matplotlib. Source codes in visualization
.
- Visualization of the scraped data
- on a webpage
- with chart.js