A small data science project I just started to work on for fun, analysing the letters between two famous German poets -- J. W. v. Goethe and J. C. F. v. Schiller :).
- scrape.py downloads 14 HTML files from Projekt Gutenberg (www.projekt-gutenberg.org) containing ~1000 letters exchanged between between Goethe and Schiller.
- preprocess.py extracts all letter numbers, letter writers and letter contents from the raw HTML files, and writes them to one single CSV file. This will be used for further analysis
- all_letters.csv is this CSV file
- The jupyter notebooks show the results of the analyses.
A writeup of the analyses and results can be found on my blog: https://mmfischer.de/003_letters/003_letters.html