This project creates word-clouds based on a tfidf calculation for txt-files which you can upload via a web-UI by using a spark cluster with pyspark.
- Anton Bracke
- Jan Mayer
- Julian Becker
- Spark jobs: spark/
- Frontend: frontend/
- Backend: backend/
- Fake HDFS: fake-hdfs/
This project can be openend as .devcontainer with VS-Code.
Continue with Start backend / frontend
make docker-up
Open a bash inside the app container with: make docker-bash
Continue with Start backend / frontend
make start # start webserver
make start-frontend # start frontend
- add frontend
- add backend
- add text file upload
- generate wordcloud
- embed wordcloud in frontend
- add database to store document-frequiencies of words
- add manual trigger for batch job to update document-frequiencies of words