- Install
openjdk
- Install
apache-spark
- Create a venv and activate
- Install deps
python -m unittest discover tests
to test (not working right now)- Create
.env
file in project root with the keysPOLLUTIONAPIKEY
,REDISPASS
,REDISHOST
andREDISPORT
.POLLUTIONAPIKEY
can be obtained fromhttps://aqicn.org/json-api
- clean up code and tests
- add requirements.txt for deps (requests, redis, pyspark, dotenv)
- figure out how to add spark back into data workflow