I follow the original repo: https://github.com/cordon-thiago/airflow-spark to set up. I add PgAdmin to easily manage Postgresql, Kafka and Zookeeper to handle stream data.
forked from cordon-thiago/airflow-spark
-
Notifications
You must be signed in to change notification settings - Fork 2
PyProjectIE221/kafka-spark-airflow-for-stock-data
About
Docker with Airflow and Spark standalone cluster
Resources
Stars
Watchers
Forks
Packages 0
No packages published
Languages
- Python 61.3%
- Scala 12.2%
- Java 8.5%
- Jupyter Notebook 7.2%
- HTML 6.4%
- R 1.8%
- Other 2.6%