- Airflow as scheduler
- Airbyte as ingestion tool
- Spark as processing tool
- MySql as data source
- Postgres as data warehouse
You need Docker to run the data stack.
To run the data stack you just need to:
docker compose --profile profilenameincmopose up -d
You can read the full documentation about data stack on my Medium Profile