Sample Project to Learn Data Engineering.
Here is a sampel project I did using an open source data set. See the full blog post for more detail. https://www.confessionsofadataguy.com/build-your-data-engineering-skills-with-open-source-data/
You will learn ...
- Docker
- PySpark
- Data Research
- Analysis
- Regex
- Window Functions
- and more.
Run docker build --tag=example-project .
to build Docker image.
Run docker-compose up test
to run tests.