Skip to content

Some example projects for Data Engineers to build, end-to-end.

Notifications You must be signed in to change notification settings

danielbeach/DataEngineeringProjects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Data Engineering Project(s)

The purpose of this repository is to give Data Engineers the chance to complete an end-to-end Data Engineering project from start to finish. Complete instructions will be given on the desired architecture and steps to take to complete each project.

The expectation of these Project(s) is that you will do everything, including Bash, Dockerfiles, README's, coding, etc. Nothing is going to be done for you, it forces you to not rely on others and skip things you might not be familiar with. Growth comes with struggle.

Similar to how work for a project might be handed down in a Data Team, some of the instructions will be specific, some will be ambiguous, and the solution you choose will be generally up to you.

This project(s) will test a Data Engineers abilities across multiple techs and concepts not limited to, but including

  • Docker
  • Bash
  • Python
  • Airflow
  • Async
  • Data Modeling
  • Postgres
  • Delta Lake
  • PySpark
  • Parquet/CSV
  • BytesIO
  • Lazy Evaluation
  • SQL
  • Analytics
  • Dashboards
  • AWS Cloud

Good Data Engineers are well-rounded and are able to work across multiple techs and concepts, as well as the ability to understand clear and unclear directions, and develop architecture to support the requirements.

Project 1

In this first Data Engineering project the idea is to setup a Data Platform
that will provide the ability to visually build a data pipeline capable of
downloading some raw TSV data, processing it, and depositing the results into
a Lake House, then displaying a Dashboard of the results.

This project tests your skills to understand high level requirements and turn them
into a technical details without much guidance.

It also tests your ability to work on the entire Data Engineering stack from `bash`,
 to `Python` and `Docker` as well as various tools.

About

Some example projects for Data Engineers to build, end-to-end.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published