Skip to content
A list of useful resources to learn Data Engineering from scratch
Branch: master
Clone or download
Type Name Latest commit message Commit time
Failed to load latest commit information. Added distributed consensus reading list Nov 5, 2019

How To Become a Data Engineer

Useful articles


Algorithms & Data Structures




Distributed Systems




  • Martin Kleppmann author of Designing Data-Intensive Application
  • BaseDS by Vaidehi Joshi about Distributed Systems


  • Apache Airflow is a platform to programmatically author, schedule and monitor workflows in Python
  • Apache Spark is a unified analytics engine for large-scale data processing
  • Apache Kafka is a distributed streaming platform
  • Luigi is a Python package that helps you build complex pipelines of batch jobs.
  • is a system for building modern data applications.
  • Prefect includes everything you need to create and run data applications.

Cloud Platforms

Data Engineering Jobs


Newsletters & Digests

  • Data Eng Weekly - Your weekly Data Engineering news
  • SF Data Weekly - A weekly email of useful links for people interested in building data platforms
  • Data Elixir - Data Elixir is an email newsletter that keeps you on top of the tools and trends in Data Science.
You can’t perform that action at this time.