Skip to content
A list of useful resources to learn Data Engineering from scratch
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
README.md Added distributed consensus reading list Nov 5, 2019

README.md

How To Become a Data Engineer

Useful articles

Talks

Algorithms & Data Structures

SQL

Programming

Databases

Distributed Systems

Books

Courses

Blogs

  • Martin Kleppmann author of Designing Data-Intensive Application
  • BaseDS by Vaidehi Joshi about Distributed Systems

Tools

  • Apache Airflow is a platform to programmatically author, schedule and monitor workflows in Python
  • Apache Spark is a unified analytics engine for large-scale data processing
  • Apache Kafka is a distributed streaming platform
  • Luigi is a Python package that helps you build complex pipelines of batch jobs.
  • Dagster.io is a system for building modern data applications.
  • Prefect includes everything you need to create and run data applications.

Cloud Platforms

Data Engineering Jobs

Other

Newsletters & Digests

  • Data Eng Weekly - Your weekly Data Engineering news
  • SF Data Weekly - A weekly email of useful links for people interested in building data platforms
  • Data Elixir - Data Elixir is an email newsletter that keeps you on top of the tools and trends in Data Science.
You can’t perform that action at this time.