This repository contains my solutions and assignments for the DeepLearning.AI Data Engineering Professional Certificate, offered on Coursera in collaboration with Amazon Web Services (AWS).
The program, taught by Joe Reis (co-author of Fundamentals of Data Engineering), provides hands-on training in building scalable, reliable, and production-ready data systems.
Organizations today generate massive amounts of data, and data engineers play a critical role in transforming raw data into valuable business insights.
This 4-course professional certificate covers the data engineering lifecycle, including:
- Generating & ingesting data
- Storing data in cloud-based architectures (AWS, Data Lakes, Warehouses)
- Transforming and modeling data for analytics and machine learning
- Serving data to downstream consumers
By completing the program, learners gain the ability to design end-to-end pipelines, orchestrate workflows, and apply best practices in data architecture.
- Data Engineering Lifecycle & Frameworks
- Data Ingestion & Pipelines (Batch + Streaming)
- Data Storage Architectures (Data Lakes, Data Lakehouses, Warehouses)
- SQL, Pandas, Spark for Data Processing
- Apache Airflow for Orchestration
- AWS Cloud Data Services
- ETL / ELT Processes
- Data Modeling & Transformation
- Serving Data for Analytics & ML
This repo includes my completed Jupyter notebooks for the course assignments:
deeplearning-ai-data-engineering/
- βββ C3_W1_Assignment.ipynb # Data Storage & Queries - Week 1
- βββ C3_W3_Assignment.ipynb # Data Storage & Queries - Week 3
- βββ C4_W2_Assignment.ipynb # Data Modeling & Transformation - Week 2
- βββ C4_W4_Assignment_1.ipynb # Advanced Assignment - Part 1
- βββ C4_W4_Assignment_2.ipynb # Advanced Assignment - Part 2
π Additional notebooks and project work will be added as I progress through the specialization.
-
Introduction to Data Engineering
- Foundations, lifecycle, business applications.
-
Source Systems, Data Ingestion, and Pipelines
- Building ingestion pipelines, batch vs streaming, connectivity.
-
Data Storage and Queries
- Data lakes, warehouses, Spark SQL, query optimization.
-
Data Modeling, Transformation, and Serving
- Modeling for analytics/ML, distributed processing, serving systems.
Upon completion, learners earn the DeepLearning.AI Data Engineering Professional Certificate, an industry-recognized credential for data engineers.
- DeepLearning.AI β Data Engineering Certificate
- Instructor β Joe Reis
- Fundamentals of Data Engineering by Joe Reis & Matt Housley