Udacity Data Engineering Nanodegree

Project Title

Udacity Data Engineering Nanodegree

Udacity Nanodegree
Explore the repository»

Postgres, Cassandra, AWS, RedShift, S3, EMR, Spark, Airflow, ETL, ELT, Data Modelling, Database Schema, Data Warehousing, Data Lakes, Data Engineering, Udacity

About The Nanodegree

The data engineering field is expected to continue growing rapidly over the next several years, and there’s huge demand for data engineers across industries. This Data Engineer Nanodegree program is comprised of content and curriculum to support six (6) projects. It is estimated to complete the program in five (5) months working 10 hours per week.

Each project will be reviewed by the Udacity reviewer network and a feedback is provided and if the student does not pass the project, he will be asked to resubmit the project until it passes.

The objective here consists in learning to design data models, build data warehouses and data lakes, automate data pipelines, and work with massive datasets.

At the end of the program, the student will combine the acquired new skills by completing a capstone project.

Educational Objectives:

Create user-friendly relational and NoSQL data models
Create scalable and efficient data warehouses
Work efficiently with massive datasets
Build and interact with a cloud-based data lake
Automate and monitor data pipelines
Develop proficiency in Spark, Airflow, and AWS tools

Certificate

TO BE ATTACHED!

Program Details

During this program, the student will complete four courses and five projects. Throughout the projects, he will play part of a data engineer at a music streaming company. He will work with the same type of data in each project, but with increasing data volume, velocity, and complexity. below you can find a course-by-course breakdown.

Associated notebooks for this course can be found here.

Course 1 – Data Modeling

In this course, the student will learn to fit the diverse needs of data consumers, understanding the differences between different data models, and how to choose the appropriate data model for a given situation. He will also build fluency in PostgreSQL and Apache Cassandra.

Project 01 - Data Modeling with Postgres

In this project, the student will model user activity data for a music streaming app called Sparkify. He will create a relational database and ETL pipeline designed to optimize queries for understanding what songs users are listening to. In PostgreSQL he will also define Fact and Dimension tables and insert data into the new tables created.

Link for Project 01 - Link

Project 02 - Data Modeling with Apache Cassandra

In these projects, the student will model user activity data for a music streaming app called Sparkify. He will create a database and ETL pipeline, in Apache Cassandra, he will model the data so he can run specific queries provided by the analytics team at Sparkify.

Link for Project 02 - Link

License

(Back to top)

Distributed under the MIT License. See LICENSE for more information.

MIT License

Contact

Djan Magno - djan.magno@gmail.com

Project Link - https://github.com/djanmagno/Udacity-Data-Engineer-Nanodegree

Footer

(Back to top)

Leave a star in GitHub, give a clap in Medium and share this guide if you found this helpful.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Notebook-Exercises		Notebook-Exercises
Project-1-Data-Modeling-with-Postgres		Project-1-Data-Modeling-with-Postgres
Project-2-Data-Modeling-with-Apache-Cassandra		Project-2-Data-Modeling-with-Apache-Cassandra
images		images
.DS_Store		.DS_Store
.gitignore		.gitignore
Data-Engineering-Nanodegree-Program-Syllabus.pdf		Data-Engineering-Nanodegree-Program-Syllabus.pdf
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Title

Udacity Data Engineering Nanodegree

About The Nanodegree

Certificate

Program Details

Course 1 – Data Modeling

License

Contact

Footer

About

Releases

Packages

Languages

License

djanmagno/Udacity-Data-Engineer-Nanodegree

Folders and files

Latest commit

History

Repository files navigation

Project Title

Udacity Data Engineering Nanodegree

About The Nanodegree

Certificate

Program Details

Course 1 – Data Modeling

License

Contact

Footer

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages