Skip to content

Capstone Project for the IBM Data Engineering Professional Certification.

Notifications You must be signed in to change notification settings

c85/ibm-de-capstone

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction to Data Engineering Capstone Project

Welcome to the Data Engineering Capstone Project.

Congratulations on making it this far!

At this point, you've made it through all twelve courses in the Data Engineering Professional Certificate program, beginning with the Introduction to Data Engineering all the way to Getting Started with Data Warehousing and BI Analytics.

You have learned an incredible set of skills that will be useful in your career as a Data Engineer.

In this Capstone project, you will:

  • Collect and understand data from multiple sources.

  • Design a database and data warehouse.

  • Analyze the data and create a dashboard.

  • Extract data from OLTP, NoSQL and MongoDB databases, transform it, and load it into the data warehouse.

  • Create an ETL pipeline and deploy machine learning models.

This Capstone provides you with practical hands-on experience to demonstrate all of the Data Engineering skills you have picked up in this Professional Certificate program.  

As part of the capstone project, you will assume the role of an Associate Data Engineer who has recently joined the organization. You will be presented with a business challenge that requires building a data platform for retailer data analytics.

In Module 1 you will design the OLTP database for an E-Commerce website, populate the OLTP Database with the provided data, and automate the export of the daily incremental data into the data warehouse.

In Module 2 you will set up a NoSQL database to store the catalog data for an E-Commerce website, load the E-Commerce catalog data into the NoSQL database, and query the E-Commerce catalog data in the NoSQL database.

In Module 3 you will design the schema for a data warehouse based on the schema of the OLTP and NoSQL databases. You’ll then create the schema and load the data into fact and dimension tables,automate the daily incremental data insertion into the data warehouse, and create Cubes and Rollups to make the reporting easier.

In Module 4 you will create a Cognos data source that points to a data warehouse table, create a bar chart of Quarterly sales of cell phones, create a pie chart of sales of electronic goods by category, andcreate a line chart of total sales per month for the year 2020.

In Module 5 you will extract data from OLTP, NoSQL, and MongoDB databases into CSV format. You will then transform the OLTP data to suit the data warehouse schema, and then load the transformed data into the data warehouse. Finally, you will verify that the data is loaded properly.

In the sixth and final module you will use your skills in Big Data Analytics to create a Spark connection to the data warehouse, and then deploy a machine learning model on SparkML for making sales projections.

Each module includes a short quiz to test your knowledge. You will be evaluated based on the quizzes in each module.

For final reporting you will finish the assignments and upload the screenshots of all tasks as proof of completion.Your peers will then review and grade your final submission of the project.You will also perform a peer review for a fellow student.

To begin take a few minutes to explore the course site. Review the material that will be covered each week, and preview the assignments you’ll need to complete to pass the course. Explore the forums where you can discuss the course material with fellow learners and the course team. If you have any questions about course content, post them in these forums to get help from others in the course community. For technical problems with the platform, visit the help or support center.

When you successfully complete the Capstone Project, you will earn your Data Engineering Professional Certificate. We are excited to have you join us and hope you enjoy the course. Good luck and let’s get started!

About

Capstone Project for the IBM Data Engineering Professional Certification.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published