Skip to content

RishiSankineni/Machine-Learning-Pipeline-LR-Pyspark

Repository files navigation

MLPipeline-Lab1-EdX

#Spark Logo + Python Logo

Power Plant Machine Learning Pipeline Application -EdX - Lab1- Big Data Analysis with Apache Spark

This notebook is an end-to-end exercise of performing Extract-Transform-Load and Exploratory Data Analysis on a real-world dataset, and then applying several different machine learning algorithms to solve a supervised regression problem on the dataset.

** This notebook covers: **

  • Part 1: Business Understanding
  • Part 2: Load Your Data
  • Part 3: Explore Your Data
  • Part 4: Visualize Your Data
  • Part 5: Data Preparation
  • Part 6: Data Modeling
  • Part 7: Tuning and Evaluation

Our goal is to accurately predict power output given a set of environmental readings from various sensors in a natural gas-fired power generation plant.

About

Power Plant ML Pipeline Application - Apache Spark

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages