Skip to content

In this repo, I build a LogisticRegression prediction model with Dask and PySpark and initialize an AWS EMR cluster to run the entire pipeline.

Notifications You must be signed in to change notification settings

arjunsawhney1/scalable-ML

 
 

Repository files navigation

Scaleable-Ml

This project predicts the deliquency rate in the Freddie Mac dataset in a distributed way using Dask and PySpark.

About

In this repo, I build a LogisticRegression prediction model with Dask and PySpark and initialize an AWS EMR cluster to run the entire pipeline.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 94.5%
  • Shell 5.5%