Skip to content

RidhimaSharma1/pyspark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

pyspark

ML project using Pyspark

In this file pyspark has been used for predicted if the flight will get delay.

To do this there are number of factors which has been taken under consideration such as miles, departure, carrier, org etc. Basic pyspark functions has been used for data exploration. There are multiple categorical varibales included in the dataset these has been handled by converting in the quantitative values. The labels are cretated on the basis of the "delay" column, if the values as greater than 20 categorized as 1 else categorized as 0. The models used are "Decision Tree" and "Logistic Regression".

The performance of the model is not as good but this depicts the basic usage of pyspark for machine learning.

About

ML project using Pyspark

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages