Skip to content

TatevKaren/PySpark_Tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

PySpark Cheat Sheet For Big Data Analytics


flame


For this article we have used Stroke Prediction Dataset publicly available on Kaggle .

Following topics are included in this tutorial:
  • Loading Data
  • Viewing Data
  • Selecting Data
  • Counting Data
  • Unique Values
  • Filtering Data
  • Ordering Data
  • Creating New Variables
  • Deleting Data
  • Changing Data Types
  • Conditions
  • Data Aggregation

Deatiled explanation and sample outputs can be found in this Medium article PySpark Cheat Sheet For Big Data Analytics

About

PySpark Tutorial

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages