For this article we have used Stroke Prediction Dataset publicly available on Kaggle .
Following topics are included in this tutorial:
- Loading Data
- Viewing Data
- Selecting Data
- Counting Data
- Unique Values
- Filtering Data
- Ordering Data
- Creating New Variables
- Deleting Data
- Changing Data Types
- Conditions
- Data Aggregation
Deatiled explanation and sample outputs can be found in this Medium article PySpark Cheat Sheet For Big Data Analytics