Run a Decision Tree based Classification process in Pyspark and evaluate the model
This exercise is a part of Coursera's Big Data Specialisation in Machine Learning, offered by University of California, San Diego.
The data available in a CSV contains San Diego weather data collected over a period of three years. More information regarding the data capture can be found here. The idea is to classify a day as low-humid or high-humid day based on weather data in the morning. The results are compared against the afternoon sensor data.