Brushing up the basics, going through some python libraries.
- Pandas, Numpy and Scikit-Learn
- There are 31527 samples in the dataset.
- All the Columns have a few NaN values.
- Wind Direction is a categorical data with 16 categories representing various directions.
- Convert all the numeric data from object data-type to float/int data-type.
- Filled all the NaN values cells with suitable values.
- Converted the given datetime data to Standard format.
- Exported the cleaned data as Clean_Data.csv.
- Visualization of features using various graphs using matplotlib and seaborn.
- Explained the observations from the graphs briefly.
- Plotted the co-relation using heatmap.
- Prediction of PM2.5 using various models.
- Tried ARIMA and VAR models, but both models failed to give satisfactory results.
- So, atlast we had to use Random Forest Regressor.
- Successfully deployed our model.