Rainfall Prediction using Machine Learning Overview
This project builds an intelligent weather prediction system that uses machine learning to forecast:
Whether it will rain or not (Classification)
The chance of rain (%)
The total rainfall (mm)
It collects real-time weather data, processes it, trains multiple ML models, and provides an interactive Streamlit dashboard for predictions and visualizations.
Features
Automated data collection from WeatherAPI
Data cleaning and feature engineering Multi-model training with GridSearchCV Separate models for classification and regression Best model auto-selection and saving as .pkl files Interactive Streamlit web app for visualization and prediction EDA and Correlation Heatmaps for analysis
Machine Learning Models Used Task Target Variable Algorithms Used Evaluation Metric Classification daily_will_it_rain Logistic Regression, Decision Tree, Random Forest F1-Score, ROC-AUC Regression (Chance of Rain) daily_chance_of_rain Linear Regression, Decision Tree, Random Forest R², MAE, RMSE Regression (Total Precipitation) totalprecip_mm Linear Regression, Decision Tree, Random Forest R², MAE, RMSE
Dataset Details
Source: WeatherAPI
Cities Covered: Hyderabad, Mumbai, Chennai, Kolkata, Thiruvananthapuram, Shimla, Manali, Srinagar, Kochi, Gangtok
Records: 100 days of historical data per city
File Generated: Weather.csv
Columns: Column Description date Date of record city City name maxtemp, mintemp, avgtemp Temperature metrics (°C) avghumidity Average humidity (%) maxwind Maximum wind speed (km/h) totalprecip_mm Total rainfall (mm) daily_chance_of_rain Probability of rain (%) daily_will_it_rain Binary rain occurrence (1 = Yes, 0 = No)
Data Preprocessing
Removed irrelevant columns: totalsnow_cm, daily_chance_of_snow, daily_will_it_snow, date, condition
Encoded city using LabelEncoder
Balanced dataset using RandomOverSampler
Split data: 80% training, 20% testing
Prevented leakage by excluding rainfall-related features during classification