Skip to content

codinghardwork/weather_prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

  1. Time Series Data Preparation
  • Reads weather data with dates as the index, recognizing it's temporal (time-based).
  • Treats missing data carefully by analyzing null patterns and forward-filling when appropriate.
  • Filters out unreliable columns (with too many missing values) to ensure model integrity.

  1. Index Handling & Time Features
  • Converts the index to datetime to unlock powerful time-based functionality.
  • Extracts features like year, month, and day-of-year — crucial for time series seasonality analysis.

  1. Feature Engineering for Forecasting
  • Constructs a forecasting target by shifting the temperature data forward — predicting tomorrow using today’s data.
  • Adds rolling averages to capture short-term trends and momentum in temperature and precipitation.
  • Introduces percent change in rolling features — highlighting trend direction and volatility.
  • Computes seasonal averages (monthly & day-of-year) using expanding window statistics to reflect cumulative climate behavior over time.

  1. Machine Learning on Time Series
  • Implements Ridge Regression, a linear model that adds regularization to prevent overfitting.
  • Excludes non-predictive or leakage-prone columns (target, station, name) from the input features.

  1. Backtesting: Simulated Real-World Forecasting
  • Defines a custom backtesting loop: simulates how a model would perform in production by iteratively training on historical data and testing on the future.
  • Ensures temporal integrity (i.e., never trains on future data).
  • Stores prediction results and evaluates both accuracy and magnitude of error.

  1. Model Evaluation and Interpretation
  • Uses mean absolute error (MAE) to measure forecast accuracy — a common metric in regression problems.
  • Explores distribution of prediction error over time to identify model behavior and reliability.
  • Sorts and filters high-error days to better understand when the model struggles (e.g., extreme weather events).

  1. Visualization for Insight
  • Plots weather metrics (e.g., snow depth) and prediction error frequencies — essential for communicating patterns, anomalies, or performance degradation visually.

  1. Key Modeling Principles Reinforced
  • Importance of data quality filtering before modeling.
  • Use of time-aware features like lag, rolling stats, and expanding averages for better temporal modeling.
  • Practice of modular thinking (via reusable functions like backtest() and compute_rolling()).
  • Insight into the limitations of simple models and when additional complexity might be warranted (e.g., nonlinear models, exogenous variables).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages