# Daily Blog #60 - Time Series Forecasting for Oceanographic Data
### June 29, 2025  

## 1. **What is Time Series Forecasting?**

A **time series** is a sequence of data points collected over time intervals (e.g., hourly, daily, monthly).
**Time series forecasting** is the use of models to predict future values based on previously observed data.

### Examples in Oceanography:

* Forecasting **sea surface temperatures (SST)** to detect El Niño.
* Predicting **chlorophyll concentration** to monitor algal blooms.
* Estimating **tidal patterns** or **wave heights** for coastal safety.

## 2. **Oceanographic Data Types (Time-Series)**

| Variable                      | Unit     | Sensor/Source          | Importance                        |
| ----------------------------- | -------- | ---------------------- | --------------------------------- |
| Sea Surface Temperature (SST) | °C       | Satellites, Buoys      | Climate models, coral bleaching   |
| Salinity                      | PSU      | CTD, ARGO floats       | Circulation, rainfall-evaporation |
| Ocean Currents                | m/s      | ADCP, HF radar         | Navigation, oil spill tracking    |
| Sea Level                     | meters   | Tide gauges, altimetry | Flood prediction                  |
| Dissolved Oxygen              | mg/L     | CTD with sensors       | Hypoxia zones, marine health      |
| pH                            | pH units | Water quality sensors  | Ocean acidification               |

> Strategic Tip: Use **open datasets** from NOAA, NASA, Copernicus, Argo, etc.


## 3. **Preprocessing Oceanographic Time Series Data**

Ocean data is messy. Here’s a practical workflow:

### Data Cleaning

* Handle **missing values**: forward fill, interpolation, seasonal fill.
* Remove outliers (e.g., via IQR or Z-score).

### Resampling

* Aggregate or interpolate to uniform intervals (e.g., daily mean SST).

### Stationarity

Check and correct for **non-stationarity**:

* Use **ADF Test** or **KPSS Test**.
* Apply differencing or transformations (log, Box-Cox).

### Feature Engineering

* **Lag features** (previous observations).
* **Rolling means** (e.g., 7-day average).
* **Seasonal indicators** (month, day of year).


## 4. **Forecasting Methods**

### A. **Statistical Models**

| Model                       | Use-case                        | Pros                           | Cons                        |
| --------------------------- | ------------------------------- | ------------------------------ | --------------------------- |
| ARIMA                       | Non-seasonal + linear trends    | Simple, interpretable          | Assumes linearity           |
| SARIMA                      | Seasonal ARIMA                  | Good for monthly/annual cycles | Complex parameter tuning    |
| Exponential Smoothing (ETS) | Smoothing + Trend + Seasonality | Fast, handles seasonality      | Not good for irregular data |

> Use `statsmodels` or `pmdarima` in Python.

### B. **Machine Learning Models**

| Model                                 | Notes                                      |
| ------------------------------------- | ------------------------------------------ |
| Random Forest                         | Use for multi-variate, tabular time series |
| Gradient Boosting (XGBoost, LightGBM) | Works with feature-engineered lags         |
| SVR / KNN                             | Not preferred for long-term dependencies   |

### C. **Deep Learning Models**

| Model                                                       | Strengths                                   |
| ----------------------------------------------------------- | ------------------------------------------- |
| RNN / LSTM                                                  | Captures long-term dependencies, non-linear |
| GRU                                                         | Lightweight alternative to LSTM             |
| TCN (Temporal Convolutional Network)                        | Faster training, parallelizable             |
| Transformer-based (e.g., Time Series Transformer, Informer) | State-of-the-art for long sequences         |

> Use `TensorFlow`, `PyTorch`, or `GluonTS`.

---

## 5. **Model Evaluation for Forecasting**

Use **forecast-specific metrics**:

| Metric | Meaning                                          |
| ------ | ------------------------------------------------ |
| MAE    | Mean Absolute Error                              |
| RMSE   | Root Mean Squared Error (penalizes large errors) |
| MAPE   | Mean Absolute Percentage Error                   |
| WAPE   | Weighted Absolute Percentage Error               |

For **multi-step** forecasts, use:

* **Rolling window forecast**
* **Walk-forward validation**

---

## 6. **Real-World Application Examples**

### Fisheries Monitoring

* **Forecast SST or chlorophyll** to predict fish migration.
* Use LSTM to anticipate periods of low oxygen (hypoxia zones).

### Climate Risk

* Forecast **sea level rise** for coastal planning using SARIMA + LSTM hybrids.

### Marine Policy

* Predict changes in **pH levels** to inform marine protected area (MPA) boundaries.


## 7. Tools & Libraries to Learn

| Tool                    | Purpose                                       |
| ----------------------- | --------------------------------------------- |
| `xarray`                | Work with multi-dimensional NetCDF ocean data |
| `pandas`                | Time series manipulation                      |
| `statsmodels`           | ARIMA, SARIMA                                 |
| `prophet`               | Quick seasonal models by Meta                 |
| `scikit-learn`          | ML models                                     |
| `tensorflow`, `pytorch` | Deep learning                                 |
| `pyts`, `tslearn`       | Time series clustering, classification        |

---

## 8. Project Idea

**Project**: Build an **AI-based dashboard** that predicts sea surface temperature, salinity, and pH levels in Philippine waters using ARGO + NOAA datasets.

### Goals:

* Use LSTM to forecast SST anomalies.
* Visualize real-time + predicted conditions.
* Add alerts for ecological events (e.g., risk of coral bleaching).

---

## 9. Summary Checklist

| Task                                  | Status |
| ------------------------------------- | ------ |
| Understand time series concepts       | 🟩     |
| Clean and preprocess ocean data       | ⬜      |
| Try ARIMA and LSTM on sample dataset  | ⬜      |
| Evaluate with MAPE, RMSE              | ⬜      |
| Build simple dashboard (Plotly, Dash) | ⬜      |


## 10. Mental Triggers

* Oceanographic data has **seasonality** and **multi-scale patterns**—go beyond linear models.
* Environmental systems are **noisy**—use robust smoothing and ensemble methods.
* Time series forecasting isn't just numbers—**your predictions could save ecosystems**.
