# Task 1 - Problem Formulation

__Delete this instruction cell at the end.__

_This task must contain the formal problem definition (T-P-E Framework) as described in the exercise statement._
_You can use this as a template, but make sure make any necessary modifications in case anything is missing._

## 1.1 Power Forecasting
### Task (T)
want a model that can look at today’s weather sensors and past power output and predict how much electricity a solar inverter will produce in the next few minutes or the next hour.
This is a regression problem because the model predicts actual numbers (kW).
* **Target Variable:** 
per-inverter instantaneous power (kW) and plant-level total power (sum of inverters).
* **ML Problem Type:** 
supervised regression (single-step real-time nowcast and 1-hour ahead forecast).
* **Prediction Horizon:**
Nowcast / real-time: predict current (or immediate next) power using concurrent sensors (useful for short term control).
Short horizon: 1-hour ahead (single step). If data cadence is sub-hourly, 1-hour ahead can be expressed as N steps ahead.
* 

### Performance (P)
* **Metric:** 
Primary metrics:

MAE (Mean Absolute Error) — interpretable in kW (directly maps to expected energy/revenue error). Robust to outliers.

RMSE (Root Mean Squared Error) — penalises large errors (important because large underestimates can cause revenue/contract penalties).

Skill score vs persistence baseline (e.g., percent improvement over persistence or simple clear-sky model) — business relevant (shows value vs trivial model).

Secondary metrics:

sMAPE or NMAE for normalised comparisons across inverters or plants with different capacities. Avoid MAPE when values near zero (night) – will use daytime subset for MAPE.

* **Justification:** ... absolute error (MAE) converts directly to lost energy (kW × time) → revenue loss; RMSE helps penalise occasional large misses that may trigger grid penalties or imbalance charges. Skill vs persistence demonstrates practical improvement.

### Experience (E) - (Refer to general data description below)

Available features (examples): timestamp, inverter_id, instantaneous inverter power, plant aggregate power, plant sensors (global horizontal irradiance or POA, ambient temp, module temp, wind speed, humidity), possible status flags. Derived features: sun elevation/azimuth, clear-sky irradiance, rolling lags and statistics (lags of power and irradiance, rolling mean/std), time-of-day, day-of-year, weekday/weekend.

Temporal coverage & structure: 34 days — short but acceptable for short-term models; must avoid leakage and overfitting. Expect diurnal cycles and night zeros.

Quality issues & pre-processing: missing timestamps, sensor dropouts, noisy irradiance (cloud spikes), outliers, intermittent inverter drops to zero. Impute short gaps (interpolation) but treat long gaps (flag and exclude or mark as missing). Remove or mask true night periods for certain metrics. Normalize per-inverter capacity.

Data split (time-aware): chronological split to prevent leakage. Example split for 34 days: first 24 days train (≈70%), next 5 days validation (≈15%), final 5 days test (≈15%). Use rolling-origin validation (expanding window) for robust model selection.

Baselines: persistence (last observed value), simple linear regression on irradiance & temp, and clear-sky scaled model. These are required to compute skill scores.

## 1.2 Operating Condition Classification
### Task (T)
want to detect when an inverter is behaving abnormally — for example, producing much less power than the others.
This is a classification problem because the model decides whether something is normal or faulty.
* **Target Variable:** 
 binary or multi-class labels indicating normal vs faulty / degraded vs maintenance required; or multi-class labels (e.g., normal, inverter fault, curtailment, communication loss, sensor fault).

* **ML Problem Type:**  binary classification (fault/no fault) or multi-class classification depending on label availability. Also consider anomaly detection (unsupervised) if labelled faults are rare or absent.

* **Prediction Horizon:** real-time or near-real-time detection (detect a fault within minutes/hours of occurrence).

### Performance (P)

Primary metrics for labelled classification:

Recall (Sensitivity) — high priority to detect true faults (missing a real fault leads to lost production).

Precision — to avoid excessive false alarms (which cost maintenance time and desensitize operators).

F1 score — balance when single metric required.

ROC-AUC and PR-AUC — PR-AUC is more informative when classes are imbalanced (faults rare).

For anomaly / unsupervised methods: precision@k, false positive rate at operational thresholds, time-to-detect (latency).

Business tradeoffs: typically prefer higher recall with an acceptable precision (operator can triage alarms). Tune threshold using validation to meet maintenance budget constraints.

### Experience (E) -

Features: short-window features per inverter (last N minutes/hours of power, normalized by clear-sky or by inverter capacity), deviation from peer inverters, residual (observed − expected from model), plant sensors, inverter telemetry if present. Peer-comparison (fleet anomaly) is very powerful: an inverter behaving differently from identical peers likely faulty.

Label availability & imbalance: faults are usually rare — expect heavy class imbalance. If labels are sparse, use semi-supervised (one-class SVM, autoencoders, isolation forest) or novel approaches using peer residuals and rule-based heuristics to generate weak labels.

Train/validation/test: chronological splits but ensure faults in training if supervised; if impossible, use unsupervised methods and evaluate on known labelled events (hold-out). Validation must preserve temporal order and use separate fault events than test. Use stratified selection by event rather than random sampling.

Evaluation protocol: evaluate per-event (detecting the faulty window) and per-timestamp (instant detection). Report confusion matrix, PR curve, and recall at fixed precision (or precision at fixed recall) as operationally meaningful numbers.

Thresholding & operationalization: calibrate decision thresholds in validation to meet an acceptable false alarm rate (e.g., < X alarms/day per plant).

## 1.3 Temporal Forecasting
### Task (T)
predict a sequence of future power values — like the whole next hour or next day, not just one step.
This is also regression but over multiple future time steps.

Targets: sequence of future power values (per inverter and plant aggregate) for a multi-step horizon — e.g., hourly profile for next 24 hours (day-ahead) or next 6–12 steps if sampling is sub-hourly.

Problem type: sequence-to-sequence regression / multi-output regression (can be framed as multi-step forecasting).

Prediction horizons: multi-step (e.g., 6–96 steps depending on cadence) — where business use includes scheduling and bidding.

### Performance (P)

Metrics per horizon step: MAE and RMSE per forecast horizon (report progression as horizon increases).

Aggregate metrics: average MAE over horizon, and energy error (kWh) over forecast window (directly maps to revenue).

Skill vs naive baselines: persistence and last-day profile. For day-ahead, compare to simple climatology or clear-sky scaled profile.

Coverage metrics (for probabilistic forecasts, optional): Prediction interval coverage probability (PICP) and interval width (sharpness). Probabilistic forecasts reduce exposure to uncertainty in bidding.

### Experience (E) -

Sequence inputs: history window (lags of power/irradiance), exogenous inputs (weather forecasts if available — very helpful for day-ahead), calendar features, sun position. For multi-step models, include deterministic features per future timestamp (sun position, forecasted irradiance if available).

Model families: Seq2Seq LSTM/GRU/Transformer, temporal convolutional networks (TCN), and gradient-boosted trees trained on recursive or direct multi-output targets (e.g., LightGBM with multiple future targets).

Validation: use rolling-origin (walk-forward) cross-validation: repeatedly train on days 1..t, validate on t+1..t+k to mimic production forecasting. Report averaged metrics across folds.

Data sufficiency: 34 days is limited for complex deep sequence models for day-ahead generalisation; augment with physical features (clear-sky model) and strong baselines. Use careful regularisation and simpler models if overfitting observed.



## 1.4 General Data Experience (E Summary)
* **Available Features:** ...
* **Temporal Coverage:** ...
* **Quality Issues:** ...
* **Splitting Strategy:** ... (e.g., Chronological 70/15/15 split)

test