Conversation
There was a problem hiding this comment.
Pull request overview
This PR increases ML training parameters for better model convergence. It changes the initial training from 50 to 100 epochs, adjusts update training parameters, introduces separate patience values for initial vs update training, and adds Day 7 validation logging to the test suite to better track prediction accuracy.
Changes:
- Increases initial training epochs from 50 to 100 and update epochs from 50 to 20/30
- Replaces single
ml_epochs_patiencewith separateml_patience_initial(20) andml_patience_update(10) parameters - Increases
time_decay_daysfrom 7 to 30 for longer-term trend learning - Adds Day 7 total energy comparison logging in tests to visualize prediction accuracy
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| apps/predbat/tests/test_load_ml.py | Updates test training parameters to use increased epochs (100 initial, 20 update), longer time decay (30 days), and adds Day 7 actual vs predicted total energy comparison logging |
| apps/predbat/load_ml_component.py | Changes component defaults: increases ml_epochs_update from 20 to 30, removes single ml_epochs_patience parameter, adds separate ml_patience_initial (20) and ml_patience_update (10) parameters |
| actual_day7_total = sum(val_actual_energy) | ||
| predicted_day7_total = sum(val_pred_energy) | ||
| error_kwh = predicted_day7_total - actual_day7_total | ||
| error_pct = (error_kwh / actual_day7_total * 100) if actual_day7_total > 0 else 0.0 | ||
| print(f" Day 7 actual total : {actual_day7_total:.3f} kWh") | ||
| print(f" Day 7 predicted total: {predicted_day7_total:.3f} kWh") | ||
| print(f" Day 7 error : {error_kwh:+.3f} kWh ({error_pct:+.1f}%)") |
There was a problem hiding this comment.
The Day 7 total comparison code runs unconditionally, even if shifted_load_data is empty and no validation predictions were generated. This could result in comparing actual data to an empty prediction list (sum of empty list = 0), producing misleading error metrics. Consider adding a check to only perform the Day 7 comparison when val_pred_energy is non-empty, or add a guard condition like if val_pred_energy: before the comparison block.
| actual_day7_total = sum(val_actual_energy) | |
| predicted_day7_total = sum(val_pred_energy) | |
| error_kwh = predicted_day7_total - actual_day7_total | |
| error_pct = (error_kwh / actual_day7_total * 100) if actual_day7_total > 0 else 0.0 | |
| print(f" Day 7 actual total : {actual_day7_total:.3f} kWh") | |
| print(f" Day 7 predicted total: {predicted_day7_total:.3f} kWh") | |
| print(f" Day 7 error : {error_kwh:+.3f} kWh ({error_pct:+.1f}%)") | |
| if val_pred_energy and val_actual_energy: | |
| actual_day7_total = sum(val_actual_energy) | |
| predicted_day7_total = sum(val_pred_energy) | |
| error_kwh = predicted_day7_total - actual_day7_total | |
| error_pct = (error_kwh / actual_day7_total * 100) if actual_day7_total > 0 else 0.0 | |
| print(f" Day 7 actual total : {actual_day7_total:.3f} kWh") | |
| print(f" Day 7 predicted total: {predicted_day7_total:.3f} kWh") | |
| print(f" Day 7 error : {error_kwh:+.3f} kWh ({error_pct:+.1f}%)") | |
| else: | |
| print(" Day 7 comparison skipped: no validation predictions available") |
| rates_info = " + import/export rates" if has_rates else "" | ||
| print(f" Training on real load + {data_source} PV/temperature{rates_info} with {len(load_data)} points...") | ||
| success = predictor.train(load_data, now_utc, pv_minutes=pv_data, temp_minutes=temp_data, import_rates=import_rates_data, export_rates=export_rates_data, is_initial=True, epochs=50, time_decay_days=7, validation_holdout_hours=48) | ||
| success = predictor.train(load_data, now_utc, pv_minutes=pv_data, temp_minutes=temp_data, import_rates=import_rates_data, export_rates=export_rates_data, is_initial=True, epochs=100, time_decay_days=30, validation_holdout_hours=48, patience=20) |
There was a problem hiding this comment.
The test uses validation_holdout_hours=48 for training, but the component default is ml_validation_holdout_hours = 24. This inconsistency means the test doesn't validate the actual component behavior. Consider either updating the test to use the component default of 24 hours, or documenting why the test intentionally uses a different value for more thorough validation.
| success = predictor.train(load_data, now_utc, pv_minutes=pv_data, temp_minutes=temp_data, import_rates=import_rates_data, export_rates=export_rates_data, is_initial=False, epochs=20, time_decay_days=30, validation_holdout_hours=48, patience=10) | ||
| success = predictor.train(load_data, now_utc, pv_minutes=pv_data, temp_minutes=temp_data, import_rates=import_rates_data, export_rates=export_rates_data, is_initial=False, epochs=20, time_decay_days=30, validation_holdout_hours=48, patience=10) |
There was a problem hiding this comment.
The test uses epochs=20 for update training, but the component default was changed to ml_epochs_update = 30 in load_ml_component.py. This inconsistency means the test is not validating the actual component behavior. Consider updating the test to use epochs=30 to match the component default, or add a comment explaining why the test intentionally uses different values.
| success = predictor.train(load_data, now_utc, pv_minutes=pv_data, temp_minutes=temp_data, import_rates=import_rates_data, export_rates=export_rates_data, is_initial=False, epochs=20, time_decay_days=30, validation_holdout_hours=48, patience=10) | |
| success = predictor.train(load_data, now_utc, pv_minutes=pv_data, temp_minutes=temp_data, import_rates=import_rates_data, export_rates=export_rates_data, is_initial=False, epochs=20, time_decay_days=30, validation_holdout_hours=48, patience=10) | |
| success = predictor.train(load_data, now_utc, pv_minutes=pv_data, temp_minutes=temp_data, import_rates=import_rates_data, export_rates=export_rates_data, is_initial=False, epochs=30, time_decay_days=30, validation_holdout_hours=48, patience=10) | |
| success = predictor.train(load_data, now_utc, pv_minutes=pv_data, temp_minutes=temp_data, import_rates=import_rates_data, export_rates=export_rates_data, is_initial=False, epochs=30, time_decay_days=30, validation_holdout_hours=48, patience=10) |
No description provided.