This repository was archived by the owner on Sep 11, 2023. It is now read-only.

Description
Which capacity of the PV system should we use?
It is useful to normalise the PV data ready for ML and using the capacity makes sense.
Importantly the training and prediction must use the same. Hence #691
The metadata for pvoutput,org and Passiv provide capacity values.
Note that The pv capacity can degrade over time, so a static one might not be so good
1. Use the maximum values of the training set
Pros:
- the system will be between 0 and 1 inclusive
- ML model can be PV system agnostic, as data will be between 0 and 1.
Cons:
- the predictions data might not be between 0 and 1 if not data in training (unlikely)
- I did this with the first CNN model and had to increase lots of capacities of ~10 systems, and reduce lots of capacities of ~10 systems
2. Use metadata
Pros:
- This number is constant, its given to us
Cons:
- It could be way off the actual power produced.
3. Hybrid 1
- use 1
- adjust capacity if prediction data > 100%
- adjust capacity if prediction data < 50% (over a hsitory of collecting it live)
4. Hybrid 2
- use 2
- If training data < 50%, then use 1
- If training data > 100% then use 1
- If prediction data > 100% then use 1
- If over a week of data, (which includes good sunny days) data < 50%, then use 1
(These lower and upper bounds could change)