Skip to content

Commit

Permalink
Merge pull request #66 from FZJ-IEK3-VSA/develop
Browse files Browse the repository at this point in the history
+ fix Hypertuning class
+ set highspy as default solver
  • Loading branch information
l-kotzur committed Nov 29, 2022
2 parents 2d91c08 + 91c6da3 commit a8dd928
Show file tree
Hide file tree
Showing 14 changed files with 191 additions and 91 deletions.
23 changes: 5 additions & 18 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,26 +1,13 @@
# Modified from
# https://github.com/PyPSA/PyPSA/master/.travis.yml
language: python
sudo: false

before_install:
- wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh;
- bash miniconda.sh -b -p $HOME/miniconda
- export PATH="$HOME/miniconda/bin:$PATH"
- hash -r
- conda config --set always_yes yes --set changeps1 no
- conda update -q conda
- conda info -a

python:
- "3.9"
install:
- conda env update -q --file=requirements.yml
- conda env update -q --file=requirements_dev.yml
- source activate tsam
- pip install -r requirements.txt
- pip install pytest
- pip install pytest-cov
- pip install codecov
- pip install --no-cache-dir -e .

script:
- source activate tsam
- pytest --cov=./tsam

after_success:
Expand Down
23 changes: 12 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,18 @@
[![Build Status](https://travis-ci.com/FZJ-IEK3-VSA/tsam.svg?branch=master)](https://travis-ci.com/FZJ-IEK3-VSA/tsam) [![Version](https://img.shields.io/pypi/v/tsam.svg)](https://pypi.python.org/pypi/tsam) [![Documentation Status](https://readthedocs.org/projects/tsam/badge/?version=latest)](https://tsam.readthedocs.io/en/latest/) [![PyPI - License](https://img.shields.io/pypi/l/tsam)]((https://github.com/FZJ-IEK3-VSA/tsam/blob/master/LICENSE.txt)) [![codecov](https://codecov.io/gh/FZJ-IEK3-VSA/tsam/branch/master/graph/badge.svg)](https://codecov.io/gh/FZJ-IEK3-VSA/tsam)
[![badge](https://img.shields.io/badge/launch-binder-579aca.svg?logo=)](https://mybinder.org/v2/gh/FZJ-IEK3-VSA/voila-tsam/HEAD?urlpath=voila/render/Time-Series-Aggregation-Module.ipynb)

<a href="https://www.fz-juelich.de/iek/iek-3"><img src="https://www.fz-juelich.de/SiteGlobals/StyleBundles/Bilder/NeuesLayout/logo.jpg?__blob=normal" alt="Forschungszentrum Juelich Logo"></a>
<a href="https://www.fz-juelich.de/en/iek/iek-3"><img src="https://www.fz-juelich.de/static/media/Logo.2ceb35fc.svg" alt="Forschungszentrum Juelich Logo" width="230px"></a>

# tsam - Time Series Aggregation Module
tsam is a python package which uses different machine learning algorithms for the aggregation of time series. The data aggregation can be performed in two freely combinable dimensions: By representing the time series by a user-defined number of typical periods or by decreasing the temporal resolution.
tsam was originally designed for reducing the computational load for large-scale energy system optimization models by aggregating their input data, but is applicable for all types of time series, e.g., weather data, load data, both simultaneously or other arbitrary groups of time series.

If you want to use tsam in a published work, **please kindly cite** one of our latest journal articles:
* Hoffmann et al. (2022):\
[**The Pareto-Optimal Temporal Aggregation of Energy System Models**](https://www.sciencedirect.com/science/article/abs/pii/S0306261922004342)
* Hoffmann et al. (2021):\
[**Typical periods or typical time steps? A multi-model analysis to determine the optimal temporal aggregation for energy system models**](https://www.sciencedirect.com/science/article/abs/pii/S0306261921011545)

The documentation of the tsam code can be found [**here**](https://tsam.readthedocs.io/en/latest/index.html).

## Features
* flexible handling of multidimensional time-series via the pandas module
* different aggregation methods implemented (averaging, k-means, exact k-medoids, hierarchical, k-maxoids, k-medoids with contiguity), which are based on scikit-learn, or self-programmed with pyomo
* hypertuning of aggregation parameters to find the optimal combination of the number of segments inside a period and the number of typical periods
* novel representation methods, keeping statistical attributes, such as the distribution
* flexible integration of extreme periods as own cluster centers
* weighting for the case of multidimensional time-series to represent their relevance
Expand All @@ -41,7 +36,7 @@ Or install directly via python as

python setup.py install

In order to use the k-medoids clustering, make sure that you have installed a MILP solver. As default coin-cbc is used. Nevertheless, in case you have access to a license we recommend commercial solvers (e.g. Gurobi or CPLEX) since they have a better performance.
In order to use the k-medoids clustering, make sure that you have installed a MILP solver. As default [HiGHS](https://github.com/ERGO-Code/HiGHS) is used. Nevertheless, in case you have access to a license we recommend commercial solvers (e.g. Gurobi or CPLEX) since they have a better performance.


## Examples
Expand Down Expand Up @@ -95,20 +90,26 @@ The example time series are based on a department [publication](https://www.mdpi

MIT License

Copyright (C) 2016-2019 Leander Kotzur (FZJ IEK-3), Maximilian Hoffmann (FZJ IEK-3), Peter Markewitz (FZJ IEK-3), Martin Robinius (FZJ IEK-3), Detlef Stolten (FZJ IEK-3)
Copyright (C) 2016-2022 Leander Kotzur (FZJ IEK-3), Maximilian Hoffmann (FZJ IEK-3), Peter Markewitz (FZJ IEK-3), Martin Robinius (FZJ IEK-3), Detlef Stolten (FZJ IEK-3)

You should have received a copy of the MIT License along with this program.
If not, see https://opensource.org/licenses/MIT

The core developer team sits in the [Institute of Energy and Climate Research - Techno-Economic Energy Systems Analysis (IEK-3)](https://www.fz-juelich.de/iek/iek-3/EN/Home/home_node.html) belonging to the [Forschungszentrum Jülich](https://www.fz-juelich.de/).

## Further Reading
## Citing and further reading

If you want to use tsam in a published work, **please kindly cite** our latest journal articles:
* Hoffmann et al. (2022):\
[**The Pareto-Optimal Temporal Aggregation of Energy System Models**](https://www.sciencedirect.com/science/article/abs/pii/S0306261922004342)


If you are further interested in the impact of time series aggregation on the cost-optimal results on different energy system use cases, you can find a publication which validates the methods and describes their cababilites via the following [**link**](https://www.sciencedirect.com/science/article/pii/S0960148117309783). A second publication introduces a method how to model state variables (e.g. the state of charge of energy storage components) between the aggregated typical periods which can be found [**here**](https://www.sciencedirect.com/science/article/pii/S0306261918300242). Finally yet importantly the potential of time series aggregation to simplify mixed integer linear problems is investigated [**here**](https://www.mdpi.com/1996-1073/12/14/2825).

The publications about time series aggregation for energy system optimization models published alongside the development of tsam are listed below:
* Hoffmann et al. (2021):\
[**The Pareto-Optimal Temporal Aggregation of Energy System Models**](https://arxiv.org/abs/2111.12072)
[**The Pareto-Optimal Temporal Aggregation of Energy System Models**](https://www.sciencedirect.com/science/article/abs/pii/S0306261922004342)\
(open access manuscript to be found [**here**](https://arxiv.org/abs/1710.07593))
* Hoffmann et al. (2021):\
[**Typical periods or typical time steps? A multi-model analysis to determine the optimal temporal aggregation for energy system models**](https://www.sciencedirect.com/science/article/abs/pii/S0306261921011545)
* Hoffmann et al. (2020):\
Expand Down
5 changes: 3 additions & 2 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
scikit-learn>=0.0
pandas>=0.18.1
numpy>=1.11.0
pyomo>=5.3
pyomo>=6.4.3
networkx
tqdm
tqdm
highspy
8 changes: 3 additions & 5 deletions requirements.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,6 @@ channels:
- conda-forge
dependencies:
- python
- scikit-learn
- pandas>=0.18.1
- numpy>=1.11.0
- coincbc
- tqdm
- pip
- pip:
- -r requirements.txt
4 changes: 2 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@

setuptools.setup(
name="tsam",
version="2.1.0",
version="2.2.2",
author="Leander Kotzur, Maximilian Hoffmann",
author_email="l.kotzur@fz-juelich.de, max.hoffmann@fz-juelich.de",
author_email="leander.kotzur@googlemail.com, max.hoffmann@fz-juelich.de",
description="Time series aggregation module (tsam) to create typical periods",
long_description=long_description,
long_description_content_type="text/markdown",
Expand Down
31 changes: 31 additions & 0 deletions test/test_durationRepresentation.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,8 @@ def test_distributionMinMaxRepresentation():
aggregation = tsam.TimeSeriesAggregation(
raw,
noTypicalPeriods=8,
segmentation=True,
noSegments=8,
hoursPerPeriod=24,
sortValues=False,
clusterMethod="hierarchical",
Expand All @@ -110,6 +112,35 @@ def test_distributionMinMaxRepresentation():
)


def test_distributionRepresentation_keeps_mean():

raw = pd.read_csv(
os.path.join(os.path.dirname(__file__), "..", "examples", "testdata.csv"),
index_col=0,
)

aggregation = tsam.TimeSeriesAggregation(
raw,
noTypicalPeriods=8,
hoursPerPeriod=24,
segmentation=True,
noSegments=8,
sortValues=False,
clusterMethod="hierarchical",
representationMethod="distributionRepresentation",
distributionPeriodWise=False,
rescaleClusterPeriods=False, # even without rescaling
)

predictedPeriods = aggregation.predictOriginalData()

assert np.isclose(
raw.mean(),
predictedPeriods.mean(),
atol=1e-4
).all()




if __name__ == "__main__":
Expand Down
63 changes: 60 additions & 3 deletions test/test_hypertuneAggregation.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ def test_optimalPair():
index_col=0,
)

datareduction=0.1
datareduction=0.01

# just take wind
aggregation_wind = tune.HyperTunedAggregations(
Expand All @@ -47,7 +47,7 @@ def test_optimalPair():
)

# and identify the best combination for a data reduction of to ~10%.
windSegments, windPeriods= aggregation_wind.identifyOptimalSegmentPeriodCombination(dataReduction=datareduction)
windSegments, windPeriods, windRMSE= aggregation_wind.identifyOptimalSegmentPeriodCombination(dataReduction=datareduction)

# just take solar irradiation
aggregation_solar = tune.HyperTunedAggregations(
Expand All @@ -63,7 +63,7 @@ def test_optimalPair():
)

# and identify the best combination for a data reduction of to ~10%.
solarSegments, solarPeriods = aggregation_solar.identifyOptimalSegmentPeriodCombination(dataReduction=datareduction)
solarSegments, solarPeriods, solarRMSE = aggregation_solar.identifyOptimalSegmentPeriodCombination(dataReduction=datareduction)


# according to Hoffmann et al. 2022 is for solar more segments and less days better than for wind
Expand All @@ -74,6 +74,63 @@ def test_optimalPair():
assert windPeriods * windSegments <= len(raw["Wind"])*datareduction
assert windPeriods * windSegments >= len(raw["Wind"])*datareduction * 0.8


def test_steepest_gradient_leads_to_optima():
"""
Based on the hint of Eva Simarik, check if the RMSE is for the optimized combination
of segments and periods smaller than sole segmentation approach
"""

raw = pd.read_csv(
os.path.join(os.path.dirname(__file__), "..", "examples", "testdata.csv"),
index_col=0,
)

SEGMENTS_TESTED = 5

datareduction = (SEGMENTS_TESTED*365)/8760

# just take wind
tunedAggregations = tune.HyperTunedAggregations(
tsam.TimeSeriesAggregation(
raw,
hoursPerPeriod=24,
clusterMethod="hierarchical",
representationMethod="meanRepresentation",
rescaleClusterPeriods=False,
segmentation=True,
)
)

# and identify the best combination for a data reduction.
segmentsOpt, periodsOpt, RMSEOpt = tunedAggregations.identifyOptimalSegmentPeriodCombination(dataReduction=datareduction)

# test steepest
tunedAggregations.identifyParetoOptimalAggregation(untilTotalTimeSteps=365*SEGMENTS_TESTED)
steepestAggregation = tunedAggregations.aggregationHistory[-1]
RMSEsteepest = steepestAggregation.totalAccuracyIndicators()["RMSE"]

# only segments
aggregation = tsam.TimeSeriesAggregation(raw,
noTypicalPeriods= 365,
hoursPerPeriod = 24,
segmentation = True,
noSegments = SEGMENTS_TESTED,
clusterMethod="hierarchical",
representationMethod="meanRepresentation",
)

RMSESegments = aggregation.totalAccuracyIndicators()["RMSE"]

assert RMSEsteepest < RMSESegments

assert np.isclose(RMSEsteepest, RMSEOpt, atol=1e-3)






def test_paretoOptimalAggregation():

raw = pd.read_csv(
Expand Down
4 changes: 4 additions & 0 deletions test/test_samemean.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,5 +43,9 @@ def test_samemean():
)






if __name__ == "__main__":
test_samemean()

0 comments on commit a8dd928

Please sign in to comment.