Merge pull request #66 from FZJ-IEK3-VSA/develop

+ fix Hypertuning class + set highspy as default solver
FZJ-IEK3-VSA · Nov 29, 2022 · a8dd928 · a8dd928
2 parents 2d91c08 + 91c6da3
commit a8dd928
Show file tree

Hide file tree

Showing 14 changed files with 191 additions and 91 deletions.
diff --git a/.travis.yml b/.travis.yml
@@ -1,26 +1,13 @@
-# Modified from
-# https://github.com/PyPSA/PyPSA/master/.travis.yml
 language: python
-sudo: false 
-
-before_install:
-  - wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh;
-  - bash miniconda.sh -b -p $HOME/miniconda
-  - export PATH="$HOME/miniconda/bin:$PATH"
-  - hash -r
-  - conda config --set always_yes yes --set changeps1 no
-  - conda update -q conda
-  - conda info -a
-
+python:
+  - "3.9"
 install:
-  - conda env update -q --file=requirements.yml
-  - conda env update -q --file=requirements_dev.yml
-  - source activate tsam
+  - pip install -r requirements.txt
+  - pip install pytest
+  - pip install pytest-cov
   - pip install codecov
   - pip install --no-cache-dir -e .
-
 script:
-  - source activate tsam
   - pytest --cov=./tsam
 
 after_success:

diff --git a/README.md b/README.md
@@ -1,23 +1,18 @@
 [![Build Status](https://travis-ci.com/FZJ-IEK3-VSA/tsam.svg?branch=master)](https://travis-ci.com/FZJ-IEK3-VSA/tsam) [![Version](https://img.shields.io/pypi/v/tsam.svg)](https://pypi.python.org/pypi/tsam) [![Documentation Status](https://readthedocs.org/projects/tsam/badge/?version=latest)](https://tsam.readthedocs.io/en/latest/) [![PyPI - License](https://img.shields.io/pypi/l/tsam)]((https://github.com/FZJ-IEK3-VSA/tsam/blob/master/LICENSE.txt)) [![codecov](https://codecov.io/gh/FZJ-IEK3-VSA/tsam/branch/master/graph/badge.svg)](https://codecov.io/gh/FZJ-IEK3-VSA/tsam)
 [![badge](https://img.shields.io/badge/launch-binder-579aca.svg?logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAFkAAABZCAMAAABi1XidAAAB8lBMVEX///9XmsrmZYH1olJXmsr1olJXmsrmZYH1olJXmsr1olJXmsrmZYH1olL1olJXmsr1olJXmsrmZYH1olL1olJXmsrmZYH1olJXmsr1olL1olJXmsrmZYH1olL1olJXmsrmZYH1olL1olL0nFf1olJXmsrmZYH1olJXmsq8dZb1olJXmsrmZYH1olJXmspXmspXmsr1olL1olJXmsrmZYH1olJXmsr1olL1olJXmsrmZYH1olL1olLeaIVXmsrmZYH1olL1olL1olJXmsrmZYH1olLna31Xmsr1olJXmsr1olJXmsrmZYH1olLqoVr1olJXmsr1olJXmsrmZYH1olL1olKkfaPobXvviGabgadXmsqThKuofKHmZ4Dobnr1olJXmsr1olJXmspXmsr1olJXmsrfZ4TuhWn1olL1olJXmsqBi7X1olJXmspZmslbmMhbmsdemsVfl8ZgmsNim8Jpk8F0m7R4m7F5nLB6jbh7jbiDirOEibOGnKaMhq+PnaCVg6qWg6qegKaff6WhnpKofKGtnomxeZy3noG6dZi+n3vCcpPDcpPGn3bLb4/Mb47UbIrVa4rYoGjdaIbeaIXhoWHmZYHobXvpcHjqdHXreHLroVrsfG/uhGnuh2bwj2Hxk17yl1vzmljzm1j0nlX1olL3AJXWAAAAbXRSTlMAEBAQHx8gICAuLjAwMDw9PUBAQEpQUFBXV1hgYGBkcHBwcXl8gICAgoiIkJCQlJicnJ2goKCmqK+wsLC4usDAwMjP0NDQ1NbW3Nzg4ODi5+3v8PDw8/T09PX29vb39/f5+fr7+/z8/Pz9/v7+zczCxgAABC5JREFUeAHN1ul3k0UUBvCb1CTVpmpaitAGSLSpSuKCLWpbTKNJFGlcSMAFF63iUmRccNG6gLbuxkXU66JAUef/9LSpmXnyLr3T5AO/rzl5zj137p136BISy44fKJXuGN/d19PUfYeO67Znqtf2KH33Id1psXoFdW30sPZ1sMvs2D060AHqws4FHeJojLZqnw53cmfvg+XR8mC0OEjuxrXEkX5ydeVJLVIlV0e10PXk5k7dYeHu7Cj1j+49uKg7uLU61tGLw1lq27ugQYlclHC4bgv7VQ+TAyj5Zc/UjsPvs1sd5cWryWObtvWT2EPa4rtnWW3JkpjggEpbOsPr7F7EyNewtpBIslA7p43HCsnwooXTEc3UmPmCNn5lrqTJxy6nRmcavGZVt/3Da2pD5NHvsOHJCrdc1G2r3DITpU7yic7w/7Rxnjc0kt5GC4djiv2Sz3Fb2iEZg41/ddsFDoyuYrIkmFehz0HR2thPgQqMyQYb2OtB0WxsZ3BeG3+wpRb1vzl2UYBog8FfGhttFKjtAclnZYrRo9ryG9uG/FZQU4AEg8ZE9LjGMzTmqKXPLnlWVnIlQQTvxJf8ip7VgjZjyVPrjw1te5otM7RmP7xm+sK2Gv9I8Gi++BRbEkR9EBw8zRUcKxwp73xkaLiqQb+kGduJTNHG72zcW9LoJgqQxpP3/Tj//c3yB0tqzaml05/+orHLksVO+95kX7/7qgJvnjlrfr2Ggsyx0eoy9uPzN5SPd86aXggOsEKW2Prz7du3VID3/tzs/sSRs2w7ovVHKtjrX2pd7ZMlTxAYfBAL9jiDwfLkq55Tm7ifhMlTGPyCAs7RFRhn47JnlcB9RM5T97ASuZXIcVNuUDIndpDbdsfrqsOppeXl5Y+XVKdjFCTh+zGaVuj0d9zy05PPK3QzBamxdwtTCrzyg/2Rvf2EstUjordGwa/kx9mSJLr8mLLtCW8HHGJc2R5hS219IiF6PnTusOqcMl57gm0Z8kanKMAQg0qSyuZfn7zItsbGyO9QlnxY0eCuD1XL2ys/MsrQhltE7Ug0uFOzufJFE2PxBo/YAx8XPPdDwWN0MrDRYIZF0mSMKCNHgaIVFoBbNoLJ7tEQDKxGF0kcLQimojCZopv0OkNOyWCCg9XMVAi7ARJzQdM2QUh0gmBozjc3Skg6dSBRqDGYSUOu66Zg+I2fNZs/M3/f/Grl/XnyF1Gw3VKCez0PN5IUfFLqvgUN4C0qNqYs5YhPL+aVZYDE4IpUk57oSFnJm4FyCqqOE0jhY2SMyLFoo56zyo6becOS5UVDdj7Vih0zp+tcMhwRpBeLyqtIjlJKAIZSbI8SGSF3k0pA3mR5tHuwPFoa7N7reoq2bqCsAk1HqCu5uvI1n6JuRXI+S1Mco54YmYTwcn6Aeic+kssXi8XpXC4V3t7/ADuTNKaQJdScAAAAAElFTkSuQmCC)](https://mybinder.org/v2/gh/FZJ-IEK3-VSA/voila-tsam/HEAD?urlpath=voila/render/Time-Series-Aggregation-Module.ipynb)
 
-<a href="https://www.fz-juelich.de/iek/iek-3"><img src="https://www.fz-juelich.de/SiteGlobals/StyleBundles/Bilder/NeuesLayout/logo.jpg?__blob=normal" alt="Forschungszentrum Juelich Logo"></a> 
+<a href="https://www.fz-juelich.de/en/iek/iek-3"><img src="https://www.fz-juelich.de/static/media/Logo.2ceb35fc.svg" alt="Forschungszentrum Juelich Logo" width="230px"></a> 
 
 # tsam - Time Series Aggregation Module
 tsam is a python package which uses different machine learning algorithms for the aggregation of time series. The data aggregation can be performed in two freely combinable dimensions: By representing the time series by a user-defined number of typical periods or by decreasing the temporal resolution.
 tsam was originally designed for reducing the computational load for large-scale energy system optimization models by aggregating their input data, but is applicable for all types of time series, e.g., weather data, load data, both simultaneously or other arbitrary groups of time series.
 
-If you want to use tsam in a published work, **please kindly cite** one of our latest journal articles:
-* Hoffmann et al. (2022):\
-[**The Pareto-Optimal Temporal Aggregation of Energy System Models**](https://www.sciencedirect.com/science/article/abs/pii/S0306261922004342)
-* Hoffmann et al. (2021):\
-[**Typical periods or typical time steps? A multi-model analysis to determine the optimal temporal aggregation for energy system models**](https://www.sciencedirect.com/science/article/abs/pii/S0306261921011545)
-
 The documentation of the tsam code can be found [**here**](https://tsam.readthedocs.io/en/latest/index.html).
 
 ## Features
 * flexible handling of multidimensional time-series via the pandas module
 * different aggregation methods implemented (averaging, k-means, exact k-medoids, hierarchical, k-maxoids, k-medoids with contiguity), which are based on scikit-learn, or self-programmed with pyomo
+* hypertuning of aggregation parameters to find the optimal combination of the number of segments inside a period and the number of typical periods
 * novel representation methods, keeping statistical attributes, such as the distribution 
 * flexible integration of extreme periods as own cluster centers
 * weighting for the case of multidimensional time-series to represent their relevance
@@ -41,7 +36,7 @@ Or install directly via python as
 
 	python setup.py install
 
-In order to use the k-medoids clustering, make sure that you have installed a MILP solver. As default coin-cbc is used. Nevertheless, in case you have access to a license we recommend commercial solvers (e.g. Gurobi or CPLEX) since they have a better performance.
+In order to use the k-medoids clustering, make sure that you have installed a MILP solver. As default [HiGHS](https://github.com/ERGO-Code/HiGHS) is used. Nevertheless, in case you have access to a license we recommend commercial solvers (e.g. Gurobi or CPLEX) since they have a better performance.
 
 
 ## Examples
@@ -95,20 +90,26 @@ The example time series are based on a department [publication](https://www.mdpi
 
 MIT License
 
-Copyright (C) 2016-2019 Leander Kotzur (FZJ IEK-3), Maximilian Hoffmann (FZJ IEK-3), Peter Markewitz (FZJ IEK-3), Martin Robinius (FZJ IEK-3), Detlef Stolten (FZJ IEK-3)
+Copyright (C) 2016-2022 Leander Kotzur (FZJ IEK-3), Maximilian Hoffmann (FZJ IEK-3), Peter Markewitz (FZJ IEK-3), Martin Robinius (FZJ IEK-3), Detlef Stolten (FZJ IEK-3)
 
 You should have received a copy of the MIT License along with this program.
 If not, see https://opensource.org/licenses/MIT
 
 The core developer team sits in the [Institute of Energy and Climate Research - Techno-Economic Energy Systems Analysis (IEK-3)](https://www.fz-juelich.de/iek/iek-3/EN/Home/home_node.html) belonging to the [Forschungszentrum Jülich](https://www.fz-juelich.de/).
 
-## Further Reading
+## Citing and further reading
+
+If you want to use tsam in a published work, **please kindly cite** our latest journal articles:
+* Hoffmann et al. (2022):\
+[**The Pareto-Optimal Temporal Aggregation of Energy System Models**](https://www.sciencedirect.com/science/article/abs/pii/S0306261922004342)
+
 
 If you are further interested in the impact of time series aggregation on the cost-optimal results on different energy system use cases, you can find a publication which validates the methods and describes their cababilites via the following [**link**](https://www.sciencedirect.com/science/article/pii/S0960148117309783). A second publication introduces a method how to model state variables (e.g. the state of charge of energy storage components) between the aggregated typical periods which can be found [**here**](https://www.sciencedirect.com/science/article/pii/S0306261918300242). Finally yet importantly the potential of time series aggregation to simplify mixed integer linear problems is investigated [**here**](https://www.mdpi.com/1996-1073/12/14/2825).
 
 The publications about time series aggregation for energy system optimization models published alongside the development of tsam are listed below:
 * Hoffmann et al. (2021):\
-[**The Pareto-Optimal Temporal Aggregation of Energy System Models**](https://arxiv.org/abs/2111.12072)
+[**The Pareto-Optimal Temporal Aggregation of Energy System Models**](https://www.sciencedirect.com/science/article/abs/pii/S0306261922004342)\
+(open access manuscript to be found [**here**](https://arxiv.org/abs/1710.07593))
 * Hoffmann et al. (2021):\
 [**Typical periods or typical time steps? A multi-model analysis to determine the optimal temporal aggregation for energy system models**](https://www.sciencedirect.com/science/article/abs/pii/S0306261921011545)
 * Hoffmann et al. (2020):\

diff --git a/requirements.txt b/requirements.txt
@@ -1,6 +1,7 @@
 scikit-learn>=0.0
 pandas>=0.18.1
 numpy>=1.11.0
-pyomo>=5.3
+pyomo>=6.4.3
 networkx
-tqdm
+tqdm
+highspy
diff --git a/requirements.yml b/requirements.yml
@@ -3,8 +3,6 @@ channels:
   - conda-forge
 dependencies:
   - python
-  - scikit-learn
-  - pandas>=0.18.1
-  - numpy>=1.11.0
-  - coincbc
-  - tqdm
+  - pip
+  - pip:
+    - -r requirements.txt
diff --git a/setup.py b/setup.py
@@ -8,9 +8,9 @@
 
 setuptools.setup(
     name="tsam",
-    version="2.1.0",
+    version="2.2.2",
     author="Leander Kotzur, Maximilian Hoffmann",
-    author_email="l.kotzur@fz-juelich.de, max.hoffmann@fz-juelich.de",
+    author_email="leander.kotzur@googlemail.com, max.hoffmann@fz-juelich.de",
     description="Time series aggregation module (tsam) to create typical periods",
     long_description=long_description,
     long_description_content_type="text/markdown",

diff --git a/test/test_durationRepresentation.py b/test/test_durationRepresentation.py
@@ -89,6 +89,8 @@ def test_distributionMinMaxRepresentation():
     aggregation = tsam.TimeSeriesAggregation(
         raw,
         noTypicalPeriods=8,
+        segmentation=True,
+        noSegments=8,
         hoursPerPeriod=24,
         sortValues=False,
         clusterMethod="hierarchical",
@@ -110,6 +112,35 @@ def test_distributionMinMaxRepresentation():
     )
 
 
+def test_distributionRepresentation_keeps_mean():
+
+    raw = pd.read_csv(
+        os.path.join(os.path.dirname(__file__), "..", "examples", "testdata.csv"),
+        index_col=0,
+    )
+
+    aggregation = tsam.TimeSeriesAggregation(
+        raw,
+        noTypicalPeriods=8,
+        hoursPerPeriod=24,
+        segmentation=True,
+        noSegments=8,
+        sortValues=False,
+        clusterMethod="hierarchical",
+        representationMethod="distributionRepresentation",
+        distributionPeriodWise=False,
+        rescaleClusterPeriods=False, # even without rescaling
+    )
+
+    predictedPeriods = aggregation.predictOriginalData()
+
+    assert np.isclose(
+        raw.mean(),
+        predictedPeriods.mean(),
+        atol=1e-4
+    ).all()
+
+
 
 
 if __name__ == "__main__":

diff --git a/test/test_hypertuneAggregation.py b/test/test_hypertuneAggregation.py
@@ -31,7 +31,7 @@ def test_optimalPair():
         index_col=0,
     )
 
-    datareduction=0.1
+    datareduction=0.01
 
     # just take wind
     aggregation_wind = tune.HyperTunedAggregations(
@@ -47,7 +47,7 @@ def test_optimalPair():
     )
 
     # and identify the best combination for a data reduction of to ~10%. 
-    windSegments, windPeriods= aggregation_wind.identifyOptimalSegmentPeriodCombination(dataReduction=datareduction)
+    windSegments, windPeriods, windRMSE= aggregation_wind.identifyOptimalSegmentPeriodCombination(dataReduction=datareduction)
 
     # just take solar irradiation
     aggregation_solar = tune.HyperTunedAggregations(
@@ -63,7 +63,7 @@ def test_optimalPair():
     )
 
     # and identify the best combination for a data reduction of to ~10%. 
-    solarSegments, solarPeriods = aggregation_solar.identifyOptimalSegmentPeriodCombination(dataReduction=datareduction)
+    solarSegments, solarPeriods, solarRMSE = aggregation_solar.identifyOptimalSegmentPeriodCombination(dataReduction=datareduction)
 
 
     # according to Hoffmann et al. 2022 is for solar more segments and less days better than for wind
@@ -74,6 +74,63 @@ def test_optimalPair():
     assert windPeriods * windSegments <= len(raw["Wind"])*datareduction
     assert windPeriods * windSegments >= len(raw["Wind"])*datareduction * 0.8
 
+
+def test_steepest_gradient_leads_to_optima():
+    """
+    Based on the hint of Eva Simarik, check if the RMSE is for the optimized combination
+    of segments and periods smaller than sole segmentation approach 
+    """
+
+    raw = pd.read_csv(
+        os.path.join(os.path.dirname(__file__), "..", "examples", "testdata.csv"),
+        index_col=0,
+    )
+
+    SEGMENTS_TESTED = 5
+
+    datareduction = (SEGMENTS_TESTED*365)/8760
+
+    # just take wind
+    tunedAggregations = tune.HyperTunedAggregations(
+        tsam.TimeSeriesAggregation(
+            raw,
+            hoursPerPeriod=24,
+            clusterMethod="hierarchical",
+            representationMethod="meanRepresentation",
+            rescaleClusterPeriods=False,
+            segmentation=True,
+        )
+    )
+
+    # and identify the best combination for a data reduction. 
+    segmentsOpt, periodsOpt, RMSEOpt = tunedAggregations.identifyOptimalSegmentPeriodCombination(dataReduction=datareduction)
+
+    # test steepest
+    tunedAggregations.identifyParetoOptimalAggregation(untilTotalTimeSteps=365*SEGMENTS_TESTED)
+    steepestAggregation = tunedAggregations.aggregationHistory[-1]
+    RMSEsteepest = steepestAggregation.totalAccuracyIndicators()["RMSE"]
+
+    # only segments
+    aggregation = tsam.TimeSeriesAggregation(raw,
+        noTypicalPeriods= 365,
+        hoursPerPeriod = 24,
+        segmentation = True,
+        noSegments = SEGMENTS_TESTED,
+        clusterMethod="hierarchical",
+        representationMethod="meanRepresentation",
+    )
+
+    RMSESegments = aggregation.totalAccuracyIndicators()["RMSE"]
+
+    assert RMSEsteepest < RMSESegments
+
+    assert np.isclose(RMSEsteepest, RMSEOpt, atol=1e-3)
+
+
+
+
+
+
 def test_paretoOptimalAggregation():
 
     raw = pd.read_csv(

diff --git a/test/test_samemean.py b/test/test_samemean.py
@@ -43,5 +43,9 @@ def test_samemean():
     )
 
 
+
+
+
+
 if __name__ == "__main__":
     test_samemean()