# Tutorial: Using tsam with oemof.tabular

@oemof anniversary meeting 2024

by srhbrnds

## Useful literature

- https://energyinformatics.springeropen.com/articles/10.1186/s42162-022-00208-5 , (Blanke, et. al.: **Time series aggregation for energy system design: review and extension of modelling seasonal storages**, 2022)

- https://www.mdpi.com/1996-1073/13/3/641 (Kotzur, et.al: **A Review on Time Series Aggregation Methods for Energy System Models**, 2020).

- https://www.sciencedirect.com/science/article/abs/pii/S0306261922004342 (Hoffmann, et. al: **The Pareto-Optimal Temporal Aggregation
of Energy System Models**, 2022)

- **oemof-solph example**: https://github.com/oemof/oemof-solph/tree/feature/integrate_tsam/examples/tsam

- **oemof-tabular example**: https://github.com/oemof/oemof-tabular/tree/features/add-tsam-to-datapackage/src/oemof/tabular/examples/datapackages/dispatch_tsam_without_multi_periods

## Limitations
- segmentation doesn't work so far
- works currently only on experimental branches in oemof.solph and oemof.tabular
- python versions 3.9, 3.10

## How it works

### oemof.tabular example

#### Make imports

In [36]:
import os
import time

import pandas as pd
import tsam.timeseriesaggregation as tsam

from pathlib import Path

import oemof_tsam_helpers as helpers

from oemof.solph import EnergySystem, Model, processing

# DONT REMOVE THIS LINE!
from oemof.tabular import datapackage  # noqa
from oemof.tabular.constraint_facades import CONSTRAINT_TYPE_MAP
from oemof.tabular.facades import TYPEMAP
from oemof.tabular.postprocessing import calculations

#### Set scenario names for origin and target (tsam)

In [73]:
datapackage_name_origin ="dispatch"
datapackage_name_tsam= "dispatch_tsam"

#### Specify paths for datapackage origin and target (tsam)

In [74]:
# specify paths to datapackage
datapackage_path_origin=helpers.DATAPACKAGE_DIR / datapackage_name_origin
datapackage_path_tsam=helpers.DATAPACKAGE_DIR  / datapackage_name_tsam

print('Origin path:', datapackage_path_origin)
print('Origin path exists?:', datapackage_path_origin.exists())

print('Target path:', datapackage_path_tsam)
print('Target path exists?:', datapackage_path_tsam.exists())

Origin path: /home/sarah/git_repos/oemof_tsam_tutorial/dispatch
Origin path exists?: True
Target path: /home/sarah/git_repos/oemof_tsam_tutorial/dispatch_tsam
Target path exists?: True


#### Prepare time series data

All timeseries data needs to be stored in one single DataFrame. For defaults check the oemof_tsam_helpers.py

In [68]:
"""Crawl the sequences csv-files of oemof-tabular data/sequences path and merges them into one single DataFrame.
    Parameters
    ----------
    path : Path
        The path object pointing to the datapackage JSON-file.

    Returns
    -------
    profiles : pd.DataFrame
        DataFrame that contains all sequence data specified in the
        oemof-tabular datapackage.

    file_columns : dict
        Dictionary containing the file paths of the csv-files in the sequences
        path as keys and the column names of each csv-file as values."""

sequences, sequence_dict = helpers.crawl_sequences_data(path=datapackage_path_origin / "data" / "sequences")

sequences.head()

Unnamed: 0_level_0,electricity-load-profile,pv-profile,wind-profile
timeindex,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2050-01-01T00:00:00Z,7.5e-05,0.0,0.147532
2050-01-01T01:00:00Z,7.1e-05,0.0,0.184181
2050-01-01T02:00:00Z,6.9e-05,0.0,0.223937
2050-01-01T03:00:00Z,6.7e-05,0.0,0.255732
2050-01-01T04:00:00Z,6.6e-05,0.0,0.26844


#### Specify and run tsam 
See also TSAM examples as Jupyter Notebooks: https://github.com/FZJ-IEK3-VSA/tsam/blob/master/examples/aggregation_example.ipynb
And for all parameter: https://tsam.readthedocs.io/en/latest/timeseriesaggregationDoc.html#time-series-aggregation-class

In [71]:
# aggregate time series by running tsam 

aggregation = tsam.TimeSeriesAggregation(sequences,
    noTypicalPeriods=10,
    hoursPerPeriod=24,
    sortValues=False,
    clusterMethod="k_means",
    rescaleClusterPeriods=False,
    extremePeriodMethod="replace_cluster_center",
    representationMethod="durationRepresentation")

aggregation

<tsam.timeseriesaggregation.TimeSeriesAggregation at 0x7d8191f8f250>

#### Prepare oemof-tabular datapackage for tsam

##### Take tsa_aggregation object and derive tsa_parameters, tsa_sequences, tsa_timeindex.

    Parameters
    ----------
    tsa_aggregation: tsam.TimeSeriesAggregation
        contains all relevant parameters and values as a result of executing
        the time series aggregation

    Returns
    -------
    tsa_sequences: pd.DataFrame
        contains typical periods and data of all oemof-tabular sequences
    tsa_parameters: pd.DataFrame
        contains meta information to solph oemof model using tsam
    tsa_timeindex: pd.Index
        contains the timeindex of aggregated and is used as index for
        seqeuences data

In [72]:
aggregated_sequences, parameters, timeindex = helpers.prepare_oemof_parameters(aggregation)

ValueError: n_samples=5 should be >= n_clusters=10.

##### Convert and save aggregated time series dataframe into oemof-tabular sequence files

    Parameters
    ----------
    tsa_sequences: pd.DataFrame
        contains typical periods and data of all oemof-tabular sequences

    tsa_timeindex: pd.Index
        contains the timeindex of aggregated and is used as index for
        seqeuences data.

    file_columns : dict
        Dictionary containing the file paths of the csv-files in the sequences
        path as keys and the column names of each csv-file as values.

    path : Path
        The Path object pointing to the oemof-tabular sequences directory
        (data/sequences) in which the tsa_profiles will be stored.

    Returns
    -------
    None

In [53]:
helpers.convert_tsa_sequences_to_oemof_sequences(aggregated_sequences, timeindex, sequence_dict, path=datapackage_path_tsam / "data" / "sequences")

##### Store tsa_parameters to path.
    
    Parameters
    ----------
    tsa_parameters : pd.DataFrame
        The path to the origin directory to copy from. Defaults to elements_original_path.
    path : Path
        The path to the oemof-tabular datapackage data/tsam.

    Returns
    -------
    None

In [54]:
helpers.store_tsa_parameter(parameters, path=datapackage_path_tsam / "data" / "tsam")

##### Create and store periods into oemof-tabular datapackage.

This necessary for multi-period optimization in oemof, if no_of_periods=0 function passes None.

    Parameters
    ----------
    tsa_timeindex: pd.Index
        Contains the timeindex of aggregated and is used as index for
        seqeuences data.

    no_of_periods : int
        Number of periods used in oemof NOT in time series aggregation.

    timeincrement : int
        Timeincrement for each period and timestep to allow for
        segmentation.

    path : Path, Default: data/periods
        The Path object pointing to the tsam oemof-tabular datapackage directory
        in which the periods will be stored.

    Returns
    -------
    periods: pd.DataFrame
        Dataframe that maps timeindex, periods and timeincrement for
        each period.

In [55]:
helpers.create_oemof_periods_csv(timeindex, path=datapackage_path_tsam / "data"/ "periods")

##### Copy data from the origin directory to the goal directory if the goal directory does not exist.

    Parameters
    ----------
    origin_path : Union[str, Path]
        The path to the origin directory to copy from. Defaults to elements_original_path.
    goal_path : Union[str, Path]
        The path to the goal directory to copy to. Defaults to elements_path.

    Returns
    -------
    None

In [56]:
helpers.copy_elements_data(origin_path=datapackage_path_origin/ "data"/ "elements", goal_path=datapackage_path_tsam / "data" / "elements")

##### Copy and append datapackage.json

from the origin directory to the goal directory if the goal directory does not exist. 

    """
    resource= {
            "path": "data/tsam/tsa_parameters.csv",
            "profile": "tabular-data-resource",
            "name": "tsa_parameters",
            "format": "csv",
            "mediatype": "text/csv",
            "encoding": "utf-8",
            "schema": {
                "fields": [
                    {
                        "name": "period",
                        "type": "integer",
                        "format": "default"
                    },
                    {
                        "name": "timesteps_per_period",
                        "type": "integer",
                        "format": "default"
                    },
                    {
                        "name": "order",
                        "type": "array",
                        "format": "default"
                    }
                ],
                "missingValues": [
                    ""
                ]
            }
        }
    """

In [57]:
helpers.copy_and_append_datapackage_json(origin_path= datapackage_path_origin, goal_path=datapackage_path_tsam)



#### Create oemof-Model from datapackage

In [59]:
scenario_name=datapackage_name_tsam

In [60]:
datapackage_path=Path(Path.cwd(),scenario_name)
results_path=Path(datapackage_path,'results')

if not results_path.exists():
    Path.mkdir(results_path)

print(results_path)

/home/sarah/git_repos/oemof_tsam_tutorial/dispatch_tsam/results


In [61]:
# create energy system object
es = EnergySystem.from_datapackage(
    os.path.join(datapackage_path, "datapackage.json"),
    attributemap={},
    typemap=TYPEMAP,
)



In [62]:
# create model from energy system (this is just oemof.solph)
m = Model(es)

# add constraints from datapackage to the model
m.add_constraints_from_datapackage(
    os.path.join(datapackage_path, "datapackage.json"),
    constraint_type_map=CONSTRAINT_TYPE_MAP,
)

# if you want dual variables / shadow prices uncomment line below
# m.receive_duals()

startTime = time.time()
# select solver 'gurobi', 'cplex', 'glpk' etc
m.solve('cbc')
executionTime = time.time() - startTime

es.params = processing.parameter_as_dict(es)
es.results = m.results()

# now we use the write results method to write the results in oemof-tabular
# format
postprocessed_results = calculations.run_postprocessing(es)
postprocessed_results.to_csv(os.path.join(results_path, "results.csv"))
executionTime

  flow_dict[flow] = ts.interpolate(method="pad").reset_index("timestep")
  flow_dict[flow] = ts.interpolate(method="pad").reset_index("timestep")
  flow_dict[flow] = ts.interpolate(method="pad").reset_index("timestep")
  flow_dict[flow] = ts.interpolate(method="pad").reset_index("timestep")
  flow_dict[flow] = ts.interpolate(method="pad").reset_index("timestep")
  flow_dict[flow] = ts.interpolate(method="pad").reset_index("timestep")
  flow_dict[flow] = ts.interpolate(method="pad").reset_index("timestep")
  flow_dict[flow] = ts.interpolate(method="pad").reset_index("timestep")
  flow_dict[flow] = ts.interpolate(method="pad").reset_index("timestep")
  flow_dict[flow] = ts.interpolate(method="pad").reset_index("timestep")
  flow_dict[flow] = ts.interpolate(method="pad").reset_index("timestep")
  flow_dict[flow] = ts.interpolate(method="pad").reset_index("timestep")
  flow_dict[flow] = ts.interpolate(method="pad").reset_index("timestep")
  flow_dict[flow] = ts.interpolate(method="pad").re

5.034213304519653

#### Compare results

In [63]:
results_origin=pd.read_csv(datapackage_path_origin / "results"/ "results.csv")
results_tsam= pd.read_csv(datapackage_path_tsam / "results" / "results.csv")

results_origin['var_value_tsam']=results_tsam['var_value']
results_origin.drop(columns=['region','type','carrier','tech'], inplace=True)
results_origin

Unnamed: 0,name,var_name,var_value,var_value_tsam
0,coal,flow_out_bus0,0.0,0.0
1,coal,summed_variable_costs_out_bus0,0.0,0.0
2,gas,flow_out_bus1,0.0,0.0
3,gas,summed_variable_costs_out_bus1,0.0,0.0
4,lignite,flow_out_bus0,5981.578947,5981.578933
5,lignite,summed_variable_costs_out_bus0,119631.578946,119631.578669
6,el-storage1,flow_in_bus0,0.0,0.0
7,el-storage2,flow_in_bus0,0.0,0.0
8,el-storage1,flow_out_bus0,0.0,0.0
9,el-storage2,flow_out_bus0,0.0,0.0
