# Module - Benchmarking
Ontime provides a Benchmark class that can be used to run a number of prediction models on a number of datasets.

In [1]:
# Import to be able to import python package from src
import sys
sys.path.insert(0, '../../../src')

from ontime.module.benchmarking.benchmark import Benchmark

Datasets submitted to a Benchmark must be of type TimeSeries and models must implement AbstractModel. To ensure that, you may have to create a simple wrapper class for the models, as shown below.

In [2]:
from ontime.core.model.abstract_model import AbstractModel

#a simple class ensuring used models have a fit() and predict() methods
class ModelWrapper(AbstractModel):
    def __init__(self, model):
        self.model = model

    def fit(self, dataset):
        self.model.fit(dataset)

    def predict(self, horizon):
        return self.model.predict(horizon)

In [3]:
from darts.models import ARIMA
from darts.models import BATS
from ontime.module.data.datasets import DatasetLoader

m1 = ModelWrapper(ARIMA(p=12, d=1, q=2))
m2 = ModelWrapper(BATS(use_trend = True))

d1 = DatasetLoader.AirPassengersDataset.load()
d2 = DatasetLoader.AusBeerDataset.load()

Models and datasets can be added to a Benchmark upon its instanciation or using add_model() and add_dataset(). Using the add_x() methods allows to give a name to the models and datasets that will be used when generating the report.

In [4]:
# adding directly
b1 = Benchmark([d1, d2], [m1, m2])

# adding one by one, with names
b2 = Benchmark()
b2.add_dataset(d1, "Air Passengers")
b2.add_dataset(d2, "Aus Beer")
b2.add_model(m1, "ARIMA")
b2.add_model(m2, "BATS")

Once the models and datasets have been added, the run() method will train instances of all the models on all the datasets individually and genereate metrics. The verbose parameter will print the status and results of the process as it progresses.

In [5]:
b1.run(verbose = True)
print("------------------------------------------------")
b2.run(verbose = False)

Starting evaluation...
Evaluation for model 1
on dataset 1 
train 

  warn('Non-stationary starting autoregressive parameters'


done, took 0.7006685733795166
infer done, took 0.005942821502685547
on dataset 2 
train 

  warn('Non-stationary starting autoregressive parameters'


done, took 0.82480788230896
infer done, took 0.007047176361083984
Evaluation for model 2
on dataset 1 
train done, took 7.4535181522369385
infer done, took 0.002725362777709961
on dataset 2 
train done, took 8.802074909210205
infer done, took 0.0041124820709228516
------------------------------------------------


  warn('Non-stationary starting autoregressive parameters'
  warn('Non-stationary starting autoregressive parameters'


To view the results, you can call get_report() and print the returned value

In [7]:
print(b1.get_report())

Model 1:
dataset 1
Results:
training time: 0.7006685733795166
test time: 0.005942821502685547
MAPE:  7.2210438134661645
dataset 2
Results:
training time: 0.82480788230896
test time: 0.007047176361083984
MAPE:  6.732026657640882

Model 2:
dataset 1
Results:
training time: 7.4535181522369385
test time: 0.002725362777709961
MAPE:  9.080914702612809
dataset 2
Results:
training time: 8.802074909210205
test time: 0.0041124820709228516
MAPE:  9.514618068126637



As mentioned above, when datasets and models have been initialized with names, those will be used in the report.

In [8]:
print(b2.get_report())

Model ARIMA:
dataset Air Passengers
Results:
training time: 1.2020156383514404
test time: 0.008520841598510742
MAPE:  7.2210438134661645
dataset Aus Beer
Results:
training time: 0.9088840484619141
test time: 0.0074732303619384766
MAPE:  6.732026657640882

Model BATS:
dataset Air Passengers
Results:
training time: 7.6037890911102295
test time: 0.0025300979614257812
MAPE:  9.080914702612809
dataset Aus Beer
Results:
training time: 8.48682188987732
test time: 0.0032923221588134766
MAPE:  9.514618068126637

