# Data based problem

If the problem to be optimized has already been solved for a representation of its' Pareto efficient front, it can be defined as a ScalarDataProblem.

Suppose we have a problem with 2 decision variables and 4 objectives. In this case, it is the river pollution problem as defined in https://ieeexplore.ieee.org/document/35354

The computed Pareto efficient solutions and the corresponding objective vector values have been computed in the file 'riverpollution.dat'. There is a total of 500 entries. Begin by importing relevant classes and laoding the data.

In [None]:
from desdeo.problem.Problem import ScalarDataProblem

import numpy as np

data = np.loadtxt("./data/riverpollution.dat")

The first 2 entries of each row are the decision variables, and the last 4 the objective function values.

In [None]:
xs, fs = data[:, 0:2], data[:, 2:]

The problem can now be defined:

In [None]:
problem = ScalarDataProblem(xs, fs)

That's it. Now the problem is defined and can be further utilized. Notice that there are no constraints. It is assumed that all the entries in the data file are feasible. The warning has to do with the fact, that the data is discrete, therefore the evaluations for specific decision variable values have to be approximated somehow. At the moment, the closest pair of decision variables is searched for in the data. Later on, a surrogate model for the data might be build to have a better approximation.

In [None]:
print("N of objectives:", problem.n_of_objectives)
print("N of variables:", problem.n_of_variables)

print("Single decision vector evaluaion:", problem.evaluate([0.4, 0.5]))