## Using the models on your data

In [1]:
import os
import sys

sys.path.append("../")

### Data formating

The data need to be given to the model as two np.ndarrays: X and Y. <br>
The need to have the same shape, X represents the criteria of the preferred alternatives, Y of the non-preferred alternatives.

Let's say that we observe:
- A > B > C
- B > D
- D > C

Then we will have:

X = [A, B, B, D]
Y = [B, C, D, C]

Now, each alternative must be replaced by its criteria values. The criteria are considered increasing, meaning that the higher the better.

In [2]:
import numpy as np

A = [0.5, 0.8]
B = [0.2, 0.3]
C = [0.1, 0.35]
D = [0.5, 0.2]

X = np.stack([A, B, B, D], axis=0)
Y = np.stack([B, C, D, C], axis=0)

In [3]:
from python.models import ClusterUTA, UTA
from python.heuristics import PLSHeuristic

Now, you can compute the model with .fit() and then access the coefficients with .coeffs:

**For the simple UTA model:**

In [None]:
model = UTA(
    n_pieces=5, # Number of linear pieces on each criteria
    epsilon=0.01, # Minimum utility difference between preference pairs
)
residual_error, u_x, u_y = model.fit(X, Y)

In [None]:
print(model.coeffs)
print(model.predict_utility(X))
print(model.predict_utility(Y))

**For the MILO ClusterUTA:**

In [None]:
model = ClusterUTA(
    n_clusters=3, # Number of clusters
    n_pieces=5, # Number of linear pieces on each criteria
    epsilon=0.01, # Minimum utility difference between preference pairs
)
residual_error, cluster_attributions, u_x, u_y = model.fit(
    X,
    Y,
    cluster_grouping=[0, 0, 1, 2], # Use same label for pairs that must be clustered together
    time_limit=None, # Maximum time allowed for optimization in seconds
    )

In [None]:
print(model.coeffs)
print(model.predict_utility(X))
print(model.predict_utility(Y))

**Finally for the Heuristic:**

In [8]:
model = PLSHeuristic(
    n_clusters=2, # Number of clusters
    models_class=UTA, # base model class
    models_params={"n_pieces": 1, # Number of linear pieces on each criteria
                   "epsilon": 0.01},
    n_init=10, # Number of initializations and optimizations
    max_iter_by_init=100, # Maximum number of iterations for each optimization
)
losses_by_init, best_losses = model.fit(
    X,
    Y,
    group_ids=[0, 0, 1, 2], # Use same label for pairs that must be clustered together
    )

In [None]:
print(model.coeffs)
print(model.predict_utility(X))
print(model.predict_utility(Y))