# Combining all together

Now we have all the pieces necessary to make implement the interactive optimization process using clustering as surrogates and different scalarization functions. Yeah.

In [1]:
 %matplotlib inline
import seaborn
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import os
from ASF import ASF
from gradutil import *
from pyomo.opt import SolverFactory
seedn = 1

First lets take all the data in

In [2]:
%%time
revenue, carbon, deadwood, ha = init_boreal()
n_revenue = nan_to_bau(revenue)
n_carbon= nan_to_bau(carbon)
n_deadwood = nan_to_bau(deadwood)
n_ha = nan_to_bau(ha)
ide = ideal(False)
nad = nadir(False)
opt = SolverFactory('glpk')

In [3]:
x = pd.concat((n_revenue, n_carbon, n_deadwood, n_ha), axis=1)
x_stack = np.dstack((n_revenue, n_carbon, n_deadwood, n_ha))

Normalize all the columns in 0-1 scale

In [4]:
%%time
x_norm = normalize(x.values)
x_norm_stack = normalize(x_stack)

Cluster the data to some clusters and calculate correponding weights

In [5]:
%%time 
nclust1 = 50
c, xtoc, dist = cluster(x_norm, nclust1, seedn, verbose=1)
w = np.array([sum(xtoc == i) for i in range(len(c))])

Calculate new cluster centers using average from normalized data

In [6]:
c_new = np.array([x_norm_stack[xtoc == i].mean(axis=0) for i in range(nclust1)])

In [7]:
%%time
ref = np.array((ide[0], 0, 0, 0))
asf = ASF(ide, nad, ref, c_new, weights=w)
opt.solve(asf.model)

In [None]:
model_to_real_values(x_stack, xtoc, asf.model)

## Trying to calculate ideal and nadir using clustering

In [None]:
%%time 
nclust2 = 1000
c2, xtoc2, dist2 = cluster(x_norm, nclust2, seedn, verbose=0)
w2 = np.array([sum(xtoc2 == i) for i in range(len(c2))])
c2_new = np.array([x_norm_stack[xtoc2 == i].mean(axis=0) for i in range(nclust2)])
c2_new_unscale = np.array([x_stack[xtoc2 == i].mean(axis=0) for i in range(nclust2)])

In [None]:
%%time
data = c2_new
weights = w2
solver = SolverFactory('glpk')
problems = []
for i in range(np.shape(data)[-1]):
    problems.append(BorealWeightedProblem(data[:, :, i], weights))
for j in range(len(problems)):
    solver.solve(problems[j].model)
payoff = [[np.sum(cluster_to_value(c2_new_unscale[:, :, i], res_to_list(problems[j].model), weights))
                   for i in range(np.shape(data)[-1])]
                  for j in range(len(problems))]
ide_clust = np.max(payoff, axis=0)
nad_clust = np.min(payoff, axis=0)

In [None]:

ide_clust, nad_clust

In [None]:
ide, nad

There clearly are differences in the vectors. The surrogate itself is not updatable so there clearly is no ways to improve the attained results.

In [None]:
ide-ide_clust

In [None]:
nad-nad_clust

Now the interesting part is if the different vectors really have any effects on the results. Even though the surrogate ideal and nadir both more averaged than the real ones, we are still dealing with the same more averaged clusters in the optimization.

## Effect of ideal and nadir

We could now test that by doing the same optimization (same reference) by using different ideal and nadir values. Especially the "edges" of Pareto front are interesting.

#### Reference to 0 0 0 0 

In [None]:
ref_test = np.array((0,0,0,0))

In [None]:
asf = ASF(ide, nad, ref_test, c_new, weights=w)
opt.solve(asf.model)
real_0 = model_to_real_values(x_stack, xtoc, asf.model)

In [None]:
asf = ASF(ide_clust, nad_clust, ref_test, c_new, weights=w)
opt.solve(asf.model)
cluster_0 = model_to_real_values(x_stack, xtoc, asf.model)

In [None]:
real_0-cluster_0

Well, there is difference. As we can see, real ideal and nadir give smaller value for the revenue value and greater for all the rest.

#### Reference to ideal

In this test it is important to note difference if we are referencing to the real ideal or the ideal of clusters. Results are of course different in these cases.

In [None]:
asf = ASF(ide, nad, ide, c_new, weights=w)
opt.solve(asf.model)
real_ide = model_to_real_values(x_stack, xtoc, asf.model)

In [None]:
asf = ASF(ide_clust, nad_clust, ide_clust, c_new, weights=w)
opt.solve(asf.model)
cluster_ide = model_to_real_values(x_stack, xtoc, asf.model)

In [None]:
real_ide-cluster_ide

The differences are still big, but differently than previously. I still don't know what to say about that.

The essence of this could better be desribed if we try to optimize just one objective. So let's refer to the ideal of the carbon objective.

In [None]:
asf = ASF(ide, nad, np.array((0,ide[1],0,0)), c_new, weights=w)
opt.solve(asf.model)
real_carbon = model_to_real_values(x_stack, xtoc, asf.model)

In [None]:
asf = ASF(ide_clust, nad_clust, np.array((0,ide_clust[1],0,0)), c_new, weights=w)
opt.solve(asf.model)
cluster_carbon = model_to_real_values(x_stack, xtoc, asf.model)

In [None]:
real_carbon-cluster_carbon

This is exactly the same than when using the ideal!?

In [None]:
real_ide - real_carbon

We even get the same point...

How about the deadwood?

In [None]:
asf = ASF(ide, nad, np.array((0,0,ide[2],0)), c_new, weights=w)
opt.solve(asf.model)
real_deadwood = model_to_real_values(x_stack, xtoc, asf.model)

In [None]:
asf = ASF(ide_clust, nad_clust, np.array((0,0,ide_clust[2],0)), c_new, weights=w)
opt.solve(asf.model)
cluster_deadwood = model_to_real_values(x_stack, xtoc, asf.model)

In [None]:
real_deadwood-cluster_deadwood

And Habitat index?

In [None]:
asf = ASF(ide, nad, np.array((0,0,0,ide[3])), c_new, weights=w)
opt.solve(asf.model)
real_ha = model_to_real_values(x_stack, xtoc, asf.model)

In [None]:
asf = ASF(ide_clust, nad_clust, np.array((0,0,0,ide_clust[3])), c_new, weights=w)
opt.solve(asf.model)
cluster_ha = model_to_real_values(x_stack, xtoc, asf.model)

In [None]:
real_ha-cluster_ha

So it looks like that there are some interesting relationships between ideals:
- In the revenue and the carbon there are greater differences in the ideal and nadir vectors compared to the real values. Probably because of this also the results attained by reference points are also more different when using real ideal or clustered ideal
- In the deadwood and the habitat the differences are smaller and so also results of references are more accurate.