# MESSAGE-ix matrix improvement tool

This jupyter notebook is a prototype of the MESSAGE-ix matrix improvement tool.
This tool is aimed to automatically improve and optimize coefficient matrix quality for a MESSAGE-ix scenario and the used to return the results from scaled MESSAGE-ix matrix to the originally intended values.

This tool is derived from the tool developed by Makowski & Sosnowski, 1981 (https://pure.iiasa.ac.at/id/eprint/1766/1/CP-81-037.pdf)

According to Curtis and Reid (1972), matrix A can be described as well-scaled if:

$
\sum_{i} \sum_{j} \ (log_{10} \ |a_{i,j}|)^2 \leq v \qquad \qquad Eq. 1
$

with $v$ is an acceptable matrix quality criteria.
If $ax_{i,j}$ is $(log_{10} |a_{i,j}|)^2$ where $a_{i,j}$ is a non-zero value, matrix $Ax$ is:

In [1]:
import numpy as np
import pandas as pd
from pyomo.environ import *
from datetime import datetime
import xarray as xr

import matplotlib.pyplot as plt

def showme(df):
    return df["val"].unstack()

In [2]:
# determine absolute bound of exponent in matrix coeff
bound = 4 

def solv(df, bound):
    """
    this function filters matrix coefficient
    dataframe in which the log of the coefficient 
    is lower or higher than the bound
    """
    df_solv = df.loc[(df["val"] >= bound) |
                     (df["val"] <= -bound)]
    return df_solv

def make_logdf(df):
    df.loc[df['val']!=0,'val'] = np.log10(np.absolute(df.loc[df['val']!=0,'val']))
    return df
    

# Load the whole matrix
matrix       = (pd.read_csv('data/matrix.csv')
               .set_index(['row','col'],drop=True)[['val']])

# calculate log base 10 of the absolute value of the matrix
log_absmatrix = matrix.copy()
log_absmatrix = make_logdf(log_absmatrix)

# Create matrix with small and large coefficient
log_absmatrix_solv  = solv(log_absmatrix,bound=bound)


## Start Looping

In [3]:
# SCALE BY ROW
# Populating row scaler
RSFs = {row:[] for row in set(log_absmatrix_solv.index.get_level_values(0))-set(["_obj"])}
for k in RSFs.keys():
    rval = log_absmatrix.loc[(k),"val"]
    lb,ub = min(rval),max(rval)
    mid = np.mean([lb,ub])
    RSFs[k] = 10**(-mid)

# Create DataFrame of row scaler
row_scaler = pd.DataFrame(data=RSFs, index=["val"]).transpose()
row_scaler.index.name = 'row'

# Create new matrix with scaled rows
matrix0 = matrix.copy()
index_mod = matrix0.index.get_level_values('row').isin(row_scaler.index)

matrix0.loc[index_mod] = matrix0.loc[index_mod].mul(row_scaler)

In [4]:
# SCALE BY COL
# Populating col scaler
log_absmatrix0 = matrix0.copy()
log_absmatrix0 = make_logdf(log_absmatrix0)
log_absmatrix0_solv = solv(log_absmatrix0,bound=bound)

CSFs = {col:[] for col in set(log_absmatrix0_solv.index.get_level_values(1))-set(["constobj"])}
for k in CSFs.keys():
    cval = log_absmatrix0.loc[(log_absmatrix0.index.get_level_values('col') == k),"val"]
    lb,ub = min(cval),max(cval)
    mid = np.mean([lb,ub])
    CSFs[k] = 10**(-mid)

# Create DataFrame of col scaler
col_scaler = pd.DataFrame(data=CSFs, index=["val"]).transpose()
col_scaler.index.name = 'col'

# Create new matrix with scaled rows
new_matrix = matrix0.copy()
index_mod = new_matrix.index.get_level_values('col').isin(col_scaler.index)
new_matrix.loc[index_mod] = new_matrix.loc[index_mod].mul(col_scaler)

In [5]:
def report(text,df):
    log_absdf = df.copy()
    log_absdf.loc[log_absdf['val']!=0,'val']=(np.log10(
        np.absolute(
            log_absdf.loc[log_absdf['val']!=0,'val'])))
    
    print(f"{text}:","[",np.int32(np.min(log_absdf)),",",np.int32(np.max(log_absdf)),"]")

report("Original value",matrix)
report("Row scaling val",matrix0)
report("New Matrix",new_matrix)

Original value: [ -6 , 6 ]
Row scaling val: [ -4 , 4 ]
New Matrix: [ -3 , 3 ]


## Modify below later

np.log10(524288.0)