This notebook implements the code required for performing "grey fixed weight clustering".

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](http://colab.research.google.com/github/liviucotfas/grey-systems-book/blob/main/grey-fixed-weight-clustering.ipynb)

Note: the implementation has been inspired by the one in the "Grey Modeling Software (version 6)".

## Load the Excel file with the necessary data

**Step 1:** Download the Excel template and adjust the object parameters (Sheet 1), the whitenization weight functions (Sheet 2) and the weight of parameters (Sheet 3). The template is available at: https://github.com/liviucotfas/grey-systems-book/raw/main/grey-fixed-weight-clustering-template.xls .

Note: if you only wish to test this notebook, you can leave the default values in the Excel file.

In [None]:
import pandas as pd

**Step 2:** If using **Google Colab** please upload the Excel file that you have downloaded at **Step 1**, when asked. Otherwise, if running locally, please set the path towards the Excel file in the `file_name` variable.

Note: if you get an error related to the version of the `xlrd` library while running the code below, please uncomment the first line of the cell `!pip install --upgrade xlrd`.

In [None]:
# !pip install --upgrade xlrd # Uncomment this line if you get an error related to xlrd while running this cell.

if 'google.colab' in str(get_ipython()): # Check if the code is running in Google Colab.
    from google.colab import files
    uploaded = files.upload()
    file_name = list(uploaded.keys())[0]
else: # The notebook is running locally
    file_name = "https://github.com/liviucotfas/grey-systems-book/raw/main/grey-fixed-weight-clustering-template.xls"

df_objects = pd.read_excel(file_name)
df_white_weight_functions = pd.read_excel(file_name, "Sheet2")
df_ratios = pd.read_excel(file_name, "Sheet3")

**Step 3:** You can check below if the values for the object parameters (Sheet 1), the whitenization weight functions (Sheet 2) and the weight of parameters (Sheet 3) have been loaded correctly.

(Sheet 1) Sequence of object parameters' values

In [None]:
print(df_objects)

(Sheet 2) Whitenization weight functions

In [None]:
print(df_white_weight_functions)

(Sheet 3) Weight of parameters

In [None]:
print(df_ratios)

**Step 4:** The results will be displayed in the **Output** section below (located after the **Implementation** section).

## Implementation

In [None]:
class WhitenizationWeightFunction(object):
    """Class used for representing whitenization weight functions."""

    def __init__ (self, function_as_string : str):
        # split the string into turning points
        self.turning_points = function_as_string.split(",")
        # validate the number of turning points
        if len(self.turning_points) != 4:
            raise ValueError("The string should contain 4 turning ponits separated by `,`: x,x,x,x")
        # strip any white spaces 
        self.turning_points = list(map(lambda s:s.strip(), self.turning_points))
    def __call__(self, value : float) -> float:
        # 1. Whitenization weight function of lower measure
        if self.turning_points[0] == "-" and self.turning_points[1] == "-" and self.turning_points[2] != "-" and self.turning_points[3] != "-":
            return self.type1(value)
        # 2. Whitenization weight function of upper measure
        elif self.turning_points[0] != "-" and self.turning_points[2] == "-" and self.turning_points[3] == "-":
            return self.type2(value)
        # 3. Whitenization weight function of moderate measure
        elif self.turning_points[0] != "-" and self.turning_points[1] != "-" and self.turning_points[2] == "-" and self.turning_points[3] != "-":
            return self.type3(value)
        # 4. Typical whitenization weight function
        elif self.turning_points[0] != "-" and self.turning_points[1] != "-" and self.turning_points[2] != "-" and self.turning_points[3] != "-":
            return self.type4(value)
        else:
            raise ValueError()            
    def type1(self, value):
        if value < 0.0 or value > float(self.turning_points[3]):
            return 0.0
        elif value < float(self.turning_points[2]) and value >= 0.0:
            return 1.0
        elif value >= float(self.turning_points[2]) and value < float(self.turning_points[3]):
            return (float(self.turning_points[3]) - value) / (float(self.turning_points[3]) - float(self.turning_points[2]));
        else:
            return 0
    def type2(self, value):
        if value < float(self.turning_points[0]):
            return 0.0
        elif value >= float(self.turning_points[0]) and value < float(self.turning_points[1]):
            return (value - float(self.turning_points[0])) / (float(self.turning_points[1]) - float(self.turning_points[0]));
        elif value >= float(self.turning_points[1]):
            return 1.0
        else:
            return 0
    def type3(self, value):
        if value < float(self.turning_points[0]) or value > float(self.turning_points[3]):
            return 0
        elif value >= float(self.turning_points[0]) and value < float(self.turning_points[1]):
            return (value - float(self.turning_points[0])) / (float(self.turning_points[1]) - float(self.turning_points[0]))
        elif value >= float(self.turning_points[1]) and value <= float(self.turning_points[3]):
            return (float(self.turning_points[3]) - value) / (float(self.turning_points[3]) - float(self.turning_points[1]))
        else:
            return 0;
    def type4(self, value):
        if value < float(self.turning_points[0]) or value > float(self.turning_points[3]):
            return 0
        elif value >= float(self.turning_points[0]) and value <= float(self.turning_points[1]):
            return (value - float(self.turning_points[0])) / (float(self.turning_points[1]) - float(self.turning_points[0]));
        elif value >= float(self.turning_points[1]) and value < float(self.turning_points[2]):
            return 1
        elif value >= float(self.turning_points[2]) and value <= float(self.turning_points[3]):
            return (float(self.turning_points[3]) - value) / (float(self.turning_points[3]) - float(self.turning_points[2]))
        else:
            return 0
##text = "-,-,841,859"
##f1 =  WhitenizationWeightFunction(text)
##f1(5)
text = "-,-,300,600"
f1 =  WhitenizationWeightFunction(text)
f1(600)

In [None]:
# Starting from the df_white_weight_functions create a list of lists holding WhitenizationWeightFunction 
whitenization_weight_functions = []
for i in range(df_white_weight_functions.shape[0]):
    whitenization_weight_functions.append([])
    for j in range(df_white_weight_functions.shape[1] - 1):
        function_as_string = df_white_weight_functions.iloc[i, j+1]
        whitenization_weight_functions[i].append(WhitenizationWeightFunction(function_as_string))

In [None]:
no_of_obj = len(df_objects)
no_of_obj #length1

In [None]:
no_of_param = len(df_objects.columns) - 1 # we substract 1 becuase the first column includes the names
no_of_param #length2

In [None]:
no_of_functions = len(df_white_weight_functions)
no_of_functions #length3

In [None]:
# Compute the clustering coefficents

# Start by creating a list of lists filled with 0
clustering_coefficents = []
for i in range(no_of_obj):
    clustering_coefficents.append([])
    for j in range(no_of_functions): 
        clustering_coefficents[i].append(0)

# Acctually compute the values
for i in range(no_of_functions):
  for j in range(no_of_obj):
    result = 0

    for k in range(no_of_param):
      value = df_objects.iloc[j, k+1] # +1 becuase the first column includes a description
      whitenization_weight_function = whitenization_weight_functions[i][k]

      function_value = whitenization_weight_function(value)
      ratio_value = df_ratios.iloc[0, k+1] # +1 becuase the first column includes a description

      result +=  function_value* ratio_value
      
    clustering_coefficents[j][i] = result; 

In [None]:
cluster = []

for coefficents in clustering_coefficents:
    max = coefficents[0]
    index = 0
    for i in range(1,len(coefficents)):
        if coefficents[i] > max:
            max = coefficents[i]
            index = i
    cluster.append(index)

In [None]:
df_results = df_objects.copy() # Create a copy of the original objects.
df_results["cluster"] = cluster # Add the cluster column

## Output

Clustering coefficents

In [None]:
clustering_coefficents

Cluster for each object

In [None]:
df_results

Objects in each cluster

In [None]:
for i in range(no_of_functions):
    cluster_index = i+1
    object_indexs = str(list(df_results[df_results['cluster'] == i].index + 1)) # +1 to have the indexes starting from 1
    print(str(cluster_index) + ": "+object_indexs)