This notebook implements the code required for performing "grey fixed weight clustering".

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](http://colab.research.google.com/github/liviucotfas/grey-systems-book/blob/main/grey-fixed-weight-clustering.ipynb)

Note: the implementation has been inspired by the one in the "Grey Modeling Software (version 6)".

## Load the Excel file with the necessary data

**Step 1:** Download the Excel template and adjust the object parameters (Sheet 1), the whitenization weight functions (Sheet 2) and the weight of parameters (Sheet 3). The template is available at: https://github.com/liviucotfas/grey-systems-book/raw/main/grey-fixed-weight-clustering-template.xls .

Note: if you only wish to test this notebook, you can leave the default values in the Excel file.

**Step 2:** If using **Google Colab** please upload the Excel file that you have downloaded at **Step 1**, when asked. Otherwise, if running locally, please set the path towards the Excel file in the `file_name` variable.

Note: if you get an error related to the version of the `xlrd` library while running the code below, please uncomment the first line of the cell `!pip install --upgrade xlrd`.

In [1]:
import pandas as pd

In [2]:
# !pip install --upgrade xlrd # Uncomment this line if you get an error related to xlrd while running this cell.

if 'google.colab' in str(get_ipython()): # Check if the code is running in Google Colab.
    from google.colab import files
    uploaded = files.upload()
    file_name = list(uploaded.keys())[0]
else: # The notebook is running locally
    file_name = "https://github.com/liviucotfas/grey-systems-book/raw/main/grey-fixed-weight-clustering-template.xls"

df_objects = pd.read_excel(file_name) # Object parameters
df_white_weight_functions = pd.read_excel(file_name, "Sheet2") # Whitenization weight functions 
df_ratios = pd.read_excel(file_name, "Sheet3") # Weights of the parameters

**Step 3:** You can check below if the values for the object parameters (Sheet 1), the whitenization weight functions (Sheet 2) and the weight of parameters (Sheet 3) have been loaded correctly.

(Sheet 1) Sequence of object parameters' values

In [3]:
print(df_objects)

   Object\Parameter  Parameter1  Parameter2  Parameter3  Parameter4
0           Object1       22.50         4.0           0        0.00
1           Object2       79.37         6.0         600        0.75
2           Object3      144.00         7.0         300        0.75
3           Object4      300.00         6.1         189       12.00
4           Object5      456.00        12.0         250       12.00
5           Object6      189.00         8.0         700        1.50
6           Object7      369.00         8.0        1300        2.25
7           Object8     1127.11        16.2         550        3.00
8           Object9      260.00        11.0         600        1.00
9          Object10      200.00         8.0         600        1.25
10         Object11      475.00        10.0        1000        0.75
11         Object12      314.10         8.0         900        0.75
12         Object13      282.80         7.4        1300        0.50
13         Object14      240.00         8.0     

(Sheet 2) Whitenization weight functions

In [4]:
print(df_white_weight_functions)

  Whitenization weight function\Parameter    Parameter1 Parameter2  \
0          Whitenization weight function1   100,300,-,-   3,10,-,-   
1          Whitenization weight function2  50,150,-,250   2,6,-,10   
2          Whitenization weight function3    -,-,50,100    -,-,4,8   

       Parameter3     Parameter4  
0    200,1000,-,-  0.25,1.25,-,-  
1  100,600,-,1100      0,0.5,-,1  
2     -,-,300,600   -,-,0.25,0.5  


(Sheet 3) Weight of parameters

In [5]:
print(df_ratios)

  Weight\Parameter  Parameter1  Parameter2  Parameter3  Parameter4
0           Weight         0.3        0.25        0.25         0.2


**Step 4:** The results will be displayed in the **Output** section below (located after the **Implementation** section).

## Implementation

In [6]:
class WhitenizationWeightFunction(object):
    """Class used for representing whitenization weight functions."""

    def __init__ (self, function_as_string : str):
        # Split the string into turning points
        self.turning_points = function_as_string.split(",")
        # Validate the number of turning points
        if len(self.turning_points) != 4:
            raise ValueError("The string should contain 4 turning points separated by `,`: x,x,x,x")
        # Strip any white spaces 
        self.turning_points = list(map(lambda s:s.strip(), self.turning_points))
    def __call__(self, value : float) -> float:
        # 1. Whitenization weight function of lower measure
        if self.turning_points[0] == "-" and self.turning_points[1] == "-" and self.turning_points[2] != "-" and self.turning_points[3] != "-":
            return self.type1(value)
        # 2. Whitenization weight function of upper measure
        elif self.turning_points[0] != "-" and self.turning_points[2] == "-" and self.turning_points[3] == "-":
            return self.type2(value)
        # 3. Whitenization weight function of moderate measure
        elif self.turning_points[0] != "-" and self.turning_points[1] != "-" and self.turning_points[2] == "-" and self.turning_points[3] != "-":
            return self.type3(value)
        # 4. Typical whitenization weight function
        elif self.turning_points[0] != "-" and self.turning_points[1] != "-" and self.turning_points[2] != "-" and self.turning_points[3] != "-":
            return self.type4(value)
        else:
            raise ValueError()            
    def type1(self, value):
        if value < 0.0 or value > float(self.turning_points[3]):
            return 0.0
        elif value < float(self.turning_points[2]) and value >= 0.0:
            return 1.0
        elif value >= float(self.turning_points[2]) and value < float(self.turning_points[3]):
            return (float(self.turning_points[3]) - value) / (float(self.turning_points[3]) - float(self.turning_points[2]));
        else:
            return 0
    def type2(self, value):
        if value < float(self.turning_points[0]):
            return 0.0
        elif value >= float(self.turning_points[0]) and value < float(self.turning_points[1]):
            return (value - float(self.turning_points[0])) / (float(self.turning_points[1]) - float(self.turning_points[0]));
        elif value >= float(self.turning_points[1]):
            return 1.0
        else:
            return 0
    def type3(self, value):
        if value < float(self.turning_points[0]) or value > float(self.turning_points[3]):
            return 0
        elif value >= float(self.turning_points[0]) and value < float(self.turning_points[1]):
            return (value - float(self.turning_points[0])) / (float(self.turning_points[1]) - float(self.turning_points[0]))
        elif value >= float(self.turning_points[1]) and value <= float(self.turning_points[3]):
            return (float(self.turning_points[3]) - value) / (float(self.turning_points[3]) - float(self.turning_points[1]))
        else:
            return 0
    def type4(self, value):
        if value < float(self.turning_points[0]) or value > float(self.turning_points[3]):
            return 0
        elif value >= float(self.turning_points[0]) and value <= float(self.turning_points[1]):
            return (value - float(self.turning_points[0])) / (float(self.turning_points[1]) - float(self.turning_points[0]));
        elif value >= float(self.turning_points[1]) and value < float(self.turning_points[2]):
            return 1
        elif value >= float(self.turning_points[2]) and value <= float(self.turning_points[3]):
            return (float(self.turning_points[3]) - value) / (float(self.turning_points[3]) - float(self.turning_points[2]))
        else:
            return 0
##text = "-,-,841,859"
##f1 =  WhitenizationWeightFunction(text)
##f1(5)
text = "-,-,300,600"
f1 =  WhitenizationWeightFunction(text)
f1(600)

0

In [7]:
# Starting from the df_white_weight_functions create a list of lists holding WhitenizationWeightFunction 
whitenization_weight_functions = []
for i in range(df_white_weight_functions.shape[0]):
    whitenization_weight_functions.append([])
    for j in range(df_white_weight_functions.shape[1] - 1):
        function_as_string = df_white_weight_functions.iloc[i, j+1]
        whitenization_weight_functions[i].append(WhitenizationWeightFunction(function_as_string))

In [8]:
no_of_obj = len(df_objects)
no_of_obj #length1

17

In [9]:
no_of_param = len(df_objects.columns) - 1 # -1 because the first column includes the names
no_of_param #length2

4

In [10]:
no_of_functions = len(df_white_weight_functions)
no_of_functions #length3

3

In [11]:
# Compute the clustering coefficents

# Start by creating a list of lists filled with 0
clustering_coefficients = []
for i in range(no_of_obj):
    clustering_coefficients.append([])
    for j in range(no_of_functions): 
        clustering_coefficients[i].append(0)

# Acctually compute the values
for i in range(no_of_functions):
  for j in range(no_of_obj):
    result = 0

    for k in range(no_of_param):
      value = df_objects.iloc[j, k+1] # +1 becuase the first column includes a description
      whitenization_weight_function = whitenization_weight_functions[i][k]

      function_value = whitenization_weight_function(value)
      ratio_value = df_ratios.iloc[0, k+1] # +1 becuase the first column includes a description

      result +=  function_value* ratio_value
      
    clustering_coefficients[j][i] = result; 

In [12]:
cluster = []

for coefficents in clustering_coefficients:
    max = coefficents[0]
    index = 0
    for i in range(1,len(coefficents)):
        if coefficents[i] > max:
            max = coefficents[i]
            index = i
    cluster.append(index)

In [13]:
# Create a copy of the original objects.
df_results = df_objects.copy()
# Add the cluster column
df_results["cluster"] = cluster

## Output

Clustering coefficients

In [17]:
clustering_coefficients
df_clustering_coefficents = pd.DataFrame(clustering_coefficients)
df_clustering_coefficents

Unnamed: 0,0,1,2
0,0.035714,0.125,1.0
1,0.332143,0.68811,0.24878
2,0.340107,0.6695,0.3125
3,0.610714,0.28825,0.36875
4,0.765625,0.075,0.25
5,0.668321,0.508,0.0
6,0.928571,0.125,0.0
7,0.859375,0.225,0.041667
8,0.765,0.25,0.0
9,0.653571,0.525,0.0


Cluster for each object

In [15]:
df_results

Unnamed: 0,Object\Parameter,Parameter1,Parameter2,Parameter3,Parameter4,cluster
0,Object1,22.5,4.0,0,0.0,2
1,Object2,79.37,6.0,600,0.75,1
2,Object3,144.0,7.0,300,0.75,1
3,Object4,300.0,6.1,189,12.0,0
4,Object5,456.0,12.0,250,12.0,0
5,Object6,189.0,8.0,700,1.5,0
6,Object7,369.0,8.0,1300,2.25,0
7,Object8,1127.11,16.2,550,3.0,0
8,Object9,260.0,11.0,600,1.0,0
9,Object10,200.0,8.0,600,1.25,0


Objects in each cluster

In [16]:
for i in range(no_of_functions):
    cluster_index = i+1
    object_indexs = str(list(df_results[df_results['cluster'] == i].index + 1)) # +1 to have the indexes starting from 1
    print(str(cluster_index) + ": "+object_indexs)

1: [4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 16]
2: [2, 3, 15]
3: [1, 17]
