## Fuzzy C-means

Fuzzy C-means (FCM) is an unsupervised clustering algorithm widely used in machine learning and data analysis. It enhances the traditional C-means algorithm by incorporating fuzzy logic principles. Fuzzy logic allows for the maintenance of probabilistic assumptions, providing probabilities instead of definitive predictions.

This integration of fuzzy logic in the prediction process is particularly valuable in situations where the accuracy of predictions carries significant implications. For instance, when diagnosing a patient as sick or healthy, instead of offering a binary prediction (true or false, 0 or 1), FCM with fuzzy logic generates probabilities. For example, the prediction might be something like [0.37, 0.63], indicating a 0.37 probability of the patient being sick and a 0.63 probability of the opposite case , the final decision will then be made by the doctor, who can consider these probabilities alongside other factors.

By avoiding strict, misleading assumptions, FCM with fuzzy logic helps mitigate potential damaging consequences. It provides a more nuanced understanding of the data, allowing decision-makers to weigh probabilities and make safer informed choices based on the context and expertise.

### Algorithm Steps:

1. **Initialization:** Specify the number of desired clusters (c), the input dataset(X), the fuzziness parameter (m) and the maximum number of iterations(tmax) .

2. **Membership Assignment:** Randomly select initial cluster centers coordinates between the maximums and the minumums of the provided data to accelerate the convergence and assign membership values to each data point. These values represent the degree of belongingness of each data point to each cluster. The formula to calculate the membership degree is :
$$\mu_{ij} = \left(\sum_{k=1}^{C} \left(\frac{d(x_i, v_j)}{d(x_i, v_k)}\right)^{\frac{2}{m-1}}\right)^{-1}$$


3. **Centroid Update:** Update the cluster centers (centroids) based on the weighted average of the data points, considering the membership values.The formula to calculate the centroids matrix is : 
$$ 
v_j = \frac{\sum_{i=1}^{N} \left(\mu_{ij}\right)^m \cdot x_i}{\sum_{i=1}^{N} \left(\mu_{ij}\right)^m}

$$

4. **Membership Update:** Recalculate the membership values for each data point based on their proximity to the updated centroids.

5. **Convergence:** Iteratively perform centroid and membership updates until a stopping criterion is met.

6. **Result:** Assign final membership values to each data point, indicating the degree of belongingness to each cluster.

Fuzzy C-means allows for soft clustering, taking in consideration situations where data points may belong to multiple clusters with varying degrees of membership.



In [27]:
import numpy as np
from sklearn.metrics import f1_score
from sklearn.datasets import load_iris


iris = load_iris()
X = iris.data
D = iris.target
n = X.shape[0]
p = X.shape[1]
c = 3

class FCM:
    def __init__(self, c, m, tmax, X):
        '''
        c:number of clusters
        m: fuzziness parameter
        tmax : maximum number of iterations
        X : the data

        '''
        self.c = c
        self.m = m
        self.p = X.shape[1]
        self.tmax = tmax
        self.X = X
        # calculating the mins and maxs of the data columns to initialise the centroids between those values.
        min_ = [np.min(X[:,j]) for j in range(X.shape[1])]
        max_ = [np.max(X[:,j]) for j in range(X.shape[1])]
        # initializing each cluster centroid between the maxs and mins to accelerate the convergence
        self.V = [[np.random.uniform(min_[j], max_[j]) for j in range(X.shape[1])] for i in range(c)]
        '''
        to measure the performance of our clustering technique , we will use a labeled data , but the problem is that we need to assign each cluster with the right class , the array Etqt will serve for that.
        '''
        self.Etqt = []
        
    def calculer_U(self):
        '''
        calculate the membership matrix U , its shape is (c,X.shape[0]) , so each column represent the degree of belongingness (probability) of each data point to each cluster
        '''
        a = -2 / (self.m - 1)
        term_1 = [[np.power(np.linalg.norm(self.X[k]-self.V[i]), a) for k in range(self.X.shape[0])] for i in range(self.c)]
        term_2 = np.sum(term_1, axis=0)
        self.U =np.divide(term_1, term_2)
        return self.U
    
    def calculer_V(self):
        '''
        V is the matrix of the different centroids ,its shape is (c,X.shape[1]) , the next method will update the centoids based on the new membership degrees
        '''
        term1 = np.dot(self.U**self.m,X)
        term2 = np.sum(self.U**self.m, axis=1)
        self.V = np.divide(term1.T,term2).T
        return self.V
        # 
    def etiqueterV(self):
        ''' 
        to avaluate the performance of the model , we will test it on a labeled data and compare the predictions with the ground truth , but in order to do that we should first assign each cluster with the appropiate class , one way to do that is to calculate the distance between classes centers and clusters centroids and assign each cluster with nearest class.
        the resulting array Etqt will have the clusters as values and the according classes as indexes , for example if Etqt = [2,0,1] , that means that the 3rd cluster reprsent the first class , the the first cluster represent the second class and the second cluster represent the third class.
        '''
        # compute the mean of each cluster
        class_means = [np.mean(self.X[D == i], axis=0) for i in range(self.c)]
        # find the closest centroid to each cluster mean
        self.Etqt = []
        for i in range(self.c):
            centroid_distances = [np.linalg.norm(class_means[i] - self.V[j]) for j in range(self.c)]
            closest_centroid = np.argmin(centroid_distances)
            self.Etqt.append(closest_centroid)
        
    def classe(self, vect):
        '''
        the classe method will calculate the membership degrees of "vect" , predict wich cluster vect more likely belong to using the argmax() , and find the according class using the precalculated array Etqt , that way we can use performance metrics (in our case F1-score)to compare between the ground truth and the predicted classes.
        '''
        term1 = [ pow(sum(np.square(vect-self.V[i])), -1./(self.m-1)) for i in range(self.c)]
        term2 = sum(term1)
        u = [term1[i]/term2 for i in range(self.c)]
        return self.Etqt.index(np.argmax(u)) # cluster of vect
    
    def fit(self):
        '''
        The usual fit method that runs the training , it will also include making the predictions since it is not necessary to make a methode for that
        '''
        for i in range(self.tmax):
            self.calculer_U()
            self.calculer_V()
        self.etiqueterV()
            #f1 = f1_score(D, y_pred, average='weighted')
            #entropy = self.calculate_entropy(self.X)
            # self.f1_scores.append(f1)
            # self.entropies.append(entropy)
        self.y_pred = [self.classe(x) for x in self.X]
        print(np.asarray(self.y_pred))
        print(self.Etqt)


    
    

In [28]:
fcm = FCM(c=3, m=2, tmax=100, X=X)
print('--- Matrix of centroids before training ---')
print(np.asarray(fcm.V))
fcm.fit()
print('--- Matrix of centroids after training --- ')
print(np.asarray(fcm.V))
preds = fcm.y_pred


--- Matrix of centroids before training ---
[[5.94185194 3.88442231 2.17807531 1.33416265]
 [6.43269245 2.11148099 4.58451463 0.5092579 ]
 [4.53418573 4.27732529 6.697229   2.04015364]]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 2 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2 2 2 2 1 2 2 2 2
 2 2 1 2 2 2 2 2 1 2 1 2 1 2 2 1 1 2 2 2 2 2 1 2 2 2 2 1 2 2 2 1 2 2 2 1 2
 2 1]
[0, 1, 2]
--- Matrix of centroids after training --- 
[[5.00396596 3.41408886 1.48281553 0.25354632]
 [5.88893236 2.76106936 4.36395164 1.39731504]
 [6.77501122 3.05238227 5.64678178 2.05354666]]


In [25]:
fcm.U.shape

(3, 150)

In [29]:
f1 = f1_score(preds,D,average='weighted')
f1

0.8944107744107744

we can see that our model is doing well on the iris dataset obtaining 0.89 F1-score on classifying its instances .