# 4. Fuzzy C-Means

## 4.0 Introduccion

***Particion dura*** : Es aquella donde un $x_i$ pertenece a un unico cluster , por ejemplo K-means.


***Particion Suave*** : Es aquella donde un $x_i$ pertenece a varios cluster , por ejemplo FCM.


La FCM trata de minimizar la funcion objetivo , lo hace actualizando los coeficiente de pertenencia ( $u_{ij}$ ) y los centroides ( $c_j$ ).

\begin{equation}
J={\sum_{i=1}^N}{\sum_{j=1}^k}u_{ij}^m|| x_i-c_j||^2
\end{equation}

***Los coeficientes de pertenencia*** se actualizan con la siguiente expresion:

\begin{equation}
u_{ij}={\sum_{l=1}^k}(\frac{|| x_i-c_j||}{|| x_i-c_l||})^{\frac{2}{1-m}}
\end{equation}

***Los centroides*** se actualizan con la siguiente expresion:
\begin{equation}
c_j=\frac{{\sum_{i=1}^N}u_{ij}^m.x_i}{{\sum_{i=1}^N}u_{ij}^m}
\end{equation}





## 4.1 Algoritmo

1. Inicializamos la matriz $U$ , esta contiene los coeficiente de pertenencia  ( $ u_{ij} $ ).
2. Actualizamos los valores de los centroides ( $ c_j $ ).
3. Una vez actualizado los valores de los centroides  ( $ c_j $ ), actualizaremos los valores de los coeficiente de pertenencia ( $ u_{ij} $ ).
4. Aplicamos la condicion de parada $||U^{k+1}-U^{k}|| < Error$ , si no se cumple se repite el paso 2.

Pd. La operacion $||U||$ denota la norma de frobenius.


\begin{equation}
||U||_{F}=({\sum_{i=1}^m}{\sum_{j=1}^n}a_{ij}^2)^{0.5}
\end{equation}


### 4.1.0 Sklearn

In [15]:
!pip install fuzzy-c-means



In [0]:
from fcmeans import FCM
from sklearn.datasets import make_blobs
from matplotlib import pyplot as plt
from seaborn import scatterplot as scatter


# create artifitial dataset
n_samples = 60
n_bins = 3  # use 3 bins for calibration_curve as we have 3 clusters here
centers = [(-5, -5), (0, 0), (5, 5)]

X,_ = make_blobs(n_samples=n_samples, n_features=2, cluster_std=1.0,
                  centers=centers, shuffle=False, random_state=42)

print(X[0])
# fit the fuzzy-c-means
fcm = FCM(n_clusters=3)
fcm.fit(X)

# outputs
fcm_centers = fcm.centers
fcm_labels  = fcm.u.argmax(axis=1)
print(fcm_labels)

# plot result

f, axes = plt.subplots(1, 2, figsize=(11,5))
scatter(X[:,0], X[:,1], ax=axes[0])
scatter(X[:,0], X[:,1], ax=axes[1], hue=fcm_labels)
scatter(fcm_centers[:,0], fcm_centers[:,1], ax=axes[1],marker="s",s=200)
plt.show()

### 4.1.1 Numpy

In [19]:
import numpy as np
import matplotlib.pyplot as plt

k=3
m=2

#Creacion de la data
np.random.seed(200)
N_datos=7
cluster1 = np.random.uniform(1.5, 2.5, (2, N_datos))
cluster2 = np.random.uniform(3.5, 4.5, (2, N_datos))
cluster3 = np.random.uniform(5.5, 6.5, (2, N_datos))

X = np.hstack((cluster1, cluster2,cluster3))


#Inicializacion ramdon de los coeficiente de pertenencia
u=np.random.uniform(0, 1, (k, X.shape[1]))
#print(u)
s=u[0]+u[1]+u[2]
u_n=u/s
print(u_n)

#Actualizacion de centroides
def New_centroides(X,u_n,m,k):
  New_centroides_x0=[]
  New_centroides_x1=[]
  for j in np.arange(k):
    New_centroides_x0.append( sum( (pow(u_n[j],m)*X)[0] )/sum( (pow(u_n[j],m)) ) )
    New_centroides_x1.append( sum( (pow(u_n[j],m)*X)[1] )/sum( (pow(u_n[j],m)) ) )
    
  return np.array([New_centroides_x0,New_centroides_x1]).reshape((3, 2))

Centroides=New_centroides(X,u_n,m,k)
print(Centroides)


# u_i_j
def New_u(X,Centroides,i,j,m):
  x_i=np.array([X[0][i],X[1][i]])
  c_j=Centroides[j]
  x_c_i_j=pow(sum(pow(x_i-c_j,2)) , 0.5 )
  new_u_i_j=0
  for l in np.arange(len(Centroides)):
    x_c_i_l=pow(sum(pow(x_i-Centroides[l],2)) , 0.5 )
    new_u_i_j=new_u_i_j+pow(x_c_i_j/x_c_i_l,2/(1-m))

  return new_u_i_j




u=New_u(X,Centroides,1,1,m)
#print(u)



[[0.36989102 0.24135356 0.34228837 0.13244423 0.43216331 0.23492334
  0.20507639 0.5192734  0.48956567 0.81557253 0.02029103 0.17067765
  0.35975172 0.83603127 0.10557494 0.1897499  0.02994366 0.38644571
  0.2780744  0.31067568 0.24069532]
 [0.49420835 0.33568047 0.38171261 0.30755006 0.43679761 0.35284736
  0.6423366  0.22572171 0.50739124 0.05321212 0.01174003 0.39832824
  0.26206252 0.05179558 0.45833135 0.57137585 0.87464641 0.32558594
  0.29100619 0.42921122 0.73811975]
 [0.13590063 0.42296597 0.27599903 0.56000571 0.13103908 0.4122293
  0.15258702 0.2550049  0.00304309 0.13121534 0.96796894 0.43099411
  0.37818576 0.11217315 0.43609371 0.23887425 0.09540993 0.28796835
  0.43091941 0.26011311 0.02118493]]
[[4.05324339 4.41572405]
 [3.73954437 4.0266853 ]
 [4.51959527 3.9375822 ]]


In [0]:
def New_Matriz_u(X,Centroides,m,k):
  Matriz_U=np.zeros( (k, X.shape[1]) )
  for i in np.arange(X.shape[1]):
    for j in np.arange(k):
      Matriz_U[j][i]=New_u(X,Centroides,i,j,m)
  
  U=Matriz_U/(Matriz_U[0]+Matriz_U[1]+Matriz_U[2])
  return U

U=New_Matriz_u(X,Centroides,m,k)
#print(New_Matriz_u(X,Centroides,m,k))


def Error(u_n,U):
  error=pow( sum(sum(pow(u_n-U,2))) , 0.5 )
  return error



In [0]:
#Creacion de la data
np.random.seed(200)
N_datos=5
cluster1 = np.random.uniform(1.5, 2.5, (2, N_datos))
cluster2 = np.random.uniform(3.5, 4.5, (2, N_datos))
cluster3 = np.random.uniform(5.5, 6.5, (2, N_datos))

X = np.hstack((cluster1, cluster2,cluster3))

u=np.random.uniform(0, 1, (k, X.shape[1]))
s=u[0]+u[1]+u[2]
u_n=u/s

e=2
k=3
m=2

In [22]:

for i in range(5):
  Centroides=New_centroides(X,u_n,m,k)
  U=New_Matriz_u(X,Centroides,m,k)
  e=Error(u_n,U)
  u_n=U


print(Centroides)
fcm = FCM(n_clusters=3)
fcm.fit(X.T)
fcm_centers = fcm.centers
print(fcm_centers)


[[5.3732029  2.99528343]
 [3.50029785 5.22234848]
 [2.95080024 3.13944734]]
[[2.08624533 2.04600393]
 [4.29546291 3.93544474]
 [6.14963333 5.99413904]]


In [23]:
Centroides=New_centroides(X,u_n,m,k)
U_1=New_Matriz_u(X,Centroides,m,k)
e=Error(u_n,U_1)
print(e)

Centroides=New_centroides(X,U_1,m,k)
U_2=New_Matriz_u(X,Centroides,m,k)
e=Error(U_1,U_2)  
print(e)

1.9300410476386856
2.0987768887955287
