# Deep Embedded Clustering for VPCF Image Analysis

This notebook demonstrates how to use the IDEC algorithm to cluster VPCF images and identify key structural features in ferroelectric materials.

## 1. Setup

In [None]:
import sys
sys.path.append('../src')
from IDEC import IDEC
from DEC import autoencoder, ClusteringLayer
import metrics
import numpy as np
import matplotlib.pyplot as plt
from keras.datasets import mnist

## 2. Load Data

**TODO:** Load your VPCF image data here. The data should be a numpy array of shape `(n_samples, n_features)`, where `n_features` is the flattened size of your images.

In [None]:
# Example using MNIST data, replace with your VPCF data
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x = np.concatenate((x_train, x_test))
y = np.concatenate((y_train, y_test))
x = x.reshape(x.shape[0], -1) / 255.0
print('Data shape:', x.shape)

## 3. Define Model

The user requested a 4-layer deep autoencoder. The `dims` parameter defines the architecture of the autoencoder. The first element is the input dimension, and the last element is the dimension of the latent space. The decoder will be symmetric to the encoder.

In [None]:
input_dim = x.shape[1]
dims = [input_dim, 500, 500, 2000, 10] # 4-layer encoder

## 4. Pre-train Autoencoder

In [None]:
idec = IDEC(dims=dims, n_clusters=10)
idec.pretrain(x, epochs=200)

## 5. Train IDEC Model

In [None]:
idec.compile(optimizer=SGD(0.01, 0.9), loss=['kld', 'mse'], loss_weights=[0.1, 1.0])
y_pred = idec.fit(x, y=y, tol=0.001, maxiter=2e4, update_interval=140)

## 6. Evaluate Clustering

In [None]:
acc = np.round(metrics.acc(y, y_pred), 5)
nmi = np.round(metrics.nmi(y, y_pred), 5)
ari = np.round(metrics.ari(y, y_pred), 5)
print('Accuracy:', acc)
print('Normalized Mutual Information:', nmi)
print('Adjusted Rand Index:', ari)