# **Visual Assessment of Tendency (VAT) - Introduction**

The **VAT algorithm** helps us **see** how many groups (clusters) exist in a dataset before applying clustering methods like K-Means.



### How It Works
1. **Measure Similarity**  
   - VAT calculates how similar or different each data point is from the others.  
   - This is done using distances (e.g., Euclidean distance).  

2. **Reorder the Data**  
   - VAT rearranges the data to place similar points closer together.  
   - This makes it easier to spot groups in the data.  

3. **Create a Visual Map**  
   - VAT makes an image (heatmap) of the reordered data.  
   - Dark blocks in the image show clusters (groups of similar points).  

### How to Read the VAT Image
- **Dark squares along the diagonal** → Show clusters (groups).  
- **Bright lines between squares** → Show gaps between clusters.  
- **More dark squares** → More clusters in the data.  

### Why Use VAT?
- **Easy to understand** – It gives a clear picture of data structure.  
- **No need to set cluster numbers** – Helps decide how many groups exist.  
- **Works for any type of data** – Can be used for numbers and categories.  

VAT is useful when you don’t know how many clusters are in the data.


In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.spatial.distance import pdist, squareform
from scipy.sparse.csgraph import shortest_path
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.impute import SimpleImputer
from sklearn.neighbors import NearestNeighbors
from sklearn.decomposition import PCA
from sklearn.datasets import make_circles, load_iris
from matplotlib.colors import LinearSegmentedColormap
import warnings

class VAT:
    
    def __init__(self, normalize=True, colormap='gray_r', n_samples_max=5000):
        self.normalize = normalize
        self.n_samples_max = n_samples_max
        self.cmap = plt.cm.gray_r if colormap == 'gray_r' else LinearSegmentedColormap.from_list('vat_cmap', ['black', 'white'], N=256)