# Example: Custom Evaluation Function for Clustering

This example demonstrates how to define and integrate a custom evaluation function for clustering problems using the **IOHClustering** framework.

In [1]:
# Import necessary libraries
from iohclustering import create_cluster_problem, general_cluster_metric
import ioh
import numpy as np
import os


## Step 1: Define the Dataset

We use the `iris_pca` dataset as an example. This dataset is preprocessed and ready for clustering.

In [2]:
# Example dataset
dataset = "iris_pca"

## Step 2: Define Custom Functions

We define a custom error function and a custom distance function to evaluate the clustering results.

In [3]:
# Define a custom error function (MAE)
def custom_error_function(X, centroids, labels):
    """
    Custom error function to evaluate clustering performance using mean absolute error (MAE).
    Parameters:
        X: np.ndarray - The dataset.
        centroids: np.ndarray - The cluster centroids.
        labels: np.ndarray - The cluster labels for each data point.
    Returns:
        float - The computed mean absolute error.
    """
    total_error = 0
    for i, centroid in enumerate(centroids):
        cluster_points = X[labels == i]
        total_error += np.sum(np.abs(cluster_points - centroid))
    return total_error / len(X)

# Define a custom distance function (Cosine similarity)
def custom_distance_function(x, centroids):
    """
    Custom distance function to compute the cosine similarity between a point and centroids.
    Parameters:
        x: np.ndarray - A single data point.
        centroids: np.ndarray - The cluster centroids.
    Returns:
        np.ndarray - The computed cosine similarities.
    """
    # Normalize the input point and centroids
    x_norm = np.linalg.norm(x)
    centroids_norm = np.linalg.norm(centroids, axis=1)
    
    # Compute cosine similarity
    cosine_similarity = np.dot(centroids, x) / (centroids_norm * x_norm)
    return cosine_similarity

## Step 3: Create the Clustering Problem

We use the `general_cluster_metric` function to combine the custom error and distance functions. Then, we create a clustering problem using the custom metric.

In [4]:
# Combine custom functions into a clustering metric
clustering_function = general_cluster_metric(custom_distance_function, custom_error_function)

# Create a clustering problem with the custom function
clustering_problem, retransform = create_cluster_problem(
    dataset=dataset,
    k=2,
    error_metric=clustering_function
)

## Step 4: Define the Random Search Algorithm

We implement a simple random search algorithm that generates random solutions and evaluates them.

In [5]:
# Define a simple random search algorithm
class RandomSearch:
    """Simple random search algorithm"""
    def __init__(self, budget_factor: int):
        self.budget_factor: int = budget_factor

    def __call__(self, problem: ioh.problem.RealSingleObjective) -> None:
        """Evaluate the problem `budget_factor * DIM` times with a randomly generated solution"""
        for _ in range(self.budget_factor * problem.meta_data.n_variables):
            x = np.random.uniform(problem.bounds.lb, problem.bounds.ub)
            problem(x)

## Step 5: Set Up the Logger

We use the `ioh.logger.Analyzer` to log the results of the random search for further analysis.

In [6]:
# Set up a logger to store results
logger = ioh.logger.Analyzer(
    root=os.getcwd(),
    folder_name="Custom_Metric_Random_Search_Test",
    algorithm_name="RandomSearch",
)

## Step 6: Run the Random Search Algorithm

We attach the logger to the clustering problem and execute the random search algorithm.

In [7]:
# Attach the logger and run the random search
RS = RandomSearch(budget_factor=2000)
clustering_problem.attach_logger(logger)
RS(clustering_problem)
clustering_problem.reset()

## Step 7: Close the Logger

Finally, we close the logger to ensure all results are saved properly.

In [8]:
# Close the logger after the run
logger.close()

## Conclusion

This example demonstrates how to integrate a custom evaluation function into the **IOHClustering** framework. You can extend this example by implementing specific logic in the custom error and distance functions to suit your clustering needs.