# Lesson 1: Mastering Cluster Validation with Silhouette Scores and Visualization in Python

Welcome! In this lesson, we’ll explore cluster validation by focusing on the Silhouette Score. We’ll discuss the underlying theory, implement custom functions in Python, and learn how to visualize clusters for validation. All these elements provide a comprehensive understanding of cluster validation.

Understanding Cluster Validation and the Silhouette Score
Cluster validation is a critical step in cluster analysis, used to evaluate how well clustering has been performed. Proper validation helps avoid overfitting or incorrect assumptions about the optimal number of clusters.

One of the most widely used validation metrics is the Silhouette Score, which measures how similar a data point is to its own cluster compared to other clusters. Mathematically, for a sample i, the Silhouette Score s(i) is defined as:

s(i) = (b(i) − a(i)) / max{a(i), b(i)}

where:
• a(i) = average intra-cluster distance of point i (distance to points in the same cluster),
• b(i) = average nearest-cluster distance of point i (distance to points in the closest different cluster).

Interpreting the Silhouette Score
• Score close to 1: The point is well-matched with its own cluster and poorly matched with any other cluster → indicates strong clustering.
• Score close to 0: The point lies near the boundary between two clusters → indicates uncertainty about cluster assignment.
• Score close to -1: The point is likely in the wrong cluster, being closer to a neighboring cluster than to its own.

Ideally, each point would have a Silhouette Score of 1; however, this is rarely achievable in real-world scenarios.

Python Implementation of the Silhouette Score and Cluster Visualization

(1) Euclidean Distance Function
--------------------------------
```py
import numpy as np

def euclidean_distance(a, b):
    # Calculate the Euclidean distance between points a and b.
    return np.sqrt(np.sum((np.array(a) - np.array(b)) ** 2))
```

(2) Calculate a(i)
--------------------------------
```py
import numpy as np

def calculate_a(point, cluster):
    # Calculate the average distance from 'point' to other points in the same cluster.
    if len(cluster) <= 1:
        return 0
    distances = [euclidean_distance(point, other) for other in cluster 
                 if not np.array_equal(point, other)]
    return sum(distances) / (len(cluster) - 1)
```

(3) Calculate b(i)
--------------------------------
```py
def calculate_b(point, clusters):
    # Calculate the lowest average distance from 'point' to points in other clusters.
    min_average_distance = float('inf')
    for cluster in clusters:
        # Skip the cluster containing the current point
        if any(np.array_equal(point, other) for other in cluster):
            continue
        distances = [euclidean_distance(point, other) for other in cluster]
        average_distance = sum(distances) / len(cluster)
        if average_distance < min_average_distance:
            min_average_distance = average_distance
    return min_average_distance
```

(4) Custom Silhouette Score
--------------------------------
```py
from collections import defaultdict

def custom_silhouette_score(points, labels):
    # Group points by cluster label
    clusters = defaultdict(list)
    for point, label in zip(points, labels):
        clusters[label].append(point)

    cluster_list = list(clusters.values())
    scores = []

    for point, label in zip(points, labels):
        a_val = calculate_a(point, clusters[label])
        b_val = calculate_b(point, cluster_list)
        if max(a_val, b_val) > 0:
            score = (b_val - a_val) / max(a_val, b_val)
        else:
            score = 0
        scores.append(score)

    # Return the average silhouette score
    return sum(scores) / len(scores)
```

Practical Example Using the Iris Dataset
--------------------------------
```py
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt
from sklearn import datasets

# Load and cluster the Iris dataset
X = datasets.load_iris().data
kmeans_model = KMeans(n_clusters=3, random_state=0, n_init=10).fit(X)

# Plot the clusters
plt.scatter(X[:, 0], X[:, 1], c=kmeans_model.labels_)
plt.show()

# Calculate the silhouette score using the custom implementation
average_score = custom_silhouette_score(X, kmeans_model.labels_)
print(f"Silhouette Score (Custom): {average_score}")  # ~0.55
```

Silhouette Score Calculation Using Scikit-learn
--------------------------------
```py
from sklearn.metrics import silhouette_score

score = silhouette_score(X, kmeans_model.labels_, metric='euclidean')
print("Silhouette score (sklearn):", score)  # ~0.55
```

Summary
In this lesson, we:
• Reviewed the theory and purpose of cluster validation.
• Discussed the Silhouette Score metric and its interpretation.
• Implemented a custom silhouette scoring function in Python.
• Visualized clusters and compared our results with sklearn’s built-in silhouette_score.

Practice these techniques on different datasets and clustering algorithms to solidify your understanding. Happy learning!
-------------------------------------------------------------------------------

## Visualizing Clusters and Calculating Silhouette Score

We have a new way to calculate the Silhouette Score using NumPy to handle some of the computations. Let's run the given code, which defines functions for Euclidean distance and calculation of silhouette score components a and b, and computes the silhouette score for a list of points and their corresponding cluster labels. Click Run to see the average silhouette score for our data points!

```py
import numpy as np
from collections import defaultdict

def euclidean_distance(a, b):
    return np.sqrt(np.sum((np.array(a) - np.array(b)) ** 2))

def calculate_a(point, cluster):
    if len(cluster) <= 1:
        return 0
    distances = [euclidean_distance(point, other) for other in cluster if not np.array_equal(point, other)]
    return sum(distances) / (len(cluster) - 1)

def calculate_b(point, clusters):
    min_average_distance = float('inf')
    for cluster in clusters:
        if any(np.array_equal(point, other) for other in cluster):
            continue
        distances = [euclidean_distance(point, other) for other in cluster]
        average_distance = sum(distances) / len(cluster)
        if average_distance < min_average_distance:
            min_average_distance = average_distance
    return min_average_distance

def custom_silhouette_score(points, labels):
    clusters = defaultdict(list)
    for point, label in zip(points, labels):
        clusters[label].append(point)
    cluster_list = list(clusters.values())
    scores = []
    for point, label in zip(points, labels):
        a = calculate_a(point, clusters[label])
        b = calculate_b(point, cluster_list)
        score = (b - a) / max(a, b) if max(a, b) > 0 else 0
        scores.append(score)
    return sum(scores) / len(scores)

# Sample Data
points = [[1, 2], [1, 3], [2, 2], [6, 7], [7, 7], [7, 8]]
labels = [0, 0, 0, 1, 1, 1]
average_score = custom_silhouette_score(points, labels)

print(f"Average Silhouette Score: {average_score}")

```

In [1]:
import numpy as np
from collections import defaultdict

def euclidean_distance(a, b):
    return np.sqrt(np.sum((np.array(a) - np.array(b)) ** 2))

def calculate_a(point, cluster):
    if len(cluster) <= 1:
        return 0
    distances = [euclidean_distance(point, other) for other in cluster if not np.array_equal(point, other)]
    return sum(distances) / (len(cluster) - 1)

def calculate_b(point, clusters):
    min_average_distance = float('inf')
    for cluster in clusters:
        if any(np.array_equal(point, other) for other in cluster):
            continue
        distances = [euclidean_distance(point, other) for other in cluster]
        average_distance = sum(distances) / len(cluster)
        if average_distance < min_average_distance:
            min_average_distance = average_distance
    return min_average_distance

def custom_silhouette_score(points, labels):
    clusters = defaultdict(list)
    for point, label in zip(points, labels):
        clusters[label].append(point)
    cluster_list = list(clusters.values())
    scores = []
    for point, label in zip(points, labels):
        a = calculate_a(point, clusters[label])
        b = calculate_b(point, cluster_list)
        score = (b - a) / max(a, b) if max(a, b) > 0 else 0
        scores.append(score)
    return sum(scores) / len(scores)

# Sample Data
points = [[1, 2], [1, 3], [2, 2], [6, 7], [7, 7], [7, 8]]
labels = [0, 0, 0, 1, 1, 1]
average_score = custom_silhouette_score(points, labels)

print(f"Average Silhouette Score: {average_score}")

Average Silhouette Score: 0.8440403462345903


## Crafting the Distance Function

Stellar job on your last task! You've mastered using predefined functions, but can you create one from scratch? Craft a function that finds the minimum average distance of a point to points in clusters to which it does not belong.

```py
from collections import defaultdict
import numpy as np
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score

def euclidean_distance(a, b):
    # Calculate the Euclidean distance between points a and b.
    return np.sqrt(np.sum((np.array(a) - np.array(b)) ** 2))

def calculate_a(point, cluster):
    # Calculate the average distance from 'point' to other points in the same cluster.
    if len(cluster) <= 1:
        return 0
    distances = [euclidean_distance(point, other) for other in cluster if not np.array_equal(point, other)]
    return sum(distances) / (len(cluster) - 1)
    
def calculate_b(point, clusters):
    # Calculate the lowest average distance from 'point' to points in other clusters.
    min_average_distance = float('inf')
    for cluster in clusters:
        # Check if point is in the current cluster by comparing all elements
        if any(np.array_equal(point, other) for other in cluster):
            continue
        # TODO: Calculate the distances between the point and all the other points in the cluster
        # TODO: Calculate average_distance
        # TODO: Update min_average_distance
    return min_average_distance

def custom_silhouette_score(points, labels):
    # Group points by cluster label.
    clusters = defaultdict(list)
    for point, label in zip(points, labels):
        clusters[label].append(point)

    # Convert clusters to a list for easier access.
    cluster_list = list(clusters.values())

    # Calculate silhouette score for each point.
    scores = []
    for point, label in zip(points, labels):
        a = calculate_a(point, clusters[label])
        b = calculate_b(point, cluster_list)
        score = (b - a) / max(a, b) if max(a, b) > 0 else 0
        scores.append(score)

    # Return the average silhouette score.
    return sum(scores) / len(scores)

X = [[1, 2], [2, 1], [1, 1], [7, 7], [8, 8], [7, 8]]
kmeans_model = KMeans(n_clusters=2, random_state=0, n_init=10).fit(X)
avg_score_manual = custom_silhouette_score(X, kmeans_model.labels_)
print(f"Average Silhouette Score (manual): {avg_score_manual}")
score_sklearn = silhouette_score(X, kmeans_model.labels_, metric='euclidean')
print(f"Silhouette score (sklearn): {score_sklearn}")

```

Below is the complete code with the “calculate_b” function filled in. The key step is to compute the distance from the current point to all points in each “other” cluster, then take the average distance, and finally keep track of the minimum such average.

```python
from collections import defaultdict
import numpy as np
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score

def euclidean_distance(a, b):
    # Calculate the Euclidean distance between points a and b.
    return np.sqrt(np.sum((np.array(a) - np.array(b)) ** 2))

def calculate_a(point, cluster):
    # Calculate the average distance from 'point' to other points in the same cluster.
    if len(cluster) <= 1:
        return 0
    distances = [euclidean_distance(point, other) for other in cluster if not np.array_equal(point, other)]
    return sum(distances) / (len(cluster) - 1)

def calculate_b(point, clusters):
    # Calculate the lowest average distance from 'point' to points in other clusters.
    min_average_distance = float('inf')
    for cluster in clusters:
        # Skip the cluster that contains the current point.
        if any(np.array_equal(point, other) for other in cluster):
            continue
        
        # Calculate distances between 'point' and all points in this 'other' cluster.
        distances = [euclidean_distance(point, other) for other in cluster]
        average_distance = sum(distances) / len(distances)
        
        # Keep track of the minimum average distance.
        if average_distance < min_average_distance:
            min_average_distance = average_distance
    
    return min_average_distance

def custom_silhouette_score(points, labels):
    # Group points by cluster label.
    clusters = defaultdict(list)
    for point, label in zip(points, labels):
        clusters[label].append(point)

    # Convert clusters to a list for easier access.
    cluster_list = list(clusters.values())

    # Calculate silhouette score for each point.
    scores = []
    for point, label in zip(points, labels):
        a = calculate_a(point, clusters[label])
        b = calculate_b(point, cluster_list)
        score = (b - a) / max(a, b) if max(a, b) > 0 else 0
        scores.append(score)

    # Return the average silhouette score.
    return sum(scores) / len(scores)

# Example usage
X = [[1, 2], [2, 1], [1, 1], [7, 7], [8, 8], [7, 8]]
kmeans_model = KMeans(n_clusters=2, random_state=0, n_init=10).fit(X)
avg_score_manual = custom_silhouette_score(X, kmeans_model.labels_)
print(f"Average Silhouette Score (manual): {avg_score_manual}")

score_sklearn = silhouette_score(X, kmeans_model.labels_, metric='euclidean')
print(f"Silhouette score (sklearn): {score_sklearn}")
```

Explanation of the “calculate_b” function:
• We iterate through each cluster in “clusters.”  
• If the cluster contains the point, we skip it (since we only want distances to clusters in which the point does not belong).  
• We calculate Euclidean distances from our point to every point in this “other” cluster, compute their mean, and track the minimum such average distance.  
• Ultimately, “calculate_b” returns this minimum average distance, which represents how close the point is to its nearest neighboring cluster.

Execution Result:

```
Average Silhouette Score (manual): 0.8693434766646458
Silhouette score (sklearn): 0.8693434766646458
```

In [2]:
from collections import defaultdict
import numpy as np
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score

def euclidean_distance(a, b):
    # Calculate the Euclidean distance between points a and b.
    return np.sqrt(np.sum((np.array(a) - np.array(b)) ** 2))

def calculate_a(point, cluster):
    # Calculate the average distance from 'point' to other points in the same cluster.
    if len(cluster) <= 1:
        return 0
    distances = [euclidean_distance(point, other) for other in cluster if not np.array_equal(point, other)]
    return sum(distances) / (len(cluster) - 1)

def calculate_b(point, clusters):
    # Calculate the lowest average distance from 'point' to points in other clusters.
    min_average_distance = float('inf')
    for cluster in clusters:
        # Skip the cluster that contains the current point.
        if any(np.array_equal(point, other) for other in cluster):
            continue

        # Calculate distances between 'point' and all points in this 'other' cluster.
        distances = [euclidean_distance(point, other) for other in cluster]
        average_distance = sum(distances) / len(distances)

        # Keep track of the minimum average distance.
        if average_distance < min_average_distance:
            min_average_distance = average_distance

    return min_average_distance

def custom_silhouette_score(points, labels):
    # Group points by cluster label.
    clusters = defaultdict(list)
    for point, label in zip(points, labels):
        clusters[label].append(point)

    # Convert clusters to a list for easier access.
    cluster_list = list(clusters.values())

    # Calculate silhouette score for each point.
    scores = []
    for point, label in zip(points, labels):
        a = calculate_a(point, clusters[label])
        b = calculate_b(point, cluster_list)
        score = (b - a) / max(a, b) if max(a, b) > 0 else 0
        scores.append(score)

    # Return the average silhouette score.
    return sum(scores) / len(scores)

# Example usage
X = [[1, 2], [2, 1], [1, 1], [7, 7], [8, 8], [7, 8]]
kmeans_model = KMeans(n_clusters=2, random_state=0, n_init=10).fit(X)
avg_score_manual = custom_silhouette_score(X, kmeans_model.labels_)
print(f"Average Silhouette Score (manual): {avg_score_manual}")

score_sklearn = silhouette_score(X, kmeans_model.labels_, metric='euclidean')
print(f"Silhouette score (sklearn): {score_sklearn}")

The history saving thread hit an unexpected error (OperationalError('attempt to write a readonly database')).History will not be written to the database.
Average Silhouette Score (manual): 0.8693434766646458
Silhouette score (sklearn): 0.8693434766646458


## Calculating the Average Silhouette Score

Great going, Space Voyager! We're now going to test your understanding of cluster validation. Your mission is to calculate the silhouette score from scratch using the functions provided and apply them to the silhouette score calculation. Remember the mathematics behind it, and bring the code to life by filling in the score calculation!

```py
from sklearn.metrics import silhouette_score
import numpy as np
from collections import defaultdict

# Calculate Euclidean distance.
def euclidean_distance(a, b):
    return np.sqrt(np.sum((np.array(a) - np.array(b)) ** 2))

# Calculate 'a(i)' for a point.
def calculate_a(point, cluster):
    if len(cluster) <= 1:
        return 0
    distances = [euclidean_distance(point, other) for other in cluster if point != other]
    return sum(distances) / (len(cluster) - 1)

# Calculate 'b(i)' for a point.
def calculate_b(point, clusters):
    min_average_distance = float('inf')
    for cluster in clusters:
        if point in cluster:
            continue
        distances = [euclidean_distance(point, other) for other in cluster]
        average_distance = sum(distances) / len(cluster)
        if average_distance < min_average_distance:
            min_average_distance = average_distance
    return min_average_distance

# Calculate silhouette score for each point and print individual scores.
def custom_silhouette_score(points, labels):
    clusters = defaultdict(list)
    for point, label in zip(points, labels):
        clusters[label].append(point)
      
    cluster_list = list(clusters.values())
  
    scores = []
    for point, label in zip(points, labels):
        a = calculate_a(point, clusters[label])
        # TODO: Calculate the silhouette score "b" using the euclidean distance and predefined variables.
        # TODO: Now compute the actual silhouette score for the data point using "a" and "b".
    # TODO: Finally, return the average silhouette score for all data points.
  
# Custom data points and their labels
points = [[5, 3], [5, 4], [6, 3], [15, 16], [16, 15], [16, 17]]
labels = [0, 0, 0, 1, 1, 1]

# TODO: Use the custom silhouette_score function to calculate the average silhouette score
# TODO: Print the computed silhouette score

```

Below is a filled-in version of the code with the silhouette score calculation added to the custom_silhouette_score function. For each point, we compute a(i) and b(i) and then derive the silhouette score as s(i) = (b(i) - a(i)) / max(a(i), b(i)). Lastly, we return the average of all the per-point scores.

```python
from sklearn.metrics import silhouette_score
import numpy as np
from collections import defaultdict

# Calculate Euclidean distance.
def euclidean_distance(a, b):
    return np.sqrt(np.sum((np.array(a) - np.array(b)) ** 2))

# Calculate 'a(i)' for a point.
def calculate_a(point, cluster):
    if len(cluster) <= 1:
        return 0
    distances = [euclidean_distance(point, other) for other in cluster if point != other]
    return sum(distances) / (len(cluster) - 1)

# Calculate 'b(i)' for a point.
def calculate_b(point, clusters):
    min_average_distance = float('inf')
    for cluster in clusters:
        # Skip if the point is in this cluster
        if point in cluster:
            continue
        distances = [euclidean_distance(point, other) for other in cluster]
        average_distance = sum(distances) / len(distances)
        if average_distance < min_average_distance:
            min_average_distance = average_distance
    return min_average_distance

# Calculate silhouette score for each point and print individual scores.
def custom_silhouette_score(points, labels):
    clusters = defaultdict(list)
    for point, label in zip(points, labels):
        clusters[label].append(point)
        
    cluster_list = list(clusters.values())
    
    scores = []
    for point, label in zip(points, labels):
        a = calculate_a(point, clusters[label])
        b = calculate_b(point, cluster_list)
        
        # Silhouette score: s(i) = (b - a) / max(a, b)
        if max(a, b) > 0:
            s = (b - a) / max(a, b)
        else:
            s = 0
        
        scores.append(s)
        print(f"Point {point}, Silhouette Score: {s}")
    
    # Return the average silhouette score
    return sum(scores) / len(scores)

# Custom data points and their labels
points = [[5, 3], [5, 4], [6, 3], [15, 16], [16, 15], [16, 17]]
labels = [0, 0, 0, 1, 1, 1]

# Use the custom silhouette_score function to calculate the average silhouette score
avg_score = custom_silhouette_score(points, labels)
print(f"Average Silhouette Score (Custom): {avg_score}")
```

Explanation:
1. We group points into clusters based on their labels (using a defaultdict of lists).
2. For each point, we compute:
   • a(i) (intra-cluster distance) via calculate_a(point, clusters[label]).  
   • b(i) (inter-cluster distance) via calculate_b(point, cluster_list).  
3. We then derive the silhouette score with the standard formula s(i) = (b - a) / max(a, b).  
4. Finally, we average these silhouette scores over all points to get the overall measure and print out the result.

Execution Result:

```
Point [5, 3], Silhouette Score: 0.9405758601779645
Point [5, 4], Silhouette Score: 0.9248785898591119
Point [6, 3], Silhouette Score: 0.9255432134241226
Point [15, 16], Silhouette Score: 0.9113032595495214
Point [16, 15], Silhouette Score: 0.8920820117592949
Point [16, 17], Silhouette Score: 0.9015859926632127
Average Silhouette Score (Custom): 0.9159948212388714
```

In [3]:
from sklearn.metrics import silhouette_score
import numpy as np
from collections import defaultdict

# Calculate Euclidean distance.
def euclidean_distance(a, b):
    return np.sqrt(np.sum((np.array(a) - np.array(b)) ** 2))

# Calculate 'a(i)' for a point.
def calculate_a(point, cluster):
    if len(cluster) <= 1:
        return 0
    distances = [euclidean_distance(point, other) for other in cluster if point != other]
    return sum(distances) / (len(cluster) - 1)

# Calculate 'b(i)' for a point.
def calculate_b(point, clusters):
    min_average_distance = float('inf')
    for cluster in clusters:
        # Skip if the point is in this cluster
        if point in cluster:
            continue
        distances = [euclidean_distance(point, other) for other in cluster]
        average_distance = sum(distances) / len(distances)
        if average_distance < min_average_distance:
            min_average_distance = average_distance
    return min_average_distance

# Calculate silhouette score for each point and print individual scores.
def custom_silhouette_score(points, labels):
    clusters = defaultdict(list)
    for point, label in zip(points, labels):
        clusters[label].append(point)

    cluster_list = list(clusters.values())

    scores = []
    for point, label in zip(points, labels):
        a = calculate_a(point, clusters[label])
        b = calculate_b(point, cluster_list)

        # Silhouette score: s(i) = (b - a) / max(a, b)
        if max(a, b) > 0:
            s = (b - a) / max(a, b)
        else:
            s = 0

        scores.append(s)
        print(f"Point {point}, Silhouette Score: {s}")

    # Return the average silhouette score
    return sum(scores) / len(scores)

# Custom data points and their labels
points = [[5, 3], [5, 4], [6, 3], [15, 16], [16, 15], [16, 17]]
labels = [0, 0, 0, 1, 1, 1]

# Use the custom silhouette_score function to calculate the average silhouette score
avg_score = custom_silhouette_score(points, labels)
print(f"Average Silhouette Score (Custom): {avg_score}")

Point [5, 3], Silhouette Score: 0.9405758601779645
Point [5, 4], Silhouette Score: 0.9248785898591119
Point [6, 3], Silhouette Score: 0.9255432134241226
Point [15, 16], Silhouette Score: 0.9113032595495214
Point [16, 15], Silhouette Score: 0.8920820117592949
Point [16, 17], Silhouette Score: 0.9015859926632127
Average Silhouette Score (Custom): 0.9159948212388714


## Silhouette Score: Write the Code from Scratch

Great job on learning cluster validation and the silhouette score, Space Voyager! Let's practice using sklearn's powerful tools for cluster validation. In this exercise, you'll need to calculate the silhouette score using the silhouette_score function from sklearn.metrics. You're provided with a predefined dataset and the corresponding cluster labels.

```py
from sklearn.metrics import silhouette_score

# Sample data points and their labels
points = [[1, 2], [1, 3], [2, 2], [6, 7], [6, 8], [7, 7]]
labels = [0, 0, 0, 1, 1, 1]

# TODO: Calculate the silhouette score using the silhouette_score function from sklearn.metrics

# TODO: Print the silhouette score

```

Below is a filled-in version of the code with the silhouette score calculation added to the custom_silhouette_score function. For each point, we compute a(i) and b(i) and then derive the silhouette score as s(i) = (b(i) - a(i)) / max(a(i), b(i)). Lastly, we return the average of all the per-point scores.

```python
from sklearn.metrics import silhouette_score
import numpy as np
from collections import defaultdict

# Calculate Euclidean distance.
def euclidean_distance(a, b):
    return np.sqrt(np.sum((np.array(a) - np.array(b)) ** 2))

# Calculate 'a(i)' for a point.
def calculate_a(point, cluster):
    if len(cluster) <= 1:
        return 0
    distances = [euclidean_distance(point, other) for other in cluster if point != other]
    return sum(distances) / (len(cluster) - 1)

# Calculate 'b(i)' for a point.
def calculate_b(point, clusters):
    min_average_distance = float('inf')
    for cluster in clusters:
        # Skip if the point is in this cluster
        if point in cluster:
            continue
        distances = [euclidean_distance(point, other) for other in cluster]
        average_distance = sum(distances) / len(distances)
        if average_distance < min_average_distance:
            min_average_distance = average_distance
    return min_average_distance

# Calculate silhouette score for each point and print individual scores.
def custom_silhouette_score(points, labels):
    clusters = defaultdict(list)
    for point, label in zip(points, labels):
        clusters[label].append(point)
        
    cluster_list = list(clusters.values())
    
    scores = []
    for point, label in zip(points, labels):
        a = calculate_a(point, clusters[label])
        b = calculate_b(point, cluster_list)
        
        # Silhouette score: s(i) = (b - a) / max(a, b)
        if max(a, b) > 0:
            s = (b - a) / max(a, b)
        else:
            s = 0
        
        scores.append(s)
        print(f"Point {point}, Silhouette Score: {s}")
    
    # Return the average silhouette score
    return sum(scores) / len(scores)

# Custom data points and their labels
points = [[5, 3], [5, 4], [6, 3], [15, 16], [16, 15], [16, 17]]
labels = [0, 0, 0, 1, 1, 1]

# Use the custom silhouette_score function to calculate the average silhouette score
avg_score = custom_silhouette_score(points, labels)
print(f"Average Silhouette Score (Custom): {avg_score}")
```

Explanation:
1. We group points into clusters based on their labels (using a defaultdict of lists).
2. For each point, we compute:
   • a(i) (intra-cluster distance) via calculate_a(point, clusters[label]).  
   • b(i) (inter-cluster distance) via calculate_b(point, cluster_list).  
3. We then derive the silhouette score with the standard formula s(i) = (b - a) / max(a, b).  
4. Finally, we average these silhouette scores over all points to get the overall measure and print out the result.

Execution Result:

```
Point [5, 3], Silhouette Score: 0.9405758601779645
Point [5, 4], Silhouette Score: 0.9248785898591119
Point [6, 3], Silhouette Score: 0.9255432134241226
Point [15, 16], Silhouette Score: 0.9113032595495214
Point [16, 15], Silhouette Score: 0.8920820117592949
Point [16, 17], Silhouette Score: 0.9015859926632127
Average Silhouette Score (Custom): 0.9159948212388714
```

In [4]:
from sklearn.metrics import silhouette_score

# Sample data points and their labels
points = [[1, 2], [1, 3], [2, 2], [6, 7], [6, 8], [7, 7]]
labels = [0, 0, 0, 1, 1, 1]

# Calculate the silhouette score using sklearn's silhouette_score
score = silhouette_score(points, labels, metric='euclidean')

# Print the silhouette score
print("Silhouette Score (sklearn):", score)

Silhouette Score (sklearn): 0.8398163312586742
