In [None]:
import numpy as np

# PSO Parameters
NUM_POINTS = 50  # Number of data points
NUM_CLUSTERS = 3  # Number of clusters
NUM_PARTICLES = 30
NUM_ITERATIONS = 100
INERTIA_WEIGHT = 0.5
COGNITIVE_CONSTANT = 1.5
SOCIAL_CONSTANT = 1.5

# Randomly generate data points
data_points = np.random.rand(NUM_POINTS, 2)  # 2D data points

# Objective function to minimize the sum of squared distances
def objective_function(centroids):
    total_distance = 0
    for point in data_points:
        distances = [np.linalg.norm(point - centroid) for centroid in centroids]
        total_distance += min(distances)
    return total_distance

# Initialize particles (each particle represents a set of cluster centroids)
particles = [np.random.rand(NUM_CLUSTERS, 2) for _ in range(NUM_PARTICLES)]
velocities = [np.random.uniform(-1, 1, (NUM_CLUSTERS, 2)) for _ in range(NUM_PARTICLES)]

# Personal and global best positions and scores
personal_best_positions = np.copy(particles)
personal_best_scores = np.array([objective_function(p) for p in particles])

global_best_position = personal_best_positions[np.argmin(personal_best_scores)]
global_best_score = np.min(personal_best_scores)

def particle_swarm_clustering():
    global global_best_position, global_best_score

    for iteration in range(NUM_ITERATIONS):
        for i in range(NUM_PARTICLES):
            velocities[i] = (INERTIA_WEIGHT * velocities[i] +
                             COGNITIVE_CONSTANT * np.random.random() * (personal_best_positions[i] - particles[i]) +
                             SOCIAL_CONSTANT * np.random.random() * (global_best_position - particles[i]))

            particles[i] += velocities[i]

            current_score = objective_function(particles[i])
            if current_score < personal_best_scores[i]:
                personal_best_positions[i] = particles[i]
                personal_best_scores[i] = current_score

                if current_score < global_best_score:
                    global_best_position = particles[i]
                    global_best_score = current_score

    return global_best_position, global_best_score

# Run the PSO clustering
best_centroids, best_score = particle_swarm_clustering()
print(f"Best Cluster Centroids: {best_centroids}, Best Score: {best_score}")

Best Cluster Centroids: [[0.59754598 0.17170333]
 [0.1825933  0.71918648]
 [0.68880233 0.66659985]], Best Score: 10.678539870863613


This code implements **Particle Swarm Optimization (PSO)** to solve a clustering problem, akin to **k-means clustering**, by minimizing the sum of squared distances between data points and their closest centroids. Let's break it down step-by-step.

---

## **Code Walkthrough**

### 1. **PSO Parameters**

```python
NUM_POINTS = 50           # Number of data points to cluster
NUM_CLUSTERS = 3          # Number of clusters (k)
NUM_PARTICLES = 30        # Number of particles in the swarm
NUM_ITERATIONS = 100      # Number of iterations
INERTIA_WEIGHT = 0.5      # Inertia weight (ω) to control the velocity
COGNITIVE_CONSTANT = 1.5  # Cognitive component (φ1) for personal best influence
SOCIAL_CONSTANT = 1.5     # Social component (φ2) for global best influence
```

These parameters define the characteristics of the PSO algorithm, including the swarm's behavior and convergence speed.

---

### 2. **Generate Random Data Points**

```python
data_points = np.random.rand(NUM_POINTS, 2)  # Generate 2D data points
```

- Creates `NUM_POINTS` random data points in a 2D space with coordinates between 0 and 1.

---

### 3. **Objective Function**

```python
def objective_function(centroids):
    total_distance = 0
    for point in data_points:
        distances = [np.linalg.norm(point - centroid) for centroid in centroids]
        total_distance += min(distances)  # Sum the minimum distance to the closest centroid
    return total_distance
```

- This function calculates the total distance of all data points to their nearest cluster centroids.
- The goal is to **minimize** this total distance, similar to the k-means objective.

---

### 4. **Initialize Particles and Velocities**

```python
particles = [np.random.rand(NUM_CLUSTERS, 2) for _ in range(NUM_PARTICLES)]
velocities = [np.random.uniform(-1, 1, (NUM_CLUSTERS, 2)) for _ in range(NUM_PARTICLES)]
```

- **Particles**: Each particle represents a candidate solution (a set of `NUM_CLUSTERS` centroids).
- **Velocities**: Each particle's velocity is initialized with random values between -1 and 1 for each dimension.

---

### 5. **Personal and Global Best**

```python
personal_best_positions = np.copy(particles)
personal_best_scores = np.array([objective_function(p) for p in particles])

global_best_position = personal_best_positions[np.argmin(personal_best_scores)]
global_best_score = np.min(personal_best_scores)
```

- **Personal Best**: The best solution each particle has encountered so far.
- **Global Best**: The best solution encountered by the entire swarm.

---

### 6. **PSO Clustering Function**

```python
def particle_swarm_clustering():
    global global_best_position, global_best_score

    for iteration in range(NUM_ITERATIONS):
        for i in range(NUM_PARTICLES):
            # Update velocity
            velocities[i] = (INERTIA_WEIGHT * velocities[i] +
                             COGNITIVE_CONSTANT * np.random.random() * (personal_best_positions[i] - particles[i]) +
                             SOCIAL_CONSTANT * np.random.random() * (global_best_position - particles[i]))

            # Update particle position
            particles[i] += velocities[i]

            # Evaluate the new position
            current_score = objective_function(particles[i])

            # Update personal best
            if current_score < personal_best_scores[i]:
                personal_best_positions[i] = particles[i]
                personal_best_scores[i] = current_score

                # Update global best
                if current_score < global_best_score:
                    global_best_position = particles[i]
                    global_best_score = current_score

    return global_best_position, global_best_score
```

#### **Key Steps:**

1. **Update Velocities**:  
   The velocity is updated using:
   - The inertia term (`INERTIA_WEIGHT * velocities[i]`).
   - The cognitive term (based on the particle's personal best).
   - The social term (based on the swarm's global best).

2. **Update Positions**:  
   Add the updated velocity to each particle's position.

3. **Evaluate Fitness**:  
   Calculate the current score (objective function) for each particle.

4. **Update Bests**:  
   - If the current position is better than the particle's personal best, update it.
   - If the current position is better than the global best, update it.

---

### 7. **Run the PSO Clustering**

```python
best_centroids, best_score = particle_swarm_clustering()
print(f"Best Cluster Centroids: {best_centroids}, Best Score: {best_score}")
```

- Runs the PSO clustering function and prints the final cluster centroids and the best score (sum of squared distances).

---

## **Sample Output**

A sample output might look like:

```plaintext
Best Cluster Centroids: [[0.702, 0.518], [0.123, 0.348], [0.890, 0.732]], Best Score: 4.325
```

---

## **Considerations and Enhancements**

1. **Visualization**:  
   Plot the data points and the final centroids to visualize the clustering result:

   ```python
   import matplotlib.pyplot as plt

   plt.scatter(data_points[:, 0], data_points[:, 1], c='blue', label='Data Points')
   plt.scatter(best_centroids[:, 0], best_centroids[:, 1], c='red', marker='X', s=100, label='Centroids')
   plt.legend()
   plt.title("PSO Clustering Result")
   plt.xlabel("X Coordinate")
   plt.ylabel("Y Coordinate")
   plt.show()
   ```

2. **Parameter Tuning**:  
   Experiment with `NUM_PARTICLES`, `NUM_ITERATIONS`, `INERTIA_WEIGHT`, `COGNITIVE_CONSTANT`, and `SOCIAL_CONSTANT` for better performance.

3. **Constraints**:  
   To keep centroids within the data bounds (e.g., between 0 and 1), consider bounding the particle updates:

   ```python
   particles[i] = np.clip(particles[i], 0, 1)
   ```

4. **Performance**:  
   For larger datasets or higher-dimensional data, increasing the number of particles and iterations may improve results.