### Aim: K-Means & Mediods  Clustering on Iris Dataset

#### Dataset: Iris
The Iris dataset includes measurements of 150 iris flowers from three species.

#### Objective
We aim to apply K-Means clustering to identify clusters based on flower measurements. As K-Means is unsupervised, we won't use the provided species labels. The clustering effectiveness will be evaluated by comparing our clusters to the actual species labels.

In [None]:
! pip install scikit-learn
! pip install pandas
! pip install matplotlib
! pip install scikit-learn-extra

#### Import in the iris dataset from the sklearn.datasets module.
```
from sklearn.datasets import load_iris
iris = load_iris()
```
#### Create a dataframes
```
X -> iris.data

```

In [None]:
from sklearn.datasets import load_iris
import pandas as pd

iris = load_iris()
data = pd.DataFrame(iris.data, columns=iris.feature_names)
X = iris.data
data.head(200)

In [None]:
import matplotlib.pyplot as plt

plt.scatter(X[:, 0], X[:, 1], cmap='viridis', edgecolors='k')
plt.title('Visualization of Iris Dataset (2D)')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.show()

In [None]:
from sklearn.cluster import KMeans
from sklearn_extra.cluster import KMedoids

k = 4  
kmeans = KMeans(n_clusters=k, random_state=42)
kmeans.fit(X)




In [None]:
cluster_assignments = kmeans.labels_
cluster_centers = kmeans.cluster_centers_

In [None]:
plt.scatter(X[:, 0], X[:, 1], c=cluster_assignments, cmap='viridis', edgecolors='k')
plt.scatter(cluster_centers[:, 0], cluster_centers[:, 1], c='red', marker='X', s=200, label='Cluster Centers')
plt.title('K-means Clustering of Iris Dataset (2D)')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.legend()
plt.show()

Extra Stuff

| Parameter                   | Description                                          |
|-----------------------------|------------------------------------------------------|
| `X[:, 0], X[:, 1]`          | Coordinates of data points in the scatter plot.      |
| `c=cluster_assignments`     | Colors of data points based on cluster assignments. |
| `cmap='viridis'`             | Colormap used for coloring the data points.          |
| `edgecolors='k'`             | Color of the edges of the markers in the plot.       |
| `cluster_centers[:, 0], cluster_centers[:, 1]` | Coordinates of cluster centers.       |
| `c='red'`                    | Color of the markers representing cluster centers.   |
| `marker='X'`                 | Marker style for cluster centers (X for cross).      |
| `s=200`                      | Size of the markers for cluster centers.              |
| `label='Cluster Centers'`   | Label for the legend indicating cluster centers.     |


In [None]:
kmedoids = KMedoids(n_clusters=k, random_state=42)
kmedoids.fit(X)

# Get medoid assignments and medoid centers
medoid_assignments = kmedoids.labels_
medoid_centers = X[kmedoids.medoid_indices_, :]

# Visualize the data in 2D (for simplicity, considering the first two features)
plt.scatter(X[:, 0], X[:, 1], c=medoid_assignments, cmap='viridis', edgecolors='k')
plt.scatter(medoid_centers[:, 0], medoid_centers[:, 1], c='red', marker='X', s=200, label='Medoids')
plt.title('K-medoids Clustering of Iris Dataset (2D)')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.legend()
plt.show()