# K-means Clustering on the Iris Dataset

This notebook demonstrates how to perform K-means clustering on the classic Iris dataset using Python, scikit-learn, and Google Colab.  
K-means is an unsupervised machine learning algorithm that groups data into clusters based on feature similarity.

**In this notebook, you will:**
- Load the Iris dataset directly from the UCI Machine Learning Repository
- Apply K-means clustering to group the data into clusters
- Visualize the resulting clusters

No prior setup or downloads are required—simply run each cell to see the results!

In [None]:
# Step 1: Import libraries
import pandas as pd
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt

# Step 2: Load the Iris dataset from UC Irvine.
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
cols = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species']
df = pd.read_csv(url, header=None, names=cols)

# Step 3: Prepare data (drop species column for clustering)
X = df.drop('species', axis=1)

# Step 4: Run K-means clustering
kmeans = KMeans(n_clusters=3, random_state=42)
df['cluster'] = kmeans.fit_predict(X)

# Step 5: Show results
print(df.head())

# Step 6: Visualize clusters (using first two features)
plt.scatter(df['sepal_length'], df['sepal_width'], c=df['cluster'])
plt.xlabel('Sepal Length')
plt.ylabel('Sepal Width')
plt.title('K-means Clusters on Iris Dataset')
plt.show()
