## Assignment 8 (Classification and Clustering)

### 1. Import Iris dataset

### 2. Apply Classification Algorithms: Naive Bayes and K-Nearest Neighbors (K-NN)

### 3. Apply Clustering Algorithm: K-Means

### Code:

In [2]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.neighbors import KNeighborsClassifier
from sklearn.cluster import KMeans
from sklearn.metrics import accuracy_score, silhouette_score
import numpy as np

def evaluate_classifiers_and_clustering():
    # Load IRIS dataset
    iris = load_iris()
    X = iris.data
    y = iris.target
    
    # Split the data
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
    
    # 1. Naive Bayes Classifier
    nb_classifier = GaussianNB()
    nb_classifier.fit(X_train, y_train)
    nb_predictions = nb_classifier.predict(X_test)
    nb_accuracy = accuracy_score(y_test, nb_predictions)
    
    # 2. K-Nearest Neighbor
    knn_classifier = KNeighborsClassifier(n_neighbors=3)
    knn_classifier.fit(X_train, y_train)
    knn_predictions = knn_classifier.predict(X_test)
    knn_accuracy = accuracy_score(y_test, knn_predictions)
    
    # 3. K-Means Clustering
    kmeans = KMeans(n_clusters=3, random_state=42)
    kmeans.fit(X)
    kmeans_labels = kmeans.labels_
    
    # Calculate silhouette score for K-Means
    silhouette_avg = silhouette_score(X, kmeans_labels)
    
    # Print results
    print("Classification Results:")
    print(f"Naive Bayes Accuracy: {nb_accuracy:.4f}")
    print(f"K-NN Accuracy: {knn_accuracy:.4f}")
    
    print("\nClustering Results:")
    print(f"K-Means Silhouette Score: {silhouette_avg:.4f}")
    
    # Additional analysis: Find optimal K for K-Means
    silhouette_scores = []
    K = range(2, 6)
    
    for k in K:
        kmeans = KMeans(n_clusters=k, random_state=42)
        kmeans.fit(X)
        silhouette_scores.append(silhouette_score(X, kmeans.labels_))
    
    print("\nK-Means Silhouette Scores for different K:")
    for k, score in zip(K, silhouette_scores):
        print(f"K={k}: {score:.4f}")

if __name__ == "__main__":
    evaluate_classifiers_and_clustering()

Classification Results:
Naive Bayes Accuracy: 0.9778
K-NN Accuracy: 1.0000

Clustering Results:
K-Means Silhouette Score: 0.5512

K-Means Silhouette Scores for different K:
K=2: 0.6810
K=3: 0.5512
K=4: 0.4976
K=5: 0.4931


### Explanation:
1. **Naive Bayes** and **K-NN** classifiers are applied to the Iris dataset, and accuracy is calculated for each.
2. **K-Means** clustering groups the data into 3 clusters, representing the 3 Iris species.