
# Day 10 — Introduction to Scikit-Learn

**Author:** Dhairya Patel  

This notebook covers:
1. What is Scikit-Learn?
2. Built-in Datasets
3. Train-Test Split
4. First Classifier: k-Nearest Neighbors (kNN)


In [None]:

import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score


## What is Scikit-Learn?
A Python library for machine learning providing preprocessing, model training, and evaluation tools.

## Loading a Built-in Dataset (Iris)

In [None]:

iris = datasets.load_iris()
X, y = iris.data, iris.target

print("Features shape:", X.shape)
print("Target shape:", y.shape)
print("Feature names:", iris.feature_names)


## Train-Test Split

In [None]:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
print("Training set size:", X_train.shape[0])
print("Test set size:", X_test.shape[0])


## First Classifier: k-Nearest Neighbors (kNN)

In [None]:

knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X_train, y_train)
y_pred = knn.predict(X_test)

print("kNN Accuracy:", accuracy_score(y_test, y_pred))



---

### Notes
- **Scikit-Learn** simplifies ML workflows with datasets, preprocessing, and models.
- **Iris dataset**: a classic beginner dataset for classification.
- **Train-test split**: ensures proper model evaluation.
- **kNN**: simple yet effective algorithm based on neighbors.

**End of Day 10.**
