
# 🌳 Decision Tree Classifier with the Iris Dataset

## 📦 Step 1: Import Libraries

```python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from sklearn.datasets import load_iris
```

We import:

- **pandas** and **numpy** for data manipulation
- **matplotlib.pyplot** for data visualization
- `%matplotlib inline` to display plots inside the notebook
- `load_iris` from sklearn's datasets to load a famous dataset for classification

---

## 🌼 Step 2: Load the Iris Dataset

```python
iris = load_iris()
```

- The Iris dataset is built into sklearn. It includes:
  - 150 samples of iris flowers
  - 4 features: sepal length, sepal width, petal length, and petal width
  - 3 classes: Setosa, Versicolor, Virginica

---

## 🔍 Step 3: Explore the Data

```python
iris.data
```

This gives the feature values.

```python
iris.feature_names
```

Get names of the features (columns):

```
['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']
```

```python
iris.target
```

Labels for each sample (0, 1, 2) corresponding to the three flower types.

```python
iris.target_names
```

Gives the names of the target labels:

```
['setosa', 'versicolor', 'virginica']
```

---

## 🗃️ Step 4: Convert to a DataFrame

```python
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df['target'] = iris.target
df.head()
```

- Converts the dataset into a Pandas DataFrame
- Adds the target labels
- Shows first 5 rows using `head()`

---

## 📊 Step 5: Visualize the Data

```python
plt.scatter(df['sepal length (cm)'], df['sepal width (cm)'], c=df['target'])
```

- Scatter plot of Sepal Length vs Sepal Width
- Points are colored by their target class

---

## 🧠 Step 6: Train a Decision Tree Classifier

```python
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
```

Split the dataset and train the model:

```python
X_train, X_test, y_train, y_test = train_test_split(df.drop(['target'], axis=1), df['target'], test_size=0.2)
model = DecisionTreeClassifier()
model.fit(X_train, y_train)
```

- `train_test_split`: splits the data into training and testing sets
- `DecisionTreeClassifier`: creates a classifier object
- `fit()`: trains the model

---

## ✅ Step 7: Evaluate the Model

```python
model.score(X_test, y_test)
```

- Evaluates the model accuracy on test data

---

## 🌲 Step 8: Visualize the Tree

```python
from sklearn.tree import plot_tree
plt.figure(figsize=(15,10))
plot_tree(model, filled=True, feature_names=iris.feature_names, class_names=iris.target_names)
```

- Plots the decision tree
- `filled=True` shows color-coded nodes
- Labels nodes with feature names and class names


import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from sklearn.datasets import load_iris


In [None]:
iris=load_iris()

In [None]:
iris.data

In [None]:
iris.target

In [None]:
import seaborn as sns

In [None]:
df=sns.load_dataset("iris")
df.head()

In [None]:
X=df.drop("species",axis=1)
y=iris.target

In [None]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

In [None]:
## postpruning 
from sklearn.tree import DecisionTreeClassifier
clf = DecisionTreeClassifier(random_state=0,max_depth=2)
clf.fit(X_train,y_train)

In [None]:
from sklearn import tree
plt.figure(figsize=(15,10))
tree.plot_tree(clf,filled=True)
plt.show()

In [None]:
y_pred=clf.predict(X_test)
y_pred

In [None]:
from sklearn.metrics import accuracy_score,classification_report
score=accuracy_score(y_pred,y_test)
score

In [None]:
print(classification_report(y_pred,y_test))

In [None]:
parameter={
    'criterion':['gini', 'entropy', 'log_loss'],
    'splitter':['best', 'random'],
    'max_depth':[1,2,3,4,5,6],
    'max_features':['auto','srqt','log2']
}

In [None]:
from sklearn.model_selection import GridSearchCV
clf=DecisionTreeClassifier(max_depth=2)
cv=GridSearchCV(clf,parameter,cv=5,scoring='accuracy')
cv.fit(X_train,y_train)

In [None]:
cv.best_params_

In [None]:
y_pred=cv.predict(X_test)

In [None]:
accuracy_score(y_pred,y_test)