### Yusif Hajizade CS-020 id: 22022735
# **Artificial Intelligence Practical Work 1**
## **Lab: Iris dataset classification using a Decision Tree**

### The objective of this lab is to implement a Decision Tree in order to classify Irises 
Specific objectives:
- Observe the data, understand their nature and how to adapt them (if needed) so you can use them
    in a Decision Tree model.
- Understand how Decision Trees work so as to implement this model in a computer program.
- Evaluate the results and put them into perspective with what we know about the data.

## **1 The data**

During this lab, we will work on a classification problem (i.e. assign labels
to data so as to group these data into distinct caterories).
The initial dataset for machine learning is the Iris dataset, featuring three Iris species with 50 instances each. Characterized by four attributes (petal width, petal length, sepal width, and sepal length), it allows discrimination between species. Visualization, such as 2D scatterplots, simplifies the understanding of the 150 instances. Notably, Setosa is linearly separable, while distinguishing Versicolor and Virginica visually is more complex.

#### **Question:**
- #### ***Is the Decision Tree model used for supervised or unsupervised classification? Explain your answer***

**Answer:** **The Decision Tree model is used for supervised classification,**
because the Decision Tree model is trained on a labeled dataset to predict the class labels for new instances, it falls under the category of supervised classification.

- Supervised learning involves training a model on a labeled dataset, where the input data (features) is associated with corresponding output labels.
- In the given dataset example, each instance has a set of features (SepalLengthCm, SepalWidthCm, PetalLengthCm, PetalWidthCm) and a corresponding label (Species), which indicates the class or category to which the instance belongs (e.g., Iris-setosa).
- The objective of the Decision Tree model is to learn a mapping from the input features to the output labels based on the labeled training data.
- The model is trained to make predictions on new, unseen instances by generalizing from the patterns observed in the training data.
- In contrast, unsupervised learning involves tasks where the algorithm is given unlabeled data and needs to find patterns, relationships, or structure in the data without explicit guidance on the output.

In [3]:
# Import necessary libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier, export_text
from sklearn.metrics import accuracy_score, classification_report
import matplotlib.pyplot as plt
import seaborn as sn

In [4]:

# Load the dataset
dataset_url = r"C:\Users\User\Desktop\AI\Iris.csv"
column_names = ["Id", "SepalLength", "SepalWidth", "PetalLength", "PetalWidth", "Species"]

df = pd.read_csv(dataset_url, names=column_names, header=0)

# Display a sample of the dataset
print(df.head())

   Id  SepalLength  SepalWidth  PetalLength  PetalWidth      Species
0   1          5.1         3.5          1.4         0.2  Iris-setosa
1   2          4.9         3.0          1.4         0.2  Iris-setosa
2   3          4.7         3.2          1.3         0.2  Iris-setosa
3   4          4.6         3.1          1.5         0.2  Iris-setosa
4   5          5.0         3.6          1.4         0.2  Iris-setosa


---------------

## **2 Buidling a Decision Tree**

In this section, we aim to build a Decision Tree for the Iris dataset, utilizing its four attributes to determine their discriminative power. The process involves identifying the attribute with the highest discriminative power and subsequently ranking the attributes. The goal is to establish a Decision Tree model that can predict the species of a new instance based on its attribute values.

#### **Preliminary Question:**
- #### ***What is the nature of the attributes of the dataset?***

Understanding the nature of the attributes is crucial for building an effective Decision Tree. It provides insights into the types of measurements and characteristics represented by the dataset.

In [5]:
# Preliminary Question 1
# What is the nature of the attributes of the dataset?

# Display a sample of the dataset
print(df.head())

   Id  SepalLength  SepalWidth  PetalLength  PetalWidth      Species
0   1          5.1         3.5          1.4         0.2  Iris-setosa
1   2          4.9         3.0          1.4         0.2  Iris-setosa
2   3          4.7         3.2          1.3         0.2  Iris-setosa
3   4          4.6         3.1          1.5         0.2  Iris-setosa
4   5          5.0         3.6          1.4         0.2  Iris-setosa


**Answer: The attributes are numerical measurements related to the dimensions of Sepal and Petal.** They all describe atributes of Iris fower.

- #### ***Do you think it is necessary to transform the attributes (scaling, standardization, ...)?***

Determining the need for attribute transformation is crucial. Some machine learning algorithms, such as Decision Trees, are not sensitive to feature scaling, but it's important to consider for other algorithms.

In [6]:
# Preliminary Question 2
# Do you think it is necessary to transform the attributes (scaling, standardization, ...)?
X = df.drop(["Id", "Species"], axis=1)
y = df["Species"]

# Define three partitions based on attribute values
short_threshold = X.median() - X.std()
long_threshold = X.median() + X.std()

# Define three partitions based on attribute values using quantiles
X["Partition"] = pd.qcut(X["SepalLength"], q=[0, 1/3, 2/3, 1], labels=["short", "average", "long"])

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X.drop("Partition", axis=1), y, test_size=0.2, random_state=42)


**Answer:** **It depends on the specific algorithm used. Decision Trees are not sensitive to feature scaling, feature scaling is necessary for Decision Trees**
The code splits the dataset into training and testing sets without applying feature scaling. This is consistent with the observation that Decision Trees are not sensitive to feature scaling. 
In the context of Decision Trees, which is the algorithm used in the provided code, scaling or standardization of features is generally not necessary. Decision Trees make decisions based on the values of individual features and the splitting points selected during training. The algorithm is not influenced by the scale of the features. if your dataset involves other algorithms or models that are sensitive to feature scales (e.g., Support Vector Machines, k-Nearest Neighbors), scaling might be necessary. It's always a good practice to check the requirements of the specific algorithms you are using.

- #### ***How are you going to use real value attributes to build your Decision Tree?***

Understanding how to utilize real-value attributes is fundamental for Decision Tree construction. Decision Trees naturally handle real-value attributes, and this step explores how to leverage them in the model.

In [11]:
# Preliminary Question 3
# How are you going to use real value attributes to build your Decision Tree?

X = df.drop(["Id", "Species"], axis=1)
y = df["Species"]

# Define three partitions based on attribute values
short_threshold = X.median() - X.std()
long_threshold = X.median() + X.std()

# Define three partitions based on attribute values using quantiles
X["Partition"] = pd.qcut(X["SepalLength"], q=[0, 1/3, 2/3, 1], labels=["short", "average", "long"])

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X.drop("Partition", axis=1), y, test_size=0.2, random_state=42)

# Train a Decision Tree classifier
dt_classifier = DecisionTreeClassifier()
dt_classifier.fit(X_train, y_train)

# Make predictions on the test set
y_pred = dt_classifier.predict(X_test)

# Evaluate the classifier
accuracy = accuracy_score(y_test, y_pred)
classification_report_str = classification_report(y_test, y_pred)

print(f"Accuracy: {accuracy:.2f}")
print("Classification Report:\n", classification_report_str)

Accuracy: 1.00
Classification Report:
                  precision    recall  f1-score   support

    Iris-setosa       1.00      1.00      1.00        10
Iris-versicolor       1.00      1.00      1.00         9
 Iris-virginica       1.00      1.00      1.00        11

       accuracy                           1.00        30
      macro avg       1.00      1.00      1.00        30
   weighted avg       1.00      1.00      1.00        30



#### **Answer: Decision Trees naturally handle real-value attributes for classification.**
The code introduces three partitions based on the real-value attribute (SepalLength). The creation of categorical partitions facilitates the Decision Tree's ability to handle these attributes effectively. The Decision Tree rules provide insights into how the model makes predictions based on attribute values.

- #### ***Have a look back at Fig. 2 p. 5: what relationship is there between this figure and the class repartitions of Table 1?***
**Answer:** Figure 2 on page 5 visualizes the data in a 2D scatterplot, allowing us to observe the grouping of instances into different classes (species). The relationship with Table 1 lies in the correspondence between the visual representation and the class repartitions. The scatterplot visually represents how instances are distributed among different classes, while Table 1 quantifies this distribution for each group based on the discriminative attribute.

- #### ***do you wish to continue adding branches in the tree? Explain your answer.***

**Answer:** The decision to continue adding branches in the tree depends on the analysis of the current branches and their effectiveness in discriminating between classes. Further branches may lead to a more detailed and specific classification but could also risk overfitting the model to the training data. It's essential to strike a balance between model complexity and generalization to achieve optimal performance on new, unseen instances.

### **Continued Segmentation:**
- #### ***Does the use of this attribute allow discrimination between classes?***
**Answer: **The analysis of Table 2 presents the class repartitions for the attribute "SW" (sepal width). The distribution of instances among different classes in each group suggests whether this attribute effectively discriminates between classes.

- #### ***Explain (and implement if you still have time) the procedure to continue segmenting the data.***

Implementation:

In [7]:
# Continue segmentation with another attribute (e.g., SepalWidth)
X["Partition_SW"] = pd.qcut(X["SepalWidth"], q=[0, 1/3, 2/3, 1], labels=["short", "average", "long"])
X_train_SW, X_test_SW, y_train_SW, y_test_SW = train_test_split(X.drop(["Partition", "Partition_SW"], axis=1), y, test_size=0.2, random_state=42)

# Train a new Decision Tree classifier with the new attribute
dt_classifier_SW = DecisionTreeClassifier()
dt_classifier_SW.fit(X_train_SW, y_train_SW)

# Make predictions on the test set
y_pred_SW = dt_classifier_SW.predict(X_test_SW)

# Evaluate the classifier
accuracy_SW = accuracy_score(y_test_SW, y_pred_SW)
classification_report_str_SW = classification_report(y_test_SW, y_pred_SW)
print(f"Accuracy (with SepalWidth): {accuracy_SW:.2f}")
print("Classification Report (with SepalWidth):\n", classification_report_str_SW)


Accuracy (with SepalWidth): 1.00
Classification Report (with SepalWidth):
                  precision    recall  f1-score   support

    Iris-setosa       1.00      1.00      1.00        10
Iris-versicolor       1.00      1.00      1.00         9
 Iris-virginica       1.00      1.00      1.00        11

       accuracy                           1.00        30
      macro avg       1.00      1.00      1.00        30
   weighted avg       1.00      1.00      1.00        30



**Answer:** The code continues the segmentation process with the attribute "SepalWidth." It creates new partitions and evaluates the Decision Tree classifier's performance with this additional attribute. This procedure can be repeated for other attributes to further refine the Decision Tree model.