### Observations from the two implemented Classifiers

Using the Iris dataset, K Nearest Neighbor and Decision Tree methods are applied to classify iris species. We generate the classification report and confusion matrix using the scikit-learn package. A classification report includes precision, recall, F1-scores, and support, which represents the number of observations of each class (iris species) along with accuracy, macro average, and weighted average.
This confusion matrix provides details such as the number of true positives (TP), the number of true negatives (TN), the number of false positives (FP) (Type 1 error) and the number of false negatives (FN) (Type 2 error). Correct predictions are represented by the diagonal values from top left to bottom right.
DT Classifier model trained showed 50 total predictions, of which 46 were correct predictions (TN+TP) and 4 were wrong predictions (FP+FN). The KNN Classifier model trained results in 50 predictions, 44 of which are correct (TN+TP) and 6 of which are wrong (FP+FN). In both the Decision Tree and KNN classifiers, the model predictions of class 0 are fully correct since precision, recall, and F1-score are all 1. DT Classifier's recall is higher than precision for class 2, so fewer FN1 are present than FP1. There are more FN2 than FP2 in class 1 because precision is higher than recall. KNN Classifier shows the same precision and recall.
In terms of our model evaluation, KNN Classifier performs better than the Decision Tree Classifier. In terms of evaluating the classifier model, precision and recall are both good metrics. If the scale of type 1 errors (FP) is lower than that of type 2 errors (FN), precision would be the better evaluation metric. It is therefore crucial to understand which evaluation metric to use based on a given scenario in order to evaluate models.

### Setting up Libraries

In [13]:
import sklearn.datasets
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.model_selection import KFold, cross_val_score

#classifiers
from sklearn.tree import DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier

In [10]:
#reading the data and splitting it into two sets
X,y = sklearn.datasets.load_iris(return_X_y=True) #loading iris dataset
X_trainingData, X_testData, y_trainingData, y_testData = train_test_split(X,y,test_size=0.33, random_state = 27)

## Decision-Tree Classifier

In [11]:
#model building and prediction
myClassifier= DecisionTreeClassifier()
myClassifier.fit(X_trainingData, y_trainingData) #creating the model
prediction=myClassifier.predict(X_testData) #applying the decision tree model to testdata using predict function

#printing the results
print("***Classification Report for Decision Tree Classifier***\n")
print(classification_report(y_testData,prediction))
print("\n")
print("***Confusion Matrix for Decision Tree Classifier***\n")
print(confusion_matrix(y_testData, prediction))
print("\n")

***Classification Report for Decision Tree Classifier***

              precision    recall  f1-score   support

           0       1.00      1.00      1.00        13
           1       0.94      0.89      0.91        18
           2       0.90      0.95      0.92        19

    accuracy                           0.94        50
   macro avg       0.95      0.95      0.95        50
weighted avg       0.94      0.94      0.94        50



***Confusion Matrix for Decision Tree Classifier***

[[13  0  0]
 [ 0 16  2]
 [ 0  1 18]]




## Cross-Validation Score

In [15]:
myScores=cross_val_score(DecisionTreeClassifier(),X,y,scoring='accuracy',cv=KFold(n_splits=10))
print(myScores)

[1.         1.         1.         0.93333333 0.93333333 0.86666667
 1.         0.86666667 0.93333333 1.        ]


## KNN Classifier

In [12]:
# Assuming k(the number of neighbors i.e. n_neighbors)=3
myClassifier = KNeighborsClassifier(n_neighbors=3, weights='uniform')

# Training or fitting the model with the train data
myClassifier.fit(X_trainingData, y_trainingData)

# applying the model to test data using predict function
prediction = myClassifier.predict(X_testData)

#printing the results
print("***Classification Report for KNN Classifier***\n")
print(classification_report(y_testData,prediction))
print("\n")
print("***Confusion Matrix for KNN Classifier***\n")
print(confusion_matrix(y_testData, prediction))
print("\n")


***Classification Report for KNN Classifier***

              precision    recall  f1-score   support

           0       1.00      1.00      1.00        13
           1       0.83      0.83      0.83        18
           2       0.84      0.84      0.84        19

    accuracy                           0.88        50
   macro avg       0.89      0.89      0.89        50
weighted avg       0.88      0.88      0.88        50



***Confusion Matrix for KNN Classifier***

[[13  0  0]
 [ 0 15  3]
 [ 0  3 16]]


