<a href="https://colab.research.google.com/github/NDsasuke/Autocorrelation-function-Diagnostics-and-prediction/blob/main/Diagnostics%20and%20prediction/Model%20Evaluation%20Metrics/Classification_Evaluation_Metrics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


1. **Importing Libraries and Loading the Dataset:**
   - This segment imports the necessary libraries and modules for our code.
   - It imports `numpy` for numerical computations, `load_iris` from `sklearn.datasets` to load the Iris dataset, `train_test_split` from `sklearn.model_selection` to split the dataset into training and test sets, `LogisticRegression` from `sklearn.linear_model` to train a logistic regression model, and various evaluation metrics from `sklearn.metrics`.
   - It also loads the Iris dataset using `load_iris()` function and assigns the features to `X` and the target variable to `y`.


In [2]:
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix, accuracy_score, precision_score, recall_score, f1_score, roc_auc_score

In [3]:
# Load the Iris dataset
data = load_iris()
X = data.data
y = data.target


2. **Splitting the Data into Train and Test Sets:**
   - This segment splits the loaded dataset into training and test sets using the `train_test_split` function.
   - It assigns 80% of the data to the training set (`X_train` and `y_train`) and 20% to the test set (`X_test` and `y_test`).
   - The `test_size` parameter is set to 0.2, indicating that we want 20% of the data for testing.


In [4]:
# Split the data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


3. **Training the Logistic Regression Model and Making Predictions:**
   - This segment creates an instance of the logistic regression model using `LogisticRegression()` and trains it on the training data using the `fit` method.
   - Once the model is trained, it makes predictions on the test set using the `predict` method and assigns the predicted labels to `y_pred`.


In [5]:
# Train a logistic regression model
model = LogisticRegression()
model.fit(X_train, y_train)

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


In [6]:
# Make predictions on the test set
y_pred = model.predict(X_test)


4. **Computing Evaluation Metrics:**
   - This segment computes various evaluation metrics to assess the performance of the classification model on the test set.
   - It calculates the confusion matrix using the `confusion_matrix` function, passing the true labels (`y_test`) and the predicted labels (`y_pred`).
   - It computes the accuracy using `accuracy_score`, passing the true labels and the predicted labels.
   - Precision, recall, and F1 score are computed using `precision_score`, `recall_score`, and `f1_score`, respectively, with the `average` parameter set to `'weighted'` to account for imbalanced class distribution.
   - The AUC-ROC score is calculated using `roc_auc_score`, passing the true labels and the predicted probabilities (`model.predict_proba(X_test)`).


In [7]:
# Compute evaluation metrics
conf_matrix = confusion_matrix(y_test, y_pred)
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, average='weighted')
recall = recall_score(y_test, y_pred, average='weighted')
f1 = f1_score(y_test, y_pred, average='weighted')
roc_auc = roc_auc_score(y_test, model.predict_proba(X_test), multi_class='ovr')



5. **Printing the Evaluation Metrics:**
   - This segment prints the computed evaluation metrics to the console.
   - It first displays the confusion matrix using `print("Confusion Matrix:")` and `print(conf_matrix)`.
   - Then it prints the accuracy, precision, recall, F1 score, and AUC-ROC score using `print("Accuracy:", accuracy)`, `print("Precision:", precision)`, `print("Recall:", recall)`, `print("F1 Score:", f1)`, and `print("AUC-ROC Score:", roc_auc)`.

By executing this code, you will obtain the evaluation metrics for your classification model, allowing you to assess its performance in terms of accuracy, precision, recall, F1 score, and AUC-ROC score.

In [8]:
# Print the evaluation metrics
print("Confusion Matrix:")
print(conf_matrix)
print("Accuracy:", accuracy)
print("Precision:", precision)
print("Recall:", recall)
print("F1 Score:", f1)
print("AUC-ROC Score:", roc_auc)


Confusion Matrix:
[[10  0  0]
 [ 0  9  0]
 [ 0  0 11]]
Accuracy: 1.0
Precision: 1.0
Recall: 1.0
F1 Score: 1.0
AUC-ROC Score: 1.0
