# Multiclass classification 

- Multiclass classification is a type of classification task where the goal is to categorize input instances into one of three or more classes. In other words, instead of simply predicting binary outcomes (e.g., yes or no, spam or not spam), the model must assign one of several possible labels to each input.



#### Multinomial Logistic Regression and One-vs-Rest (OvR) Logistic Regression are two approaches for handling multiclass classification tasks using logistic regression.


1. Multinomial Logistic Regression:

- In multinomial logistic regression, the logistic regression model is extended to handle multiple classes directly. Instead of fitting a separate binary classification model for each class (as in OvR), multinomial logistic regression estimates the probabilities of each class directly.

- The model computes the probability of each class as the softmax function of the linear combination of features and class-specific coefficients. The softmax function ensures that the predicted probabilities sum to 1 across all classes.


2. One-vs-Rest (OvR) Logistic Regression:

- In OvR logistic regression, also known as one-vs-all, a separate binary logistic regression model is trained for each class.
  For each class, the model is trained to distinguish that class from all other classes combined.

- During prediction, the class with the highest probability output by any of the binary classifiers is selected as the final       prediction.


In summary, multinomial logistic regression directly models the probabilities of each class, while OvR logistic regression trains multiple binary classifiers, each distinguishing one class from the rest

In [3]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report,accuracy_score,confusion_matrix

In [4]:
# Load the dataset
iris = load_iris()

In [6]:
print(iris.DESCR)

.. _iris_dataset:

Iris plants dataset
--------------------

**Data Set Characteristics:**

    :Number of Instances: 150 (50 in each of three classes)
    :Number of Attributes: 4 numeric, predictive attributes and the class
    :Attribute Information:
        - sepal length in cm
        - sepal width in cm
        - petal length in cm
        - petal width in cm
        - class:
                - Iris-Setosa
                - Iris-Versicolour
                - Iris-Virginica
                
    :Summary Statistics:

                    Min  Max   Mean    SD   Class Correlation
    sepal length:   4.3  7.9   5.84   0.83    0.7826
    sepal width:    2.0  4.4   3.05   0.43   -0.4194
    petal length:   1.0  6.9   3.76   1.76    0.9490  (high!)
    petal width:    0.1  2.5   1.20   0.76    0.9565  (high!)

    :Missing Attribute Values: None
    :Class Distribution: 33.3% for each of 3 classes.
    :Creator: R.A. Fisher
    :Donor: Michael Marshall (MARSHALL%PLU@io.arc.nasa.gov)
    :

In [7]:
iris.keys()

dict_keys(['data', 'target', 'frame', 'target_names', 'DESCR', 'feature_names', 'filename', 'data_module'])

In [8]:
iris.data

array([[5.1, 3.5, 1.4, 0.2],
       [4.9, 3. , 1.4, 0.2],
       [4.7, 3.2, 1.3, 0.2],
       [4.6, 3.1, 1.5, 0.2],
       [5. , 3.6, 1.4, 0.2],
       [5.4, 3.9, 1.7, 0.4],
       [4.6, 3.4, 1.4, 0.3],
       [5. , 3.4, 1.5, 0.2],
       [4.4, 2.9, 1.4, 0.2],
       [4.9, 3.1, 1.5, 0.1],
       [5.4, 3.7, 1.5, 0.2],
       [4.8, 3.4, 1.6, 0.2],
       [4.8, 3. , 1.4, 0.1],
       [4.3, 3. , 1.1, 0.1],
       [5.8, 4. , 1.2, 0.2],
       [5.7, 4.4, 1.5, 0.4],
       [5.4, 3.9, 1.3, 0.4],
       [5.1, 3.5, 1.4, 0.3],
       [5.7, 3.8, 1.7, 0.3],
       [5.1, 3.8, 1.5, 0.3],
       [5.4, 3.4, 1.7, 0.2],
       [5.1, 3.7, 1.5, 0.4],
       [4.6, 3.6, 1. , 0.2],
       [5.1, 3.3, 1.7, 0.5],
       [4.8, 3.4, 1.9, 0.2],
       [5. , 3. , 1.6, 0.2],
       [5. , 3.4, 1.6, 0.4],
       [5.2, 3.5, 1.5, 0.2],
       [5.2, 3.4, 1.4, 0.2],
       [4.7, 3.2, 1.6, 0.2],
       [4.8, 3.1, 1.6, 0.2],
       [5.4, 3.4, 1.5, 0.4],
       [5.2, 4.1, 1.5, 0.1],
       [5.5, 4.2, 1.4, 0.2],
       [4.9, 3

In [9]:
iris.target_names

array(['setosa', 'versicolor', 'virginica'], dtype='<U10')

In [10]:
iris.target

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

In [11]:
X = iris.data
y = iris.target

In [12]:
X , y

(array([[5.1, 3.5, 1.4, 0.2],
        [4.9, 3. , 1.4, 0.2],
        [4.7, 3.2, 1.3, 0.2],
        [4.6, 3.1, 1.5, 0.2],
        [5. , 3.6, 1.4, 0.2],
        [5.4, 3.9, 1.7, 0.4],
        [4.6, 3.4, 1.4, 0.3],
        [5. , 3.4, 1.5, 0.2],
        [4.4, 2.9, 1.4, 0.2],
        [4.9, 3.1, 1.5, 0.1],
        [5.4, 3.7, 1.5, 0.2],
        [4.8, 3.4, 1.6, 0.2],
        [4.8, 3. , 1.4, 0.1],
        [4.3, 3. , 1.1, 0.1],
        [5.8, 4. , 1.2, 0.2],
        [5.7, 4.4, 1.5, 0.4],
        [5.4, 3.9, 1.3, 0.4],
        [5.1, 3.5, 1.4, 0.3],
        [5.7, 3.8, 1.7, 0.3],
        [5.1, 3.8, 1.5, 0.3],
        [5.4, 3.4, 1.7, 0.2],
        [5.1, 3.7, 1.5, 0.4],
        [4.6, 3.6, 1. , 0.2],
        [5.1, 3.3, 1.7, 0.5],
        [4.8, 3.4, 1.9, 0.2],
        [5. , 3. , 1.6, 0.2],
        [5. , 3.4, 1.6, 0.4],
        [5.2, 3.5, 1.5, 0.2],
        [5.2, 3.4, 1.4, 0.2],
        [4.7, 3.2, 1.6, 0.2],
        [4.8, 3.1, 1.6, 0.2],
        [5.4, 3.4, 1.5, 0.4],
        [5.2, 4.1, 1.5, 0.1],
        [5

In [13]:
# split the dataset into training and test 

X_train , X_test , y_train , y_test = train_test_split(X , y , test_size = 20 , random_state =42)

In [14]:
X_train 

array([[4.7, 3.2, 1.6, 0.2],
       [6.1, 3. , 4.9, 1.8],
       [5. , 3.4, 1.6, 0.4],
       [6.4, 2.8, 5.6, 2.1],
       [7.9, 3.8, 6.4, 2. ],
       [6.7, 3. , 5.2, 2.3],
       [6.7, 2.5, 5.8, 1.8],
       [6.8, 3.2, 5.9, 2.3],
       [4.8, 3. , 1.4, 0.3],
       [4.8, 3.1, 1.6, 0.2],
       [4.6, 3.6, 1. , 0.2],
       [5.7, 4.4, 1.5, 0.4],
       [6.7, 3.1, 4.4, 1.4],
       [4.8, 3.4, 1.6, 0.2],
       [4.4, 3.2, 1.3, 0.2],
       [6.3, 2.5, 5. , 1.9],
       [6.4, 3.2, 4.5, 1.5],
       [5.2, 3.5, 1.5, 0.2],
       [5. , 3.6, 1.4, 0.2],
       [5.2, 4.1, 1.5, 0.1],
       [5.8, 2.7, 5.1, 1.9],
       [6. , 3.4, 4.5, 1.6],
       [6.7, 3.1, 4.7, 1.5],
       [5.4, 3.9, 1.3, 0.4],
       [5.4, 3.7, 1.5, 0.2],
       [5.5, 2.4, 3.7, 1. ],
       [6.3, 2.8, 5.1, 1.5],
       [6.4, 3.1, 5.5, 1.8],
       [6.6, 3. , 4.4, 1.4],
       [7.2, 3.6, 6.1, 2.5],
       [5.7, 2.9, 4.2, 1.3],
       [7.6, 3. , 6.6, 2.1],
       [5.6, 3. , 4.5, 1.5],
       [5.1, 3.5, 1.4, 0.2],
       [7.7, 2

In [16]:
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
x_test = scaler.transform(X_test)

In [17]:
X_train , X_test

(array([[-1.33479525,  0.30812582, -1.21295773, -1.30685472],
        [ 0.3291276 , -0.14706005,  0.65079179,  0.7938082 ],
        [-0.97824036,  0.76331169, -1.21295773, -1.04427185],
        [ 0.68568249, -0.60224592,  1.0461326 ,  1.18768249],
        [ 2.46845698,  1.67368342,  1.49795066,  1.05639106],
        [ 1.04223739, -0.14706005,  0.82022356,  1.45026536],
        [ 1.04223739, -1.28502472,  1.15908711,  0.7938082 ],
        [ 1.16108902,  0.30812582,  1.21556437,  1.45026536],
        [-1.21594362, -0.14706005, -1.32591224, -1.17556328],
        [-1.21594362,  0.08053288, -1.21295773, -1.30685472],
        [-1.45364689,  1.21849755, -1.55182128, -1.30685472],
        [-0.14627893,  3.03924102, -1.26943499, -1.04427185],
        [ 1.04223739,  0.08053288,  0.3684055 ,  0.26864247],
        [-1.21594362,  0.76331169, -1.21295773, -1.30685472],
        [-1.69135015,  0.30812582, -1.3823895 , -1.30685472],
        [ 0.56683086, -1.28502472,  0.70726905,  0.92509963],
        

In [18]:
# Initialize multinomial logistic regression model
multi_model = LogisticRegression(multi_class='multinomial', solver='lbfgs' , max_iter = 1000 , random_state = 42)

In [19]:
# Train the multinomial logistic regression model
multi_model.fit(X_train , y_train)

LogisticRegression(max_iter=1000, multi_class='multinomial', random_state=42)

In [20]:
# Make predictions using the multinomial logistic regression model

y_pred_multi = multi_model.predict(X_test)

In [21]:
y_pred_multi

array([2, 1, 2, 2, 2, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 2, 2, 2, 2, 2])

In [22]:
y_test

array([1, 0, 2, 1, 1, 0, 1, 2, 1, 1, 2, 0, 0, 0, 0, 1, 2, 1, 1, 2])

In [24]:
print(" Multinomial Logistic Regression:")
print("Confusion Matrix")
print(confusion_matrix(y_test , y_pred_multi))
print("\nAccuracy:")
print(accuracy_score(y_test , y_pred_multi))
print("\nClassification Report:")
print(classification_report(y_test, y_pred_multi))

 Multinomial Logistic Regression:
Confusion Matrix
[[0 6 0]
 [0 0 9]
 [0 0 5]]

Accuracy:
0.25

Classification Report:
              precision    recall  f1-score   support

           0       0.00      0.00      0.00         6
           1       0.00      0.00      0.00         9
           2       0.36      1.00      0.53         5

    accuracy                           0.25        20
   macro avg       0.12      0.33      0.18        20
weighted avg       0.09      0.25      0.13        20



  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


In [25]:
# Initialize OvR logistic regression model
ovr_model = LogisticRegression(multi_class='ovr', solver='lbfgs' , max_iter = 1000 , random_state = 42)

In [26]:
# Train the OvR logistic regression model
ovr_model.fit(X_train , y_train)

LogisticRegression(max_iter=1000, multi_class='ovr', random_state=42)

In [28]:
y_pred_ovr = ovr_model.predict(X_test)

In [29]:
y_pred_ovr

array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

In [30]:
print(" OVR Logistic Regression:")
print("Confusion Matrix")
print(confusion_matrix(y_test , y_pred_ovr))
print("\nAccuracy:")
print(accuracy_score(y_test , y_pred_ovr))
print("\nClassification Report:")
print(classification_report(y_test, y_pred_ovr))

 OVR Logistic Regression:
Confusion Matrix
[[0 0 6]
 [0 0 9]
 [0 0 5]]

Accuracy:
0.25

Classification Report:
              precision    recall  f1-score   support

           0       0.00      0.00      0.00         6
           1       0.00      0.00      0.00         9
           2       0.25      1.00      0.40         5

    accuracy                           0.25        20
   macro avg       0.08      0.33      0.13        20
weighted avg       0.06      0.25      0.10        20



  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
