# Multi Class - predict single category from multiple categories for a  data point

- Logistic Regression with multi_class ( soft_max ) 
- OneVsRest Logistic Regression with a BASE binary classifer
    can be used multiple labels if y is a 2D matrix of **0s or 1s** **only**
- OneVsOne Logistic Regression with a BASE binary classifier

# Multi Label - predict multiple categories from multiple categories for a  data point

- MultiOutputClassifier with a BASE binary/multi class classifier

**Binary Classifiers give binary class probabilites**
- y is 1D , 0 or 1 (or two distinct values)
- proba(0),prob(1) [Function can be sigmoid] 
- can choose a different cut off

**Multi class classfiers give n-class probabilities**
- y is 1D , 0,1,2,..n
- y cant be 2D
- proba(0),prob(1),proba(2),prob(3) [4 classes with softmax for example, total prob sum =1] 
- highest prob class is chosen , no cut off

**One vs Rest is similar to multi class**
- y is 1D , 0,1,2,..n 
- proba(0),prob(1),proba(2),prob(3) [4 classes with Yes Prob for example and then soft maxed, total prob sum =1] 
- highest prob class is chosen , no cut off

- y can be 2D  but binarized(one hot encoding)
- Ex: [[0,1,0,0],[0,0,1,0]] 
- proba(0),prob(1),proba(2),prob(3) [4 classes with Yes Prob for example total prob sum !=1] 
- highest prob class is chosen , no cut off

- y can be 2D  but binarized(can have multi output)
- Ex: [[0,1,1,0],[1,1,1,0]] 
- proba(0),prob(1),proba(2),prob(3) [4 classes with Yes Prob for example total prob sum !=1] 
- highest prob class is chosen , no cut off

**One vs One** 
- y **can only be 1D** but multi label
- Ex [0,1,2,4,2]
- No Probabilities only Classes


**MultiOutput**
- y must be 2D
- [[1,1,0],[1,0,0]]  or [[1,2,3] ,[0,2,1],[1,3,0]] 
- one model for each column-independent
- Each Binary or Multi class model as above
- probabitlies array are for each model
- probabitlies array  is number of labels in each column , example2 column3   has three labels.



In [1]:
import numpy as np
import pandas as pd

from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
from sklearn.metrics import f1_score
from sklearn.metrics import recall_score
from sklearn.metrics import precision_score

from sklearn.preprocessing import LabelBinarizer

In [2]:
test_lr=LogisticRegression(multi_class='multinomial',solver='lbfgs')

ovr2_x = np.array([
    [10, 10],
    [8, 10],
    [-5, 5.5],
    [-5.4, 5.5],
    [-20, -20],
   [-15, -20]
])
## Multioutput target data is not supported with label binarization
# Binarize them using label binarizer 
# Nothing but one hot encoding
lb=LabelBinarizer()
lb.fit([0,1,2])
print(lb.transform([0,1,1,2,1,1]))
over2_y=lb.transform([0,1,1,2,1,1])
## It has to be BINARY labels
#over2_y = np.array([[0,1],[0,1],[1,0],[1,1], [1,1], [0,1]])

ovr2 =LogisticRegression().fit(ovr2_x, over2_y)

ovr2_preds_classes=ovr2.predict([[-19, -20],[9,9]])
ovr2_preds_probs=ovr2.predict_proba([[-19, -20],[9,9]])

ovr2_preds_classes,ovr2_preds_classes.shape,ovr2_preds_probs,ovr2_preds_probs.shape,ovr2_preds_probs.sum(axis=1)

[[1 0 0]
 [0 1 0]
 [0 1 0]
 [0 0 1]
 [0 1 0]
 [0 1 0]]


ValueError: y should be a 1d array, got an array of shape (6, 3) instead.

In [3]:
# Note that lr takes string labels as well
# multinomial uses cross entropy loss
test_lr=LogisticRegression(multi_class='multinomial',solver='lbfgs')
test_lr.fit([[1,2,3,4],[4,3,2,1]],['P','Q'])
test_lr.predict([[4,2,3,3]])
confusion_matrix(['P'],test_lr.predict([[4,2,3,3]]))

array([[0, 1],
       [0, 0]], dtype=int64)

In [4]:
test_lr.predict_proba([[4,2,3,3]]),test_lr.predict_proba([[4,2,3,3]]).sum(axis=1)

(array([[0.39517592, 0.60482408]]), array([1.]))

In [5]:
print('default is ovr')
#test_lr=LogisticRegression(multi_class='ovr',solver='lbfgs')
test_lr=LogisticRegression(solver='lbfgs')
test_lr.fit([[1,2,3,4],[4,3,2,1]],['P','Q'])
test_lr.predict([[4,2,3,3]])
confusion_matrix(['P'],test_lr.predict([[4,2,3,3]]))

default is ovr


array([[0, 1],
       [0, 0]], dtype=int64)

In [6]:
test_lr.predict_proba([[4,2,3,3]]),test_lr.predict_proba([[4,2,3,3]]).sum(axis=1)

(array([[0.41904338, 0.58095662]]), array([1.]))

In [7]:
test_lr.classes_

array(['P', 'Q'], dtype='<U1')

In [8]:
iris=load_iris()
print(iris.data.shape,iris.target.shape)
# The prediction was too accurate . SO i have made training_size << test_size
split= train_test_split(iris.data,iris.target,test_size=0.75,random_state=25)
A= split[0]
B=split[2]
C=split[1]
D=split[3]

(150, 4) (150,)


In [9]:
lr=LogisticRegression(multi_class='multinomial',solver='lbfgs')
#lr.fit(iris.data,iris.target)
lr.fit(A,B)

LogisticRegression(multi_class='multinomial')

iris.data[-1,:]  will return a row/series which is a 1D object
iris.data[-1:,:] will return a DF which is a 2D object

In [10]:
iris.data[-2:,:].shape,iris.data[-1:,:].shape,iris.data[-1,:].shape, iris.target[-1].shape,iris.target[-1:].shape
#iris.data[-1:,:]


((2, 4), (1, 4), (4,), (), (1,))

In [11]:
df=pd.DataFrame(iris.data)
df.iloc[-1],iris.data[-1,:],df.iloc[-1].shape,iris.data[-1,:].shape

(0    5.9
 1    3.0
 2    5.1
 3    1.8
 Name: 149, dtype: float64,
 array([5.9, 3. , 5.1, 1.8]),
 (4,),
 (4,))

In [12]:
df.iloc[-1:],iris.data[-1:,:],df.iloc[-1:].shape,iris.data[-1:,:].shape

(       0    1    2    3
 149  5.9  3.0  5.1  1.8,
 array([[5.9, 3. , 5.1, 1.8]]),
 (1, 4),
 (1, 4))

In [13]:
iris.target[-100:]

array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

In [14]:
#pred_classes=lr.predict(iris.data[-100:,:])
#pred_probs=lr.predict_proba(iris.data[-100:,:])

In [15]:
#predict takes 2D Array !!!!
## Batchsize X features
pred_classes=lr.predict(C)
pred_probs=lr.predict_proba(C)

print(pred_classes.shape,pred_probs.shape)
print(pred_classes)
#SoftMax Probabilities
print(pred_probs)
#Probs sum to 1 as 
print(pred_probs.sum(axis=1))

(113,) (113, 3)
[0 2 2 1 2 1 2 0 1 1 0 0 0 2 0 1 2 2 1 1 1 1 1 0 0 2 1 2 2 0 1 2 2 0 2 1 1
 0 0 0 0 0 0 0 2 0 0 1 0 2 2 0 0 2 1 2 2 1 2 1 2 2 1 0 2 1 2 0 1 2 0 0 2 1
 1 0 2 1 2 1 2 0 0 1 0 0 1 2 0 2 1 1 1 2 1 0 2 0 0 1 2 2 2 1 0 2 0 0 1 1 0
 0 0]
[[9.24828716e-01 7.51528698e-02 1.84139895e-05]
 [5.14288601e-03 4.81816483e-01 5.13040631e-01]
 [9.74730444e-03 3.77111350e-01 6.13141346e-01]
 [2.85214537e-02 5.82142495e-01 3.89336052e-01]
 [3.38176294e-04 8.94896073e-02 9.10172216e-01]
 [2.33456851e-02 7.34413664e-01 2.42240651e-01]
 [5.85345072e-03 3.50364333e-01 6.43782216e-01]
 [9.48747996e-01 5.12394754e-02 1.25282403e-05]
 [9.16209975e-02 8.57147402e-01 5.12316003e-02]
 [1.83880419e-01 7.91627455e-01 2.44921257e-02]
 [9.31663349e-01 6.83077285e-02 2.89220215e-05]
 [8.98394804e-01 1.01566659e-01 3.85369767e-05]
 [9.04781258e-01 9.51698998e-02 4.88424492e-05]
 [3.94662870e-03 3.79936489e-01 6.16116882e-01]
 [9.42848195e-01 5.71327672e-02 1.90376275e-05]
 [1.49975338e-02 6.91916689e-01 2

In [16]:
def all_metrics(D,pred_classes):
    return (confusion_matrix(D,pred_classes) ,\
    recall_score(D,pred_classes,average=None) , precision_score(D,pred_classes,average=None), \
    f1_score(D,pred_classes,average=None) )

In [17]:
all_metrics(D,pred_classes)

(array([[41,  0,  0],
        [ 0, 35,  4],
        [ 0,  0, 33]], dtype=int64),
 array([1.       , 0.8974359, 1.       ]),
 array([1.        , 1.        , 0.89189189]),
 array([1.        , 0.94594595, 0.94285714]))

In [18]:
print(classification_report(D,pred_classes))

              precision    recall  f1-score   support

           0       1.00      1.00      1.00        41
           1       1.00      0.90      0.95        39
           2       0.89      1.00      0.94        33

    accuracy                           0.96       113
   macro avg       0.96      0.97      0.96       113
weighted avg       0.97      0.96      0.96       113



In [19]:
lr.classes_

array([0, 1, 2])

# Multi Class - predict single category from multiple categories for a  data point
## OneVsRestClassifier
- **y is 1D or 2D**
- generates num_of_models = num_of_labels
- each model = one vs rest
- so probabilities array shape = num_of_models shape 
(technically each model generates 2/binary probabilities , Yes Prob =0... No Prob =0...)
For this classifier , when y is 1D , **Yes Logit Score** is taken and softmax is applied,so prob(sum) is 1, when y is 2D  **Yes Prob** from each model is taken and highest Yes Prob decides the class,prob(sum) is not 1
- so classes array shape = num_of_test_data_points
- one class with highest probability among all probabilities is picked
- Note: if probs are 0.2 ,0.4, 0.3 the second one is picked. it need not be >0.5 cutoff like in softmax logistic
- SIMILAR TO SOFTMAX/MULTINOMIAL

In [20]:

from sklearn.multiclass import OneVsRestClassifier

ovr_x = np.array([
    [10, 10],
    [8, 10],
    [-5, 5.5],
    [-5.4, 5.5],
    [-20, -20],
   [-15, -20]
])
print('y is 1D of distinct values')
print('single data point can belong ONLY 1 class')
over_y = np.array([0, 0, 1, 1, 2, 2])
ovr = OneVsRestClassifier(LogisticRegression()).fit(ovr_x, over_y)
ovr_preds_classes=ovr.predict([[-19, -20],[9,9]])
ovr_preds_probs=ovr.predict_proba([[-19, -20],[9,9]])

print('Sum of probabilities is 1, Highest Yes-Logit score is taken and then softmaxed')
ovr_preds_classes,ovr_preds_classes.shape,ovr_preds_probs,ovr_preds_probs.shape,ovr_preds_probs.sum(axis=1)


y is 1D of distinct values
single data point can belong ONLY 1 class
Sum of probabilities is 1, Highest Yes-Logit score is taken and then softmaxed


(array([2, 0]),
 (2,),
 array([[2.13656050e-07, 1.98122741e-02, 9.80187512e-01],
        [9.79177154e-01, 2.05881382e-02, 2.34708228e-04]]),
 (2, 3),
 array([1., 1.]))

In [21]:
from sklearn.multiclass import OneVsRestClassifier

ovr2_x = np.array([
    [10, 10],
    [8, 10],
    [-5, 5.5],
    [-5.4, 5.5],
    [-20, -20],
   [-15, -20]
])
print('y is 2D of 0s or 1s- one hot')
print('single data point can to belong ONLY 1 class')
## Multioutput target data is not supported with label binarization
# Binarize them using label binarizer 
# Nothing but one hot encoding
lb=LabelBinarizer()
lb.fit([0,1,2])
print(lb.transform([0, 0, 1, 1, 2, 2]))
over2_y=lb.transform([0, 0, 1, 1, 2, 2])
## It has to be BINARY labels
#over2_y = np.array([[0,1],[0,1],[1,0],[1,1], [1,1], [0,1]])

ovr2 = OneVsRestClassifier(LogisticRegression()).fit(ovr2_x, over2_y)

ovr2_preds_classes=ovr2.predict([[-19, -20],[9,9]])
ovr2_preds_probs=ovr2.predict_proba([[-19, -20],[9,9]])

print('Sum of probabilities is not 1,highest Yes-prob is reported')
ovr2_preds_classes,ovr2_preds_classes.shape,ovr2_preds_probs,ovr2_preds_probs.shape,ovr2_preds_probs.sum(axis=1)


y is 2D of 0s or 1s- one hot
single data point can to belong ONLY 1 class
[[1 0 0]
 [1 0 0]
 [0 1 0]
 [0 1 0]
 [0 0 1]
 [0 0 1]]
Sum of probabilities is not 1,highest Yes-prob is reported


(array([[0, 0, 1],
        [1, 0, 0]]),
 (2, 3),
 array([[2.16935949e-07, 2.01164183e-02, 9.95234667e-01],
        [9.80416304e-01, 2.06141926e-02, 2.35005252e-04]]),
 (2, 3),
 array([1.0153513, 1.0012655]))

In [22]:
from sklearn.multiclass import OneVsRestClassifier

ovr2_x = np.array([
    [10, 10],
    [8, 10],
    [-5, 5.5],
    [-5.4, 5.5],
    [-20, -20],
   [-15, -20]
])
print('y is 2D of 0s or 1s - single data point can belong to 1 or more classes')
print('single data point can belong  1 or MORE class')
## Multioutput target data is not supported with label binarization
# Binarize them using label binarizer 
# Nothing but one hot encoding

## Allowed Multi output what to interpret?
over2_y=np.array([[1 ,1 ,1],[0 ,1,0],[0 ,1 ,0],[0 ,0 ,1],[0, 1 ,0],[0 ,1, 0]])
ovr2 = OneVsRestClassifier(LogisticRegression()).fit(ovr2_x, over2_y)

ovr2_preds_classes=ovr2.predict([[-19, -20],[9,9]])
ovr2_preds_probs=ovr2.predict_proba([[-19, -20],[9,9]])


print('Sum of probabilities is not 1,highest Yes-prob is reported')
ovr2_preds_classes,ovr2_preds_classes.shape,ovr2_preds_probs,ovr2_preds_probs.shape,ovr2_preds_probs.sum(axis=1)

y is 2D of 0s or 1s - single data point can belong to 1 or more classes
single data point can belong  1 or MORE class
Sum of probabilities is not 1,highest Yes-prob is reported


(array([[0, 1, 0],
        [0, 1, 0]]),
 (2, 3),
 array([[5.88330448e-09, 9.87264033e-01, 1.11036018e-02],
        [4.99769937e-01, 9.88226100e-01, 4.92769421e-01]]),
 (2, 3),
 array([0.99836764, 1.98076546]))

In [23]:
from sklearn.multiclass import OneVsRestClassifier

ovr2_x = np.array([
    [10, 10],
    [8, 10],
    [-5, 5.5],
    [-5.4, 5.5],
    [-20, -20],
   [-15, -20]
])
## Multioutput target data is not supported with label binarization
##over2_y = np.array([[0,1],[0,2],[1,0],[1,1], [2,1], [2,1]])
## It has to be BINARY labels
over2_y = np.array([[0,1],[0,1],[1,0],[1,1], [1,1], [0,1]])
ovr2 = OneVsRestClassifier(LogisticRegression()).fit(ovr2_x, over2_y)

ovr2_preds_classes=ovr2.predict([[-19, -20],[9,9]])
ovr2_preds_probs=ovr2.predict_proba([[-19, -20],[9,9]])

ovr2_preds_classes,ovr2_preds_classes.shape,ovr2_preds_probs,ovr2_preds_probs.shape,ovr2_preds_probs.sum(axis=1)


(array([[1, 1],
        [0, 1]]),
 (2, 2),
 array([[0.77027017, 0.98508217],
        [0.00621082, 0.97810504]]),
 (2, 2),
 array([1.75535235, 0.98431586]))

In [24]:
from sklearn.multiclass import OneVsRestClassifier

ovr2_x = np.array([
    [10, 10],
    [8, 10],
    [-5, 5.5],
    [-5.4, 5.5],
    [-20, -20],
   [-15, -20]
])
print('y is 2D but values are not 0s or 1s')
## Multioutput target data is not supported with label binarization
over2_y = np.array([[0,1],[0,2],[1,0],[1,1], [2,1], [2,1]])
## It has to be BINARY labels
##over2_y = np.array([[0,1],[0,1],[1,0],[1,1], [1,1], [0,1]])
ovr2 = OneVsRestClassifier(LogisticRegression()).fit(ovr2_x, over2_y)

ovr2_preds_classes=ovr2.predict([[-19, -20],[9,9]])
ovr2_preds_probs=ovr2.predict_proba([[-19, -20],[9,9]])

ovr2_preds_classes,ovr2_preds_classes.shape,ovr2_preds_probs,ovr2_preds_probs.shape,ovr2_preds_probs.sum(axis=1)

y is 2D but values are not 0s or 1s


ValueError: Multioutput target data is not supported with label binarization

## OneVs One
- **y is 1D only**

- generates num_of_models = num_of_labels Combination 2 ( nC2 )
- each model = pair of class labels
- so probabilities array shape = num_of_models shape
- so classes array shape = num_of_test_data_points
- one class with Majority voting is picked
- Note: if probs are 0.2 ,0.4, 0.3 the second one is picked. it need not be >0.5 cutoff like in softmax logistic --- COPY PASTE MISTAKE?

https://machinelearningmastery.com/one-vs-rest-and-one-vs-one-for-multi-class-classification/
Each binary classification model may predict one class label and the model with the most predictions or votes is predicted by the one-vs-one strategy.


An alternative is to introduce K(K − 1)/2 binary discriminant functions, one for every possible pair of classes. This is known as a one-versus-one classifier. Each point is then classified according to a majority vote amongst the discriminant functions.

ex: 0,1,2,3 are the labels
nc2 is 4c2 is 6
0,1 gives prob(0), prob(1) => choose highest prob class
0,2 gives prob(0), prob(2) =>  choose highest prob class
0,3 gives prob(0), prob(3) =>  choose highest prob class
1,2 gives prob(1), prob(2) =>  choose highest prob class
1,3 gives prob(1), prob(3) =>  choose highest prob class
2,3 gives prob(2), prob(4) =>  choose highest prob class

- in One vs rest , each model is for a particular class. So predict_proba makes sense

- Each model is not for a particular class vs rest.So predict_proba for class 2 has three values  from (0,2) , (1,2) ,(2,3) .So predict_proba does not make sense .


- Count the highest number of occurence. 

In [25]:
from sklearn.multiclass import OneVsOneClassifier

ovo_x = np.array([
    [10, 10],
    [8, 10],
    [-5, 5.5],
    [-5.4, 5.5],
    [-20, -20],
   [-15, -20]
])

print('y cannot be 2D')
## 4 classes => 6 models
#ovo_y = np.array([[0,1],[0,2],[1,0],[1,1], [2,1], [2,1]])
ovo_y = np.array([[0,1],[0,1],[1,0],[1,1], [1,1], [0,1]])
ovo = OneVsOneClassifier(LogisticRegression()).fit(ovo_x, ovo_y)
ovo_preds_classes=ovo.predict([[-19, -20],[9,9]])

## HAS NO predict_proba metho
#ovo_preds_probs=ovo.predict_proba([[-19, -20],[9,9]])

ovo_preds_classes,ovo_preds_classes.shape
#ovo_preds_probs,ovo_preds_probs.shape

y cannot be 2D


ValueError: y should be a 1d array, got an array of shape (6, 2) instead.

In [26]:
from sklearn.multiclass import OneVsOneClassifier

ovo_x = np.array([
    [10, 10],
    [8, 10],
    [-5, 5.5],
    [-5.4, 5.5],
    [-20, -20],
   [-15, -20]
])
print('y is 1D of distinct values')
print('single data point can belong ONLY 1 class')
## 4 classes => 6 models
ovo_y = np.array([0, 0, 1, 1, 2, 3])
ovo = OneVsOneClassifier(LogisticRegression()).fit(ovo_x, ovo_y)
ovo_preds_classes=ovo.predict([[-19, -20],[9,9]])

## HAS NO predict_proba metho
#ovo_preds_probs=ovo.predict_proba([[-19, -20],[9,9]])

ovo_preds_classes,ovo_preds_classes.shape
#ovo_preds_probs,ovo_preds_probs.shape

y is 1D of distinct values
single data point can belong ONLY 1 class


(array([2, 0]), (2,))

### Other Multi Class Classification

## MultiOutput ( predict multiple categories from multipe categories for a  data point)
- generates num_of_models = num_of_labels
- Each column is treated an independent/separate model 
- ** Target is 2D **
- Target is generally 
- For each data point , target is an array of num_of_labels [0 ,1, 1]
- For regular Classification, for each data point is target is a scalar of the label - 2
- each model = one for each column/label 
- so probabilities array shape = num_of_models shape
- so classes array shape = num_of_test_data_points * num_labels
- each output picks the label in each model  [1 1 0] for example


In [27]:
from sklearn.datasets import make_multilabel_classification
from sklearn.multioutput import MultiOutputClassifier

X, y = make_multilabel_classification(n_classes=3, random_state=0)
X.shape , X ,y


((100, 20),
 array([[3., 6., 1., ..., 1., 5., 0.],
        [3., 5., 5., ..., 1., 1., 1.],
        [3., 3., 5., ..., 0., 2., 1.],
        ...,
        [3., 7., 3., ..., 2., 4., 2.],
        [7., 2., 1., ..., 1., 1., 1.],
        [3., 5., 3., ..., 1., 2., 4.]]),
 array([[0, 1, 0],
        [0, 1, 0],
        [1, 1, 1],
        [1, 1, 1],
        [0, 1, 0],
        [1, 1, 0],
        [0, 0, 0],
        [1, 0, 0],
        [0, 0, 1],
        [0, 0, 0],
        [0, 1, 0],
        [0, 0, 0],
        [0, 1, 1],
        [1, 0, 0],
        [1, 1, 0],
        [1, 1, 1],
        [0, 0, 0],
        [0, 1, 0],
        [0, 0, 0],
        [1, 1, 1],
        [0, 1, 1],
        [0, 0, 1],
        [1, 0, 1],
        [0, 1, 0],
        [0, 1, 0],
        [1, 1, 0],
        [0, 1, 1],
        [0, 0, 1],
        [1, 1, 1],
        [1, 1, 0],
        [0, 1, 0],
        [0, 0, 1],
        [0, 0, 1],
        [0, 1, 0],
        [0, 0, 0],
        [0, 1, 0],
        [0, 1, 0],
        [1, 1, 1],
        [1, 1, 1]

In [28]:
clf = MultiOutputClassifier(LogisticRegression()).fit(X, y)
m_pred_classes=clf.predict(X[-2:])
m_pred_probs=clf.predict_proba(X[-2:])

print(m_pred_classes,m_pred_classes.shape,m_pred_probs,len(m_pred_probs))
#m_pred_probs.shape


[[1 1 1]
 [1 0 1]] (2, 3) [array([[0.14044346, 0.85955654],
       [0.16267135, 0.83732865]]), array([[0.09446415, 0.90553585],
       [0.78465578, 0.21534422]]), array([[0.453594  , 0.546406  ],
       [0.38098685, 0.61901315]])] 3


In [29]:
for i,model_probs in enumerate(m_pred_probs,0):
    print("Model",i)
    print(model_probs)
    print(model_probs.shape)

Model 0
[[0.14044346 0.85955654]
 [0.16267135 0.83732865]]
(2, 2)
Model 1
[[0.09446415 0.90553585]
 [0.78465578 0.21534422]]
(2, 2)
Model 2
[[0.453594   0.546406  ]
 [0.38098685 0.61901315]]
(2, 2)


Model 1 - 2 datapoints X binary 
[array([[0.14044346, 0.85955654],
       [0.16267135, 0.83732865]]), 
Model 2 - 2 datapoints X binary       
array([[0.09446415, 0.90553585],
       [0.78465578, 0.21534422]]), 
Model 3 - 2 datapoints X binary        
 array([[0.453594  , 0.546406  ],
       [0.38098685, 0.61901315]])] 3

### if y contains only single column only one model is generated
- here y is a single column with (0,1,2) as class labels
- so prediction is an array with single column with 0 or 1 or 2

In [30]:
z=np.zeros((100,1))
z[-25:]=2
z[25:50]=1
z

print('y is 2D . Each Column is a Category(Ex:Social Status, Education) of sub-classes')
print('single data point can belong ONLY 1 class')

clf = MultiOutputClassifier(LogisticRegression(max_iter=500)).fit(X, z)

m_pred_classes=clf.predict(X[-10:])
m_pred_probs=clf.predict_proba(X[-10:])

m_pred_classes.shape,m_pred_classes,m_pred_probs
#,m_pred_probs.shape

y is 2D . Each Column is a Category(Ex:Social Status, Education) of sub-classes
single data point can belong ONLY 1 class


((10, 1),
 array([[2.],
        [2.],
        [2.],
        [2.],
        [1.],
        [0.],
        [2.],
        [2.],
        [0.],
        [1.]]),
 [array([[0.29735607, 0.01669833, 0.6859456 ],
         [0.35424857, 0.03472184, 0.61102959],
         [0.28626994, 0.08585234, 0.62787772],
         [0.38757585, 0.01791295, 0.59451121],
         [0.3591262 , 0.5287604 , 0.11211341],
         [0.54803632, 0.08543395, 0.36652973],
         [0.13241586, 0.01890283, 0.84868131],
         [0.34492806, 0.19627413, 0.45879782],
         [0.51556317, 0.08515241, 0.39928442],
         [0.4569614 , 0.46048421, 0.08255439]])])

In [31]:
z=np.zeros((100,2))
z[-25:,:]=2
z[25:50,:]=1
z

clf = MultiOutputClassifier(LogisticRegression(max_iter=500)).fit(X, z)

m_pred_classes=clf.predict(X[-10:])
m_pred_probs=clf.predict_proba(X[-10:])

m_pred_classes.shape,m_pred_classes,m_pred_probs
#,m_pred_probs.shape

((10, 2),
 array([[2., 2.],
        [2., 2.],
        [2., 2.],
        [2., 2.],
        [1., 1.],
        [0., 0.],
        [2., 2.],
        [2., 2.],
        [0., 0.],
        [1., 1.]]),
 [array([[0.29735607, 0.01669833, 0.6859456 ],
         [0.35424857, 0.03472184, 0.61102959],
         [0.28626994, 0.08585234, 0.62787772],
         [0.38757585, 0.01791295, 0.59451121],
         [0.3591262 , 0.5287604 , 0.11211341],
         [0.54803632, 0.08543395, 0.36652973],
         [0.13241586, 0.01890283, 0.84868131],
         [0.34492806, 0.19627413, 0.45879782],
         [0.51556317, 0.08515241, 0.39928442],
         [0.4569614 , 0.46048421, 0.08255439]]),
  array([[0.29735607, 0.01669833, 0.6859456 ],
         [0.35424857, 0.03472184, 0.61102959],
         [0.28626994, 0.08585234, 0.62787772],
         [0.38757585, 0.01791295, 0.59451121],
         [0.3591262 , 0.5287604 , 0.11211341],
         [0.54803632, 0.08543395, 0.36652973],
         [0.13241586, 0.01890283, 0.84868131],
       

In [32]:
z=np.zeros((100,))
z[-25:]=2
z[25:50]=1
print(z.shape)
print ('y cannot be 1D')

clf = MultiOutputClassifier(LogisticRegression(max_iter=500)).fit(X, z)

m_pred_classes=clf.predict(X[-10:])
m_pred_probs=clf.predict_proba(X[-10:])

m_pred_classes.shape,m_pred_classes,m_pred_probs
#,m_pred_probs.shape

(100,)
y cannot be 1D


ValueError: y must have at least two dimensions for multi-output regression but has only one.