# One-vs-All example

### Multiclass classification

** This One-vs-All object uses n logistic regressors ** 

In [1]:
import numpy as np
import pandas as pd

import one_vs_rest
from sklearn.metrics import confusion_matrix

### generate training set

** Note: this set is not entirely linearly separable, but it works well enough for the example **

In [2]:
size = 1000
thr = 0.5

x = np.random.normal(size=(size,4))
y = []
for i in x:
    if (i[0]>thr):
        y.append(0)
    elif (i[1]>thr):
        y.append(1)
    elif (i[2]>thr):
        y.append(2)
    elif (i[3]>thr):
        y.append(3)
    else:
        y.append(4)    
    
y = np.asarray(y)

# convert to pandas DataFrame
df = pd.DataFrame(np.hstack((x,y[:,None])))
df.columns = ['x1','x2','x3','x4','y']
 
# check balance
print(df['y'].value_counts())
print(df['y'].value_counts().std())
df.head()

0.0    331
4.0    218
1.0    199
2.0    144
3.0    108
Name: y, dtype: int64
85.27309071448038


Unnamed: 0,x1,x2,x3,x4,y
0,1.761553,-0.311067,0.502025,0.109265,0.0
1,-0.407302,0.42391,0.616338,0.114228,2.0
2,0.942139,1.128345,-1.555455,-1.953168,0.0
3,0.151805,-0.793728,-1.253582,-0.817049,4.0
4,-0.730682,-0.209059,0.788876,0.11073,2.0


### Test set

In [3]:
tx = np.random.normal(size=(size,4))
ty = []
for i in tx:
    if (i[0]>thr):
        ty.append(0)
    elif (i[1]>thr):
        ty.append(1)
    elif (i[2]>thr):
        ty.append(2)
    elif (i[3]>thr):
        ty.append(3)
    else:
        ty.append(4)    

ty = np.asarray(ty)

### Classification

In [4]:
ovr = one_vs_rest.OvR()

In [5]:
ovr.fit(x,y,epochs=100)

** Prediction **

In [6]:
pred = ovr.predict(tx)

In [7]:
confusion_matrix(pred,ty)

array([[313,  24,  27,  25,  45],
       [  2, 173,   9,   7,  15],
       [  1,   2,  81,   4,   1],
       [  3,   3,   4,  51,   0],
       [  1,  10,  16,  19, 164]])

** Performance is decent considering the quality of the traning set **
<br><br>
Note: y=3 is the least linearly separable class and performs the worst consequently

** Input parameters are similar to a single LinearRegressor: **

In [8]:
ovr.fit(x,y,starting_coeff=True,method='stochastic', bin_size=1 ,epochs=100, learning_rate=0.00001)
ovr.fit(x,y,starting_coeff=True,method='batch', epochs=100, learning_rate=0.00001)

** Also works with pandas DataFrame or Series: **

In [9]:
ovr.fit(df.drop('y',axis=1),df['y'],epochs=100)