# **Multinomial regression on Iris dataset**

**By Philip Blumin and Paul Cucchiara**

# **Dataset info** 

This Iris database was obtained from the University of California Irvine Machine Learning Repository from R.A. Fisher. 

Attributes:
1. **SL:** sepal length in cm
2. **SW:** sepal width in cm
3. **PL:** petal length in cm
4. **PW:** petal width in cm

Classes:

-- Iris Setosa - 0

-- Iris Versicolour - 1

-- Iris Virginica - 2

*Numbers indicate associated value in one hot encoded matrix



# **Stretch Goal #2 (Multiclass)**

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from numpy import random
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler

  import pandas.util.testing as tm


In [None]:
columns = ['sepal length','sepal width','petal length', 'petal width','Classification']
dsiris = pd.read_csv('iris.data', sep=',', delimiter=None, header='infer', names=columns)

x = dsiris.iloc[:, :-1].values
Y = dsiris.iloc[:, -1].values


xtrain, xtest, ytrain, ytest = train_test_split(x, Y, test_size = 0.2, random_state = 1)

sc = StandardScaler()
xtrain = sc.fit_transform(xtrain)
xtest = sc.transform(xtest)

xtrain = np.array(xtrain)
ones = np.ones((1,len(xtrain)))
xtrain = np.insert(xtrain, 0, ones, axis=1)

xtest = np.array(xtest)
ones = np.ones((1,len(xtest)))
xtest = np.insert(xtest, 0, ones, axis=1)

#making weight vector
weights = [[0.1, 0.1,0.1],[0.1, 0.2,0.3],
           [0.1, 0.2,0.3],[0.1, 0.2,0.3],
           [0.1, 0.2,0.3]]
weights = np.array(weights)

**Above, the code is split and an extra column is added to the feature matricies to account for the bias. The weight matrix is made with dimensions equal to the number of features by number of classes.**

In [None]:
columns = ['Bias', 'SL','SW','PL','PW']
xtrain = pd.DataFrame(xtrain, columns= columns)
xtrain.head()

Unnamed: 0,Bias,SL,SW,PL,PW
0,1.0,0.315537,-0.036122,0.447486,0.234531
1,1.0,2.244933,-0.036122,1.29804,1.396429
2,1.0,-0.2874,-1.240184,0.050561,-0.152768
3,1.0,0.677298,-0.517747,1.014522,1.138229
4,1.0,-0.046225,-0.517747,0.731004,1.525529


In [None]:
setosa = np.array('Iris-setosa')
versicolor = np.array('Iris-versicolor')

iris = []
for i in range(0, len(ytrain)):
  if ytrain[i] == 'Iris-setosa':
    h = [1,0,0]
  elif ytrain[i] == 'Iris-versicolor':
    h = [0,1,0]
  else:
    h = [0,0,1]
  iris.append(h)
ytrain = iris
ytrain = np.array(ytrain)


iris1 = []
for i in range(0, len(ytest)):
  if ytest[i] == 'Iris-setosa':
    h = [1,0,0]
  elif ytest[i] == 'Iris-versicolor':
    h = [0,1,0]
  else:
    h = [0,0,1]
  iris1.append(h)
ytest = iris1
ytest = np.array(ytest)



**Above, the class we are trying to predict is one hot encoded with the first number for 'Iris-setosa', the second number for 'Iris-versicolor', and the third for 'Iris-virginica'**

In [None]:
a = 0.1
#Finding Softmax
def softmax(x,w,j):
  sum = 0
  for i in range(0,3):
    sum = sum + np.exp(np.dot(np.transpose(w[:,i]),x))
  stuff0 = np.exp(np.dot(np.transpose(w[:,0]),x))/sum
  stuff1 = np.exp(np.dot(np.transpose(w[:,1]),x))/sum
  stuff2 = np.exp(np.dot(np.transpose(w[:,2]),x))/sum
  stuff = np.array([stuff0,stuff1,stuff2])
  return stuff[j]


**The softmax function above takes in the current matricies for the weights and a sample from the train matrix. It outputs the probability that the sample is one of the classes with j representing the class.**

**The multinomial classification model is computed below. The model is made using each of the samples in the train data 1000 times. Once the model is made, it is used to predict the classes of the xtest data and compare them to the actual classes.**

In [None]:
a = 0.1
g = np.zeros(5)
prob = []
bestval = 0;
num = 0
xtrain = np.array(xtrain)

#--------------Training Model using train data-------------#
for k in range(0, 1000):
  for i in range (0,len(xtrain)):
    for j in range(0,3):
      h = softmax(xtrain[i],weights,j)
      prob.append(h)
    for j in range(0,3):
      t = ytrain[i][j]
      o = prob[j]
      g = g + (-1/(len(xtrain)) * xtrain[i] *(t-o))
      weights[:,j] = weights[:,j] - (a * g) 
    prob = []
#print(weights)

yguess = []
predguess = []
probT = []

#------Determining effectiveness of model on test data-----#

for i in range (0,len(xtest)):
  for j in range(0,3):
    check = softmax(xtest[i],weights,j)
    probT.append(check)

  predguess.append(np.argmax(probT))
  yguess.append(np.argmax(ytest[i]))
 
  probT = []
score = accuracy_score(yguess, predguess)
yguess = np.array(yguess)
predguess = np.array(predguess)
#print(np.concatenate((predguess.reshape(len(predguess),1), yguess.reshape(len(yguess),1)),1))
print("Accuracy Score: ", score)

baseline = np.sum([0 == np.argmax(ytest[i]) for i in range(0, len(ytest))])
ascore_baseline = baseline/len(ytest)
print("Accuracy Score of baseline: ", ascore_baseline)

Accuracy Score:  0.9666666666666667
Accuracy Score of baseline:  0.36666666666666664


**With this high of an accuracy, it is clear that the model is really good at approximating the outcome of the iris dataset. It only messed up once. This is much more effective than using the baseline model.**