# E-commerce Project - Logistic Regression
We are now going to look at training **logistic regression** with softmax. In my other walk throughs on logistic regression, we weren't looking at multiclass classification (just binary) so we had just been using the **sigmoid** function, and not softmax. This will give us the chance to see how logistic regression performs compared to a neural network. 

We can start with our imports.

In [33]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import os

from sklearn.utils import shuffle

And lets define our `get_data` function:

In [38]:
def get_data():
  df = pd.read_csv('data/ecommerce_data.csv')
  data = df.as_matrix()
  np.random.shuffle(data)
  X = data[:,:-1]
  Y = data[:,-1].astype(np.int32)

  # one-hot encode the categorical data
  N, D = X.shape
  X2 = np.zeros((N, D+3))
  X2[:,0:(D-1)] = X[:,0:(D-1)] # non-categorical

  # one-hot
  for n in range(N):
      t = int(X[n,D-1])
      X2[n,t+D-1] = 1
  X = X2

  # split train and test
  Xtrain = X[:-100]
  Ytrain = Y[:-100]
  Xtest = X[-100:]
  Ytest = Y[-100:]

  # normalize columns 1 and 2
  for i in (1, 2):
    m = Xtrain[:,i].mean()
    s = Xtrain[:,i].std()
    Xtrain[:,i] = (Xtrain[:,i] - m) / s
    Xtest[:,i] = (Xtest[:,i] - m) / s

  return Xtrain, Ytrain, Xtest, Ytest

We are going to need a function to get the indicator matrix from the targets.

In [39]:
def y2indicator(y, K):
    N = len(y)
    ind = np.zeros((N,K))
    for i in range(N):
        ind[i, y[i]] = 1 
    return ind

Now we can can get our data.

In [41]:
Xtrain, Ytrain, Xtest, Ytest = get_data()
D = Xtrain.shape[1]
K = len(set(Ytrain) | set(Ytest))

And convert our `Y` data into an indicator matrix. 

In [49]:
Ytrain_ind = y2indicator(Ytrain, K)
Ytest_ind = y2indicator(Ytest, K)

It is shape (400 x 4) because we have have four classes that make up Y. 

In [47]:
Ytrain_ind.shape

(400, 4)

Now randomly initialize our weights. 

In [50]:
W = np.random.randn(D, K)
b = np.zeros(K)

And we can define our `softmax` function:

In [None]:
def softmax(a):
    