My goal here is to implement an ANN using pytorch. There's a lot of tutorials using NMIST, and I had a lot of guidance from a tutorial created by the youtube channel "Sentdex." In particular, I used the text tutorial given here
https://pythonprogramming.net/data-deep-learning-neural-network-pytorch/ to start me off.

Once I had an idea of how to use torch, I tried to do this on my own with a toy dataset like digits without any batching to see if I understood everything, and then I implemented a batching proceedure to speed this up. I want to try and implement a parallel type code for batching, but this would be for later.

In [None]:
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

from sklearn import datasets
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn import preprocessing

In [None]:
# train = datasets.MNIST('', train=True, download=True,
#                        transform=transforms.Compose([
#                            transforms.ToTensor()
#                        ]))

# test = datasets.MNIST('', train=False, download=True,
#                        transform=transforms.Compose([
#                            transforms.ToTensor()
#                        ]))
# trainset = torch.utils.data.DataLoader(train, shuffle=True)
# testset = torch.utils.data.DataLoader(test, shuffle=False)

In [None]:
X, y = datasets.load_digits(return_X_y = True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)
print (X_train.shape)
scaler = StandardScaler() 
scaler.fit(X_train)   # calculate mean
X_train_norm = scaler.transform(X_train)  # apply normalization on X_train
X_test_norm = scaler.transform(X_test)    # apply normalization on X_test

(1347, 64)


In [None]:
X_train_tensor = torch.from_numpy(X_train_norm)
X_test_tensor = torch.from_numpy(X_test_norm)
y_train_tensor = torch.from_numpy(y_train)
y_test_tensor = torch.from_numpy(y_test)

In [None]:
def accuracy(ypred, yexact):
    p = np.array(ypred == yexact, dtype=int)
    return np.sum(p) / float(len(yexact))

In [None]:
class ANN(nn.Module):
    def __init__(self):
        super(ANN, self).__init__()
        self.Linear1 = nn.Linear(64, 32)
        self.Linear2 = nn.Linear(32, 32)
        self.Linear3 = nn.Linear(32, 10)

    def forward(self, x):
        #print(x.shape)
        x = F.relu(self.Linear1(x))
        x = F.relu(self.Linear2(x))
        x = self.Linear3(x)
        return F.log_softmax(x, dim = 1)

In [None]:
model = ANN()
loss_func = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters())

for epoch in range(1000): 
    model.zero_grad() 
    output = model(X_train_tensor.float())  
    loss = F.nll_loss(output, y_train_tensor) 
    loss.backward()  
    optimizer.step()  
print(torch.argmax(output, axis = 1))
print(y_train)

tensor([1, 1, 8,  ..., 2, 7, 1])
[1 1 8 ... 2 7 1]


In [None]:
with torch.no_grad():
    output = model(X_test_tensor.float())
    out_np = output.detach().numpy()
predict = np.argmax(out_np, axis = 1)
acc = accuracy(predict, y_test)
print(acc)

0.9747474747474747


Okay, it seems like I have a grasp of how to handle everything now. I'll try to write some code to batch everything now.

In [None]:
model = ANN()
loss_func = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters())

max_epochs = 3
batch_size = 50
num_datapoints = X_train.shape[0]

for epoch in range(50):
    for i in range(batch_size):
        if((i+1)*batch_size < num_datapoints):
            X_batch = X_train_tensor[i*batch_size:(i+1)*batch_size,:]
            y_batch = y_train_tensor[i*batch_size:(i+1)*batch_size]
        else:
            X_batch = X_train_tensor[i*batch_size:num_datapoints,:]
            y_batch = y_train_tensor[i*batch_size:num_datapoints]
        model.zero_grad() 
        output = model(X_batch.float())  
        loss = F.nll_loss(output, y_batch) 
        loss.backward()  
        optimizer.step()

with torch.no_grad():
    output = model(X_test_tensor.float())
    out_np = output.detach().numpy()
predict = np.argmax(out_np, axis = 1)
print(predict)
print(y_test)
            
        

[6 9 3 7 2 2 5 2 5 2 1 8 4 0 4 2 3 7 8 8 4 3 9 7 5 6 3 5 6 3 4 9 1 4 4 6 9
 4 7 6 6 9 1 3 6 1 3 0 6 5 5 1 9 5 6 0 9 0 0 1 0 4 5 2 4 5 7 0 7 5 9 9 5 4
 7 0 4 5 5 9 9 0 2 3 8 0 6 4 4 9 1 2 8 3 5 2 9 0 4 4 4 3 5 3 1 3 5 9 4 2 7
 7 4 4 1 9 2 7 8 7 2 6 9 4 0 7 2 7 5 8 7 5 7 9 0 6 6 4 2 8 0 9 4 6 9 9 6 9
 0 5 5 6 6 0 6 4 2 9 3 8 7 2 9 0 4 5 3 6 5 9 9 8 4 2 1 3 7 7 2 2 3 9 8 0 3
 2 2 5 6 9 9 4 1 5 4 2 3 6 4 8 5 9 5 7 1 9 4 8 1 5 4 4 9 6 1 8 6 0 4 5 2 7
 4 6 4 5 6 0 3 2 3 6 7 1 5 1 4 7 6 5 8 5 5 1 0 2 8 8 9 9 7 6 2 2 2 3 4 8 8
 3 6 0 9 7 7 0 1 0 4 5 1 5 3 6 0 4 1 0 0 3 6 5 9 7 3 5 5 9 9 8 5 3 3 2 0 5
 8 3 4 0 2 4 6 4 3 4 5 0 5 2 1 3 1 4 1 1 7 0 1 5 2 1 2 8 7 0 6 4 8 8 5 1 8
 4 5 8 7 9 8 6 0 6 2 0 7 9 1 9 5 2 7 7 1 8 7 4 3 8 3 5 6 0 0 3 0 5 0 0 4 1
 2 8 4 5 9 6 3 1 8 8 4 2 3 8 9 8 8 5 0 6 3 3 7 1 6 4 1 2 1 1 6 4 7 4 8 3 4
 0 5 1 9 4 5 7 6 3 7 0 5 9 7 5 9 7 4 2 1 9 0 7 5 2 3 6 3 9 6 9 5 0 1 5 5 8
 3 3 6 2 6 5 6 2 0 8 7 3 7 0 2 2 3 5 8 7 3 6 5 9 9 2 9 6 3 0 7 1 1 9 6 1 1
 0 0 2 9 3 9 9 3 7 7 1 3 

Looks good. I'm done! Some places for improvement are at the end though. I should learn how to do everything without switching back to numpy to improve efficiency (not having to change back to numpy array).