# Long Short-Term Memory Networks with PyTorch

### Model A: 1 Hidden Layer

* Unroll 28 time steps
    * Each step input size: 28 x 1
    * Total per unroll: 28 x 28
        * Feedforward Neural Network input size: 28 x 28
* 1 Hidden Layer

#### Steps: 

<li> Step 1: Load Dataset
<li> Step 2: Make Dataset Iterable
<li> Step 3: Create Model Class
<li> Step 4: Instantiate Model Class
<li> Step 5: Instantiate Loss Class
<li> Step 6: Instantiate Optimizer Class
<li> Step 7: Train Model

#### Step 1: Load Dataset


In [1]:
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision.datasets as dsets
from torch.autograd import Variable

In [2]:
train_dataset = dsets.MNIST(root = './data',
                            train = True,
                            transform = transforms.ToTensor(),
                            download = True)

test_dataset = dsets.MNIST(root = './data',
                           train = False,
                           transform = transforms.ToTensor())

In [3]:
print(train_dataset.data.size())

torch.Size([60000, 28, 28])


In [4]:
print(train_dataset.targets.size())

torch.Size([60000])


In [5]:
print(test_dataset.data.size())

torch.Size([10000, 28, 28])


In [6]:
print(test_dataset.targets.size())

torch.Size([10000])


#### Step 2: Make Dataset Iterable

In [7]:
batch_size = 100
n_iters = 3000

num_epochs = n_iters / (len(train_dataset) / batch_size)
num_epochs = int(num_epochs)

train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
                                           batch_size=batch_size,
                                           shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=test_dataset,
                                        batch_size=batch_size,
                                        shuffle=False)

#### Step 3: Create Model Class

In [8]:
class LSTMModel(nn.Module):
    
    def __init__(self, input_dim, hidden_dim, layer_dim, output_dim):
        super(LSTMModel, self).__init__()
        
        # Hidden Dimensions
        self.hidden_dim = hidden_dim
        
        # Number of hidden layers
        self.layer_dim = layer_dim
        
        # Building your LSTM
        # batch_first = True causes input/output tensors to be of shape
        # (batch_dim, seq_dim, feature_dim)
        self.lstm = nn.LSTM(input_dim, hidden_dim, layer_dim, batch_first=True)
        
        # Readout layer
        self.fn = nn.Linear(hidden_dim, output_dim)

        
    def forward(self, x):
        # Initialize hidden state with zeroes
        # (layer_dim, batch_size, hidden_dim)
        h0 = Variable(torch.zeros(self.layer_dim, x.size(0), self.hidden_dim))
        
        # Initialize Cell State
        c0 = Variable(torch.zeros(self.layer_dim, x.size(0), self.hidden_dim)) 
        
        # 28 time-step
        out, (hn, cn) = self.lstm(x, (h0,c0))
        
        # Input hidden state of the last time step
        # out.size() ---> 100, 28, 100
        # out[:, -1, :] ---> 100, 100 ---> just want last time step hidden states!
        out = self.fn(out[:, -1, :])
        # out.size() ---> 100, 10 
        return out

#### Step 4: Instantiate Model Class

* 28 time steps
    * Each time step: input dimension = 28
* 1 Hidden Layer
* MNIST 1-9 digits ----> output dimension = 10

In [9]:
input_dim = 28
hidden_dim = 100
layer_dim = 1
output_dim = 10

model = LSTMModel(input_dim, hidden_dim, layer_dim, output_dim)

#### Step 5: Instantiate Loss Class

In [10]:
criterion = nn.CrossEntropyLoss()

#### Step 6: Instantiate Optimizer Class

In [11]:
learning_rate = 0.1
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

**Parameters In-Depth**



In [12]:
len(list(model.parameters()))

6

In [13]:
for i in range(len(list(model.parameters()))):
    print(list(model.parameters())[i].size())

torch.Size([400, 28])
torch.Size([400, 100])
torch.Size([400])
torch.Size([400])
torch.Size([10, 100])
torch.Size([10])


#### Parameters
- **Input** $\rightarrow$ **Gates**
    - $[400, 28] \rightarrow w_1, w_3, w_5, w_7$
    - $[400] \rightarrow b_1, b_3, b_5, b_7$
- **Hidden State** $\rightarrow$ **Gates**
    - $[400, 100] \rightarrow w_2, w_4, w_6, w_8$
    - $[400] \rightarrow b_2, b_4, b_6, b_8$
- **Hidden State** $\rightarrow$ **Output**
    - $[10, 100] \rightarrow w_9$
    - $[10] \rightarrow b_9$

#### Step 7: Train Model

Process

1. <b>Convert input/labels to variables</b>
    * LSTM input: (1, 28)
    * RNN input: (1, 28)
    * CNN input: (1, 28, 28)
    * Feedforward NN input: (1, 28*28)
2. Clear gradient buffers
3. Get output given inputs
4. Get loss
5. Get gradients w.r.t. parameters
6. Update parameters using gradients
    * parameters = parameters - leaarning_rate * parameter_gradients
7. Repeat

In [14]:
# Number of steps to unroll\
seq_dim = 28
iter = 0

for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        
        images = Variable(images.view(-1, seq_dim, input_dim))
        labels = Variable(labels)
        
        optimizer.zero_grad()
        
        outputs = model(images)
        
        loss = criterion(outputs, labels)
        
        loss.backward()
        
        optimizer.step()
        
        iter += 1
        
        if iter % 500 == 0:
            correct = 0
            total = 0
            
            for images, labels in test_loader:
                
                images = Variable(images.view(-1, seq_dim, input_dim))
                outputs = model(images)
                
                _, predicted = torch.max(outputs.data, 1)
                
                total += labels.size(0)
                
                correct += (predicted == labels).sum()
                
            accuracy = 100 * correct / total
            
            print('Iterations: {}, Loss: {}, Accuracy: {}'.format(iter, loss.data, accuracy.float()))

Iterations: 500, Loss: 2.2654316425323486, Accuracy: 18.0
Iterations: 1000, Loss: 1.1688305139541626, Accuracy: 50.0
Iterations: 1500, Loss: 0.4424925148487091, Accuracy: 82.0
Iterations: 2000, Loss: 0.35868415236473083, Accuracy: 87.0
Iterations: 2500, Loss: 0.24794839322566986, Accuracy: 93.0
Iterations: 3000, Loss: 0.27900707721710205, Accuracy: 94.0


In [17]:
import gc
gc.collect()

828

### Model B: 2 Hidden Layers

* Unroll 28 time steps
    * Each step input size: 28 x 1
    * Total per unroll: 28 x 28
        * Feedforward Neural Network input size: 28 x 28
* <b>2 Hidden Layers</b>

#### Steps: 

<li> Step 1: Load Dataset
<li> Step 2: Make Dataset Iterable
<li> Step 3: Create Model Class
<li> Step 4: Instantiate Model Class
<li> Step 5: Instantiate Loss Class
<li> Step 6: Instantiate Optimizer Class
<li> Step 7: Train Model

In [18]:
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision.datasets as dsets
from torch.autograd import Variable


'''
STEP 1: LOAD DATASET
'''
train_dataset = dsets.MNIST(root = './data',
                            train = True,
                            transform = transforms.ToTensor(),
                            download = True)

test_dataset = dsets.MNIST(root = './data',
                           train = False,
                           transform = transforms.ToTensor())


'''
STEP 2: MAKE DATASET ITERABLE
'''
batch_size = 100
n_iters = 3000

num_epochs = n_iters / (len(train_dataset) / batch_size)
num_epochs = int(num_epochs)

train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
                                           batch_size=batch_size,
                                           shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=test_dataset,
                                        batch_size=batch_size,
                                        shuffle=False)



'''
STEP 3: CREATE MODEL CLASS
'''
class LSTMModel(nn.Module):
    
    def __init__(self, input_dim, hidden_dim, layer_dim, output_dim):
        super(LSTMModel, self).__init__()
        
        # Hidden Dimensions
        self.hidden_dim = hidden_dim
        
        # Number of hidden layers
        self.layer_dim = layer_dim
        
        # Building your LSTM
        # batch_first = True causes input/output tensors to be of shape
        # (batch_dim, seq_dim, feature_dim)
        self.lstm = nn.LSTM(input_dim, hidden_dim, layer_dim, batch_first=True)
        
        # Readout layer
        self.fn = nn.Linear(hidden_dim, output_dim)

        
    def forward(self, x):
        # Initialize hidden state with zeroes
        # (layer_dim, batch_size, hidden_dim)
        h0 = Variable(torch.zeros(self.layer_dim, x.size(0), self.hidden_dim))
        
        # Initialize Cell State
        c0 = Variable(torch.zeros(self.layer_dim, x.size(0), self.hidden_dim)) 
        
        # 28 time-step
        out, (hn, cn) = self.lstm(x, (h0,c0))
        
        # Input hidden state of the last time step
        # out.size() ---> 100, 28, 100
        # out[:, -1, :] ---> 100, 100 ---> just want last time step hidden states!
        out = self.fn(out[:, -1, :])
        # out.size() ---> 100, 10 
        return out
    
    
    
'''
STEP 4: INSTANTIATE MODEL CLASS
'''
input_dim = 28
hidden_dim = 100
layer_dim = 2
output_dim = 10

model = LSTMModel(input_dim, hidden_dim, layer_dim, output_dim)


print('-'*60)
print(model)
print('-'*60)
print(len(list(model.parameters())))
print('-'*60)
for i in range(len(list(model.parameters()))):
    print(list(model.parameters())[i].size())
print('-'*60)
print('-'*60)



'''
STEP 5: INSTANTIATE LOSS CLASS
'''
criterion = nn.CrossEntropyLoss()

'''
STEP 6: INSTANTIATE OPTIMIZER CLASS
'''
learning_rate = 0.1
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)


'''
STEP 7: TRAIN THE MODEL
'''
# Number of steps to unroll\
seq_dim = 28
iter = 0

for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        
        images = Variable(images.view(-1, seq_dim, input_dim))
        labels = Variable(labels)
        
        optimizer.zero_grad()
        
        outputs = model(images)
        
        loss = criterion(outputs, labels)
        
        loss.backward()
        
        optimizer.step()
        
        iter += 1
        
        if iter % 500 == 0:
            correct = 0
            total = 0
            
            for images, labels in test_loader:
                
                images = Variable(images.view(-1, seq_dim, input_dim))
                outputs = model(images)
                
                _, predicted = torch.max(outputs.data, 1)
                
                total += labels.size(0)
                
                correct += (predicted == labels).sum()
                
            accuracy = 100 * correct / total
            
            print('Iterations: {}, Loss: {}, Accuracy: {}'.format(iter, loss.data, accuracy.float()))

            
            
print('-'*60)


------------------------------------------------------------
LSTMModel(
  (lstm): LSTM(28, 100, num_layers=2, batch_first=True)
  (fn): Linear(in_features=100, out_features=10, bias=True)
)
------------------------------------------------------------
10
------------------------------------------------------------
torch.Size([400, 28])
torch.Size([400, 100])
torch.Size([400])
torch.Size([400])
torch.Size([400, 100])
torch.Size([400, 100])
torch.Size([400])
torch.Size([400])
torch.Size([10, 100])
torch.Size([10])
------------------------------------------------------------
------------------------------------------------------------
Iterations: 500, Loss: 2.301187038421631, Accuracy: 12.0
Iterations: 1000, Loss: 1.746108889579773, Accuracy: 37.0
Iterations: 1500, Loss: 0.5666611194610596, Accuracy: 81.0
Iterations: 2000, Loss: 0.3617479205131531, Accuracy: 91.0
Iterations: 2500, Loss: 0.18505558371543884, Accuracy: 92.0
Iterations: 3000, Loss: 0.06820815801620483, Accuracy: 95.0
--------

In [20]:
import gc
gc.collect()

0

### Model C: 3 Hidden Layers

* Unroll 28 time steps
    * Each step input size: 28 x 1
    * Total per unroll: 28 x 28
        * Feedforward Neural Network input size: 28 x 28
* <b>3 Hidden Layers</b>

#### Steps: 

<li> Step 1: Load Dataset
<li> Step 2: Make Dataset Iterable
<li> Step 3: Create Model Class
<li> Step 4: Instantiate Model Class
<li> Step 5: Instantiate Loss Class
<li> Step 6: Instantiate Optimizer Class
<li> Step 7: Train Model

In [21]:
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision.datasets as dsets
from torch.autograd import Variable


'''
STEP 1: LOAD DATASET
'''
train_dataset = dsets.MNIST(root = './data',
                            train = True,
                            transform = transforms.ToTensor(),
                            download = True)

test_dataset = dsets.MNIST(root = './data',
                           train = False,
                           transform = transforms.ToTensor())


'''
STEP 2: MAKE DATASET ITERABLE
'''
batch_size = 100
n_iters = 3000

num_epochs = n_iters / (len(train_dataset) / batch_size)
num_epochs = int(num_epochs)

train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
                                           batch_size=batch_size,
                                           shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=test_dataset,
                                        batch_size=batch_size,
                                        shuffle=False)



'''
STEP 3: CREATE MODEL CLASS
'''
class LSTMModel(nn.Module):
    
    def __init__(self, input_dim, hidden_dim, layer_dim, output_dim):
        super(LSTMModel, self).__init__()
        
        # Hidden Dimensions
        self.hidden_dim = hidden_dim
        
        # Number of hidden layers
        self.layer_dim = layer_dim
        
        # Building your LSTM
        # batch_first = True causes input/output tensors to be of shape
        # (batch_dim, seq_dim, feature_dim)
        self.lstm = nn.LSTM(input_dim, hidden_dim, layer_dim, batch_first=True)
        
        # Readout layer
        self.fn = nn.Linear(hidden_dim, output_dim)

        
    def forward(self, x):
        # Initialize hidden state with zeroes
        # (layer_dim, batch_size, hidden_dim)
        h0 = Variable(torch.zeros(self.layer_dim, x.size(0), self.hidden_dim))
        
        # Initialize Cell State
        c0 = Variable(torch.zeros(self.layer_dim, x.size(0), self.hidden_dim)) 
        
        # 28 time-step
        out, (hn, cn) = self.lstm(x, (h0,c0))
        
        # Input hidden state of the last time step
        # out.size() ---> 100, 28, 100
        # out[:, -1, :] ---> 100, 100 ---> just want last time step hidden states!
        out = self.fn(out[:, -1, :])
        # out.size() ---> 100, 10 
        return out
    
    
    
'''
STEP 4: INSTANTIATE MODEL CLASS
'''
input_dim = 28
hidden_dim = 100
layer_dim = 3
output_dim = 10

model = LSTMModel(input_dim, hidden_dim, layer_dim, output_dim)

print('-'*60)
print(model)
print('-'*60)
print(len(list(model.parameters())))
print('-'*60)
for i in range(len(list(model.parameters()))):
    print(list(model.parameters())[i].size())
print('-'*60)
print('-'*60)



'''
STEP 5: INSTANTIATE LOSS CLASS
'''
criterion = nn.CrossEntropyLoss()

'''
STEP 6: INSTANTIATE OPTIMIZER CLASS
'''
learning_rate = 0.1
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)


'''
STEP 7: TRAIN THE MODEL
'''
# Number of steps to unroll\
seq_dim = 28
iter = 0

for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        
        images = Variable(images.view(-1, seq_dim, input_dim))
        labels = Variable(labels)
        
        optimizer.zero_grad()
        
        outputs = model(images)
        
        loss = criterion(outputs, labels)
        
        loss.backward()
        
        optimizer.step()
        
        iter += 1
        
        if iter % 500 == 0:
            correct = 0
            total = 0
            
            for images, labels in test_loader:
                
                images = Variable(images.view(-1, seq_dim, input_dim))
                outputs = model(images)
                
                _, predicted = torch.max(outputs.data, 1)
                
                total += labels.size(0)
                
                correct += (predicted == labels).sum()
                
            accuracy = 100 * correct / total
            
            print('Iterations: {}, Loss: {}, Accuracy: {}'.format(iter, loss.data, accuracy.float()))

            
            
print('-'*60)


------------------------------------------------------------
LSTMModel(
  (lstm): LSTM(28, 100, num_layers=3, batch_first=True)
  (fn): Linear(in_features=100, out_features=10, bias=True)
)
------------------------------------------------------------
14
------------------------------------------------------------
torch.Size([400, 28])
torch.Size([400, 100])
torch.Size([400])
torch.Size([400])
torch.Size([400, 100])
torch.Size([400, 100])
torch.Size([400])
torch.Size([400])
torch.Size([400, 100])
torch.Size([400, 100])
torch.Size([400])
torch.Size([400])
torch.Size([10, 100])
torch.Size([10])
------------------------------------------------------------
------------------------------------------------------------
Iterations: 500, Loss: 2.293963670730591, Accuracy: 11.0
Iterations: 1000, Loss: 2.295689344406128, Accuracy: 11.0
Iterations: 1500, Loss: 2.1494925022125244, Accuracy: 21.0
Iterations: 2000, Loss: 1.1335668563842773, Accuracy: 63.0
Iterations: 2500, Loss: 0.6934176087379456, Ac

### Comparison with RNN

|     Model A RNN    |     Model B RNN    |   Model C RNN      |
|--------------------|--------------------|--------------------|
|   ReLU             |  ReLU              |  TanH              |
|  1 Hidden Layer    |  2 Hidden Layers   |  2 Hidden Layers   |
|  100 Hidden Units  |  100 Hidden Units  |  100 Hidden Units  |
|     > 86 %         |      > 95 %        |       > 95 %       |

|     Model A LSTM   |     Model B LSTM   |   Model C LSTM     |
|--------------------|--------------------|--------------------|
|  1 Hidden Layer    |  2 Hidden Layers   |  3 Hidden Layers   |
|  100 Hidden Units  |  100 Hidden Units  |  100 Hidden Units  |
|    > 94 %          |     > 95 %         |      89 %          |

### Deep Learning

- 2 ways to expand a Long short-term memory network
    - More Hidden Units
        - $(o, i, f, g)$ gates
    - More Hidden layers
- Cons
    - Need a larger dataset
        - Curse of Dimensionality
    - Does not necessarily mean higher accuracy

## Long Short-Term Memory Networks with PyTorch (GPU)

### Model B: 2 Hidden Layers

* Unroll 28 time steps
    * Each step input size: 28 x 1
    * Total per unroll: 28 x 28
        * Feedforward Neural Network input size: 28 x 28
* <b>2 Hidden Layers</b>

#### Steps: 

<li> Step 1: Load Dataset
<li> Step 2: Make Dataset Iterable
<li> <b>Step 3: Create Model Class</b>
<li> <b>Step 4: Instantiate Model Class</b>
<li> Step 5: Instantiate Loss Class
<li> Step 6: Instantiate Optimizer Class
<li> <b>Step 7: Train Model</b>

In [None]:
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision.datasets as dsets
from torch.autograd import Variable


'''
STEP 1: LOAD DATASET
'''
train_dataset = dsets.MNIST(root = './data',
                            train = True,
                            transform = transforms.ToTensor(),
                            download = True)

test_dataset = dsets.MNIST(root = './data',
                           train = False,
                           transform = transforms.ToTensor())


'''
STEP 2: MAKE DATASET ITERABLE
'''
batch_size = 100
n_iters = 3000

num_epochs = n_iters / (len(train_dataset) / batch_size)
num_epochs = int(num_epochs)

train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
                                           batch_size=batch_size,
                                           shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=test_dataset,
                                        batch_size=batch_size,
                                        shuffle=False)



'''
STEP 3: CREATE MODEL CLASS
'''
class LSTMModel(nn.Module):
    
    def __init__(self, input_dim, hidden_dim, layer_dim, output_dim):
        super(LSTMModel, self).__init__()
        
        # Hidden Dimensions
        self.hidden_dim = hidden_dim
        
        # Number of hidden layers
        self.layer_dim = layer_dim
        
        # Building your LSTM
        # batch_first = True causes input/output tensors to be of shape
        # (batch_dim, seq_dim, feature_dim)
        self.lstm = nn.LSTM(input_dim, hidden_dim, layer_dim, batch_first=True)
        
        # Readout layer
        self.fn = nn.Linear(hidden_dim, output_dim)

        
    def forward(self, x):
        # Initialize hidden state with zeroes
        # (layer_dim, batch_size, hidden_dim)
        if torch.cuda.is_available():
            h0 = Variable(torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).cuda())
            # Initialize Cell State
            c0 = Variable(torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).cuda()) 
        else:
            h0 = Variable(torch.zeros(self.layer_dim, x.size(0), self.hidden_dim))
            # Initialize Cell State
            c0 = Variable(torch.zeros(self.layer_dim, x.size(0), self.hidden_dim)) 
        

        
        # 28 time-step
        out, (hn, cn) = self.lstm(x, (h0,c0))
        
        # Input hidden state of the last time step
        # out.size() ---> 100, 28, 100
        # out[:, -1, :] ---> 100, 100 ---> just want last time step hidden states!
        out = self.fn(out[:, -1, :])
        # out.size() ---> 100, 10 
        return out
    
    
    
'''
STEP 4: INSTANTIATE MODEL CLASS
'''
input_dim = 28
hidden_dim = 100
layer_dim = 2
output_dim = 10

model = LSTMModel(input_dim, hidden_dim, layer_dim, output_dim)

if torch.cuda.is_available():
    model.cuda()

print('-'*60)
print(model)
print('-'*60)
print(len(list(model.parameters())))
print('-'*60)
for i in range(len(list(model.parameters()))):
    print(list(model.parameters())[i].size())
print('-'*60)
print('-'*60)



'''
STEP 5: INSTANTIATE LOSS CLASS
'''
criterion = nn.CrossEntropyLoss()

'''
STEP 6: INSTANTIATE OPTIMIZER CLASS
'''
learning_rate = 0.1
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)


'''
STEP 7: TRAIN THE MODEL
'''
# Number of steps to unroll\
seq_dim = 28
iter = 0

for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        
        if torch.cuda.is_available():
            images = Variable(images.view(-1, seq_dim, input_dim).cuda())
            labels = Variable(labels.cuda())
        else:
            images = Variable(images.view(-1, seq_dim, input_dim))
            labels = Variable(labels)
        
        optimizer.zero_grad()
        
        outputs = model(images)
        
        loss = criterion(outputs, labels)
        
        loss.backward()
        
        optimizer.step()
        
        iter += 1
        
        if iter % 500 == 0:
            correct = 0
            total = 0
            
            for images, labels in test_loader:
                if torch.cuda.is_available():
                    images = Variable(images.view(-1, seq_dim, input_dim).cuda())
                else:
                    images = Variable(images.view(-1, seq_dim, input_dim))
                    
                outputs = model(images)
                
                _, predicted = torch.max(outputs.data, 1)
                
                total += labels.size(0)
                
                if torch.cuda.is_available():
                    correct += (predicted.cpu() == labels.cpu()).sum()
                else:
                    correct += (predicted == labels).sum()
                
            accuracy = 100 * correct / total
            
            print('Iterations: {}, Loss: {}, Accuracy: {}'.format(iter, loss.data, accuracy.float()))

            
            
print('-'*60)


------------------------------------------------------------
LSTMModel(
  (lstm): LSTM(28, 100, num_layers=2, batch_first=True)
  (fn): Linear(in_features=100, out_features=10, bias=True)
)
------------------------------------------------------------
10
------------------------------------------------------------
torch.Size([400, 28])
torch.Size([400, 100])
torch.Size([400])
torch.Size([400])
torch.Size([400, 100])
torch.Size([400, 100])
torch.Size([400])
torch.Size([400])
torch.Size([10, 100])
torch.Size([10])
------------------------------------------------------------
------------------------------------------------------------
Iterations: 500, Loss: 2.301875352859497, Accuracy: 12.0
Iterations: 1000, Loss: 1.8230276107788086, Accuracy: 39.0


In [None]:
save_model = True
if save_model is True:
    # saving only params
    torch.save(model.state_dict(), 'Models/LSTMPytorch.pkl')

---------