# Pyrat Deep Learning Processing

## Setup Environment

Required libraries for Data Preprocessing

In [33]:
# Import libraries

import os
import numpy as np
import tqdm
import ast
import scipy
import scipy.sparse
import sys
import pickle

Required libraries for Network Training

In [34]:
# Import libraries

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import inspect

# Personal libraries

from sklearn.model_selection import train_test_split
from torch.utils.data import DataLoader, TensorDataset

Required libraries for Visualization

In [35]:
import matplotlib.pyplot as plt
%matplotlib inline

Define the path for the saved Pyrat Games. **If you change your Pyrat repo location UPDATE THIS!**

In [36]:
# Set your path to the saves folder here
directory =  'D:\PyRat-1\saves\\'

Set the name for the pickled file containing the Pyrat dataset preprocessed for supervised learning.

In [37]:
dataset_name = "pyrat_dataset.pkl"

Define our **device** as the first visible CUDA device if we have CUDA available:

In [38]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Using device '+ str(device))

Using device cuda


## Creating Pyrat Games

If you have not done so already, you need the latest version of PyRat. To obtain it, clone the [official PyRat repository](https://github.com/BastienPasdeloup/PyRat-1). 

PS: You will need to have pygame installed in your machine, open a terminal and run:

<pre>pip install pygame</pre> 

In the context of the AI course, we are going to simplify the rules of PyRat a bit.
In fact, we are going to remove all walls and mud penalties. Also, we are not going to consider symmetric mazes anymore.

As such, a default game is launched with the following parameters. Please try now (note that you may have to type python instead of python3): 

<pre>python pyrat.py -p 40 -md 0 -d 0 --nonsymmetric</pre>

An empty labyrinth will appear.

Please check out all the options offered by the pyrat software, by running : 

<pre>python pyrat.py -h</pre>

Importantly, there are options to change the size of the map, the number of cheese, which will be very useful later to benchmark your own solutions. 

In the supervised and unsupervised projects, we are going to look at plays between two greedy algorithms. Generating 1000 such games while saving data is easily obtained with PyRat. 

Open another terminal to launch the next command line. Generating 1000 games will take a few minutes.

<pre>python pyrat.py --width 21 --height 15 -p 40 -md 0 -d 0 --nonsymmetric --rat AIs/manh.py --python AIs/manh.py --tests 1000 --nodrawing --synchronous --save</pre>

The 1000 generated games will be in the "saves" folder. Each time you execute the command new games are added to the saves folder. You have to manually delete the old games if you do not want to use them (for example, if you change the size of the labyrinth or if you want to train your IA on new games).

As bonus, to run a cute visual simulation to understand Pyrat with the players controlled by the greedy approach AI you can run the following command:

<pre>python pyrat.py --width 20 --height 20 -p 40 -md 0 -d 0 --nonsymmetric --rat AIs/manh.py --python AIs/manh.py</pre>

## Preprocessing Tools

### Constant Definitions

Preprocessing Constant Definitions

In [39]:
PHRASES = {
    "# Random seed\n": "seed",
    "# MazeMap\n": "maze",
    "# Pieces of cheese\n": "pieces"    ,
    "# Rat initial location\n": "rat"    ,
    "# Python initial location\n": "python"   , 
    "rat_location then python_location then pieces_of_cheese then rat_decision then python_decision\n": "play"
}
 
MOVE_DOWN = 'D'
MOVE_LEFT = 'L'
MOVE_RIGHT = 'R'
MOVE_UP = 'U'
 
translate_action = {
    MOVE_LEFT:0,
    MOVE_RIGHT:1,
    MOVE_UP:2,
    MOVE_DOWN:3
}

### Function Definitions

**Define a function** to process a Pyrat game file.

In [40]:
"""
    This function receives a pyrat file save and returns its parameters.
"""

def process_file(filename):
    f = open(filename,"r")    
    info = f.readline()
    params = dict(play=list())
    while info is not None:
        if info.startswith("{"):
            params["end"] = ast.literal_eval(info)
            break
        if "turn " in info:
            info = info[info.find('rat_location'):]
        if info in PHRASES.keys():
            param = PHRASES[info]
            if param == "play":
                rat = ast.literal_eval(f.readline())
                python = ast.literal_eval(f.readline())
                pieces = ast.literal_eval(f.readline())
                rat_decision = f.readline().replace("\n","")
                python_decision = f.readline().replace("\n","")
                play_dict = dict(
                    rat=rat,python=python,piecesOfCheese=pieces,
                    rat_decision=rat_decision,python_decision=python_decision)
                params[param].append(play_dict)
            else:
                params[param] = ast.literal_eval(f.readline())
        else:
            print("did not understand:", info)
            break
        info = f.readline()
    return params

Process a sample game to **understand** its contents

In [41]:
sample_game = process_file("sample_game8x8")
sample_game

{'play': [{'rat': (0, 0),
   'python': (7, 7),
   'piecesOfCheese': [(3, 7),
    (6, 3),
    (4, 5),
    (2, 3),
    (3, 2),
    (1, 4),
    (2, 4),
    (4, 0),
    (5, 3),
    (6, 6),
    (5, 5),
    (3, 5),
    (1, 5),
    (0, 4),
    (1, 1),
    (0, 2),
    (0, 5),
    (6, 0),
    (5, 6),
    (3, 6),
    (7, 2),
    (7, 3),
    (2, 7),
    (0, 7),
    (5, 2),
    (1, 3),
    (7, 4),
    (4, 7),
    (7, 6),
    (4, 2),
    (4, 1),
    (5, 0),
    (6, 5),
    (7, 5),
    (4, 3),
    (1, 2),
    (2, 5),
    (5, 1),
    (6, 2),
    (3, 0)],
   'rat_decision': 'R',
   'python_decision': 'D'},
  {'rat': (1, 0),
   'python': (7, 6),
   'piecesOfCheese': [(3, 7),
    (6, 3),
    (4, 5),
    (2, 3),
    (3, 2),
    (1, 4),
    (2, 4),
    (4, 0),
    (5, 3),
    (6, 6),
    (5, 5),
    (3, 5),
    (1, 5),
    (0, 4),
    (1, 1),
    (0, 2),
    (0, 5),
    (6, 0),
    (5, 6),
    (3, 6),
    (7, 2),
    (7, 3),
    (2, 7),
    (0, 7),
    (5, 2),
    (1, 3),
    (7, 4),
    (4, 7),
    (4, 2

**Understand** how to interpret the size of a maze from a saved game file.

In [42]:
# The key 'rat' from a saved game file data dictionary contains the initial coordinates of the rat
# The rat always starts at bottom left corner (0,0)
print( f'Rat initial position {sample_game["rat"]}' )

# The key 'python' from a saved game file data dictionary contains the initial coordinates of the python
# The python starts at the top right corner (mazeWidth-1,mazeHeight-1)
print( f'Python initial position {sample_game["python"]}' )

# Get the maze size from a single saved game file
sampleWidth =  sample_game["python"][0] + 1
sampleHeight = sample_game["python"][1] + 1

print(f'Maze size: {sampleWidth, sampleHeight}')

Rat initial position (0, 0)
Python initial position (7, 7)
Maze size: (8, 8)


**Define a function** to create a canvas.

In [43]:
"""
    The goal of this function is to create a canvas, which will be the vector used to train a classifier. 
    As we want to predict a next move, we will create a canvas that is centered on the player, so that we create a translation invariance.
"""

def convert_input(player, maze, opponent, mazeHeight, mazeWidth, piecesOfCheese):
    
    # We consider twice the size of the maze to simplify the creation of the canvas. 
    # The canvas is initialized as a numpy tensor with 3 dimensions, the third one corresponding to "layers" of the canvas. 
    # Here, we just use one layer, but you can define other ones to put more information on the play (e.g. the location of the opponent could be put in a second layer).
    
    im_size = (2*mazeHeight-1,2*mazeWidth-1,1)
    #im_size = (2*mazeHeight-1,2*mazeWidth-1,2)

    # We initialize a canvas with only zeros.
    canvas = np.zeros(im_size)

    # Coordinates of the player in the original maze
    (x,y) = player
    # Coordinates of the opponent in the original maze
    (x_opponent, y_opponent) = opponent
    
    # Center of the canvas, which represents the view of the centered player
    canvas_x_center = mazeWidth - 1
    canvas_y_center = mazeHeight - 1
        
    # Fill in the first layer of the canvas with the value 1 at the location of the cheeses, relative to the position of the player (i.e. the canvas is centered on the player location).
    #for (x_cheese,y_cheese) in piecesOfCheese:
    #    canvas[ canvas_x_center+(x_cheese-x), canvas_y_center+(y_cheese-y), 0] = 1
        
    for (x_cheese,y_cheese) in piecesOfCheese:
        # Fill the player layer with cheese
        canvas[ y_cheese+canvas_y_center-y, x_cheese+canvas_x_center-x, 0] = 1
        # Fill the opponent layer with cheese
        #canvas[ y_cheese+canvas_y_center-y_opponent, x_cheese+canvas_x_center-x_opponent, 1] = 1
    # Fill the opponent layer with the opponent's position with respect to the player
    #canvas[y_opponent+canvas_y_center-y,x_opponent+canvas_x_center-x,1] = 1

    return canvas

Vizualize more clearly what a single play contains to **understand** next preprocessing functions.

In [44]:
# Print the first play of the sample game
sample_game["play"][0]

{'rat': (0, 0),
 'python': (7, 7),
 'piecesOfCheese': [(3, 7),
  (6, 3),
  (4, 5),
  (2, 3),
  (3, 2),
  (1, 4),
  (2, 4),
  (4, 0),
  (5, 3),
  (6, 6),
  (5, 5),
  (3, 5),
  (1, 5),
  (0, 4),
  (1, 1),
  (0, 2),
  (0, 5),
  (6, 0),
  (5, 6),
  (3, 6),
  (7, 2),
  (7, 3),
  (2, 7),
  (0, 7),
  (5, 2),
  (1, 3),
  (7, 4),
  (4, 7),
  (7, 6),
  (4, 2),
  (4, 1),
  (5, 0),
  (6, 5),
  (7, 5),
  (4, 3),
  (1, 2),
  (2, 5),
  (5, 1),
  (6, 2),
  (3, 0)],
 'rat_decision': 'R',
 'python_decision': 'D'}

**Define a function** to vectorize a Pyrat game data dictionary.

In [45]:
"""
    This function vectorizes a Pyrat game data dictionary.
"""
def dict_to_x_y(end, rat, python, maze, piecesOfCheese, rat_decision, python_decision,
                    mazeWidth, mazeHeight):
        # We only use the winner
        if end["win_python"] == 1: 
            player = python
            opponent = rat        
            decision = python_decision
        elif end["win_rat"] == 1:
            player = rat
            opponent = python        
            decision = rat_decision
        else:
            return False
        if decision == "None" or decision == "": #No play
            return False
        x_1 = convert_input(player, maze, opponent, mazeHeight, mazeWidth, piecesOfCheese)
        y = np.zeros((1,4),dtype=np.int8)
        y[0][translate_action[decision]] = 1
        return x_1,y

**Understand** the vectorized version of the sample game data dictionary.

In [46]:
# Create a vectorized version of the sample game data dictionary
sample_x_y = dict_to_x_y(**(sample_game["play"][0]),
                         maze = sample_game["maze"],
                         end = sample_game["end"],
                         mazeWidth = sampleWidth,
                         mazeHeight = sampleHeight)

# Ensure that the created canvas matches the formula ( 2*mazeHeight-1, 2*mazeWidth-1, 1 )
print(f'Maze size:   {sampleWidth, sampleHeight}')
print(f'Canvas size: {(sample_x_y[0]).shape}')

Maze size:   (8, 8)
Canvas size: (15, 15, 2)


In [47]:
# Check who is the winner
print("This game's winner is...")
print(f"Rat: {sample_game['end']['win_rat']}")
print(f"Python: {sample_game['end']['win_python']}\n")

# Print the canvas (only first layer, so index-0 of 3rd dimension)
print(sample_x_y[0][:,:,1])

This game's winner is...
Rat: 0
Python: 1

[[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]


In [144]:
# Print the player's choice
sample_x_y[1]

array([[0, 0, 0, 1]], dtype=int8)

### Main Preprocessing

The following code parses the saves directory to generate a database file called 'pyrat_dataset.pkl'.
It is taken exactly as it is presented in the generate_dataset.py file for supervised learning.

The saved pickled file contains a sparse matrix representation of both the canvas and the decision vector since this data contains mostly zero-valued elements and sparse matrices are more memory-efficient representations of a matrix than multidimensional arrays.

For a good introduction on sparse matrices refer to the next [link](https://machinelearningmastery.com/sparse-matrices-for-machine-learning/#:~:text=Matrices%20that%20contain%20mostly%20zero,non%2Dzero%2C%20called%20dense).

In [145]:
games = list()

for root, dirs, files in os.walk(directory):
    for filename in tqdm.tqdm(files):
        if filename.startswith("."):
            continue
        game_params = process_file(directory+filename)
        games.append(game_params)
        
# Check if all games are on mazes of same dimension
mazeWidth=games[0]["python"][0] + 1
mazeHeight=games[0]["python"][1] + 1
print(mazeWidth, mazeHeight)
for game in games :
    if game["python"][0] + 1 != mazeWidth or game["python"][1] + 1 != mazeHeight :
        print("Saves directory contains games of various dimensions")
        exit()
        
x_1_train = list()
y_train = list()
wins_python = 0
wins_rat = 0
        
for game in tqdm.tqdm(games):
    if game["end"]["win_python"] == 1: 
        wins_python += 1
    elif game["end"]["win_rat"] == 1:
        wins_rat += 1
    else:
        continue
    plays = game["play"]
    for play in plays:
        x_y = dict_to_x_y(**play,maze=game["maze"],end=game["end"], mazeWidth=mazeWidth, mazeHeight=mazeHeight)
        if x_y:
            x1, y = x_y
            y_train.append(scipy.sparse.csr_matrix(y.reshape(1,-1)))
            x_1_train.append(scipy.sparse.csr_matrix(x1.reshape(1,-1)))

print("Greedy/Draw/Random Greedy, {}/{}/{}".format(wins_rat,1000 - wins_python - wins_rat, wins_python)) 

pickle.dump([x_1_train,y_train, mazeWidth, mazeHeight], open("pyrat_dataset.pkl","wb"))

100%|██████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:42<00:00, 23.75it/s]


20 20


100%|██████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:18<00:00, 52.98it/s]


Greedy/Draw/Random Greedy, 435/113/452


## Train the Network

### Prepare the Network's Inputs

Load the pickled dataset containing the plays canvas (x) and the associated decision vectors (y) as scipy sparse matrices representations.

In [146]:
### This line reloads the pyrat_dataset that was stored as a pkl file by the generate dataset script. 
x, y, mazeWidth, mazeHeight = pickle.load(open(dataset_name,"rb"))

As the dataset was stored using scipy sparse array to save space, we convert it back to torch dense array. 
Note that you could keep the sparse representation if you work with a machine learning method that accepts sparse arrays.

In [147]:
x = scipy.sparse.vstack(x).todense()
y = scipy.sparse.vstack(y).todense()

Turn the sparse matrices into pytorch tensors.

In [148]:
# For reshape a single dimension may be -1, in which case it’s inferred from the remaining dimensions and the number of elements in input.
x = torch.FloatTensor(x).reshape(-1,(2*mazeHeight-1)*(2*mazeWidth-1))  # (number of moves, size of the canvas)
y = torch.argmax(torch.FloatTensor(y), dim=1)  # (number of moves,)

# This is the number of features contained in a canvas (canvasWidth * canvasHeight)
canvas_size = x.shape[1]

print(f'Number of features/maze-cells contained in a canvas (canvasWidth * canvasHeight): {canvas_size}')
print(x.shape, y.shape)

Number of features/maze-cells contained in a canvas (canvasWidth * canvasHeight): 1521
torch.Size([67359, 1521]) torch.Size([67359])


In the previous cell, **torch.argmax** returns the indices of the maximum value of all elements in the input tensor. So it is a good transformation for our expected output tensor since originally 'y' was a 4-dim vector ([0, 0, 0, 1]) where each of the indices matched an agent movement for a play if it contained a '1'. So our transformed 'y' tensor looks as follows:

In [149]:
y[0:30]

tensor([0, 0, 0, 0, 0, 0, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 0, 0, 3, 0, 0, 0, 0,
        2, 0, 0, 2, 2, 2])

### Split data for **cross validation**.

**80%** of the samples for the **training set**.

**20%** of the samples for the **test set**.

In [165]:
# Split technique prosposed by IMT Introduction to Artificial Intellignece course (SKIPPED)

'''## Split your data into x_train, x_test, y_train, y_test.

n = int(x.shape[0] * 80/100)  # number of examples in the train set
x_train = x[:n]
x_test  = x[n:]
y_train = y[:n]
y_test  = y[n:]

print(f'{n} samples chosen for the TRAINING SET.')
print(f'{x.shape[0] - n} samples chosen for the TEST SET.')'''

53887 samples chose for the TRAINING SET.
13472 samples chose for the TEST SET.


In [171]:
# Split technique using classic Scikit Learn approach which is compatible with Pytorch tensors

# Split data into training, validation, and test sets

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)
x_train, x_val, y_train, y_val = train_test_split(x_train, y_train, test_size=0.2, random_state=42)

print(f'{x_train.shape} samples of type {type(x_train)} chosen for the TRAINING SET.')
print(f'{x_test.shape} samples of type {type(x_test)} chosen for the TEST SET.')
print(f'{x_val.shape} samples of type {type(x_val)} chosen for the VALIDATION SET.')

torch.Size([43109, 1521]) samples of type <class 'torch.Tensor'> chosen for the TRAINING SET.
torch.Size([13472, 1521]) samples of type <class 'torch.Tensor'> chosen for the TEST SET.
torch.Size([10778, 1521]) samples of type <class 'torch.Tensor'> chosen for the VALIDATION SET.


### Create DataLoaders
Create **DataLoaders**. 

[DataLoader documentation](https://pytorch.org/docs/stable/data.html)

[Why we shouldn't shuffle Test and Validation loaders?](https://discuss.pytorch.org/t/shuffle-true-or-shuffle-false-for-val-and-test-dataloaders/143720)

In [173]:
# Define how many samples per batch to load

batch_size = 20

# Define data loaders

train_data = TensorDataset(x_train, y_train)
train_loader = DataLoader(train_data, batch_size=batch_size, shuffle=True)

val_data = TensorDataset(x_val, y_val)
val_loader = DataLoader(val_data, batch_size=batch_size, shuffle=False)

test_data = TensorDataset(x_test, y_test)
test_loader = DataLoader(test_data, batch_size=batch_size, shuffle=False)

### Define Network Topology

Now you have to train a classifier using supervised learning and evaluate it's performance.

To begin with, **define** a neural network with **two hidden layers**. In pytorch, this correspond to only adding two layers of type "Linear".

You need to make sure that the size of the input of the first layer correspond to the width of your X vector.

*Feel free to try different number of layer and other non linear function.*

In [197]:
class Net(nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim):
        super(Net, self).__init__()
        self.input_layer = nn.Linear(input_dim, hidden_dim)
        self.hidden_layer1 = nn.Linear(hidden_dim, hidden_dim)
        self.hidden_layer2 = nn.Linear(hidden_dim, hidden_dim)
        self.output_layer = nn.Linear(hidden_dim, output_dim)
    
    def forward(self, x):
        x = torch.relu(self.input_layer(x))
        x = torch.relu(self.hidden_layer1(x))
        x = torch.relu(self.hidden_layer2(x))
        x = self.output_layer(x)
        return x

Let's **initialize** the neural network!

In [152]:
# Define model parameters

input_dim = canvas_size
hidden_dim = 20
output_dim = 4

# Instantiate model and move it to device

net = Net(input_dim, hidden_dim, output_dim)
net.to(device=device)

Net(
  (input_layer): Linear(in_features=1521, out_features=20, bias=True)
  (hidden_layer1): Linear(in_features=20, out_features=20, bias=True)
  (hidden_layer2): Linear(in_features=20, out_features=20, bias=True)
  (output_layer): Linear(in_features=20, out_features=4, bias=True)
)

### Loss Function and Optimizer

Define a **loss function** and **optimizer**.

In [153]:
# Define the loss function as cross-entropy
criterion = nn.CrossEntropyLoss()

# Set Adam as the optimizer
optimizer = optim.Adam(net.parameters(), lr=0.001)

# Set stochastic gradient descent as the optimizer
# optimizer = torch.optim.SGD(model_1.parameters(),lr = 0.01)

### Train the network

In [178]:
def training(n_epochs, train_loader, valid_loader, model, criterion, optimizer):

    train_losses, valid_losses = [], []
    # initialize tracker for minimum validation loss
    valid_loss_min = np.Inf  # set initial "min" to infinity

    for epoch in range(n_epochs):
        train_loss, valid_loss = 0, 0 # monitor losses
      
        # train the model
        model.train() # prep model for training
        for data, label in train_loader:
            data = data.to(device=device)
            label = label.to(device=device)
            optimizer.zero_grad() # clear the gradients of all optimized variables
            output = model(data) # forward pass: compute predicted outputs by passing inputs to the model
            loss = criterion(output, label) # calculate the loss
            loss.backward() # backward pass: compute gradient of the loss with respect to model parameters
            optimizer.step() # perform a single optimization step (parameter update)
            train_loss += loss.item() * data.size(0) # update running training loss
      
        # validate the model
        model.eval()
        for data, label in valid_loader:
            data = data.to(device=device)
            label = label.to(device=device)
            with torch.no_grad():
                output = model(data)
            loss = criterion(output,label)
            valid_loss += loss.item() * data.size(0)
      
        # calculate average loss over an epoch
        train_loss /= len(train_loader.sampler)
        valid_loss /= len(valid_loader.sampler)
        train_losses.append(train_loss)
        valid_losses.append(valid_loss)
      
        print('epoch: {} \ttraining Loss: {:.6f} \tvalidation Loss: {:.6f}'.format(epoch+1, train_loss, valid_loss))

        # save model if validation loss has decreased
        if valid_loss <= valid_loss_min:
            print('validation loss decreased ({:.6f} --> {:.6f}).  Saving model ...'.format(
            valid_loss_min,
            valid_loss))
            torch.save(model.state_dict(), 'model.pt')
            valid_loss_min = valid_loss
  
    print('Finished Training')
      
    return train_losses, valid_losses    

In [179]:
# Set number of epochs to train the model
n_epoch = 30

train_losses_1, valid_losses_1 = training(n_epoch, train_loader, val_loader, net, criterion, optimizer)

epoch: 1 	training Loss: 0.007269 	validation Loss: 0.840631
validation loss decreased (inf --> 0.840631).  Saving model ...
epoch: 2 	training Loss: 0.010548 	validation Loss: 0.831311
validation loss decreased (0.840631 --> 0.831311).  Saving model ...
epoch: 3 	training Loss: 0.009007 	validation Loss: 0.834387
epoch: 4 	training Loss: 0.006274 	validation Loss: 0.872081
epoch: 5 	training Loss: 0.005814 	validation Loss: 0.876442
epoch: 6 	training Loss: 0.006413 	validation Loss: 0.894185
epoch: 7 	training Loss: 0.007214 	validation Loss: 0.908008
epoch: 8 	training Loss: 0.005517 	validation Loss: 0.911897
epoch: 9 	training Loss: 0.006570 	validation Loss: 0.934281
epoch: 10 	training Loss: 0.004626 	validation Loss: 0.944269
epoch: 11 	training Loss: 0.003888 	validation Loss: 1.006526
epoch: 12 	training Loss: 0.006800 	validation Loss: 0.999785
epoch: 13 	training Loss: 0.007387 	validation Loss: 1.005184
epoch: 14 	training Loss: 0.005396 	validation Loss: 0.966720
epoch: 1

In [181]:
net.load_state_dict(torch.load('model.pt', map_location=device))

<All keys matched successfully>

In [182]:
def evaluation(model, test_loader, criterion):

    # initialize lists to monitor test loss and accuracy
    test_loss = 0.0
    class_correct = list(0. for i in range(10))
    class_total = list(0. for i in range(10))

    model.eval() # prep model for evaluation
    for data, label in test_loader:
        data = data.to(device=device, dtype=torch.float32)
        label = label.to(device=device, dtype=torch.long)
        with torch.no_grad():
            output = model(data) # forward pass: compute predicted outputs by passing inputs to the model
        loss = criterion(output, label)
        test_loss += loss.item()*data.size(0)
        _, pred = torch.max(output, 1) # convert output probabilities to predicted class
        correct = np.squeeze(pred.eq(label.data.view_as(pred))) # compare predictions to true label
        # calculate test accuracy for each object class
        for i in range(len(label)):
            digit = label.data[i]
            class_correct[digit] += correct[i].item()
            class_total[digit] += 1

    # calculate and print avg test loss
    test_loss = test_loss/len(test_loader.sampler)
    print('test Loss: {:.6f}\n'.format(test_loss))
    for i in range(10):
        print('test accuracy of %1s: %2d%% (%2d/%2d)' % (str(i), 100 * class_correct[i] / class_total[i], np.sum(class_correct[i]), np.sum(class_total[i])))
    print('\ntest accuracy (overall): %2.2f%% (%2d/%2d)' % (100. * np.sum(class_correct) / np.sum(class_total), np.sum(class_correct), np.sum(class_total)))

In [185]:
evaluation(net, test_loader, criterion)

test Loss: 0.765983

test accuracy of 0: 88% (3016/3415)
test accuracy of 1: 90% (3115/3426)
test accuracy of 2: 93% (3034/3259)
test accuracy of 3: 92% (3133/3372)


ZeroDivisionError: float division by zero

In [190]:
# Evaluate the model on the test data
net.eval()
with torch.no_grad():
    test_loss = 0
    test_acc = 0
    for inputs, targets in test_loader:
        # Move inputs and targets to device
        inputs, targets = inputs.to(device), targets.to(device)

        # Forward pass
        y_pred = net(inputs)

        # Compute loss
        test_loss += criterion(y_pred, targets).item()

        # Compute accuracy
        test_acc += (y_pred.argmax(dim=1) == targets).float().sum().item()

    # Normalize test loss and accuracy
    test_loss /= len(test_loader.dataset)
    test_acc /= len(test_loader.dataset)

# Print the test results
print('Test Loss: {:.4f}, Test Accuracy: {:.2f}%'.format(test_loss, test_acc * 100))

Test Loss: 0.0469, Test Accuracy: 91.11%


In [189]:
# Train the model
num_epochs = 10
for epoch in range(num_epochs):
    train_loss = 0.0
    train_acc = 0.0
    val_loss = 0.0
    val_acc = 0.0

    # Training loop
    net.train()
    for i, (inputs, targets) in enumerate(train_loader):
        # Move inputs and targets to device
        inputs, targets = inputs.to(device), targets.to(device)

        # Forward pass
        y_pred = net(inputs)

        # Compute loss
        loss = criterion(y_pred, targets)

        # Zero gradients
        optimizer.zero_grad()

        # Backward pass
        loss.backward()

        # Update parameters
        optimizer.step()

        # Accumulate training loss and accuracy
        train_loss += loss.item() * inputs.size(0)
        train_acc += (y_pred.argmax(dim=1) == targets).sum().item()

    # Validation loop
    net.eval()
    with torch.no_grad():
        for inputs, targets in val_loader:
            # Move inputs and targets to device
            inputs, targets = inputs.to(device), targets.to(device)

            # Forward pass
            y_pred = net(inputs)

            # Compute loss
            loss = criterion(y_pred, targets)

            # Accumulate validation loss and accuracy
            val_loss += loss.item() * inputs.size(0)
            val_acc += (y_pred.argmax(dim=1) == targets).sum().item()

    # Normalize losses and accuracies
    train_loss /= len(train_loader.dataset)
    train_acc /= len(train_loader.dataset)
    val_loss /= len(val_loader.dataset)
    val_acc /= len(val_loader.dataset)

    # Print training and validation results for each epoch
    print('Epoch [{}/{}], Train Loss: {:.4f}, Train Acc: {:.2f}%, Val Loss: {:.4f}, Val Acc: {:.2f}%'
          .format(epoch+1, num_epochs, train_loss, train_acc * 100, val_loss, val_acc * 100))

Epoch [1/10], Train Loss: 0.0130, Train Acc: 99.56%, Val Loss: 0.8178, Val Acc: 90.64%
Epoch [2/10], Train Loss: 0.0082, Train Acc: 99.74%, Val Loss: 0.8219, Val Acc: 90.81%
Epoch [3/10], Train Loss: 0.0038, Train Acc: 99.88%, Val Loss: 0.8499, Val Acc: 90.78%
Epoch [4/10], Train Loss: 0.0062, Train Acc: 99.77%, Val Loss: 0.8766, Val Acc: 90.87%
Epoch [5/10], Train Loss: 0.0065, Train Acc: 99.80%, Val Loss: 0.9093, Val Acc: 90.76%
Epoch [6/10], Train Loss: 0.0048, Train Acc: 99.86%, Val Loss: 0.8945, Val Acc: 90.82%
Epoch [7/10], Train Loss: 0.0058, Train Acc: 99.83%, Val Loss: 0.9270, Val Acc: 91.10%
Epoch [8/10], Train Loss: 0.0062, Train Acc: 99.80%, Val Loss: 0.9456, Val Acc: 90.64%
Epoch [9/10], Train Loss: 0.0049, Train Acc: 99.84%, Val Loss: 0.9636, Val Acc: 90.78%
Epoch [10/10], Train Loss: 0.0044, Train Acc: 99.86%, Val Loss: 0.9734, Val Acc: 90.99%


**Training Remarks**

If the training accuracy is about 25%, it means the network predicts the result as good as chance (4 possible choices).

When you train a neural network, you have to analyze your results:

- If, after the training, your training accuracy is far from 100%, your network is underfitting (high bias).

- Try to train the network longer (more epochs, bigger/smaller learning rate, batch size).

- Or, define a bigger network (more hidden layers, bigger out_features).

- If, your test accuracy is far from your training accuracy, your network is overfitting (high variance).

- Try to regularize your optimization (look at L2 regularization, weight decay, drop out, early stopping...).

- Try to use more data.

In [192]:
net.state_dict()

OrderedDict([('input_layer.weight',
              tensor([[ 2.1269e-02,  2.4909e-01, -4.8094e-02,  ..., -4.1285e-01,
                       -2.8764e-01,  1.1323e-02],
                      [-1.3283e-02,  1.6664e-01, -1.3403e-01,  ..., -3.6669e-01,
                       -2.3599e-01,  8.2365e-04],
                      [ 1.5724e-02,  1.1396e-01,  1.6748e-02,  ...,  1.5410e-01,
                        2.1780e-01, -8.4343e-03],
                      ...,
                      [-1.8870e-02,  4.7227e-02, -4.5484e-02,  ..., -4.2070e-01,
                       -1.0073e-01, -1.3574e-02],
                      [ 1.1586e-02, -1.9704e-05,  4.4684e-01,  ..., -3.6990e-02,
                       -7.5150e-02,  5.5947e-03],
                      [ 2.4980e-02, -2.1176e-01, -2.2873e-02,  ..., -2.9665e-01,
                       -1.7624e-01, -7.6903e-03]], device='cuda:0')),
             ('input_layer.bias',
              tensor([-0.2572,  0.1904,  0.3292,  0.4441,  0.2741,  0.2929,  0.6503,  0.0469,
   

Line to play with rat.
<pre>python pyrat.py -p 40 -x 20 -y 20 -d 0 -md 0 --rat AIs/manh.py --python D:/introduction-to-ai/session2/lab/supervised_playing/source/supervised_player.py --nonsymmetric</pre>