## Investigating what happens to regions of data when passed through neural networks and non linearities

The following notebook helps to visualize what happens to points as they pass non linearities and a neural network. The analysis is done step by step. The main reason for the development of this notebook is that `02-space_stretching.ipynb` lets you just see what happens after 1 linear transformation and 1 non linear transformation in 2D. This notebook helps you to see what happens after each step of a full neural network with hidden layers in 3D. There is also a part dedicated to 1D transformation at the end of this notebook.

You can choose whether your input points should be a 2-D Mesh Grid or a 2-D Gaussian Cloud. You can choose a non linearity: `TanH` or `ReLU`.

The code runs the input through two networks

1. A 2 Layered Network (Linear Layer, NL, Linear Layer)
1. A 3 Layered Network (Linear Layer, NL, Linear Layer, NL, Linear Layer)

You can also optionally initialize the weights of the neural network yourself. Initializing weights to different scale of values will enable you to make visualizations similar to `02-space_stretching.ipynb` 



### Select your choices below

Note: Mesh Grid with TanH and  Gaussian Cloud with TanH & Default Manual Weight Initialization give particularly nice visualizations.

### Input Type ('Gaussian Cloud' or 'Mesh Grid')

In [29]:
input_type = 'Gaussian Cloud'
# input_type = 'Gaussian Cloud'

In [30]:
assert input_type in ['Mesh Grid', 'Gaussian Cloud']

### Non Linearity Type ('relu' or 'tanh')

In [31]:
nl_type = 'tanh'
# nl_type = 'tanh'

In [32]:
assert nl_type in ['relu', 'tanh']

### Weight Initialization Type ('manual' or 'random')

In [33]:
init_type = 'manual'
# init_type = 'random'

In [34]:
assert init_type in ['manual','random']

Below there is the code cell to initialize the weights for the first layer of the network. You can modify it to suit your needs and change other layers as well. Refer `Weight Initilization` below

In [35]:
# Load libraries
import numpy as np
import torch
import torch.nn as nn
from matplotlib.pyplot import plot, title, axis
import matplotlib.pyplot as plt

In [36]:
def set_default(figsize=(12, 12)):
    plt.style.use(['dark_background', 'bmh'])
    plt.rc('axes', facecolor='k')
    plt.rc('figure', facecolor='k')
    plt.rc('figure', figsize=figsize)

def show_scatterplot(X, colors, title=''):
    colors = colors.cpu().numpy()
    X = X.cpu().numpy()
    plt.figure()
    plt.axis('equal')
    plt.scatter(X[:, 0], X[:, 1], c=colors, s=30)
    # plt.grid(True)
    plt.title(title)
    plt.axis()

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
set_default()
%matplotlib widget

# 3D

### 3-D plotting libraries installation instructions

Install ipyml and jupyter extensions by following instructions from https://github.com/matplotlib/jupyter-matplotlib#installatio
I have given the commands below for your convenience \
**Note**: These installations have to be done on top of the dl-minicourse conda environment

In [37]:
# !conda install -c conda-forge ipympl

In [38]:
# # If using the Notebook
# !conda install -c conda-forge widgetsnbextension

In [39]:
# # If using JupyterLab
# !conda install nodejs
# !jupyter labextension install @jupyter-widgets/jupyterlab-manager
# !jupyter labextension install jupyter-matplotlib

In [40]:
from mpl_toolkits.mplot3d import Axes3D
def show_scatterplot_3D(X, colors, title=''):
    colors = colors.cpu().numpy()
    X = X.cpu().numpy()
    fig = plt.figure()
    ax = fig.add_subplot(111, projection='3d')
    ax.set_xlabel('x')
    ax.set_ylabel('y')
    ax.set_zlabel('z')
    #ax.axis('equal')
    ax.scatter(X[:, 0], X[:, 1], X[:, 2], c=colors, s=30)
    #plt.show()
    # plt.grid(True)
    plt.title(title)
    #ax.axis('off')

In [41]:
if input_type == 'Mesh Grid':
    x = np.arange(-5, 5, 0.1)
    y = np.arange(-5, 5, 0.1)
    xx, yy = np.meshgrid(x, y, sparse=True)
    xx = xx.reshape(-1)
    yy = yy.reshape(-1)
    inputs = torch.from_numpy(np.array([(i,j) for i in xx for j in yy]))
elif input_type == 'Gaussian Cloud':    
    inputs = torch.randn(1000,2)

In [42]:
colors=inputs[:,0]

In [43]:
if nl_type == 'relu':
    NL = nn.ReLU()
elif nl_type == 'tanh':
    NL = nn.Tanh()

### There are 6 network models defined below. 
1. Only the first linear layer (2D -> 3D)
2. The first linear (2D -> 3D) and non linear layer
3. Linear (2D -> 3D), NL and Linear layer (3D -> 2D)
4. Linear (2D -> 3D), NL and Hidden layer (3D -> 3D)
5. Linear (2D -> 3D), NL, Hidden (3D -> 3D), NL layer
6. Linear (2D -> 3D), NL, Hidden (3D -> 3D), NL and Linear layer (3D -> 2D)

All the 6 networks are made to have the same weight for the first layer in the cell below. You can manually change this weight if you want.
The 3 networks with hidden layers are made to have the same weight for the hidden layer. You can manually change this too.


In [44]:
n_data = 2
n_hidden = 3
bias = False


linear_model = nn.Sequential(
        nn.Linear(n_data, n_hidden, bias=bias)
)

linear_model.to(device)

non_linear_model = nn.Sequential(
        nn.Linear(n_data, n_hidden, bias=bias),
        NL
)

non_linear_model.to(device)


full_model = nn.Sequential(
        nn.Linear(n_data, n_hidden, bias=bias),
        NL,
        nn.Linear(n_hidden,n_data, bias=bias),
)

full_model.to(device)


model_extra_hidden_layer = nn.Sequential(
        nn.Linear(n_data, n_hidden, bias=bias),
        NL,
        nn.Linear(n_hidden,n_hidden, bias=bias)
)

model_extra_hidden_layer.to(device)

model_extra_hidden_layer_nl = nn.Sequential(
        nn.Linear(n_data, n_hidden, bias=bias),
        NL,
        nn.Linear(n_hidden,n_hidden, bias=bias),
        NL
)

model_extra_hidden_layer_nl.to(device)


full_model_extra_hidden_layer = nn.Sequential(
        nn.Linear(n_data, n_hidden, bias=bias),
        NL,
        nn.Linear(n_hidden,n_hidden, bias=bias),
        NL,
        nn.Linear(n_hidden,n_data, bias=bias),
)

full_model_extra_hidden_layer.to(device)

Sequential(
  (0): Linear(in_features=2, out_features=3, bias=False)
  (1): Tanh()
  (2): Linear(in_features=3, out_features=3, bias=False)
  (3): Tanh()
  (4): Linear(in_features=3, out_features=2, bias=False)
)

### Weight Initialization

In [45]:
S = 3
if n_hidden == 2:
    first_layer_initial_weights = S*torch.eye(2)  
else:
    if n_hidden == 4:
        additional_weight_matrix = torch.ones(2,2)
    elif n_hidden == 3:
        additional_weight_matrix = torch.ones(1,2)
    first_layer_initial_weights = S*torch.cat((torch.eye(2),2*additional_weight_matrix),0)

In [46]:
if init_type =='manual':
    W1 = first_layer_initial_weights
elif init_type == 'random':
    W1 = linear_model[0].weight.data
non_linear_model[0].weight.data.copy_(W1)
full_model[0].weight.data.copy_(W1)
model_extra_hidden_layer[0].weight.data.copy_(W1)
model_extra_hidden_layer_nl[0].weight.data.copy_(W1)
full_model_extra_hidden_layer[0].weight.data.copy_(W1)

W3 = full_model[2].weight.data

full_model_extra_hidden_layer[4].weight.data.copy_(W3)

W2 = model_extra_hidden_layer[2].weight.data
model_extra_hidden_layer_nl[2].weight.data.copy_(W2)
full_model_extra_hidden_layer[2].weight.data.copy_(W2)

print('Weights of the first linear layer')
print(W1)
print('Weights of the last linear layer')
print(W3)
print('Weights of the hidden linear layer')
print(W2)

Weights of the first linear layer
tensor([[3., 0.],
        [0., 3.],
        [6., 6.]])
Weights of the last linear layer
tensor([[ 0.1345, -0.5229,  0.5245],
        [ 0.2360,  0.4346, -0.5221]])
Weights of the hidden linear layer
tensor([[ 0.3411,  0.0557,  0.3757],
        [-0.2658, -0.1838, -0.3370],
        [-0.4643,  0.2287, -0.4700]])


In [47]:
inputs = inputs.double()
linear_model=linear_model.double()
non_linear_model=non_linear_model.double()
full_model=full_model.double()
model_extra_hidden_layer = model_extra_hidden_layer.double()
model_extra_hidden_layer_nl = model_extra_hidden_layer_nl.double()
full_model_extra_hidden_layer = full_model_extra_hidden_layer.double()

In [48]:
outputs_1 = linear_model(inputs)
outputs_2 = non_linear_model(inputs)
outputs_3 = full_model(inputs)
outputs_4 = model_extra_hidden_layer(inputs)
outputs_5 = model_extra_hidden_layer_nl(inputs)
outputs_6 = full_model_extra_hidden_layer(inputs)

### The following is the step by step visualization of what happens to the input after each layer. 
Notice that the color scheme remains constant throughout so it is easier to visualize what happens to regions of points

In [49]:
print('Input Data')
show_scatterplot(inputs.data,colors = colors, title='Input')

print('Weights of first linear layer')
print(W1)
show_scatterplot_3D(outputs_1.data,colors = colors, title='After Linear Layer (2D -> 3D)')


print('Weights of first linear layer')
print(W1)
print('Non Linearity:')
print(NL)
show_scatterplot_3D(outputs_2.data,colors = colors, title='After Linear Layer (2D -> 3D) + NL')


print('Weights of first linear layer')
print(W1)
print('Non Linearity:')
print(NL)
print('Weights of last linear layer')
print(W3)
show_scatterplot(outputs_3.data,colors = colors, title='After Linear Layer (2D -> 3D) + NL + Linear Layer (3D -> 2D)')


print('Weights of first linear layer')
print(W1)
print('Non Linearity:')
print(NL)
print('Weights of hidden linear layer')
print(W2)
show_scatterplot_3D(outputs_4.data,colors = colors, title='After Linear Layer (2D -> 3D) + NL + Hidden Layer (3D -> 3D)')


print('Weights of first linear layer')
print(W1)
print('Non Linearity:')
print(NL)
print('Weights of hidden linear layer')
print(W2)
print('Non Linearity:')
print(NL)
show_scatterplot_3D(outputs_5.data,colors = colors, title='After Linear Layer (2D -> 3D) + NL + Hidden Layer (3D -> 3D) + NL')


print('Weights of first linear layer')
print(W1)
print('Non Linearity:')
print(NL)
print('Weights of hidden linear layer')
print(W2)
print('Non Linearity:')
print(NL)
print('Weights of last linear layer')
print(W3)
show_scatterplot(outputs_6.data,colors = colors, title='After Linear Layer (2D -> 3D) + NL + Hidden Layer (3D -> 3D) + NL + Linear Layer (3D -> 2D)')

Input Data


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

Weights of first linear layer
tensor([[3., 0.],
        [0., 3.],
        [6., 6.]])


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

Weights of first linear layer
tensor([[3., 0.],
        [0., 3.],
        [6., 6.]])
Non Linearity:
Tanh()


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

Weights of first linear layer
tensor([[3., 0.],
        [0., 3.],
        [6., 6.]])
Non Linearity:
Tanh()
Weights of last linear layer
tensor([[ 0.1345, -0.5229,  0.5245],
        [ 0.2360,  0.4346, -0.5221]])


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

Weights of first linear layer
tensor([[3., 0.],
        [0., 3.],
        [6., 6.]])
Non Linearity:
Tanh()
Weights of hidden linear layer
tensor([[ 0.3411,  0.0557,  0.3757],
        [-0.2658, -0.1838, -0.3370],
        [-0.4643,  0.2287, -0.4700]])


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

Weights of first linear layer
tensor([[3., 0.],
        [0., 3.],
        [6., 6.]])
Non Linearity:
Tanh()
Weights of hidden linear layer
tensor([[ 0.3411,  0.0557,  0.3757],
        [-0.2658, -0.1838, -0.3370],
        [-0.4643,  0.2287, -0.4700]])
Non Linearity:
Tanh()


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

Weights of first linear layer
tensor([[3., 0.],
        [0., 3.],
        [6., 6.]])
Non Linearity:
Tanh()
Weights of hidden linear layer
tensor([[ 0.3411,  0.0557,  0.3757],
        [-0.2658, -0.1838, -0.3370],
        [-0.4643,  0.2287, -0.4700]])
Non Linearity:
Tanh()
Weights of last linear layer
tensor([[ 0.1345, -0.5229,  0.5245],
        [ 0.2360,  0.4346, -0.5221]])


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

## 1D

In [22]:
n=50
if input_type == 'Mesh Grid':
    inputs = torch.linspace(-1,1,n)
elif input_type == 'Gaussian Cloud':    
    inputs = torch.randn(n,1)
inputs_in_2d = torch.from_numpy(np.array(list(map(lambda x: (np.array([x,0])),inputs)))) # Converting 1-D inputs form (x) -> (x,0) for visualizing as a scatter plot
inputs = inputs.view(-1,1)

### There are 3 network models defined below. 
1. Only the first linear layer
2. The first linear and non linear layer
3. Linear, NL and Linear layer

All the 3 networks are made to have the same weight for the first layer in the cell below. You can manually change this weight if you want.

The full network converts the 1-D data into 2-D, passes it through a NL and converts it back to 1-D.

In [23]:
n_data = 1
n_hidden = 2
bias = False
if nl_type == 'relu':
    NL = nn.ReLU()
elif nl_type == 'tanh':
    NL = nn.Tanh()

linear_model = nn.Sequential(
        nn.Linear(n_data, n_hidden, bias=bias)
)

linear_model.to(device)

non_linear_model = nn.Sequential(
        nn.Linear(n_data, n_hidden, bias=bias),
        NL
)

non_linear_model.to(device)

full_model = nn.Sequential(
        nn.Linear(n_data, n_hidden, bias=bias),
        NL,
        nn.Linear(n_hidden,n_data, bias=bias)
)

full_model.to(device)

Sequential(
  (0): Linear(in_features=1, out_features=2, bias=False)
  (1): ReLU()
  (2): Linear(in_features=2, out_features=1, bias=False)
)

In [24]:
W = linear_model[0].weight.data
print('Weights of the first linear layer')
print(W)
non_linear_model[0].weight.data.copy_(W)
full_model[0].weight.data.copy_(W)
print('Weights of the last linear layer')
print(full_model[2].weight.data)

Weights of the first linear layer
tensor([[-0.8563],
        [ 0.9086]])
Weights of the last linear layer
tensor([[ 0.1840, -0.2933]])


In [25]:
outputs_1 = linear_model(inputs)
outputs_2 = non_linear_model(inputs)
outputs_3 = full_model(inputs)
outputs_3_in_2d = torch.from_numpy(np.array(list(map(lambda x: (np.array([x,0])),outputs_3.data))))

In [26]:
colors=torch.cat((torch.zeros(25),torch.ones(25)))

### The following is the step by step visualization of what happens to the input after each layer. 
Notice that the color scheme remains constant throughout so it is easier to visualize what happens to regions of points

In [27]:
print('Input Data')
show_scatterplot(inputs_in_2d.data,colors=colors)

print('Weights of the first linear layer')
print(W)
show_scatterplot(outputs_1.data,colors=colors)

print('Weights of the first linear layer')
print(W)
print('Non Linearity')
print(NL)
show_scatterplot(outputs_2.data,colors=colors)


print('Weights of the first linear layer')
print(W)
print('Non Linearity')
print(NL)
print('Weights of the last linear layer')
print(full_model[2].weight.data)
show_scatterplot(outputs_3_in_2d,colors=colors)

Input Data


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

Weights of the first linear layer
tensor([[-0.8563],
        [ 0.9086]])


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

Weights of the first linear layer
tensor([[-0.8563],
        [ 0.9086]])
Non Linearity
ReLU()


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

Weights of the first linear layer
tensor([[-0.8563],
        [ 0.9086]])
Non Linearity
ReLU()
Weights of the last linear layer
tensor([[ 0.1840, -0.2933]])


Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

In [28]:
# from celluloid import Camera
# import time
# anim_data_np = [inputs_in_2d.data.numpy(),outputs_1.data.numpy(),outputs_2.data.numpy(),outputs_3_in_2d.data.numpy()]
# anim_data = [inputs_in_2d.data,outputs_1.data,outputs_2.data,outputs_3_in_2d.data]
# fig = plt.figure()
# camera = Camera(fig)
# for i in range(len(anim_data)):
#     plt.scatter(anim_data[i][:, 0], anim_data[i][:, 1], c=colors.numpy(), s=30)
#     #plt.plot(anim_data[i][:,0],anim_data[i][:,1])
#     camera.snap()
#     time.sleep(20)
    
# animation = camera.animate()
# animation.save('celluloid_minimal.gif', writer = 'imagemagick')