# MNIST with Tuning Neurons

This notebook is dedicated to study the effectof the tunning neurons in the MNIST task, the comparison should be done with other kind of ANNs already know in the industry.

**The main points to compare are**:
 - Performance (errors, accuracy, other metrics)
 - Learning Speed (in cycles)
 - Compare results of BOTH types of networks at initialization moment with a final Linear decoder
 - Operation complexity (number of OPS for example for each to get to a meaningful representation)
 - Compare a  random set of Tuning Neuron Ensembles (TNEs) with a final linear decoder
 - Create several ideas on attention mechanisms for the TNEs to generate different Deep TNEs Networks (DTNENs) and compare results

*NOTE*: The implementation of TNEs in TensorFlow might be needed for this task, as it might take a lot of time in the CPU.

In [1]:
import tensorflow as tf
import matplotlib.pyplot as plt
import random

from sklearn import linear_model
from sklearn.model_selection import train_test_split

import numpy as np
import pandas as pd

%matplotlib inline

In [2]:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


## Tuning Neurons

In [3]:
#issue with TensorFlow, reference here: https://github.com/tensorflow/tensorflow/issues/6698
#config needed
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.InteractiveSession(config=config)

### Definitions and neuron creation

In [4]:
def weight_variable(shape):
  initial = tf.truncated_normal(shape, stddev=0.1)
  return tf.Variable(initial)

def bias_variable(shape):
  initial = tf.constant(0.1, shape=shape)
  return tf.Variable(initial)

def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

def max_pool_2x2(x):
  return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                        strides=[1, 2, 2, 1], padding='SAME')

In [5]:
# def create_tuning_neuron_2d(shape, min_y=0.5,
#                                max_y=1.5, min_weight=0.0, max_weight=0.8,
#                                min_x=-1.0, max_x=1.0, saturation=None,
#                               ):
#     height, width, input_channels, output_channels = shape #NHWC
#     n_neurons = n_synapses = width * height
#     # the first point is for y=0
#     nshape = [height, width] #, output_channels]
#     y1 = tf.zeros((n_neurons, ))
#     x1 = tf.random_uniform((n_neurons, ), -1, 1)
#     # the second point is for x = +-1
#     y2 = tf.random_uniform((n_neurons, ),min_y, max_y)
#     vec = np.random.choice([-1,1], (n_neurons, ))
#     x2 = tf.convert_to_tensor(vec, tf.float32) #will define a's sign
#     #a = tf.Variable( tf.divide(tf.subtract(y1,y2), tf.subtract(x1,x2)))
#     a = tf.divide(tf.subtract(y1,y2), tf.subtract(x1,x2))
#     A = a
#     #A = tf.stack([a for i in range(output_channels)])#, axis=1)
#     b = tf.subtract(y1, (a - x1))
#     B = b
#     #b = tf.Variable(tf.subtract(y1, (a - x1)))
#     #B = tf.stack([b for i in range(output_channels)])#, axis=1)
#     W = tf.Variable(tf.random_uniform([n_synapses,n_neurons]))
#     if(saturation is None):
#         saturation = [random.uniform(0.8, 1.0),1.0]
#     sat = tf.random_uniform((n_neurons, ), *saturation)
#     SAT = sat
#     return (A, B, SAT, W)

In [10]:
def create_tuning_neuron_2d(shape, min_y=0.5,
                               max_y=1.5, min_weight=0.0, max_weight=0.8,
                               min_x=-1.0, max_x=1.0, saturation=None,
                              ):
    height, width, input_channels, output_channels = shape #NHWC
    n_neurons = n_synapses = width * height
    # the first point is for y=0
    y1 = tf.zeros((n_neurons, ))
    x1 = tf.random_uniform((n_neurons, ), -1, 1)
    # the second point is for x = +-1
    y2 = tf.random_uniform((n_neurons, ), min_y, max_y)
    vec = np.random.choice([-1,1], (n_neurons, ))
    x2 = tf.convert_to_tensor(vec, tf.float32) #will define a's sign
    #a = tf.Variable( tf.divide(tf.subtract(y1,y2), tf.subtract(x1,x2)))
    a = tf.divide(tf.subtract(y1,y2), tf.subtract(x1,x2))
    A = a
    #A = tf.stack([a for i in range(output_channels)])#, axis=1)
    b = tf.subtract(y1, (a - x1))
    #b = tf.Variable(tf.subtract(y1, (a - x1)))
    #B = tf.stack([b for i in range(output_channels)])#, axis=1)
    B = b
    #W = tf.Variable(tf.random_uniform([n_synapses,n_neurons]))
    W = tf.random_uniform([n_synapses,n_neurons])
                   
    if(saturation is None):
        saturation = [random.uniform(0.8, 1.0),1.0]
    sat = tf.random_uniform((n_neurons, ), *saturation)
    SAT = sat
    #SAT = tf.stack([sat for i in range(output_channels)])#, axis=1)
    #outputs placeholder
    return (A,B,SAT,W)

In [11]:
def evaluate_tuning_neuron(shape, A,B,SAT,W,x):
    height, width, input_channels, output_channels = shape #NHWC
    #regression logit model
    # I will first play with reshaping as I don't understand TF yet too much
    # it seems that reshaping is the ONLY available solution for tensorflow
    #xresh = tf.transpose(tf.reshape(x, [batch_size, height*width, output_channels]))
    #print(A.shape, B.shape, W.shape, SAT.shape, x.shape)
    xshape = x.get_shape()
    tf.Print(x, [x])
    xresh = tf.transpose(tf.reshape(x, [-1, height*width, output_channels ]),perm=[0,2,1])
    #try with another resizing and see what gives!!
    xresh2 = tf.reshape(x, [-1, height*width])
    xcurrent = tf.matmul(xresh2, W)
    #yb = tf.maximum(tf.minimum(tf.multiply(xcurrent, A) + B, sat), 0 )
    #print(A.shape, B.shape, W.shape, SAT.shape, x.shape, xcurrent.shape, xresh.shape)
    yb = tf.maximum(tf.minimum(tf.multiply(xcurrent, A) + B, SAT), 0 )
    y = tf.reshape(yb, [-1, height, width , output_channels])
    return y

In [12]:
tx = tf.placeholder(tf.float32, shape=[None, 784])
ty_ = tf.placeholder(tf.float32, shape=[None, 10])
batch_size = 100

In [13]:
#patch [width, height, input channels, output channels]
tW_conv1 = weight_variable([5, 5, 1, 32])
tb_conv1 = bias_variable([32]) #number of output bias
#4d tensor [ N# batch inputs, width, height, n#channels]
tx_image = tf.reshape(tx, [-1, 28, 28, 1])

thconv1 = conv2d(tx_image, tW_conv1) + tb_conv1
#thconv1 = tf.Print(thconv1, [thconv1.shape])
shape1 = [28, 28, 1, 32 ]
A1,B1,SAT1,W1 = create_tuning_neuron_2d(shape1, saturation=[0.8,10])
th_conv1 = evaluate_tuning_neuron(shape1, A1,B1,SAT1,W1, thconv1)
# th_conv1 = tf.Print(th_conv1, [th_conv1.shape])
th_pool1 = max_pool_2x2(th_conv1)

# th_conv1 = tf.nn.relu(conv2d(tx_image, tW_conv1) + tb_conv1)
# th_pool1 = max_pool_2x2(th_conv1)
# th_pool1 = tf.Print(th_pool1, [th_pool1.shape])

In [14]:
th_pool1

<tf.Tensor 'MaxPool:0' shape=(?, 14, 14, 32) dtype=float32>

In [15]:
tW_conv2 = weight_variable([5, 5, 32, 64])
tb_conv2 = bias_variable([64])

# thconv2 = conv2d(th_pool1, tW_conv2) + tb_conv2
# (a, b, sat, W, x, th_conv2, y_ )= create_tuning_neuron_2d([14, 14, 1, 64], thconv2,
#                                                    saturation=[0.8,10])
# th_pool2 = max_pool_2x2(th_conv2)

# shape2 = [14, 14, 1, 64 ]
# A2,B2,SAT2,W2 = create_tuning_neuron_2d(shape2, saturation=[0.8,10])
# th_conv2 = evaluate_tuning_neuron(shape2, A2,B2,SAT2,W2, thconv2)
# th_conv2 = tf.Print(th_conv1, [th_conv1.shape])
# th_pool2 = max_pool_2x2(th_conv2)

th_conv2 = tf.nn.relu(conv2d(th_pool1, tW_conv2) + tb_conv2)
th_pool2 = max_pool_2x2(th_conv2)


In [16]:
th_pool2

<tf.Tensor 'MaxPool_1:0' shape=(?, 7, 7, 64) dtype=float32>

In [17]:
#Densely connected layer
tW_fc1 = weight_variable([7 * 7 * 64, 1024])
tb_fc1 = bias_variable([1024])

th_pool2_flat = tf.reshape(th_pool2, [-1, 7*7*64])

In [18]:
#is HERE that I have to replace by my Tuning Neurons

# thconv1 = conv2d(tx_image, tW_conv1) + tb_conv1
# (a, b, sat, W, x, th_conv1, y_ )= create_tuning_neuron_2d([28, 28, 1, 32 ], 
#                                                           thconv1, saturation=[0.8,10])
# th_pool1 = max_pool_2x2(th_conv1)

th_fc1 = tf.nn.relu(tf.matmul(th_pool2_flat, tW_fc1) + tb_fc1)

#Dropout

tkeep_prob = tf.placeholder(tf.float32)
th_fc1_drop = tf.nn.dropout(th_fc1, tkeep_prob)

#Readout Layer

tW_fc2 = weight_variable([1024, 10])
tb_fc2 = bias_variable([10])

ty_conv = tf.matmul(th_fc1_drop, tW_fc2) + tb_fc2

In [19]:
cross_entropy = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(labels=ty_, logits=ty_conv))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(ty_conv, 1), tf.argmax(ty_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

In [20]:
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=ty_, logits=ty_conv))
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)

# sess.run(tf.global_variables_initializer())

# for _ in range(1000):
#   batch = mnist.train.next_batch(batch_size)
#   train_step.run(feed_dict={tx: batch[0], ty_: batch[1]})

In [21]:
%%time 

sess.run(tf.global_variables_initializer())
for i in range(2000):
    batch = mnist.train.next_batch(50)
    if i>0 and i % 100 == 0:
      train_accuracy = accuracy.eval(feed_dict={
          tx: batch[0], ty_: batch[1], tkeep_prob: 1.0})
      print('step %d, training accuracy %g' % (i, train_accuracy))
      if(train_accuracy > 0.999):
        break
    train_step.run(feed_dict={tx: batch[0], ty_: batch[1], tkeep_prob: 0.5})

step 100, training accuracy 0.04
step 200, training accuracy 0.16
step 300, training accuracy 0.08
step 400, training accuracy 0.1
step 500, training accuracy 0.1
step 600, training accuracy 0.14
step 700, training accuracy 0.1
step 800, training accuracy 0.08
step 900, training accuracy 0.12
step 1000, training accuracy 0.1
step 1100, training accuracy 0.14
step 1200, training accuracy 0.12
step 1300, training accuracy 0.08
step 1400, training accuracy 0.08
step 1500, training accuracy 0.2
step 1600, training accuracy 0.12
step 1700, training accuracy 0.14
step 1800, training accuracy 0.12
step 1900, training accuracy 0.12
CPU times: user 9.98 s, sys: 1.55 s, total: 11.5 s
Wall time: 11.3 s


Results seem completely random, as if it was creating a NEW encoding each time ???

I really don't understand it

In [None]:

print('test accuracy %g' % accuracy.eval(feed_dict={
x: mnist.test.images, y_: mnist.test.labels, tkeep_prob: 1.0}))


Still with issues about the dimensions:

    InvalidArgumentError: In[0].dim(0) and In[1].dim(0) must be the same: [10000,32,784] vs [1,784,784]
         [[Node: MatMul_3 = BatchMatMul[T=DT_FLOAT, adj_x=false, adj_y=false, _device="/job:localhost/replica:0/task:0/gpu:0"](transpose, Reshape_9)]]
         [[Node: Mean_3/_7 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_239_Mean_3", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

    During handling of the above exception, another exception occurred:
    
After this I must check on HOW to do the correct things for the trainings...

In [None]:
vals = sess.run(tf.trainable_variables())

In [None]:
for k,v in zip([v.name for v in tf.trainable_variables()], vals):
    print ("var: "+k+" = "+str(v.shape))