# MNIST with Tuning Neurons

This notebook is dedicated to study the effectof the tunning neurons in the MNIST task, the comparison should be done with other kind of ANNs already know in the industry.

**The main points to compare are**:
 - Performance (errors, accuracy, other metrics)
 - Learning Speed (in cycles)
 - Compare results of BOTH types of networks at initialization moment with a final Linear decoder
 - Operation complexity (number of OPS for example for each to get to a meaningful representation)
 - Compare a  random set of Tuning Neuron Ensembles (TNEs) with a final linear decoder
 - Create several ideas on attention mechanisms for the TNEs to generate different Deep TNEs Networks (DTNENs) and compare results

*NOTE*: The implementation of TNEs in TensorFlow might be needed for this task, as it might take a lot of time in the CPU.

In [1]:
import tensorflow as tf
import matplotlib.pyplot as plt
import random

from sklearn import linear_model
from sklearn.model_selection import train_test_split

import numpy as np
import pandas as pd

%matplotlib inline

In [2]:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz


## BaseLine - Single Layer network

MNIST from Google [TensorFlow examples](https://www.tensorflow.org/get_started/mnist/pros)

### Starting Tensorflow InteractiveSession

In [3]:
#issue with TensorFlow, reference here: https://github.com/tensorflow/tensorflow/issues/6698
#config needed

import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.InteractiveSession(config=config)

### Softmax Regression Model

In [4]:
x = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None, 10])

In [5]:
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))


In [6]:
#initializing session variables (defined previously). Step needed to make it work
sess.run(tf.global_variables_initializer())


In [7]:
#Defining regression model

y = tf.matmul(x,W) + b

cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))

In [8]:
#Training the model

train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

for _ in range(1000):
  batch = mnist.train.next_batch(100)
  train_step.run(feed_dict={x: batch[0], y_: batch[1]})

In [9]:
#Evaluating the model

correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

print(accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels}))


0.9154


## BaseLine comparison - Multi-Layer Convolutional Network

#### Weight Initialization

In [10]:
def weight_variable(shape):
  initial = tf.truncated_normal(shape, stddev=0.1)
  return tf.Variable(initial)

def bias_variable(shape):
  initial = tf.constant(0.1, shape=shape)
  return tf.Variable(initial)

#### Convolution and Pooling

In [11]:
def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

def max_pool_2x2(x):
  return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                        strides=[1, 2, 2, 1], padding='SAME')

In [12]:
#patch [width, height, input channels, output channels]
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32]) #number of output bias
#4d tensor [ ???, width, height, n#channels]
x_image = tf.reshape(x, [-1, 28, 28, 1])

#defining convolutional layers
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)

W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])

h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

#Densely connected layer
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])

h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

#Dropout

keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

#Readout Layer

W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])

y_conv = tf.matmul(h_fc1_drop, W_fc2) + b_fc2

#### Train and Evaluate the Model

In [13]:
cross_entropy = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_conv))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

In [14]:
%%time 

with tf.Session() as sess1:
  sess1.run(tf.global_variables_initializer())
  for i in range(20000):
    batch = mnist.train.next_batch(50)
    if i % 100 == 0:
      train_accuracy = accuracy.eval(feed_dict={
          x: batch[0], y_: batch[1], keep_prob: 1.0})
      print('step %d, training accuracy %g' % (i, train_accuracy))
    train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})

  print('test accuracy %g' % accuracy.eval(feed_dict={
      x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))

step 0, training accuracy 0.12
step 100, training accuracy 0.78
step 200, training accuracy 0.94
step 300, training accuracy 0.8
step 400, training accuracy 0.92
step 500, training accuracy 0.98
step 600, training accuracy 0.94
step 700, training accuracy 0.94
step 800, training accuracy 0.9
step 900, training accuracy 0.92
step 1000, training accuracy 0.94
step 1100, training accuracy 0.92
step 1200, training accuracy 0.98
step 1300, training accuracy 0.96
step 1400, training accuracy 0.94
step 1500, training accuracy 1
step 1600, training accuracy 0.98
step 1700, training accuracy 0.98
step 1800, training accuracy 0.98
step 1900, training accuracy 1
step 2000, training accuracy 0.96
step 2100, training accuracy 0.98
step 2200, training accuracy 0.94
step 2300, training accuracy 0.96
step 2400, training accuracy 0.96
step 2500, training accuracy 1
step 2600, training accuracy 1
step 2700, training accuracy 0.98
step 2800, training accuracy 0.98
step 2900, training accuracy 0.96
step 3

## Tuning Neurons

### Definitions and neuron creation

In [15]:
def create_tuning_neuron_layer(n_neurons, n_synapses, min_y=0.5,
                               max_y=1.5, min_weight=0.0, max_weight=0.8,
                               min_x=-1.0, max_x=1.0, saturation=None,
                               attention_function=None #future use. Gaussian for example
                              ):
    """
    Creates a neuron based on tuning curves.
    
    #Future, add attention as a mechanism for different neurons to give more importance 
    to certain groups of input weights than others, this should encourage sparcity ...
    """
    #This will create linear functions based on two points in x,y axis.
    #The idea is to be able to control the intersections between the function
    #and those points (max and min x and y)
    #with this we can calculate the values a and b
    #weights have a maximum value to avoid monopolising the next layer
    #saturation indicates that the neuron output will have a maximum spiking rate
    #spiking rate is interpreted as the value between [0,1], which should be like the 
    #capacitor integration of the spikes (vague interpretation)
    #attention should later be added to test what is the importance of that .. I think it
    # will only be useful for large dimension vectors.
    
    # the first point is for y=0
    y1 = tf.zeros(n_neurons)
    x1 = tf.random_uniform((n_neurons, ), -1, 1)
    # the second point is for x = +-1
    y2 = tf.random_uniform((n_neurons, ),min_y, max_y)
    vec = np.random.choice([-1,1], n_neurons)
    x2 = tf.convert_to_tensor(vec, tf.float32) #will define a's sign
#     a = tf.Variable( tf.divide(tf.subtract(y1,y2), tf.subtract(x1,x2)))
    a = tf.divide(tf.subtract(y1,y2), tf.subtract(x1,x2))
#     b = tf.subtract(y1, (a - x1))
    b = tf.Variable(tf.subtract(y1, (a - x1)))
    W = tf.Variable(tf.random_uniform([n_synapses,n_neurons]))
    if(saturation is None):
        saturation = [random.uniform(0.8, 1.0),1.0]
    sat = tf.random_uniform((n_neurons,), *saturation)
    
    #inputs placeholder
    x = tf.placeholder(tf.float32, shape=[None, n_synapses])
    #outputs placeholder
    y_ = tf.placeholder(tf.float32, shape=[None, n_neurons])
    #regression logit model
    y = tf.maximum(tf.minimum(tf.multiply(tf.matmul(x,W), a) + b, sat), 0 )
    
    return (a, b, sat, W, x, y, y_)
        
    
    

#### First Experiment

Recreating the same as the simple one at the beginning of the document but with the Tuning Neurons

In [16]:
(a,b,sat,W,x,y,y_) = create_tuning_neuron_layer(10, 784, saturation = [0.8,10])

In [17]:
sess = tf.InteractiveSession(config=config)
sess.run(tf.global_variables_initializer())

In [26]:
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))
#Training the model

train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

for _ in range(1000):
  batch = mnist.train.next_batch(100)
  train_step.run(feed_dict={x: batch[0], y_: batch[1]})

In [27]:
#Evaluating the model

correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

print(accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

0.5692


Basically, this result seems like a RANDOM CHOICE!!!!!

It does not seems good.

Maybe I'm not using the rignt training function?

Maybe I'm not working correctly with it?

Maybe is just useless for this part?

Maybe is useful only for bigger models?


What it DOES LOOK is that the evaluation seems slow

Also, changing the SATURATION values, it improves ... like 300 to 450%

#### Second experiment

Recreating the same structure of the deep model and check what happens