# Demonstration of TensorFlow: FizzBuzz

This notebook is a demonstration of how to solve the FuzzBuzz problem using a neural network. Big thanks to JoelGrus for the TF code. This notebook is paired with the presentation, 'Deep Learning Explained for Developers'.

### Install necessary libraries

In [3]:
import numpy as np
import tensorflow as tf

### Step 1: How will we represent our data?
To generate some sample data, we'll convert each number into a binary vector. This will make our data much more compatible for a neural network. 

**Input**
A binary representation of a number. For example: [0, 1, 0, 1] <<-- the number '5'
  
**Output** 
The probabilities of: 
  a. The number itself.
  b. 'Fizz'
  c. 'Buzz'
  d. 'FizzBuzz'
  
For example: [0.1, 0.5, 0.1, 0.3]

In [5]:
def binary_encode(i, num_digits):
    return np.array([i >> d & 1 for d in range(num_digits)])
  
def fizz_buzz_encode(i):
  if   i % 15 == 0: return np.array([0, 0, 0, 1])
  elif i % 5  == 0: return np.array([0, 0, 1, 0])
  elif i % 3  == 0: return np.array([0, 1, 0, 0])
  else:             return np.array([1, 0, 0, 0])

### Step 2: Generate Sample Data!

Note: We need all the data (samples and otherwise) to be in the same format. Thus, we need to decide how many digits we'll consider.  Also, we don't want to use our sample data in our model.

In [7]:
NUM_DIGITS = 10
trX = np.array([binary_encode(i, NUM_DIGITS) for i in range(101, 2 ** NUM_DIGITS)])
trY = np.array([fizz_buzz_encode(i)          for i in range(101, 2 ** NUM_DIGITS)])

### Time to setup our model! 

1. We choose a number of hidden units and layers to use. 
2. We define the input and output formats, and make some choices on how to 'activate' each layer.

In [10]:
NUM_HIDDEN = 100
# An input variable with width NUM_DIGITS digits, and output variable with length 4
X = tf.placeholder("float", [None, NUM_DIGITS])
Y = tf.placeholder("float", [None, 4])

We'll need an input variable with width NUM_DIGITS, and an output variable with width 4:

In [12]:
# X = tf.placeholder("float", [None, NUM_DIGITS])
# Y = tf.placeholder("float", [None, 4])

# Initialize weights randomly around 0.
def init_weights(shape):
    return tf.Variable(tf.random_normal(shape, stddev=0.01))

# Our hidden layer, and output layer
# Weights shape: [10, 100]
w_h = init_weights([NUM_DIGITS, NUM_HIDDEN])
w_o = init_weights([NUM_HIDDEN, 4])

# Our model has one hidden layer, and one output layer
def model(X, w_h, w_o):
    h = tf.nn.relu(tf.matmul(X, w_h))
    return tf.matmul(h, w_o)

### Let the learning begin!

Now that we have our model, we can try to watch it "learn" from our data. In other words, "minimize error". For the more technially inclined, we're using softmax cross-entropy as our cost function.

In [14]:
py_x = model(X, w_h, w_o)

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=py_x, labels=Y))
train_op = tf.train.GradientDescentOptimizer(0.05).minimize(cost)

# Our prediction will simply be the max probability of our output vector
predict_op = tf.argmax(py_x, 1)

In [15]:
# Lets run 
NUM_EPOCHS = 10000
BATCH_SIZE = 128

# Lets shuffle each iteration
sess = tf.Session()

with sess.as_default():
    tf.global_variables_initializer().run()
    
    for epoch in range(NUM_EPOCHS):
        # Shuffle each input set
        p = np.random.permutation(range(len(trX)))
        trX, trY = trX[p], trY[p]
        
        # Grab batches of size BATCH_SIZE and do some training!
        for start in range(0, len(trX), BATCH_SIZE):
            end = start + BATCH_SIZE
            sess.run(train_op, feed_dict={X: trX[start:end], Y: trY[start:end]})
            
        accuracy = np.mean(np.argmax(trY, axis=1) ==
                           sess.run(predict_op, feed_dict={X: trX, Y: trY}))
        print(epoch, accuracy)

### Make Predictions on 1 to 100!

In [17]:
# Create a list of the numbers 1 to 100, binary encode them
numbers = np.arange(1, 101)
teX = np.transpose(binary_encode(numbers, NUM_DIGITS))

def fizz_buzz(i, prediction):
    return [str(i), "fizz", "buzz", "fizzbuzz"][prediction]
  
teY = sess.run(predict_op, feed_dict={X: teX})
output = np.vectorize(fizz_buzz)(numbers, teY)

for i in range(1, len(output)):
  print(str(i) + ":  " + str(output[i-1]))