#"Will It Learn?" - Word Jumble
Will a simple convolution network learn to unscramble a word? Let's make this hard and give it only pixels.

###First we will create some functions to draw scrambled word:

In [0]:
#This is a code cell. Click on the cell to make it active. 
#Then click on the left corner arrow in a circle button to run it. Or use the keyboard shortcut ctlr+enter.
import numpy as np
from PIL import Image, ImageDraw
import random
import matplotlib.pyplot as plt
from matplotlib.animation import FuncAnimation
%matplotlib inline

#We will give a single tensor with each channel the board state
word = "jumble"
img_shape = (24,60,3)

def img_word(img_shape, word):
  w = img_shape[1]
  h = img_shape[0]
  img = Image.new('RGB', size=(w, h))
  draw = ImageDraw.Draw(img)
  draw.text((10, h/2 - 5), word)
  
  return img
    
img = img_word(img_shape, word)
plt.axis('off')
plt.imshow(img)
plt.show()

#This should show and image of the word jumble

##Next we will create some routines to scramble words

In [0]:
from random import shuffle, randint
import numpy as np

def scramble(word):
  letters = []
  for l in word:
    letters.append(l)
  shuffle(letters)
  return ''.join(letters)

six_letter_words = ['jumble', 'puzzle', 'lizard', 'wonder', 'object', 'zombie', 'quiver', 'jungle', 'pickup', 'chunky', 'aliens', 'amazon', 'animal',
                   'barely', 'burley', 'bobble', 'wildly', 'wicked', 'buying', 'cactus', 'chatty', 'charms', 'charge', 'champs', 'change', 'chewed',
                   'dangle', 'decide', 'deluxe', 'dazzle', 'defeat', 'deduce']

print('We have', len(six_letter_words), 'six letter words')

def get_data(words, image_shape):
  iW = randint(0, len(words)-1)
  word = words[iW]
  scrambled = scramble(word)
  img = img_word(image_shape, scrambled)
  label = np.zeros(len(words))
  label[iW] = 1.0
  return np.array(img) / 255.0, label, word

img, label, word = get_data(six_letter_words, img_shape)
plt.axis('off')
plt.imshow(img)
plt.show()
print(word)

##What do you think?
Will this somewhat simple network accomplish the task of learning which of the ten words was chosen and then scrambled, given only the pizels?


Let's find out...

First let's make sure you are using the GPU. Run the cell bellow to double check:

In [0]:
import tensorflow as tf
print(tf.__version__)
tf.test.gpu_device_name()
#if you see something like '/device:GPU:0' then the gpu accelerated learning will be enabled. if not, cpu will work, just slower.

##Our Neural Network

Here's what our network will look like. Pretty straight foward, 4 layers of convolutions. 64 filters per layer. A 3x3 kernel used in each. With relu activation in each. 

We will give it all the images planes stacked into one image. We will ask it to activate one of four neurons to represent which corner it thinks the circle is hidding at the end.


In [0]:
#Let's define a pretty straight forward conv neural network. It will have about 200K params, given the input image dimension of 24 x 60
import keras
from keras.layers import Flatten, Conv2D, Dense

num_words = len(six_letter_words)
img = keras.Input(shape=img_shape)

x = img
x = Conv2D(64, (3, 3), strides=(1,1), activation="relu")(x)
x = Conv2D(64, (3, 3), strides=(1,1), activation="relu")(x)
x = Conv2D(64, (3, 3), strides=(1,1), activation="relu")(x)
x = Conv2D(64, (3, 3), strides=(1,1), activation="relu")(x)
x = Flatten()(x)
x = Dense(100, activation='relu')(x)
x = Dense(num_words, activation='softmax')(x)

model = keras.Model(img, x)
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["acc"])
print(model.summary())

###Now let's start the training:

In [0]:
# Here's function that will generate a numpy array of images and moves for our NN to train on
def generator(img_shape, batch_size, words):
  while True:
    X = []
    y = []
    for i in range(batch_size):
      img, label, word = get_data(words, img_shape)
      X.append(img)
      y.append(label)
    yield np.array(X), np.array(y)
    
    
#Now let's start the training.
batch_size = 64
steps_per_epoch = 128
epochs = 2

train_gen = generator(img_shape, batch_size, six_letter_words)

model.fit_generator(train_gen, 
                    steps_per_epoch=steps_per_epoch, 
                    epochs=epochs, 
                    verbose=1)


In [0]:
#Now let's create a new image and see how it does.
#Run this code cell multiple times to get a feeling about how it handles different cases.

img, label, word = get_data(six_letter_words, img_shape)
res = model.predict(img[None, :, :, :])
print("raw prediction: ", res[0])
iPred = np.argmax(res[0])
print("Predicts:", six_letter_words[iPred], "confidence", res[0][iPred])
print("Answer:", word)

plt.axis('off')
plt.imshow(img)
plt.show()

#So What's Your Verdict?
###Do you think this learned to unscramble the letters?

I don't know why, but I thought this would be a challenge. The fact that I struggle to unscramble these words only underscores my amazement at how quickly and effortlessly it appears to have mastered this. 


###Note: 
this wasn't working until I normalized the channel images between 0-1

#Further thought..

How would this do against a much larger dataset of words?

###What tests could you devise to disambiguiate whether it had simply 'memorized' the world state?

Could we restucture this to output a vector 6x27 with one hot active for each letter? Would that make it easier or harder?

I decided to try. Take a look [here at part 2](https://colab.research.google.com/drive/1qjtyWfpHIn45yOrmDwNIC68p8SaqsP29) if you are curious.

#Episodes Links:

[Will It Learn? - S01E01 Circle Count](https://drive.google.com/open?id=11EiFFa-imh5MNEPJZuqgqJAwLYHhP3gG)

[Will It Learn? - S01E02 Tic Tace Toe](https://drive.google.com/open?id=1PKosDR9wcgPaF2-BYMSZiu2nW03COxma)

[Will It Learn? - S01E03 : Shell Game](https://drive.google.com/open?id=163iv-LaidgxiU3tT_RcLCT_K1HOdagMu)

[Will It Learn? - S01E04 : Word Jumble](https://drive.google.com/open?id=19ENSHOC-TEyDqZ-_47QhSHHxhUAuDEoA)

[Will It Learn? - S01E05 : Mazes](https://drive.google.com/open?id=1qdYWNwrmYAtFsayzoxPuuGAE1RTKt1ia)

