# How I went from 0 to a business oriented neural network in 7 days #

My journey through my first, proof of concept, tiny neural network that translates English “natural” language into an executable SQL query.

#### Objectives

1. For the business reader: get you curious about Machine Intelligence and show that the short term benefits are tangible
2. For the IT/BI reader: encourage you to build a neural network for your organisation  


## Where it started ##

While learning more and more about machine learning, I finally looked into [TensorFlow](https://www.tensorflow.org), a state of the art [... open source software library for numerical computation using data flow graphs.].  The tutorial on creating an English to French translation algorithm made me think that, maybe I could apply this to another language I am fluent in: SQL.  SQL is arguably the most used language to query relational databases.   
Even if you have never seen a simple SQL query, you could certainly understand it.  To get the *address* of a *store* named "*Downtown Montreal*", one could write:
>SELECT address FROM store WHERE name = "Downtown Montreal"

Easy, right!  It can get a lot more complicated, of course.  But you see why I, perhaps naively, think translation should be a piece of cake.

**Need an intro to Neural Networks?** Look at this short [video](https://www.youtube.com/watch?v=P2HPcj8lRJE), from DeepLearning.TV.  Do not be afraid, learning stuff is a good thing.

## Where I was 7 days ago
* I was sitting on an old, but, as I found out recently, not completely forgotten bachelor’s degree in Mathematics
* I accumulated 20 years in Information Technology, mainly in the apparel industry
* I had about than 6 months of data science oriented experience with Python and R
* I had read a couple of papers on neural networks, listened to a few lectures on *the Interwebs* in the past month  

So, I had a little basic understanding, but I was nowhere near a neural network expert!  (I am still nowhere near by the way!)


## Day 1 : On with the tutorial then!
>Every computer program starts with copy-paste.  

The idea is not go over the tutorial step by step here, but to reassure my Business reader and my IT reader that, although this journey includes some detours, it has a solid destination. 

###### Which tutorial?
* I chose the [Sequence-to-Sequence Models](https://www.tensorflow.org/tutorials/seq2seq/) tutorial, I figured it was the closest to what I was trying to achieve. (See citation above.)

###### Installation and first run
* Bugs every step of the way, but mainly due to version control over Python, packages and scripts on GitHub.  Frustrating, but it is fairly easy to find support and answers in forums.  This is completely understandable as the project is actively under development.  I would say that the TensorFlow team and community are doing a very job on supporting this tutorial. 
* Don’t forget that even if the training starts, it might not be bug free yet!  Here is what I got:
![](resource/TrainEnFr.png)
* For reference, I am doing this on a high end Mac Book Pro and don’t expect to be able to get to any descent English to French translator on similar computer power… I pulled the proverbial plug on the training after 600 iterations (out of the recommended 340K!), just loading the 22M observations dataset takes ages (though one can easily use a small subset for testing)
> At first glance, I was looking at around 1700 hours of training time, if I did not run out of memory first…

**First goal acheived**; reproducing the neural network locally. Awesome.  Now let's try to customize it to a SQL translation machine.
## Day 4 : Now I have a crappy English-to-French translator
I don't mind; I am conviced that with more computer power I could get to an descent translation machine.  Now I need to teach this neural net some SQL.

###### Creating a dataset
My first dataset was only around 20 English sentences with corresponding SQL queries for training, plus 4 or 5 for testing.  
I used Excel for this.  
![](resource/ExcelData.png)

Columns A, B and C have 3 different versions that produces the same SQL statement.  As you can see, I am not aiming for truly natural language, in a business environment, I figure efficiency will be appreciated more.

##### From generic tutorial to personalised model
* We get into a different kind of problems at this step, this time more due to lack of documentation.  The good people at TensorFlow give, perfectly understandably, no rodent’s posterior in my fiddling around with the Python scripts.
* Remember that this is the fun part.  I can assure you that the code is robust and you won’t have to fully understand it to get it to work, just a few parameter tweaks are required.
* Out of Excel as plain text, the format of the file was not compatible.  I wrote (well, mostly copied&pasted, remember?) a function in Python to fix that. Here it is:

In [14]:
# NOTE: grey boxes should be ignored by Business readers!
import io
def convertutf(filename):
    with io.open(filename,'r',encoding='utf16') as f:
        text = f.read()
    # process Unicode text
    with io.open(filename,'w',encoding='utf8') as f:
        f.write(text)

## Day 6 : ... Now I have a crappy English-to-SQL translator, sweet!
I got a result with my own data!  It was completely of the worst kind, but i saw at that point that I could get somewhere interesting with a few more keystrokes on my magic mat grey metal box with an half eaten apple logo on it.

1. The model ignores digits by default, but it can convert numbers to text, not ideal, but I will do for now.
2. A bigger dataset
    + I am creating data inefficently in Excel, so I only got to around 450 observations.  This should do for a proof of concept.
    
###### Here is a sample of my training set

In [2]:
with open("resource/giga-fren.release2.fixed.en") as tdata_en:
    alldat_en = tdata_en.read()
with open("resource/giga-fren.release2.fixed.fr") as tdata_fr:
    alldat_fr = tdata_fr.read()
print(alldat_en[0:444] + alldat_fr[0:970])

store number of store 1
number of store 2
name of store 3
description of store 4
type of store 5
status of store 6
city of store 7
state of store 8
province of store 9
size of store 10
square foot of store 11
square footage of store 12
footage of store 13
rent of store 14
monthly rent of store 15
yearly rent of store 16
opening date of store 17
closing date of store 18
address of store 19
full address of store 20
street address of store 21
SELECT store_id FROM store WHERE store_id = 1
SELECT store_id FROM store WHERE store_id = 2
SELECT name FROM store WHERE store_id = 3
SELECT name FROM store WHERE store_id = 4
SELECT type FROM store WHERE store_id = 5
SELECT active FROM store WHERE store_id = 6
SELECT city FROM store WHERE store_id = 7
SELECT state FROM store WHERE store_id = 8
SELECT state FROM store WHERE store_id = 9
SELECT sqr_foot FROM store WHERE store_id = 10
SELECT sqr_foot FROM store WHERE store_id = 11
SELECT sqr_foot FROM store WHERE store_id = 12
SELECT sqr_foot FROM stor


## Day 7 : Ok, I hope this will work


##### Start the training 
![](resource/Training.png)
I aborted the process after 9000 steps

##### Decode

![](resource/Decode.png)

Every line that start with ">" is where I enter a new English sentence, below is the translation from the neural network.  Not good with typos and adjectives, but increasing the dataset should fix that.  And of course, when I improvise a new word in a question, it fails.

As you can see, it works pretty well considering the nano-size training set.  Not impressive yet, I must admit.  That is my next task.

##### Next step
I am still struggling with coming up with a clever way to generate a big amount of data.  My goal is to get it into the 10000 observations range with multiple tables, *group by* clause, etc.  I will publish new results about that as soon as my brain lets me.


## So, 7 days later

From this experiment, here is what I envision in a business environment.

Every day at work, small snippets of info need to be looked up:
* How many stores do we have in Ontario?
* What is the phone extension of Jane Doe?
* What is the inventory value of the Fall 2016 collection?

All these questions can surely be answered fairly quickly by a report, the intranet, a colleague, etc.  But what if all these could be answered from a single Google-like search box?  This could significantly change the way employees access information.  And since neural networks are based on data, changing or upgrading a major software will have minimal impact on getting this type of information.

Furthermore, Neural networks could be built to seamlessly assist in:
* Translating product descriptions
* Describing products from an image or sketch
* Building a knowledge base from the Support Center email history
* Clustering stores
* Etc.

The use cases are countless!!

## Conclusion

The neural networks tools available today are mature enough to allow organisations to use them, maybe in a humble manner, but certainly with significant business benefit.

------------
For reference, these are the Python packages I am working with:
![](resource/Packages.png)

January 19, 2017  
Simon Laurin

Oh! Did I mention speech recognition...