# T81-558: Applications of Deep Neural Networks
**Class 11: Natural Language Processing and Speech Recognition**
* Instructor: [Jeff Heaton](https://sites.wustl.edu/jeffheaton/), School of Engineering and Applied Science, [Washington University in St. Louis](https://engineering.wustl.edu/Programs/Pages/default.aspx)
* For more information visit the [class website](https://sites.wustl.edu/jeffheaton/t81-558/).

# Reused Functions

In [1]:
from sklearn import preprocessing
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import shutil
import os


# Encode text values to dummy variables(i.e. [1,0,0],[0,1,0],[0,0,1] for red,green,blue)
def encode_text_dummy(df, name):
    dummies = pd.get_dummies(df[name])
    for x in dummies.columns:
        dummy_name = "{}-{}".format(name, x)
        df[dummy_name] = dummies[x]
    df.drop(name, axis=1, inplace=True)


# Encode text values to a single dummy variable.  The new columns (which do not replace the old) will have a 1
# at every location where the original column (name) matches each of the target_values.  One column is added for
# each target value.
def encode_text_single_dummy(df, name, target_values):
    for tv in target_values:
        l = list(df[name].astype(str))
        l = [1 if str(x) == str(tv) else 0 for x in l]
        name2 = "{}-{}".format(name, tv)
        df[name2] = l


# Encode text values to indexes(i.e. [1],[2],[3] for red,green,blue).
def encode_text_index(df, name):
    le = preprocessing.LabelEncoder()
    df[name] = le.fit_transform(df[name])
    return le.classes_


# Encode a numeric column as zscores
def encode_numeric_zscore(df, name, mean=None, sd=None):
    if mean is None:
        mean = df[name].mean()

    if sd is None:
        sd = df[name].std()

    df[name] = (df[name] - mean) / sd


# Convert all missing values in the specified column to the median
def missing_median(df, name):
    med = df[name].median()
    df[name] = df[name].fillna(med)


# Convert all missing values in the specified column to the default
def missing_default(df, name, default_value):
    df[name] = df[name].fillna(default_value)


# Convert a Pandas dataframe to the x,y inputs that TensorFlow needs
def to_xy(df, target):
    result = []
    for x in df.columns:
        if x != target:
            result.append(x)

    # find out the type of the target column.  Is it really this hard? :(
    target_type = df[target].dtypes
    target_type = target_type[0] if hasattr(target_type, '__iter__') else target_type

    # Encode to int for classification, float otherwise. TensorFlow likes 32 bits.
    if target_type in (np.int64, np.int32):
        # Classification
        return df.as_matrix(result).astype(np.float32), df.as_matrix([target]).astype(np.int32)
    else:
        # Regression
        return df.as_matrix(result).astype(np.float32), df.as_matrix([target]).astype(np.float32)


# Nicely formatted time string
def hms_string(sec_elapsed):
    h = int(sec_elapsed / (60 * 60))
    m = int((sec_elapsed % (60 * 60)) / 60)
    s = sec_elapsed % 60
    return "{}:{:>02}:{:>05.2f}".format(h, m, s)


# Regression chart, we will see more of this chart in the next class.
def chart_regression(pred, y):
    t = pd.DataFrame({'pred': pred, 'y': y.flatten()})
    t.sort_values(by=['y'], inplace=True)
    a = plt.plot(t['y'].tolist(), label='expected')
    b = plt.plot(t['pred'].tolist(), label='prediction')
    plt.ylabel('output')
    plt.legend()
    plt.show()


# Get a new directory to hold checkpoints from a neural network.  This allows the neural network to be
# loaded later.  If the erase param is set to true, the contents of the directory will be cleared.
def get_model_dir(name, erase):
    base_path = os.path.join(".", "dnn")
    model_dir = os.path.join(base_path, name)
    os.makedirs(model_dir, exist_ok=True)
    if erase and len(model_dir) > 4 and os.path.isdir(model_dir):
        shutil.rmtree(model_dir, ignore_errors=True)  # be careful, this deletes everything below the specified path
    return model_dir


# Remove all rows where the specified column is +/- sd standard deviations
def remove_outliers(df, name, sd):
    drop_rows = df.index[(np.abs(df[name] - df[name].mean()) >= (sd * df[name].std()))]
    df.drop(drop_rows, axis=0, inplace=True)


# Encode a column to a range between normalized_low and normalized_high.
def encode_numeric_range(df, name, normalized_low=-1, normalized_high=1,
                         data_low=None, data_high=None):
    if data_low is None:
        data_low = min(df[name])
        data_high = max(df[name])

    df[name] = ((df[name] - data_low) / (data_high - data_low)) \
               * (normalized_high - normalized_low) + normalized_low

# NLP with LSTM and CNN Neural Networks

The material in this class session is based heavily upon the paper [Character-level Convolutional Networks for Text Classification](https://arxiv.org/abs/1509.01626).

The following pages were also used for material for this class session:

* [TensorFlow — Text Classification](https://medium.com/@ilblackdragon/tensorflow-text-classification-615198df9231#.i1r4ao3te)

TensorFlow implementations of the above paper:

* [Text Classification Using Recurrent Neural Networks on Words]()
* [Text Classification Using Convolutional Neural Networks on Words]()
* [Text Classification Using Recurrent Neural Networks on Characters]()
* [Text Classification Using Convolutional Neural Networks on Characters]()

# Data Sources: DBPedia

[DBPedia](http://wiki.dbpedia.org/) uses the data contained in [WikiPedia]() in database form.  The data in DBPedia can be queried in an SQL-like syntax named Protocol and RDF Query Language, or [SPARQL](https://en.wikipedia.org/wiki/SPARQL). 

For the text examples in this class we will use a sample of the DBPedia articles classified into 14 high level document classifications:

* Company (1)
* EducationalInstitution (2)
* Artist (3)
* Athlete (4)
* OfficeHolder (5)
* MeanOfTransportation (6)
* Building (7)
* NaturalPlace (8)
* Village (9)
* Animal (10)
* Plant (11)
* Album (12)
* Film (13)
* WrittenWork (14)

The data files can be found at this [location](https://drive.google.com/drive/folders/0Bz8a_Dbh9Qhbfll6bVpmNUtUcFdjYmF2SEpmZUZUcVNiMUw1TWN6RDV3a0JHT3kxLVhVR2M).


TensorFlow makes available several operators designed for text classification.

* skflow.preprocessing.**ByteProcessor (doc_len)** - Turn a list of text strings into fixed length arrays (specified by doc_len) using integer ASCII values, for example "ABC" becomes [65, 66, 67, 0, 0] if the doc_len is 5.
* skflow.ops.**one_hot_matrix** - One hot is the same as dummy variables. Expands multiple inputs into a cube, with dimensions [num_samples, input_size, num_samples].
* skflow.ops.**split_squeeze** - Splits input on given dimension and then squeezes that dimension.

In [3]:
import tensorflow.contrib.learn as learn

# Classifying Text Documents

data = [
    "This is a test",
    "ABC",
    "abc"
]

char_processor = learn.preprocessing.ByteProcessor(5)

z = list(char_processor.fit_transform(data))

print(z)



[array([ 84, 104, 105, 115,  32], dtype=uint8), array([65, 66, 67,  0,  0], dtype=uint8), array([97, 98, 99,  0,  0], dtype=uint8)]


In [9]:
temp = skflow.ops.one_hot_matrix(X_train, 256) 
print("1:{}".format(temp))
temp = skflow.ops.split_squeeze(1, MAX_DOCUMENT_LENGTH, temp)
print(len(temp))
print("2:{}".format(temp[0]))


1:Tensor("OneHot:0", shape=(560000, 100, 256), dtype=float32)
100
2:Tensor("Squeeze:0", shape=(560000, 256), dtype=float32)


# Word2Vec

Word2vec is a group of related models that are used to produce word embeddings. These models are shallow, two-layer neural networks that are trained to reconstruct linguistic contexts of words. Word2vec takes as its input a large corpus of text and produces a vector space, typically of several hundred dimensions, with each unique word in the corpus being assigned a corresponding vector in the space. Word vectors are positioned in the vector space such that words that share common contexts in the corpus are located in close proximity to one another in the space.

Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). [Efficient estimation of word representations in vector space](https://arxiv.org/abs/1301.3781). arXiv preprint arXiv:1301.3781.

![Word2Vec](https://pbs.twimg.com/media/C7jJxIjWkAA8E_s.jpg)
[Trust Word2Vec](https://twitter.com/DanilBaibak/status/844647217885581312)

### Suggested Software for Word2Vec

* [GoogleNews Vectors](https://code.google.com/archive/p/word2vec/), [GitHub Mirror](https://github.com/mmihaltz/word2vec-GoogleNews-vectors)
* [Python Gensim](https://radimrehurek.com/gensim/)


In [1]:
import gensim

# Not that the path below refers to a location on my hard drive.
# You should download GoogleNews Vectors (see suggested software above)
model = gensim.models.KeyedVectors.load_word2vec_format(
    '/Users/jeff/data/language/GoogleNews-vectors-negative300.bin.gz', binary=True)

In [2]:
w = model['hello']

In [3]:
print(len(w))

300


In [4]:
print(w)

[-0.05419922  0.01708984 -0.00527954  0.33203125 -0.25       -0.01397705
 -0.15039062 -0.265625    0.01647949  0.3828125  -0.03295898 -0.09716797
 -0.16308594 -0.04443359  0.00946045  0.18457031  0.03637695  0.16601562
  0.36328125 -0.25585938  0.375       0.171875    0.21386719 -0.19921875
  0.13085938 -0.07275391 -0.02819824  0.11621094  0.15332031  0.09082031
  0.06787109 -0.0300293  -0.16894531 -0.20800781 -0.03710938 -0.22753906
  0.26367188  0.012146    0.18359375  0.31054688 -0.10791016 -0.19140625
  0.21582031  0.13183594 -0.03515625  0.18554688 -0.30859375  0.04785156
 -0.10986328  0.14355469 -0.43554688 -0.0378418   0.10839844  0.140625
 -0.10595703  0.26171875 -0.17089844  0.39453125  0.12597656 -0.27734375
 -0.28125     0.14746094 -0.20996094  0.02355957  0.18457031  0.00445557
 -0.27929688 -0.03637695 -0.29296875  0.19628906  0.20703125  0.2890625
 -0.20507812  0.06787109 -0.43164062 -0.10986328 -0.2578125  -0.02331543
  0.11328125  0.23144531 -0.04418945  0.10839844 -0.28

In [10]:
import numpy as np

w1 = model['cat']
w2 = model['dog']

dist = np.linalg.norm  (w1-w2)

print(dist)

2.08153


In [11]:
model.most_similar(positive=['woman', 'king'], negative=['man'])


[('queen', 0.7118192315101624),
 ('monarch', 0.6189674139022827),
 ('princess', 0.5902431011199951),
 ('crown_prince', 0.5499460697174072),
 ('prince', 0.5377321839332581),
 ('kings', 0.5236843824386597),
 ('Queen_Consort', 0.5235945582389832),
 ('queens', 0.5181134939193726),
 ('sultan', 0.5098593235015869),
 ('monarchy', 0.5087411999702454)]

In [16]:
model.doesnt_match("breakfast cereal dinner lunch".split())


'cereal'

In [14]:
model.similarity('woman', 'man')


0.76640122309953529

# **Code below this point will not run in Data Scientist Workbench **

The code below interfaces with your computer's microphone and speakers.  It will not run in Data Scientist Workbench.


# Speech Recognition

A very common use of LSTM and RNN's is [speech recognition](https://en.wikipedia.org/wiki/Speech_recognition).  

# Using Google Voice for Speech Recognition

Google speech recognition makes use of [LSTM and some other technologies](https://research.googleblog.com/2015/08/the-neural-networks-behind-google-voice.html).

See Google [Speech Recognition in action](https://www.google.com/intl/en/chrome/demos/speech.html).

```
pip install SpeechRecognition
```
For Mac install, see the [following](http://stackoverflow.com/questions/33513522/when-installing-pyaudio-pip-cannot-find-portaudio-h-in-usr-local-include).

In [20]:
# pip install SpeechRecognition
# see this for PyAudio
# pip install pyttsx

#!/usr/bin/env python3

# NOTE: this example requires PyAudio because it uses the Microphone class

import speech_recognition as sr
import os

# obtain audio from the microphone
r = sr.Recognizer()
with sr.Microphone() as source:
    print("Say something!")
    audio = r.listen(source)

# recognize speech using Google Speech Recognition
try:
    # for testing purposes, we're just using the default API key
    # to use another API key, use `r.recognize_google(audio, key="GOOGLE_SPEECH_RECOGNITION_API_KEY")`
    # instead of `r.recognize_google(audio)`
    str = r.recognize_google(audio)
    print("You said: {}".format(str))
    os.system("say 'I believe you said: {}'".format(str))
except sr.UnknownValueError:
    print("Google Speech Recognition could not understand audio")
except sr.RequestError as e:
    print("Could not request results from Google Speech Recognition service; {0}".format(e))


Say something!
You said: hello


# Simple Text to Speech

Challenges:

* [Background Conversation](https://www.youtube.com/watch?v=IKB3Qiglyro&t=119s)
* [Klingon](https://www.youtube.com/watch?v=ucO3heC-Ztw)

In [22]:
# The following code works on a Mac
import os

def say(s):
    s = s.replace("'","")
    os.system("say '{}'".format(s))
    
say("Shall we play a game?")

# Text to Speech and Speech Recognition


Text to speech and speech recognition often go hand in hand.


In [28]:
# pip install SpeechRecognition
# see this for PyAudio
# pip install pyttsx

#!/usr/bin/env python3

# NOTE: this example requires PyAudio because it uses the Microphone class

import speech_recognition as sr
import os

def say(s):
    s = s.replace("'","")
    os.system("say '{}'".format(s))

# obtain audio from the microphone
r = sr.Recognizer()
with sr.Microphone() as source:
    say("Hello there, please say something.")
    audio = r.listen(source)

# recognize speech using Google Speech Recognition
try:
    # for testing purposes, we're just using the default API key
    # to use another API key, use `r.recognize_google(audio, key="GOOGLE_SPEECH_RECOGNITION_API_KEY")`
    # instead of `r.recognize_google(audio)`
    str = r.recognize_google(audio)
    print("You said: {}".format(str))
    say("I think you said {}".format(str))
except sr.UnknownValueError:
    print("Google Speech Recognition could not understand audio")
except sr.RequestError as e:
    print("Could not request results from Google Speech Recognition service; {0}".format(e))


You said: hola que tal como se Yama


# Eliza Example

[ELIZA](https://en.wikipedia.org/wiki/ELIZA) is an early natural language processing computer program created from 1964 to 1966 at the MIT Artificial Intelligence Laboratory by Joseph Weizenbaum.  The following code is based in an [Eliza Python Implementation by SureSmallThing](https://www.smallsurething.com/implementing-the-famous-eliza-chatbot-in-python/).

In [17]:
import re
import random
import speech_recognition as sr
import os

reflections = {
    "am": "are",
    "was": "were",
    "i": "you",
    "i'd": "you would",
    "i've": "you have",
    "i'll": "you will",
    "my": "your",
    "are": "am",
    "you've": "I have",
    "you'll": "I will",
    "your": "my",
    "yours": "mine",
    "you": "me",
    "me": "you"
}

psychobabble = [
    [r'i need (.*)',
     ["Why do you need {0}?",
      "Would it really help you to get {0}?",
      "Are you sure you need {0}?"]],

    [r'why don\'?t you ([^\?]*)\??',
     ["Do you really think I don't {0}?",
      "Perhaps eventually I will {0}.",
      "Do you really want me to {0}?"]],

    [r'why can\'?t I ([^\?]*)\??',
     ["Do you think you should be able to {0}?",
      "If you could {0}, what would you do?",
      "I don't know -- why can't you {0}?",
      "Have you really tried?"]],

    [r'i can\'?t (.*)',
     ["How do you know you can't {0}?",
      "Perhaps you could {0} if you tried.",
      "What would it take for you to {0}?"]],

    [r'i am (.*)',
     ["Did you come to me because you are {0}?",
      "How long have you been {0}?",
      "How do you feel about being {0}?"]],

    [r'i\'?m (.*)',
     ["How does being {0} make you feel?",
      "Do you enjoy being {0}?",
      "Why do you tell me you're {0}?",
      "Why do you think you're {0}?"]],

    [r'are you ([^\?]*)\??',
     ["Why does it matter whether I am {0}?",
      "Would you prefer it if I were not {0}?",
      "Perhaps you believe I am {0}.",
      "I may be {0} -- what do you think?"]],

    [r'what (.*)',
     ["Why do you ask?",
      "How would an answer to that help you?",
      "What do you think?"]],

    [r'how (.*)',
     ["How do you suppose?",
      "Perhaps you can answer your own question.",
      "What is it you're really asking?"]],

    [r'because (.*)',
     ["Is that the real reason?",
      "What other reasons come to mind?",
      "Does that reason apply to anything else?",
      "If {0}, what else must be true?"]],

    [r'(.*) sorry (.*)',
     ["There are many times when no apology is needed.",
      "What feelings do you have when you apologize?"]],

    [r'hello(.*)',
     ["Hello... I'm glad you could drop by today.",
      "Hi there... how are you today?",
      "Hello, how are you feeling today?"]],

    [r'i think (.*)',
     ["Do you doubt {0}?",
      "Do you really think so?",
      "But you're not sure {0}?"]],

    [r'(.*) friend (.*)',
     ["Tell me more about your friends.",
      "When you think of a friend, what comes to mind?",
      "Why don't you tell me about a childhood friend?"]],

    [r'yes',
     ["You seem quite sure.",
      "OK, but can you elaborate a bit?"]],

    [r'(.*) computer(.*)',
     ["Are you really talking about me?",
      "Does it seem strange to talk to a computer?",
      "How do computers make you feel?",
      "Do you feel threatened by computers?"]],

    [r'is it (.*)',
     ["Do you think it is {0}?",
      "Perhaps it's {0} -- what do you think?",
      "If it were {0}, what would you do?",
      "It could well be that {0}."]],

    [r'it is (.*)',
     ["You seem very certain.",
      "If I told you that it probably isn't {0}, what would you feel?"]],

    [r'can you ([^\?]*)\??',
     ["What makes you think I can't {0}?",
      "If I could {0}, then what?",
      "Why do you ask if I can {0}?"]],

    [r'can I ([^\?]*)\??',
     ["Perhaps you don't want to {0}.",
      "Do you want to be able to {0}?",
      "If you could {0}, would you?"]],

    [r'you are (.*)',
     ["Why do you think I am {0}?",
      "Does it please you to think that I'm {0}?",
      "Perhaps you would like me to be {0}.",
      "Perhaps you're really talking about yourself?"]],

    [r'you\'?re (.*)',
     ["Why do you say I am {0}?",
      "Why do you think I am {0}?",
      "Are we talking about you, or me?"]],

    [r'i don\'?t (.*)',
     ["Don't you really {0}?",
      "Why don't you {0}?",
      "Do you want to {0}?"]],

    [r'i feel (.*)',
     ["Good, tell me more about these feelings.",
      "Do you often feel {0}?",
      "When do you usually feel {0}?",
      "When you feel {0}, what do you do?"]],

    [r'i have (.*)',
     ["Why do you tell me that you've {0}?",
      "Have you really {0}?",
      "Now that you have {0}, what will you do next?"]],

    [r'i would (.*)',
     ["Could you explain why you would {0}?",
      "Why would you {0}?",
      "Who else knows that you would {0}?"]],

    [r'is there (.*)',
     ["Do you think there is {0}?",
      "It's likely that there is {0}.",
      "Would you like there to be {0}?"]],

    [r'my (.*)',
     ["I see, your {0}.",
      "Why do you say that your {0}?",
      "When your {0}, how do you feel?"]],

    [r'you (.*)',
     ["We should be discussing you, not me.",
      "Why do you say that about me?",
      "Why do you care whether I {0}?"]],

    [r'why (.*)',
     ["Why don't you tell me the reason why {0}?",
      "Why do you think {0}?"]],

    [r'i want (.*)',
     ["What would it mean to you if you got {0}?",
      "Why do you want {0}?",
      "What would you do if you got {0}?",
      "If you got {0}, then what would you do?"]],

    [r'(.*) mother(.*)',
     ["Tell me more about your mother.",
      "What was your relationship with your mother like?",
      "How do you feel about your mother?",
      "How does this relate to your feelings today?",
      "Good family relations are important."]],

    [r'(.*) father(.*)',
     ["Tell me more about your father.",
      "How did your father make you feel?",
      "How do you feel about your father?",
      "Does your relationship with your father relate to your feelings today?",
      "Do you have trouble showing affection with your family?"]],

    [r'(.*) child(.*)',
     ["Did you have close friends as a child?",
      "What is your favorite childhood memory?",
      "Do you remember any dreams or nightmares from childhood?",
      "Did the other children sometimes tease you?",
      "How do you think your childhood experiences relate to your feelings today?"]],

    [r'(.*)\?',
     ["Why do you ask that?",
      "Please consider whether you can answer your own question.",
      "Perhaps the answer lies within yourself?",
      "Why don't you tell me?"]],

    [r'quit',
     ["Thank you for talking with me.",
      "Good-bye.",
      "Thank you, that will be $150.  Have a good day!"]],

    [r'(.*)',
     ["Please tell me more.",
      "Let's change focus a bit... Tell me about your family.",
      "Can you elaborate on that?",
      "Why do you say that {0}?",
      "I see.",
      "Very interesting.",
      "{0}.",
      "I see.  And what does that tell you?",
      "How does that make you feel?",
      "How do you feel when you say that?"]]
]


def reflect(fragment):
    tokens = fragment.lower().split()
    for i, token in enumerate(tokens):
        if token in reflections:
            tokens[i] = reflections[token]
    return ' '.join(tokens)


def analyze(statement):
    for pattern, responses in psychobabble:
        match = re.match(pattern, statement.rstrip(".!"))
        if match:
            response = random.choice(responses)
            return response.format(*[reflect(g) for g in match.groups()])

def say(s):
    s = s.replace("'","")
    os.system("say '{}'".format(s))

def main():
    say("Hello. How are you feeling today?")

    r = sr.Recognizer()
    with sr.Microphone() as source:
        done = False

        while not done:
            audio = r.listen(source)

            # recognize speech using Google Speech Recognition
            try:
                # for testing purposes, we're just using the default API key
                # to use another API key, use `r.recognize_google(audio, key="GOOGLE_SPEECH_RECOGNITION_API_KEY")`
                # instead of `r.recognize_google(audio)`
                statement = r.recognize_google(audio)
                print("Human: {}".format(statement))
                response = analyze(statement)

                if statement.lower() == 'quit':
                    done = True

                print("Eliza (computer): {}".format(response))
                say(response)
            except sr.UnknownValueError:
                print("No input, or could not understand audio.")
            except sr.RequestError as e:
                print("Error: Could not request results from Google Speech Recognition service; {0}".format(e))



if __name__ == "__main__":
    main()

Human: my mother hates me
Eliza (computer): When your mother hates you, how do you feel?
Human: I feel sad
Eliza (computer): you feel sad.
Human: yes I do do you feel sad
Eliza (computer): OK, but can you elaborate a bit?
No input, or could not understand audio.
Human: quick
Eliza (computer): Why do you say that quick?
Human: quick
Eliza (computer): quick.
Human: quit
Eliza (computer): Thank you for talking with me.


# Chat Bots

Using the above code you can create your own primitive chat bots.  A some what famous video on Youtube from Cornell University shows what happens [when two chat bots converse](https://www.youtube.com/watch?v=WnzlbyTZsQY).  Other interesting chat bot type technology:

* [CleverBot](http://www.cleverbot.com/)
* [Computer Science Paper Generator](https://pdos.csail.mit.edu/archive/scigen/)

# More on LSTM

* [The Unreasonable Effectiveness of Recurrent Neural Networks](http://karpathy.github.io/2015/05/21/rnn-effectiveness/)
* [LSTM Music](https://www.youtube.com/watch?v=0VTI1BBLydE)
* [Natural Language Processing from Scratch](https://arxiv.org/abs/1103.0398)

# Solution for Spring 2017 Kaggle

The following code trains the model.

In [6]:
import tensorflow as tf
from sklearn.model_selection import train_test_split
import tensorflow.contrib.learn as learn

# Set the desired TensorFlow output level for this example
tf.logging.set_verbosity(tf.logging.INFO)

path = "./data/"

filename_train = os.path.join(path,"train.csv")
filename_test = os.path.join(path,"test.csv")
filename_submit = os.path.join(path,"submit.csv")

df_train = pd.read_csv(filename_train,na_values=['NA','?'])

# Encode feature vector
encode_numeric_zscore(df_train,'len')
encode_numeric_zscore(df_train,'links')
df_train.drop('title', axis=1, inplace=True)
df_train.drop('id', axis=1, inplace=True)

num_classes = len(df_train.groupby('class')['class'].nunique())

print("Number of classes: {}".format(num_classes))

# Create x & y for training

# Create the x-side (feature vectors) of the training
x, y = to_xy(df_train,'class')

# Split into train/test
x_train, x_validate, y_train, y_validate = train_test_split(
    x, y, test_size=0.25, random_state=42)

# Get/clear a directory to store the neural network to
model_dir = get_model_dir('kaggle',True)

opt=tf.train.AdamOptimizer(learning_rate=1e-2)
#opt=tf.train.MomentumOptimizer(learning_rate=1e-5,momentum=0.9)

# Create a deep neural network with 3 hidden layers of 30, 20, 5
feature_columns = [tf.contrib.layers.real_valued_column("", dimension=x.shape[1])]
classifier = learn.DNNClassifier(
    optimizer= opt,
    model_dir= model_dir,
    config=tf.contrib.learn.RunConfig(save_checkpoints_secs=30),
    hidden_units=[100, 50, 25], n_classes=num_classes, feature_columns=feature_columns)

# Might be needed in future versions of "TensorFlow Learn"
#classifier = learn.SKCompat(classifier) # For Sklearn compatibility

# Early stopping
validation_monitor = tf.contrib.learn.monitors.ValidationMonitor(
    x_validate,
    y_validate,
    every_n_steps=500,
    early_stopping_metric="loss",
    early_stopping_metric_minimize=True,
    early_stopping_rounds=50)

# Fit/train neural network
classifier.fit(x_train, y_train,monitors=[validation_monitor],steps=10000)

Number of classes: 5
INFO:tensorflow:Using config: {'_keep_checkpoint_every_n_hours': 10000, '_evaluation_master': '', '_num_ps_replicas': 0, '_save_checkpoints_steps': None, '_task_type': None, '_environment': 'local', '_tf_config': gpu_options {
  per_process_gpu_memory_fraction: 1
}
, '_tf_random_seed': None, '_master': '', '_keep_checkpoint_max': 5, '_is_chief': True, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x116e8bf28>, '_save_summary_steps': 100, '_task_id': 0, '_save_checkpoints_secs': 30}
Instructions for updating:
Monitors are deprecated. Please use tf.train.SessionRunHook.
Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))
Instructions for updating:
Estimator is decoupled from Scikit Learn 

  equality = a == b


Instructions for updating:
Please switch to tf.summary.scalar. Note that tf.summary.scalar uses the node name instead of the tag. This means that TensorFlow will automatically de-duplicate summary names based on the scope they are created in. Also, passing a tensor or list of tags to a scalar summary op is no longer supported.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Saving checkpoints for 1 into ./dnn/kaggle/model.ckpt.
INFO:tensorflow:step = 1, loss = 1.69504
INFO:tensorflow:global_step/sec: 84.4133
INFO:tensorflow:step = 101, loss = 0.683313
INFO:tensorflow:global_step/sec: 73.3376
INFO:tensorflow:step = 201, loss = 0.636839
INFO:tensorflow:global_step/sec: 65.5144
INFO:tensorflow:step = 301, loss = 0.614748
INFO:tensorflow:global_step/sec: 71.0559
INFO:tensorflow:step = 401, loss = 0.581737
Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the S

DNNClassifier(params={'embedding_lr_multipliers': None, 'optimizer': <tensorflow.python.training.adam.AdamOptimizer object at 0x116e8bfd0>, 'dropout': None, 'activation_fn': <function relu at 0x113078158>, 'gradient_clip_norm': None, 'input_layer_min_slice_size': None, 'hidden_units': [100, 50, 25], 'head': <tensorflow.contrib.learn.python.learn.estimators.head._MultiClassHead object at 0x116e8bd30>, 'feature_columns': (_RealValuedColumn(column_name='', dimension=17, default_value=None, dtype=tf.float32, normalizer=None),)})

The following code builds the submission file.

In [7]:
# Generate Kaggle submit file

# Encode feature vector
df_test = pd.read_csv(filename_test,na_values=['NA','?'])

#encode_numeric_zscore(df_test,'petal_w')
#encode_numeric_zscore(df_test,'petal_l')
#encode_numeric_zscore(df_test,'sepal_w')
#encode_numeric_zscore(df_test,'sepal_l')
encode_numeric_zscore(df_test,'len')
encode_numeric_zscore(df_test,'links')
df_test.drop('title', axis=1, inplace=True)
ids = df_test['id']
df_test.drop('id', axis=1, inplace=True)

x = df_test.as_matrix().astype(np.float32)

# Generate predictions
pred = list(classifier.predict_proba(x, as_iterable=True))
#pred

# Create submission data set

df_submit = pd.DataFrame(pred)
df_submit.insert(0,'id',ids)
df_submit.columns = ['id','class-0','class-1','class-2','class-3','class-4']

df_submit.to_csv(filename_submit, index=False)

print(df_submit)

Instructions for updating:
Estimator is decoupled from Scikit Learn interface by moving into
separate class SKCompat. Arguments x, y and batch_size are only
available in the SKCompat class, Estimator will only accept input_fn.
Example conversion:
  est = Estimator(...) -> est = SKCompat(Estimator(...))


  equality = a == b


          id       class-0       class-1       class-2       class-3  \
0       6639  9.999897e-01  9.921896e-07  1.065219e-23  9.280530e-06   
1       9603  0.000000e+00  0.000000e+00  1.000000e+00  0.000000e+00   
2      12234  9.825366e-01  1.737450e-02  1.105266e-08  8.891991e-05   
3      16535  2.608014e-35  1.000000e+00  1.050461e-36  0.000000e+00   
4      18157  1.000000e+00  1.667878e-13  0.000000e+00  0.000000e+00   
5      33302  0.000000e+00  0.000000e+00  1.000000e+00  0.000000e+00   
6      37190  0.000000e+00  0.000000e+00  1.000000e+00  0.000000e+00   
7      43051  2.216956e-07  9.999998e-01  8.821426e-25  0.000000e+00   
8      43373  0.000000e+00  1.000000e+00  0.000000e+00  0.000000e+00   
9      51487  0.000000e+00  1.000000e+00  0.000000e+00  0.000000e+00   
10     54573  0.000000e+00  1.000000e+00  5.470123e-18  0.000000e+00   
11     87367  3.828980e-10  7.078284e-22  1.000000e+00  2.773656e-19   
12    134848  4.482552e-01  4.734279e-05  5.627338e-25  5.516837