# Wide and Deep Network
In this tutorial, we are going to implement a [Wide and Deep Network](https://arxiv.org/abs/1606.07792) to solve a classification problem. A Wide and Deep Network combines a linear model with a feed forward neural net so that our predictions will have memorization and generalization. This type of model can be used for classification and regression problems. This allows for less feature engineering with relatively accurate predictions. Thus, getting the best of both worlds.

<br>


![alt text](jpg/wide_and_deep_model.jpg "model image")

## The Data
We are going to be using the Titanic Kaggle data to predict whether or not the passenger will survive based on certain attributes like Name, Gender, what ticket they had, the fare they paid the cabin they stayed in etc. For more information on this data set check out here at [Kaggle](https://www.kaggle.com/c/titanic/data).


First off we’re going to define all of our columns as Continuous or Categorical.

<b>Continuous columns </b>— any numerical value in a continuous range. Pretty much if it is a numerical representation like money, or age.

<b>Categorical columns </b>— part of a finite set. Like male or female, or even what country someone is from.

In [1]:
import tensorflow as tf
import pandas as pd

In [2]:
combiner='sum' # combiner다른 거 해보기

In [3]:
CATEGORICAL_COLUMNS = ["Name", "Gender", "Embarked", "Cabin"]
CONTINUOUS_COLUMNS = ["Age", "SibSp", "Parch", "Fare", "PassengerId", "Pclass"]

Since we are only looking to see if a person survived, this is a binary classification problem. We predict a 1 if that person survives and a 0… if they do not :( , We then create a column solely for our survived category.

In [4]:
SURVIVED_COLUMN = "Survived"

## The Network
Now we can get to creating the columns and adding embedding layers. When we build our model were going to want to change our categorical columns into a sparse column. For our columns with a small set of categories such as Gender or Embarked (C, Q or S) we will transform them into sparse columns with keys. 

In [5]:
gender = tf.contrib.layers.sparse_column_with_keys(column_name='Gender', keys=['female', 'male'], combiner=combiner)
embarked = tf.contrib.layers.sparse_column_with_keys(
    column_name='Embarked',keys=['C', 'Q', 'S'], combiner='sum') # Port of Embarkation (C: Cherbourg; Q: Queenstown; S: Southampton)

The other categorical columns have many more options than we want to put keys, and since we don’t have a vocab file to map all of the possible categories into an integer we will hash them.

In [6]:
cabin = tf.contrib.layers.sparse_column_with_hash_bucket('Cabin', hash_bucket_size=1000, combiner=combiner)
name = tf.contrib.layers.sparse_column_with_hash_bucket('Name', hash_bucket_size=1000, combiner=combiner)

Our continuous columns we want to use their real value. The reason that passenger id is in continuous and not categorical is because they’re not in string format and they’re already an integer ID.

In [7]:
age = tf.contrib.layers.real_valued_column("Age")
passenger_id = tf.contrib.layers.real_valued_column("PassengerId")
sib_sp = tf.contrib.layers.real_valued_column("SibSp")  # Number of Siblings/Spouses Aboard
parch = tf.contrib.layers.real_valued_column("Parch")   # Number of Parents/Children Aboard
fare = tf.contrib.layers.real_valued_column("Fare")     # Passenger Fare
p_class = tf.contrib.layers.real_valued_column("Pclass") # Passenger Class (1: 1st; 2: 2nd; 3: 3rd)

We are going to bucket the ages. Bucketization allows us to find the survival correlation by certain age groups and not by all the ages as a whole, thus increasing our accuracy.

In [8]:
age_buckets = tf.contrib.layers.bucketized_column(age, boundaries=[5, 18, 25,30, 35, 40,45, 50, 55, 65])

Almost done, we are going to define our wide columns and our deep columns. Our wide columns are going to effectively memorize interactions between our features. Our wide columns don’t generalize our features, this is why we have our deep columns.

In [9]:
wide_columns = [gender, embarked, cabin, name, age_buckets,
                  tf.contrib.layers.crossed_column([age_buckets, gender], hash_bucket_size=int(1e6), combiner=combiner),
                  tf.contrib.layers.crossed_column([embarked, name], hash_bucket_size=int(1e4), combiner=combiner)]

In [10]:
#wide_columns = [gender, embarked, p_class, cabin, name, age_buckets,
#                  tf.contrib.layers.crossed_column([p_class, cabin], hash_bucket_size=int(1e4), combiner='sum'), 이거 쓰면 에러
#                  tf.contrib.layers.crossed_column([age_buckets, gender], hash_bucket_size=int(1e6), combiner='sum'),
#                  tf.contrib.layers.crossed_column([embarked, name], hash_bucket_size=int(1e4), combiner='sum')]

The benefit of having these deep columns is that it takes our sparse high dimension features and reduces them into low dimensions.

In [11]:
deep_columns = [age, passenger_id, sib_sp, parch, fare, p_class,
                  tf.contrib.layers.embedding_column(gender, dimension=8, combiner=combiner),
                  tf.contrib.layers.embedding_column(embarked, dimension=8, combiner=combiner),
                  tf.contrib.layers.embedding_column(cabin, dimension=8, combiner=combiner),
                  tf.contrib.layers.embedding_column(name, dimension=8, combiner=combiner)]

We finish off our function by creating our classifier with our deep columns and wide columns.

In [12]:
model = tf.contrib.learn.DNNLinearCombinedClassifier(linear_feature_columns=wide_columns,
                                                          dnn_feature_columns=deep_columns,
                                                          dnn_hidden_units=[100, 50],
                                                          enable_centered_bias=False,
                                                          model_dir='model/')



## Train and Evaluate the Model

The last thing we will have to do before running the network is create mappings for our continuous and categorical columns. What we are doing here by creating this function, and this is standard throughout the Tensorflow learning code, is creating an input function for our dataframe. This converts our dataframe into something that Tensorflow can manipulate. The benefit of this is that we can change and tweak how our tensors are being created. If we wanted we could pass feature columns into .fit .feature .predict as an individually created column like we have above with our features, but this is a much cleaner solution.

In [13]:
def input_fn(df, train=False):
    """Input builder function."""
    # Creates a dictionary mapping from each continuous feature column name (k) to
    # the values of that column stored in a constant Tensor.
    continuous_cols = {k: tf.constant(df[k].values) for k in CONTINUOUS_COLUMNS}
    # Creates a dictionary mapping from each categorical feature column name (k)
    # to the values of that column stored in a tf.SparseTensor.
    categorical_cols = {k: tf.SparseTensor(
        indices=[[i, 0] for i in range(df[k].size)],
        values=df[k].values,
        shape=[df[k].size, 1]) for k in CATEGORICAL_COLUMNS}
    # Merges the two dictionaries into one.
    feature_cols = dict(continuous_cols)
    feature_cols.update(categorical_cols)
    # Converts the label column into a constant Tensor.
    if train:
        label = tf.constant(df[SURVIVED_COLUMN].values)
        # Returns the feature columns and the label.
        return feature_cols, label
    else:
        return feature_cols

Now after all this we can write our training function

In [14]:
def train_and_eval(model):
    """Train and evaluate the model."""
    df_train = pd.read_csv(tf.gfile.Open("./data/train.csv"), skipinitialspace=True, engine='python')
    df_test = pd.read_csv(tf.gfile.Open("./data/test.csv"), skipinitialspace=True, engine='python')
    model.fit(input_fn=lambda: input_fn(df_train, True), steps=300)
    print model.predict(input_fn=lambda: input_fn(df_test), as_iterable=True)
    results = model.evaluate(input_fn=lambda: input_fn(df_train, True), steps=1)
    for key in sorted(results):
        print("%s: %s" % (key, results[key]))

We read in our csv files that were preprocessed, like effectively imputed missing values, for simplicity sake.
These csv’s are converted to tensors using our input_fn by lambda. we build our estimator then we print our predictions and print out our evaluation results.

In [15]:
config = tf.ConfigProto()
config.gpu_options.allow_growth = True   
with tf.Session(config=config) as sess:
    train_and_eval(model)



<generator object <genexpr> at 0x7f32bc29aeb0>
accuracy: 0.866442
accuracy/baseline_target_mean: 0.383838
accuracy/threshold_0.500000_mean: 0.866442
auc: 0.952788
global_step: 600
labels/actual_target_mean: 0.383838
labels/prediction_mean: 0.391919
loss: 0.316167
precision/positive_threshold_0.500000_mean: 0.830861
recall/positive_threshold_0.500000_mean: 0.818713
