##### Copyright 2019 The TensorFlow Authors.


In [1]:
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

TODO:

- fix model creation (preprocessing breaks tensor? https://tensorflow.google.cn/tutorials/load_data/csv)

- name all layers

- save model for loading afterwards (currently broken, might require named layers)

- make sure to mark unaltered cells and annotations as from the original tutorial


Maybe:

- make a preprocessing model where the preprocessing is done when DiSENN is initialized
(for portability)


# Classify structured data using Keras Preprocessing Layers

<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://www.tensorflow.org/tutorials/structured_data/preprocessing_layers">
    <img src="https://www.tensorflow.org/images/tf_logo_32px.png" />
    View on TensorFlow.org</a>
  </td>
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/structured_data/preprocessing_layers.ipynb">
    <img src="https://www.tensorflow.org/images/colab_logo_32px.png" />
    Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/tensorflow/docs/blob/master/site/en/tutorials/structured_data/preprocessing_layers.ipynb">
    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
    View source on GitHub</a>
  </td>
  <td>
    <a href="https://storage.googleapis.com/tensorflow_docs/docs/site/en/tutorials/structured_data/preprocessing_layers.ipynb"><img src="https://www.tensorflow.org/images/download_logo_32px.png" />Download notebook</a>
  </td>
</table>

This tutorial demonstrates how to classify structured data (e.g. tabular data in a CSV). You will use [Keras](https://www.tensorflow.org/guide/keras) to define the model, and [preprocessing layers](https://keras.io/guides/preprocessing_layers/) as a bridge to map from columns in a CSV to features used to train the model. This tutorial contains complete code to:

* Load a CSV file using [Pandas](https://pandas.pydata.org/).
* Build an input pipeline to batch and shuffle the rows using [tf.data](https://www.tensorflow.org/guide/datasets).
* Map from columns in the CSV to features used to train the model using Keras Preprocessing layers.
* Build, train, and evaluate a model using Keras.

Note: This tutorial is similar to [Classify structured data with feature columns](https://www.tensorflow.org/tutorials/structured_data/feature_columns). This version uses new experimental Keras [Preprocessing Layers](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing) instead of `tf.feature_column`. Keras Preprocessing Layers are more intuitive, and can be easily included inside your model to simplify deployment.

## The Dataset

You will use a simplified version of the PetFinder [dataset](https://www.kaggle.com/c/petfinder-adoption-prediction). There are several thousand rows in the CSV. Each row describes a pet, and each column describes an attribute. You will use this information to predict if the pet will be adopted.

Following is a description of this dataset. Notice there are both numeric and categorical columns. There is a free text column which you will not use in this tutorial.

Column | Description| Feature Type | Data Type
------------|--------------------|----------------------|-----------------
Type | Type of animal (Dog, Cat) | Categorical | string
Age |  Age of the pet | Numerical | integer
Breed1 | Primary breed of the pet | Categorical | string
Color1 | Color 1 of pet | Categorical | string
Color2 | Color 2 of pet | Categorical | string
MaturitySize | Size at maturity | Categorical | string
FurLength | Fur length | Categorical | string
Vaccinated | Pet has been vaccinated | Categorical | string
Sterilized | Pet has been sterilized | Categorical | string
Health | Health Condition | Categorical | string
Fee | Adoption Fee | Numerical | integer
Description | Profile write-up for this pet | Text | string
PhotoAmt | Total uploaded photos for this pet | Numerical | integer
AdoptionSpeed | Speed of adoption | Classification | integer

## Install and Import necessary libraries


In [2]:
!pip install sklearn
!pip install numpy
!pip install pandas
!pip install tensorflow
!pip install pydot
!pip install pydotplus
!pip install graphviz
!pip install datetime
!pip install packaging
!pip install keras





Install for graph: https://graphviz.gitlab.io/download/
maybe follow: https://bobswift.atlassian.net/wiki/spaces/GVIZ/pages/131924165/Graphviz+installation

In [3]:
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow.keras import layers
import keras
from sklearn.model_selection import train_test_split
from tensorflow.keras.layers.experimental import preprocessing
from datetime import datetime
import tensorboard

## Reading in the data

the data is read into a pandas dataframe

As the real data is sensitive, large and expensive to use,
for now I use a dummy dataset about adoption-speed

In [4]:
dataset_url = 'http://storage.googleapis.com/download.tensorflow.org/data/petfinder-mini.zip'
csv_file = 'datasets/petfinder-mini/petfinder-mini.csv'

tf.keras.utils.get_file('petfinder_mini.zip', dataset_url,
                        extract=True, cache_dir='.')
dataframe = pd.read_csv(csv_file)

In [5]:
dataframe.head()

Unnamed: 0,Type,Age,Breed1,Gender,Color1,Color2,MaturitySize,FurLength,Vaccinated,Sterilized,Health,Fee,Description,PhotoAmt,AdoptionSpeed
0,Cat,3,Tabby,Male,Black,White,Small,Short,No,No,Healthy,100,Nibble is a 3+ month old ball of cuteness. He ...,1,2
1,Cat,1,Domestic Medium Hair,Male,Black,Brown,Medium,Medium,Not Sure,Not Sure,Healthy,0,I just found it alone yesterday near my apartm...,2,0
2,Dog,1,Mixed Breed,Male,Brown,White,Medium,Medium,Yes,No,Healthy,0,Their pregnant mother was dumped by her irresp...,7,3
3,Dog,4,Mixed Breed,Female,Black,Brown,Medium,Short,Yes,No,Healthy,150,"Good guard dog, very alert, active, obedience ...",8,2
4,Dog,1,Mixed Breed,Male,Black,No Color,Medium,Short,No,No,Healthy,0,This handsome yet cute boy is up for adoption....,3,2


## Create target variable

I have to select the variable I want to train for and drop the columns that are not important or contain that information from the normal dataset.

Valid for the example data:
The task in the Kaggle competition is to predict the speed at which a pet will be adopted (e.g., in the first week, the first month, the first three months, and so on). Let's simplify this for our tutorial. Here, you will transform this into a binary classification problem, and simply predict whether the pet was adopted, or not.

After modifying the label column, 0 will indicate the pet was not adopted, and 1 will indicate it was.

In [6]:
# In the original dataset "4" indicates the pet was not adopted.
dataframe['target'] = np.where(dataframe['AdoptionSpeed']==4, 0, 1)

# Drop un-used columns.
dataframe = dataframe.drop(columns=['AdoptionSpeed', 'Description'])

## Split the dataframe into train, validation, and test

The loaded dataset was a single file. It has to be split into train, validation, and test sets.

In [7]:
train, test = train_test_split(dataframe, test_size=0.2)
train, val = train_test_split(train, test_size=0.2)
print(len(train), 'train examples')
print(len(val), 'validation examples')
print(len(test), 'test examples')

7383 train examples
1846 validation examples
2308 test examples


## Create an input pipeline using tf.data

The dataframes is wrapped with [tf.data](https://www.tensorflow.org/guide/datasets).
This is done to easily shuffle and batch the data. 

If the RAM is not sufficient, tf.data could be used directly to read it from disk in batches.

In [8]:
# A utility method to create a tf.data dataset from a Pandas Dataframe
def df_to_dataset(dataframe, shuffle=True, batch_size=32):
  dataframe = dataframe.copy()
  labels = dataframe.pop('target')
  ds = tf.data.Dataset.from_tensor_slices((dict(dataframe), labels))
  if shuffle:
    ds = ds.shuffle(buffer_size=len(dataframe))
  ds = ds.batch(batch_size)
  ds = ds.prefetch(batch_size)
  return ds

The general pipeline for input is finished here.
What does it look like?

In [9]:
batch_size = 5
train_ds = df_to_dataset(train, batch_size=batch_size)

In [10]:
[(train_features, label_batch)] = train_ds.take(1)
print('Every feature:', list(train_features.keys()))
print('A batch of ages:', train_features['Age'])
print('A batch of targets:', label_batch )

# TODO currently this is targeted towards the dummy -set

Every feature: ['Type', 'Age', 'Breed1', 'Gender', 'Color1', 'Color2', 'MaturitySize', 'FurLength', 'Vaccinated', 'Sterilized', 'Health', 'Fee', 'PhotoAmt']
A batch of ages: tf.Tensor([32  6  3  2 36], shape=(5,), dtype=int64)
A batch of targets: tf.Tensor([1 1 1 1 1], shape=(5,), dtype=int32)


The dataset returns a dictionary of column names (from the dataframe) that map to column values from rows in the dataframe.

## Demonstrate the use of preprocessing layers.

I will have to adapt the pipelines when I replace the dummy-code, but afterwards I will be able to input plain string data etc from new data as well.

Information about the pre-processing layers for easy access when I am there:

*   [`Normalization`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing/Normalization) - Feature-wise normalization of the data.
*   [`CategoryEncoding`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing/CategoryEncoding) - Category encoding layer.
*   [`StringLookup`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing/StringLookup) - Maps strings from a vocabulary to integer indices.
*   [`IntegerLookup`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing/IntegerLookup) - Maps integers from a vocabulary to integer indices.

A list of available preprocessing layers can be found [here](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing).

### Numeric columns
A Normalization() layer ensures that each numeric feature has a mean of 0 and a standard deviation of 1.

The `get_normalization_layer` function returns a keras layer.
It applies featurewise normalization to numerical features.

In [11]:
def get_normalization_layer(name, dataset):
  # Create a Normalization layer for our feature.
  normalizer = preprocessing.Normalization()

  # Prepare a Dataset that only yields our feature.
  feature_ds = dataset.map(lambda x, y: x[name])

  # Learn the statistics of the data.
  normalizer.adapt(feature_ds)

  return normalizer

In [12]:
photo_count_col = train_features['PhotoAmt']
layer = get_normalization_layer('PhotoAmt', train_ds)
layer(photo_count_col)

<tf.Tensor: shape=(5, 1), dtype=float32, numpy=
array([[-0.50925547],
       [ 0.4294512 ],
       [ 1.0552557 ],
       [ 0.4294512 ],
       [-0.50925547]], dtype=float32)>

Note: If you many numeric features (hundreds, or more), it is more efficient to concatenate them first and use a single [normalization](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing/Normalization) layer.

### Categorical columns

In the dummy dataset, Type is represented as a string (e.g. 'Dog', or 'Cat'). Sadly, one can not feed strings directly to a model. The preprocessing layer takes care of representing strings as a one-hot vector.

The `get_category_encoding_layer` function returns a layer, mapping values from a vocabulary to integer indices and one-hot encodes the features.

In [13]:
def get_category_encoding_layer(name, dataset, dtype, max_tokens=None):
  # Create a StringLookup layer which will turn strings into integer indices
  if dtype == 'string':
    index = preprocessing.StringLookup(max_tokens=max_tokens)
  else:
    index = preprocessing.IntegerLookup(max_values=max_tokens)

  # Prepare a Dataset that only yields our feature
  feature_ds = dataset.map(lambda x, y: x[name])

  # Learn the set of possible values and assign them a fixed integer index.
  index.adapt(feature_ds)

  # Create a Discretization for our integer indices.
  encoder = preprocessing.CategoryEncoding(max_tokens=index.vocab_size())

  # Prepare a Dataset that only yields our feature.
  feature_ds = feature_ds.map(index)

  # Learn the space of possible indices.
  encoder.adapt(feature_ds)

  # Apply one-hot encoding to our indices. The lambda function captures the
  # layer so we can use them, or include them in the functional model later.
  return lambda feature: encoder(index(feature))

In [14]:
type_col = train_features['Type']
layer = get_category_encoding_layer('Type', train_ds, 'string')
layer(type_col)

<tf.Tensor: shape=(5, 4), dtype=float32, numpy=
array([[0., 0., 1., 0.],
       [0., 0., 0., 1.],
       [0., 0., 0., 1.],
       [0., 0., 0., 1.],
       [0., 0., 1., 0.]], dtype=float32)>

Often, you don't want to feed a number directly into the model, but instead use a one-hot encoding of those inputs. Consider raw data that represents a pet's age.

In [15]:
type_col = train_features['Age']
category_encoding_layer = get_category_encoding_layer('Age', train_ds,
                                                      'int64', 5)
category_encoding_layer(type_col)

<tf.Tensor: shape=(5, 5), dtype=float32, numpy=
array([[0., 1., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 1., 0., 0.],
       [0., 1., 0., 0., 0.]], dtype=float32)>

## Choose and prepare columns to use

While we can deal with all types of data, we have to make a list of all columns for each type.\
That way I am able to define which layer needs to be treated how\

In [16]:
batch_size = 256
train_ds = df_to_dataset(train, batch_size=batch_size)
val_ds = df_to_dataset(val, shuffle=False, batch_size=batch_size)
test_ds = df_to_dataset(test, shuffle=False, batch_size=batch_size)

In [17]:
all_inputs = []
encoded_features = []

# Numeric features.
for header in ['PhotoAmt', 'Fee']:  # TODO use all headers in UMC set minus the ones I know are something else
  numeric_col = tf.keras.Input(shape=(1,), name=header)
  normalization_layer = get_normalization_layer(header, train_ds)
  encoded_numeric_col = normalization_layer(numeric_col)
  all_inputs.append(numeric_col)
  encoded_features.append(encoded_numeric_col)

In [18]:
# Categorical features encoded as integers.

# TODO at the UMC data, this will be more common, some tests have a categorical scale 
# However, most of them can just be interpreted as normal numerical feature, so I won't have to overdo it
age_col = tf.keras.Input(shape=(1,), name='Age', dtype='int64')
encoding_layer = get_category_encoding_layer('Age', train_ds, dtype='int64',
                                             max_tokens=5)
encoded_age_col = encoding_layer(age_col)
all_inputs.append(age_col)
encoded_features.append(encoded_age_col)

In [19]:
# Categorical features encoded as string.
categorical_cols = ['Type', 'Color1', 'Color2', 'Gender', 'MaturitySize',
                    'FurLength', 'Vaccinated', 'Sterilized', 'Health', 'Breed1'] 
# TODO replace this by reading the headings from the dataframe

for header in categorical_cols:
  categorical_col = tf.keras.Input(shape=(1,), name=header, dtype='string')
  encoding_layer = get_category_encoding_layer(header, train_ds, dtype='string',
                                               max_tokens=5) # TODO maybe, this line has to be duplicated and slightly changed to accomodate for different max_tokens
  encoded_categorical_col = encoding_layer(categorical_col)
  all_inputs.append(categorical_col)
  encoded_features.append(encoded_categorical_col)


In [20]:
# Currently I do not think the UMC data needs to be balanced.
# It will be evaluated on the same dataset (though a different part of it)
# We do not have a large number of samples that are underrepresented, probably causing large inaccura

#use:
#    https://www.tensorflow.org/tutorials/structured_data/imbalanced_data

## Create, compile, and train the model


In [38]:
# The first step towards a working model
# is our preprocessed input.
# As that is a relative complex task, that is regarded it's owy model.

preprocessed_layers = layers.Concatenate()(encoded_features)
preprocesessing_model = tf.keras.Model(all_inputs, preprocessed_layers)
preprocesessing_model.summary()

Model: "functional_3"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
Age (InputLayer)                [(None, 1)]          0                                            
__________________________________________________________________________________________________
Type (InputLayer)               [(None, 1)]          0                                            
__________________________________________________________________________________________________
Color1 (InputLayer)             [(None, 1)]          0                                            
__________________________________________________________________________________________________
Color2 (InputLayer)             [(None, 1)]          0                                            
_______________________________________________________________________________________

In [22]:
# The main model will contain 2 more sub-models next to the preprocessing one.
# One of them represents different concepts, the other one how important a certain concept is,
# depending on the input.

# As both models create a separate output, to create a prediction, their outputs have to be combined.

# A simple way to aggregate the output is to just multiply the concepts with their weights

# While the name is *SUM*-aggregator, it is not actually summed RN #TODO rename it
# the summing is done in the main model, as the (in the usual sense prmature) output of this function helps interpret the model

class SumAggregator:
    def __init__(self, num_classes, **kwargs):
        """Basic Aggregator that joins the concepts and relevances by summing their products.
        -> weights every concept with its relevance the output is the sum
        """
        super().__init__()
        self.num_classes = num_classes

    @staticmethod
    def forward( concepts, relevances):
        """Forward pass of Sum Aggregator.

        Aggregates concepts and relevances and returns the predictions for each class.

        Parameters # TODO change this to appropriate TF variants
        ----------
        concepts : torch.Tensor
            Contains the output of the conceptizer with shape (BATCH, NUM_CONCEPTS, DIM_CONCEPT=1).
        relevances : torch.Tensor
            Contains the output of the parameterizer with shape (BATCH, NUM_CONCEPTS, NUM_CLASSES).

        Returns
        -------
        class_predictions : torch.Tensor
            Predictions for each class. Shape - (BATCH, NUM_CLASSES)
            
        """
        #permuted = tf.transpose(relevances, perm=[0, 2])  # so that the number of concepts is at the end
        #batch_matrix_matrix_product = tf.matmul(permuted, concepts)  # multiply all relevance scores
        #       with their corresponding concepts activation
        #aggregated = tf.squeeze(batch_matrix_matrix_product)  # squeeze(-1)  # remove the number of concepts
        aggregated = tf.math.multiply(concepts, relevances)
        
        return tf.nn.log_softmax(aggregated)


In [23]:
# Within the models structure, there are repetetive patterns.

# For readability those layers are combined into custom layers:

#The first one is a typical combination found in the conceptizer
class ConceptizerLayer(layers.Layer):
    
    def __init__(self, out_shape):
        super(ConceptizerLayer, self).__init__()
        #self.inp = layers.Input(shape=(13,))
        self.lin = layers.Dense(out_shape, activation='linear')
        self.relu = layers.Dense(out_shape, activation='relu')
        
    def call(self, input_tensor,  training=False):
        #x = self.inp(input_tensor)
        x = self.lin(input_tensor)
        x = self.relu(x)
        return x


In [39]:
# The conceptizer is a submodel of our network.
# The idea is, that it learns which combination of inputs are relevant together 

class Conceptizer(layers.Layer):
    
    def __init__(self, start_dim):
        super(Conceptizer, self).__init__()
        self.con0 = ConceptizerLayer(start_dim)
        #self.flat = keras.layers.Flatten()
        self.con1 = ConceptizerLayer(start_dim/2)
        self.con2 = ConceptizerLayer(start_dim/4)
        self.lin = layers.Dense(start_dim/4, activation='linear')

        
    
    def call(self, input_tensor,  training=False):
        x = self.con0(input_tensor)
        #x = self.flat(x)
        x = self.con1(x)
        x = self.con2(x)
        x = self.lin(x)
        return x


In [40]:
# A combination of layers, common in the parameterizer

class ParameterizerLayer(layers.Layer):
    
    def __init__(self, out_shape, dropout_rate):
        super(ParameterizerLayer, self).__init__()       

        self.para_lin = layers.Dense(out_shape, activation='linear')
        self.para_drop = layers.Dropout(dropout_rate)
        self.para_relu = layers.Dense(out_shape, activation='relu')
        
    
    def call(self, input_tensor,  training=False):
        x = self.para_lin(input_tensor)
        if training:
            x = self.para_drop(x, training=training)
        x = self.para_relu(x)        
        return x

In [41]:
# The purpose of this sub-model to the network is to assign weights to the conceptizers output.
# One could say that every value in its output is a measure of how important a certain concept is.
# Here a concept is one output node of the conceptizer,
# the Parameterizers output at position x relates to the conceptizers output at that same index (x) 

class Parameterizer(layers.Layer):
    
    def __init__(self, hidden_sizes, out_shape, dropout_rate):
        super(Parameterizer, self).__init__()
        self.para0 = ParameterizerLayer(hidden_sizes[0], dropout_rate)
        self.para1 = ParameterizerLayer(hidden_sizes[1], dropout_rate)
        self.para2 = ParameterizerLayer(hidden_sizes[2], dropout_rate)
        self.para3 = ParameterizerLayer(hidden_sizes[3], dropout_rate)
        self.para_lin = layers.Dense(out_shape, activation='linear')
        self.para_drop = layers.Dropout(dropout_rate)

        
    
    def call(self, input_tensor,  training=False):
        x = self.para0(input_tensor)
        x = self.para1(x)
        x = self.para2(x)
        x = self.para3(x)
        x = self.para_lin(x)
        if training:
            x = self.para_drop(x, training=training)
        return x

In [43]:
# As this model in non-standard in many ways, a custom loss function is necessary
# As we would like to track and reference those values outside of the model, they are defined here 
# TODO is that actually necessary?

#TODO make them the actually matching loss and metric!
loss_tracker = keras.metrics.Mean(name="loss")
mae_metric = keras.metrics.MeanAbsoluteError(name="mae")

In [44]:
# This class stores the network itself

# Basically: the input is preprocessed into normalized scalars and fed into 
# a) The conceptizer to learn which features go together to form underlying structures
# b) The parameterizer to learn which concept is important in what situation.
# a and b are then combined by the aggregator
# the aggregators output is summed up to produce the prediction

#TODO replace the singe neuron at the end by a sum function.

class DiSENN(tf.keras.Model):

    def __init__(self, hidden_sizes, preprocessing_model, dropout_rate):
        
        #general (superclass) model constructor
        super(DiSENN, self).__init__()
        
        #preprocess the input
        self.input_layers = preprocessing_model
        
        # the main model that learns the concepts
        input_features = 512
        self.conceptizer = Conceptizer(input_features)  # out: input_features / 4

        
        # The model that gives a weight per feature (parameterizer)
        out_shape = input_features/4
        self.parameterizer = Parameterizer(hidden_sizes, out_shape, dropout_rate)  # out: out_shape

        
        # The way to combine the activation and weights (aggregator)
        num_classes = 13 #TODO update
        self.aggregator = SumAggregator(num_classes)
        
        self.finalizer = layers.Dense(1, activation='linear')
        
        
    def call(self, input_tensor, training=False):
        
        pre_processed = self.input_layers(input_tensor) 
        
        concepts = self.conceptizer(pre_processed)
        
        relevances = self.parameterizer(pre_processed)
        
        #aggregate the output
        aggr_output = self.aggregator.forward(concepts, relevances)
        
        finalized = self.finalizer(aggr_output)
        return finalized
    
    # customize the train function, base code from:
    # https://www.tensorflow.org/guide/keras/customizing_what_happens_in_fit
    
    def train_step(self, data):
        # Unpack the data. Its structure depends on your model and
        # on what you pass to `fit()`.
        if len(data) == 3:
            x, y, sample_weight = data
            print("Warning: sample weight is not currently supported!")
        else:
            x, y = data

        with tf.GradientTape() as tape:
            y_pred = self(x, training=True)  # Forward pass
            # Compute the loss value
            # (the loss function is configured in `compile()`)
            loss = keras.losses.binary_crossentropy(y, y_pred)        
               
        # Compute gradients
        trainable_vars = self.trainable_variables
        gradients = tape.gradient(loss, trainable_vars)
        
        # Update weights
        self.optimizer.apply_gradients(zip(gradients, trainable_vars))
        
        # Update metrics (includes the metric that tracks the loss)
        self.compiled_metrics.update_state(y, y_pred) # TODO support sample_weight=sample_weight
        
        # Return a dict mapping metric names to current value
        return {m.name: m.result() for m in self.metrics}
    
    @property
    def metrics(self):
        # We list our `Metric` objects here so that `reset_states()` can be
        # called automatically at the start of each epoch
        # or at the start of `evaluate()`.
        # If you don't implement this property, you have to call
        # `reset_states()` yourself at the time of your choosing.
        return [loss_tracker, mae_metric]

In [45]:
# The network as object has to be initialized before use, the hidden_sizes are important to determine hoe big the network is going to be.
# hidden_sizes must be exactly 4 elements long, each number gives the size of a hidden layer in the parameterizer.
hidden_sizes = [11, 5, 5, 22]  # TODO fix this to actual size
model = DiSENN(hidden_sizes, preprocesessing_model, dropout_rate=0.1)     

model.compile(optimizer='adam',
      #loss=tf.keras.losses.BinaryCrossentropy(from_logits=False),  # used to be from_logits=True
      metrics=["accuracy"])

Let's visualize our connectivity graph:


In [46]:
# Define the Keras TensorBoard callback, used for the animated, interactive tensorboard visualizatioon
logdir="logs/fit/" + datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=logdir)

#This should plot the exhaustive graph, but is a bit unreliable
tf.keras.utils.plot_model(model, show_shapes=True, rankdir="LR")

('Failed to import pydot. You must `pip install pydot` and install graphviz (https://graphviz.gitlab.io/download/), ', 'for `pydotprint` to work.')


### Train the model


In [47]:
model.fit(train_ds, epochs=10, validation_data=val_ds, callbacks=[tensorboard_callback])

# for this to work, properly I need to define my own loss - function like here:
# https://www.tensorflow.org/guide/keras/custom_layers_and_models

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x23f6eef97c0>

In [48]:
loss, accuracy = model.evaluate(test_ds)
print("Accuracy", accuracy)

Accuracy 0.0


In [50]:
#visualize model in an interactive way

%reload_ext tensorboard
# rankdir='LR' is used to make the graph horizontal.
#tf.keras.utils.plot_model(model, show_shapes=True, rankdir="LR")
%tensorboard --logdir logs

Reusing TensorBoard on port 6006 (pid 21468), started 6:24:42 ago. (Use '!kill 21468' to kill it.)

## Inference on new data

As the model contains all important parts, it should be able to work on any file of the right format


The model should be saved such that it can just be reloaded later.\
I will follow the tutorial [here](https://www.tensorflow.org/tutorials/keras/save_and_load)

In [51]:
model.save('my_pet_classifier')
reloaded_model = tf.keras.models.load_model('my_pet_classifier')

INFO:tensorflow:Assets written to: my_pet_classifier\assets


To get a prediction for a new sample, you can simply call `model.predict()`. There are just two things you need to do:

1.   Wrap scalars into a list so as to have a batch dimension (models only process batches of data, not single samples)
2.   Call `convert_to_tensor` on each feature

In [53]:
# TODO this does not work

sample = {
    'Type': 'Cat',
    'Age': 3,
    'Breed1': 'Tabby',
    'Gender': 'Male',
    'Color1': 'Black',
    'Color2': 'White',
    'MaturitySize': 'Small',
    'FurLength': 'Short',
    'Vaccinated': 'No',
    'Sterilized': 'No',
    'Health': 'Healthy',
    'Fee': 100,
    'PhotoAmt': 2,
}

input_dict = {name: tf.convert_to_tensor([value]) for name, value in sample.items()}
predictions = reloaded_model.predict(input_dict)
prob = tf.nn.sigmoid(predictions[0])

print(
    "This particular pet had a %.1f percent probability "
    "of getting adopted." % (100 * prob)
)

ValueError: in user code:

    c:\users\deisl\desktop\thesis\adaptation\senn\cudaenv\lib\site-packages\tensorflow\python\keras\engine\training.py:1462 predict_function  *
        return step_function(self, iterator)
    c:\users\deisl\desktop\thesis\adaptation\senn\cudaenv\lib\site-packages\tensorflow\python\keras\engine\training.py:1452 step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    c:\users\deisl\desktop\thesis\adaptation\senn\cudaenv\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:1211 run
        return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
    c:\users\deisl\desktop\thesis\adaptation\senn\cudaenv\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:2585 call_for_each_replica
        return self._call_for_each_replica(fn, args, kwargs)
    c:\users\deisl\desktop\thesis\adaptation\senn\cudaenv\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:2945 _call_for_each_replica
        return fn(*args, **kwargs)
    c:\users\deisl\desktop\thesis\adaptation\senn\cudaenv\lib\site-packages\tensorflow\python\keras\engine\training.py:1445 run_step  **
        outputs = model.predict_step(data)
    c:\users\deisl\desktop\thesis\adaptation\senn\cudaenv\lib\site-packages\tensorflow\python\keras\engine\training.py:1418 predict_step
        return self(x, training=False)
    c:\users\deisl\desktop\thesis\adaptation\senn\cudaenv\lib\site-packages\tensorflow\python\keras\engine\base_layer.py:985 __call__
        outputs = call_fn(inputs, *args, **kwargs)
    c:\users\deisl\desktop\thesis\adaptation\senn\cudaenv\lib\site-packages\tensorflow\python\keras\saving\saved_model\utils.py:71 return_outputs_and_add_losses
        outputs, losses = fn(inputs, *args, **kwargs)
    c:\users\deisl\desktop\thesis\adaptation\senn\cudaenv\lib\site-packages\tensorflow\python\keras\saving\saved_model\utils.py:167 wrap_with_training_arg
        return tf_utils.smart_cond(
    c:\users\deisl\desktop\thesis\adaptation\senn\cudaenv\lib\site-packages\tensorflow\python\keras\utils\tf_utils.py:64 smart_cond
        return smart_module.smart_cond(
    c:\users\deisl\desktop\thesis\adaptation\senn\cudaenv\lib\site-packages\tensorflow\python\framework\smart_cond.py:56 smart_cond
        return false_fn()
    c:\users\deisl\desktop\thesis\adaptation\senn\cudaenv\lib\site-packages\tensorflow\python\keras\saving\saved_model\utils.py:170 <lambda>
        lambda: replace_training_and_call(False))
    c:\users\deisl\desktop\thesis\adaptation\senn\cudaenv\lib\site-packages\tensorflow\python\keras\saving\saved_model\utils.py:165 replace_training_and_call
        return wrapped_call(*args, **kwargs)
    c:\users\deisl\desktop\thesis\adaptation\senn\cudaenv\lib\site-packages\tensorflow\python\eager\def_function.py:780 __call__
        result = self._call(*args, **kwds)
    c:\users\deisl\desktop\thesis\adaptation\senn\cudaenv\lib\site-packages\tensorflow\python\eager\def_function.py:814 _call
        results = self._stateful_fn(*args, **kwds)
    c:\users\deisl\desktop\thesis\adaptation\senn\cudaenv\lib\site-packages\tensorflow\python\eager\function.py:2828 __call__
        graph_function, args, kwargs = self._maybe_define_function(args, kwargs)
    c:\users\deisl\desktop\thesis\adaptation\senn\cudaenv\lib\site-packages\tensorflow\python\eager\function.py:3213 _maybe_define_function
        graph_function = self._create_graph_function(args, kwargs)
    c:\users\deisl\desktop\thesis\adaptation\senn\cudaenv\lib\site-packages\tensorflow\python\eager\function.py:3065 _create_graph_function
        func_graph_module.func_graph_from_py_func(
    c:\users\deisl\desktop\thesis\adaptation\senn\cudaenv\lib\site-packages\tensorflow\python\framework\func_graph.py:986 func_graph_from_py_func
        func_outputs = python_func(*func_args, **func_kwargs)
    c:\users\deisl\desktop\thesis\adaptation\senn\cudaenv\lib\site-packages\tensorflow\python\eager\def_function.py:600 wrapped_fn
        return weak_wrapped_fn().__wrapped__(*args, **kwds)
    c:\users\deisl\desktop\thesis\adaptation\senn\cudaenv\lib\site-packages\tensorflow\python\saved_model\function_deserialization.py:251 restored_function_body
        raise ValueError(

    ValueError: Could not find matching function to call loaded from the SavedModel. Got:
      Positional arguments (2 total):
        * {'Type': <tf.Tensor 'input_tensor_11:0' shape=(None, 1) dtype=string>, 'Age': <tf.Tensor 'input_tensor:0' shape=(None, 1) dtype=int32>, 'Breed1': <tf.Tensor 'input_tensor_1:0' shape=(None, 1) dtype=string>, 'Gender': <tf.Tensor 'input_tensor_6:0' shape=(None, 1) dtype=string>, 'Color1': <tf.Tensor 'input_tensor_2:0' shape=(None, 1) dtype=string>, 'Color2': <tf.Tensor 'input_tensor_3:0' shape=(None, 1) dtype=string>, 'MaturitySize': <tf.Tensor 'input_tensor_8:0' shape=(None, 1) dtype=string>, 'FurLength': <tf.Tensor 'input_tensor_5:0' shape=(None, 1) dtype=string>, 'Vaccinated': <tf.Tensor 'input_tensor_12:0' shape=(None, 1) dtype=string>, 'Sterilized': <tf.Tensor 'input_tensor_10:0' shape=(None, 1) dtype=string>, 'Health': <tf.Tensor 'input_tensor_7:0' shape=(None, 1) dtype=string>, 'Fee': <tf.Tensor 'input_tensor_4:0' shape=(None, 1) dtype=int32>, 'PhotoAmt': <tf.Tensor 'input_tensor_9:0' shape=(None, 1) dtype=int32>}
        * False
      Keyword arguments: {}
    
    Expected these arguments to match one of the following 4 option(s):
    
    Option 1:
      Positional arguments (2 total):
        * {'Age': TensorSpec(shape=(None,), dtype=tf.int64, name='input_tensor/Age'), 'Breed1': TensorSpec(shape=(None,), dtype=tf.string, name='input_tensor/Breed1'), 'Color1': TensorSpec(shape=(None,), dtype=tf.string, name='input_tensor/Color1'), 'Color2': TensorSpec(shape=(None,), dtype=tf.string, name='input_tensor/Color2'), 'Fee': TensorSpec(shape=(None,), dtype=tf.int64, name='input_tensor/Fee'), 'FurLength': TensorSpec(shape=(None,), dtype=tf.string, name='input_tensor/FurLength'), 'Gender': TensorSpec(shape=(None,), dtype=tf.string, name='input_tensor/Gender'), 'Health': TensorSpec(shape=(None,), dtype=tf.string, name='input_tensor/Health'), 'MaturitySize': TensorSpec(shape=(None,), dtype=tf.string, name='input_tensor/MaturitySize'), 'PhotoAmt': TensorSpec(shape=(None,), dtype=tf.int64, name='input_tensor/PhotoAmt'), 'Sterilized': TensorSpec(shape=(None,), dtype=tf.string, name='input_tensor/Sterilized'), 'Type': TensorSpec(shape=(None,), dtype=tf.string, name='input_tensor/Type'), 'Vaccinated': TensorSpec(shape=(None,), dtype=tf.string, name='input_tensor/Vaccinated')}
        * True
      Keyword arguments: {}
    
    Option 2:
      Positional arguments (2 total):
        * {'Age': TensorSpec(shape=(None,), dtype=tf.int64, name='Age'), 'Breed1': TensorSpec(shape=(None,), dtype=tf.string, name='Breed1'), 'Color1': TensorSpec(shape=(None,), dtype=tf.string, name='Color1'), 'Color2': TensorSpec(shape=(None,), dtype=tf.string, name='Color2'), 'Fee': TensorSpec(shape=(None,), dtype=tf.int64, name='Fee'), 'FurLength': TensorSpec(shape=(None,), dtype=tf.string, name='FurLength'), 'Gender': TensorSpec(shape=(None,), dtype=tf.string, name='Gender'), 'Health': TensorSpec(shape=(None,), dtype=tf.string, name='Health'), 'MaturitySize': TensorSpec(shape=(None,), dtype=tf.string, name='MaturitySize'), 'PhotoAmt': TensorSpec(shape=(None,), dtype=tf.int64, name='PhotoAmt'), 'Sterilized': TensorSpec(shape=(None,), dtype=tf.string, name='Sterilized'), 'Type': TensorSpec(shape=(None,), dtype=tf.string, name='Type'), 'Vaccinated': TensorSpec(shape=(None,), dtype=tf.string, name='Vaccinated')}
        * True
      Keyword arguments: {}
    
    Option 3:
      Positional arguments (2 total):
        * {'Age': TensorSpec(shape=(None,), dtype=tf.int64, name='Age'), 'Breed1': TensorSpec(shape=(None,), dtype=tf.string, name='Breed1'), 'Color1': TensorSpec(shape=(None,), dtype=tf.string, name='Color1'), 'Color2': TensorSpec(shape=(None,), dtype=tf.string, name='Color2'), 'Fee': TensorSpec(shape=(None,), dtype=tf.int64, name='Fee'), 'FurLength': TensorSpec(shape=(None,), dtype=tf.string, name='FurLength'), 'Gender': TensorSpec(shape=(None,), dtype=tf.string, name='Gender'), 'Health': TensorSpec(shape=(None,), dtype=tf.string, name='Health'), 'MaturitySize': TensorSpec(shape=(None,), dtype=tf.string, name='MaturitySize'), 'PhotoAmt': TensorSpec(shape=(None,), dtype=tf.int64, name='PhotoAmt'), 'Sterilized': TensorSpec(shape=(None,), dtype=tf.string, name='Sterilized'), 'Type': TensorSpec(shape=(None,), dtype=tf.string, name='Type'), 'Vaccinated': TensorSpec(shape=(None,), dtype=tf.string, name='Vaccinated')}
        * False
      Keyword arguments: {}
    
    Option 4:
      Positional arguments (2 total):
        * {'Age': TensorSpec(shape=(None,), dtype=tf.int64, name='input_tensor/Age'), 'Breed1': TensorSpec(shape=(None,), dtype=tf.string, name='input_tensor/Breed1'), 'Color1': TensorSpec(shape=(None,), dtype=tf.string, name='input_tensor/Color1'), 'Color2': TensorSpec(shape=(None,), dtype=tf.string, name='input_tensor/Color2'), 'Fee': TensorSpec(shape=(None,), dtype=tf.int64, name='input_tensor/Fee'), 'FurLength': TensorSpec(shape=(None,), dtype=tf.string, name='input_tensor/FurLength'), 'Gender': TensorSpec(shape=(None,), dtype=tf.string, name='input_tensor/Gender'), 'Health': TensorSpec(shape=(None,), dtype=tf.string, name='input_tensor/Health'), 'MaturitySize': TensorSpec(shape=(None,), dtype=tf.string, name='input_tensor/MaturitySize'), 'PhotoAmt': TensorSpec(shape=(None,), dtype=tf.int64, name='input_tensor/PhotoAmt'), 'Sterilized': TensorSpec(shape=(None,), dtype=tf.string, name='input_tensor/Sterilized'), 'Type': TensorSpec(shape=(None,), dtype=tf.string, name='input_tensor/Type'), 'Vaccinated': TensorSpec(shape=(None,), dtype=tf.string, name='input_tensor/Vaccinated')}
        * False
      Keyword arguments: {}
