##### Copyright 2019 The TensorFlow Authors.


In [1]:
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

TODO:

- fix model creation (preprocessing breaks tensor? https://tensorflow.google.cn/tutorials/load_data/csv)

- name all layers

- save model for loading afterwards (currently broken, might require named layers, might clash with concatenate layer)

- make sure to mark unaltered cells and annotations as from the original tutorial

- make sure the GPU is used

- make model .fit and .evaluate work (if it does not)

- make callbacks work during fitting (in custom loop)


Maybe:

- make a preprocessing model where the preprocessing is done when SENN is initialized
(for portability)

- augment data by just adding random noise


# Classify structured data using Keras Preprocessing Layers

<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://www.tensorflow.org/tutorials/structured_data/preprocessing_layers">
    <img src="https://www.tensorflow.org/images/tf_logo_32px.png" />
    View on TensorFlow.org</a>
  </td>
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/structured_data/preprocessing_layers.ipynb">
    <img src="https://www.tensorflow.org/images/colab_logo_32px.png" />
    Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/tensorflow/docs/blob/master/site/en/tutorials/structured_data/preprocessing_layers.ipynb">
    <img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />
    View source on GitHub</a>
  </td>
  <td>
    <a href="https://storage.googleapis.com/tensorflow_docs/docs/site/en/tutorials/structured_data/preprocessing_layers.ipynb"><img src="https://www.tensorflow.org/images/download_logo_32px.png" />Download notebook</a>
  </td>
</table>

This tutorial demonstrates how to classify structured data (e.g. tabular data in a CSV). You will use [Keras](https://www.tensorflow.org/guide/keras) to define the model, and [preprocessing layers](https://keras.io/guides/preprocessing_layers/) as a bridge to map from columns in a CSV to features used to train the model. This tutorial contains complete code to:

* Load a CSV file using [Pandas](https://pandas.pydata.org/).
* Build an input pipeline to batch and shuffle the rows using [tf.data](https://www.tensorflow.org/guide/datasets).
* Map from columns in the CSV to features used to train the model using Keras Preprocessing layers.
* Build, train, and evaluate a model using Keras.

Note: This tutorial is similar to [Classify structured data with feature columns](https://www.tensorflow.org/tutorials/structured_data/feature_columns). This version uses new experimental Keras [Preprocessing Layers](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing) instead of `tf.feature_column`. Keras Preprocessing Layers are more intuitive, and can be easily included inside your model to simplify deployment.

## The Dataset

For designing the network I use a smaller and simpler dataset. 

It is a simplified version of the PetFinder [dataset](https://www.kaggle.com/c/petfinder-adoption-prediction). There are several thousand rows in the CSV. Each row describes a pet, and each column describes an attribute. 

The goal is to predict if the pet will be adopted.

Following is a description of this dataset. \
Notice there are both numeric and categorical columns. 

The free text column will be ignored.

Column | Description| Feature Type | Data Type
------------|--------------------|----------------------|-----------------
Type | Type of animal (Dog, Cat) | Categorical | string
Age |  Age of the pet | Numerical | integer
Breed1 | Primary breed of the pet | Categorical | string
Color1 | Color 1 of pet | Categorical | string
Color2 | Color 2 of pet | Categorical | string
MaturitySize | Size at maturity | Categorical | string
FurLength | Fur length | Categorical | string
Vaccinated | Pet has been vaccinated | Categorical | string
Sterilized | Pet has been sterilized | Categorical | string
Health | Health Condition | Categorical | string
Fee | Adoption Fee | Numerical | integer
Description | Profile write-up for this pet | Text | string
PhotoAmt | Total uploaded photos for this pet | Numerical | integer
AdoptionSpeed | Speed of adoption | Classification | integer

## Install and Import necessary libraries


In [2]:
!pip install sklearn
!pip install numpy
!pip install pandas
!pip install tensorflow
!pip install pydot
!pip install pydotplus
!pip install graphviz
!pip install datetime
!pip install packaging
!pip install keras





Install for graph: https://graphviz.gitlab.io/download/
maybe follow: https://bobswift.atlassian.net/wiki/spaces/GVIZ/pages/131924165/Graphviz+installation

In [1]:
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow.keras.constraints import max_norm
from tensorflow.keras import layers
import keras
from sklearn.model_selection import train_test_split
from tensorflow.keras.layers.experimental import preprocessing
from datetime import datetime
import tensorboard

In [2]:
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))

Num GPUs Available:  0


## Reading in the data

The data is read into a pandas dataframe

Again:\
As the real data is sensitive, large and expensive to use,
for now I use a dummy dataset about adoption-speed

In [3]:
dataset_url = 'http://storage.googleapis.com/download.tensorflow.org/data/petfinder-mini.zip'
csv_file = 'datasets/petfinder-mini/petfinder-mini.csv'

tf.keras.utils.get_file('petfinder_mini.zip', dataset_url,
                        extract=True, cache_dir='.')
dataframe = pd.read_csv(csv_file)

In [4]:
dataframe.head()

Unnamed: 0,Type,Age,Breed1,Gender,Color1,Color2,MaturitySize,FurLength,Vaccinated,Sterilized,Health,Fee,Description,PhotoAmt,AdoptionSpeed
0,Cat,3,Tabby,Male,Black,White,Small,Short,No,No,Healthy,100,Nibble is a 3+ month old ball of cuteness. He ...,1,2
1,Cat,1,Domestic Medium Hair,Male,Black,Brown,Medium,Medium,Not Sure,Not Sure,Healthy,0,I just found it alone yesterday near my apartm...,2,0
2,Dog,1,Mixed Breed,Male,Brown,White,Medium,Medium,Yes,No,Healthy,0,Their pregnant mother was dumped by her irresp...,7,3
3,Dog,4,Mixed Breed,Female,Black,Brown,Medium,Short,Yes,No,Healthy,150,"Good guard dog, very alert, active, obedience ...",8,2
4,Dog,1,Mixed Breed,Male,Black,No Color,Medium,Short,No,No,Healthy,0,This handsome yet cute boy is up for adoption....,3,2


## Creating the target variable

I have to select the variable I want to train for and drop the columns that are not important or contain that information from the normal dataset.

Valid for the example data:
The task in the Kaggle competition was to predict the speed at which a pet will be adopted (e.g., in the first week, the first month, the first three months, and so on). Let's simplify this for our purposes. It is transformed into a binary classification problem:
I simply predict whether the pet was adopted, or not.

After modifying the label column, 0 will indicate the pet was not adopted, and 1 will indicate it was.

In [5]:
# In the original dataset "4" indicates the pet was not adopted.
dataframe['target'] = np.where(dataframe['AdoptionSpeed']==4, 0, 1)

# Drop un-used columns. (including our now target which can not be used for training)
dataframe = dataframe.drop(columns=['AdoptionSpeed', 'Description'])

In [6]:
#dataframe = dataframe.drop(columns=['Fee', 'PhotoAmt','Type', 'Color1', 'Color2', 'Gender', 'MaturitySize',
#                    'FurLength', 'Vaccinated', 'Sterilized', 'Health', 'Breed1'])

#for testing

In [7]:
dataframe.head()

Unnamed: 0,Type,Age,Breed1,Gender,Color1,Color2,MaturitySize,FurLength,Vaccinated,Sterilized,Health,Fee,PhotoAmt,target
0,Cat,3,Tabby,Male,Black,White,Small,Short,No,No,Healthy,100,1,1
1,Cat,1,Domestic Medium Hair,Male,Black,Brown,Medium,Medium,Not Sure,Not Sure,Healthy,0,2,1
2,Dog,1,Mixed Breed,Male,Brown,White,Medium,Medium,Yes,No,Healthy,0,7,1
3,Dog,4,Mixed Breed,Female,Black,Brown,Medium,Short,Yes,No,Healthy,150,8,1
4,Dog,1,Mixed Breed,Male,Black,No Color,Medium,Short,No,No,Healthy,0,3,1


## Spliting the dataframe into train, validation, and test

The loaded dataset was a single file. It has to be split into train, validation, and test sets.

In [8]:
train, test = train_test_split(dataframe, test_size=0.2)
train, val = train_test_split(train, test_size=0.2)
print(len(train), 'train examples')
print(len(val), 'validation examples')
print(len(test), 'test examples')

7383 train examples
1846 validation examples
2308 test examples


## Input pipeline

The dataframe is wrapped with [tf.data](https://www.tensorflow.org/guide/datasets).
This is done to easily shuffle and batch the data. 

If the RAM is not sufficient, tf.data could be used directly to read it from disk in batches.

In [9]:
# A utility method to create a tf.data dataset from a Pandas Dataframe
def df_to_dataset(dataframe, shuffle=True, batch_size=1):
  dataframe = dataframe.copy()
  labels = dataframe.pop('target')
  ds = tf.data.Dataset.from_tensor_slices((dict(dataframe), labels))
  if shuffle:
    ds = ds.shuffle(buffer_size=len(dataframe))
  ds = ds.batch(batch_size)
  ds = ds.prefetch(batch_size)
  return ds

The general pipeline for input is finished here.
What does it look like?

In [10]:
batch_size = 5
train_ds = df_to_dataset(train, batch_size=batch_size)

In [11]:
[(train_features, label_batch)] = train_ds.take(1)
print('Every feature:', list(train_features.keys()))
print('A batch of ages:', train_features['Age'])
print('A batch of targets:', label_batch )

# TODO currently this is targeted towards the dummy -set

Every feature: ['Type', 'Age', 'Breed1', 'Gender', 'Color1', 'Color2', 'MaturitySize', 'FurLength', 'Vaccinated', 'Sterilized', 'Health', 'Fee', 'PhotoAmt']
A batch of ages: tf.Tensor([ 2  1  4  2 12], shape=(5,), dtype=int64)
A batch of targets: tf.Tensor([0 1 1 1 0], shape=(5,), dtype=int32)


The dataset returns a dictionary of column names (from the dataframe) that map to column values from rows in the dataframe.

## Preprocessing layers

I will have to adapt the pipelines when I replace the dummy-code, but afterwards I will be able to input plain string data etc from new data as well.

Information about the pre-processing layers for easy access when I am there:

*   [`Normalization`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing/Normalization) - Feature-wise normalization of the data.
*   [`CategoryEncoding`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing/CategoryEncoding) - Category encoding layer.
*   [`StringLookup`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing/StringLookup) - Maps strings from a vocabulary to integer indices.
*   [`IntegerLookup`](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing/IntegerLookup) - Maps integers from a vocabulary to integer indices.

A list of available preprocessing layers can be found [here](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing).

### Numeric columns
A Normalization() layer ensures that each numeric feature has a mean of 0 and a standard deviation of 1.

The `get_normalization_layer` function returns a keras layer.
It applies featurewise normalization to numerical features.

In [12]:
def get_normalization_layer(name, dataset):
  # Create a Normalization layer for our feature.
  normalizer = preprocessing.Normalization()

  # Prepare a Dataset that only yields our feature.
  feature_ds = dataset.map(lambda x, y: x[name])

  # Learn the statistics of the data.
  normalizer.adapt(feature_ds)

  return normalizer

In [13]:
photo_count_col = train_features['PhotoAmt']
layer = get_normalization_layer('PhotoAmt', train_ds)
layer(photo_count_col)

<tf.Tensor: shape=(5, 1), dtype=float32, numpy=
array([[-0.1901024 ],
       [ 0.12953459],
       [ 0.44917157],
       [ 0.44917157],
       [-0.1901024 ]], dtype=float32)>

TODO: If I will indeed have many numeric features (hundreds, or more), it would be more efficient to concatenate them first and use a single [normalization](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing/Normalization) layer.

### Categorical columns

In the dummy dataset, Type is represented as a string (e.g. 'Dog', or 'Cat'). Sadly, one can not feed strings directly to a model. The preprocessing layer takes care of representing strings as a one-hot vector.

The `get_category_encoding_layer` function returns a layer, mapping values from a vocabulary to integer indices and one-hot encodes the features.

In [14]:
def get_category_encoding_layer(name, dataset, dtype, max_tokens=None):
  # Create a StringLookup layer which will turn strings into integer indices
  if dtype == 'string':
    index = preprocessing.StringLookup(max_tokens=max_tokens)
  else:
    index = preprocessing.IntegerLookup(max_values=max_tokens)

  # Prepare a Dataset that only yields our feature
  feature_ds = dataset.map(lambda x, y: x[name])

  # Learn the set of possible values and assign them a fixed integer index.
  index.adapt(feature_ds)

  # Create a Discretization for our integer indices.
  encoder = preprocessing.CategoryEncoding(max_tokens=index.vocab_size())

  # Prepare a Dataset that only yields our feature.
  feature_ds = feature_ds.map(index)

  # Learn the space of possible indices.
  encoder.adapt(feature_ds)

  # Apply one-hot encoding to our indices. The lambda function captures the
  # layer so we can use them, or include them in the functional model later.
  return lambda feature: encoder(index(feature))

In [15]:
type_col = train_features['Type']
layer = get_category_encoding_layer('Type', train_ds, 'string')
layer(type_col)

<tf.Tensor: shape=(5, 4), dtype=float32, numpy=
array([[0., 0., 1., 0.],
       [0., 0., 0., 1.],
       [0., 0., 0., 1.],
       [0., 0., 1., 0.],
       [0., 0., 1., 0.]], dtype=float32)>

Often, you don't want to feed a number directly into the model, but instead use a one-hot encoding of those inputs. Consider raw data that represents a pet's age.

In [16]:
type_col = train_features['Age']
category_encoding_layer = get_category_encoding_layer('Age', train_ds,
                                                      'int64', 5)
category_encoding_layer(type_col)

<tf.Tensor: shape=(5, 5), dtype=float32, numpy=
array([[0., 0., 1., 0., 0.],
       [0., 0., 0., 0., 1.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 1., 0., 0., 0.]], dtype=float32)>

## Choosing and preparing columns to use

While we can deal with all types of data, we have to make a list of all columns for each type.\
That way I am able to define which layer needs to be treated how\

In [17]:
batch_size = 1
train_ds = df_to_dataset(train, batch_size=batch_size)
val_ds = df_to_dataset(val, shuffle=False, batch_size=batch_size)
test_ds = df_to_dataset(test, shuffle=False, batch_size=batch_size)

In [18]:
all_inputs = []
encoded_features = []

# Numeric features.
for header in ['PhotoAmt', 'Fee']:  # TODO use all headers in UMC set minus the ones I know are something else
  numeric_col = tf.keras.Input(shape=(1,), name=header)
  normalization_layer = get_normalization_layer(header, train_ds)
  encoded_numeric_col = normalization_layer(numeric_col)
  all_inputs.append(numeric_col)
  encoded_features.append(encoded_numeric_col)

In [19]:
# Categorical features encoded as integers.

# TODO at the UMC data, this will be more common, some tests have a categorical scale 
# However, most of them can just be interpreted as normal numerical feature, so I won't have to overdo it
age_col = tf.keras.Input(shape=(1,), name='Age', dtype='int64')
encoding_layer = get_category_encoding_layer('Age', train_ds, dtype='int64',
                                             max_tokens=5)
encoded_age_col = encoding_layer(age_col)
all_inputs.append(age_col)
encoded_features.append(encoded_age_col)

In [20]:
# Categorical features encoded as string.
categorical_cols = ['Type', 'Color1', 'Color2', 'Gender', 'MaturitySize',
                    'FurLength', 'Vaccinated', 'Sterilized', 'Health', 'Breed1'] 
# TODO replace this by reading the headings from the dataframe and substracting the hardcoded headings (that I know are something else))

for header in categorical_cols:
  categorical_col = tf.keras.Input(shape=(1,), name=header, dtype='string')
  encoding_layer = get_category_encoding_layer(header, train_ds, dtype='string',
                                               max_tokens=5) # TODO maybe, this line has to be duplicated and slightly changed to accomodate for different max_tokens
  encoded_categorical_col = encoding_layer(categorical_col)
  all_inputs.append(categorical_col)
  encoded_features.append(encoded_categorical_col)


In [21]:
# Currently I do not think the UMC data needs to be balanced.
# It will be evaluated on the same dataset (though a different part of it)
# We do not have a large number of samples that are underrepresented, probably causing large inaccura

#use:
#    https://www.tensorflow.org/tutorials/structured_data/imbalanced_data

## The model


In [22]:
# The first step towards a working model
# is our preprocessed input.
# As that is a relative complex task, that is regarded it's owy model.

preprocessed_layers = layers.Concatenate()(encoded_features) #encoded_features
preprocesessing_model = tf.keras.Model(all_inputs, preprocessed_layers)
preprocesessing_model.summary()

Model: "functional_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
Age (InputLayer)                [(None, 1)]          0                                            
__________________________________________________________________________________________________
Type (InputLayer)               [(None, 1)]          0                                            
__________________________________________________________________________________________________
Color1 (InputLayer)             [(None, 1)]          0                                            
__________________________________________________________________________________________________
Color2 (InputLayer)             [(None, 1)]          0                                            
_______________________________________________________________________________________

In [23]:
# The main model will contain 2 more sub-models next to the preprocessing one.
# One of them represents different concepts, the other one how important a certain concept is,
# depending on the input.

# As both models create a separate output, to create a prediction, their outputs have to be combined.

# A simple way to aggregate the output is to just multiply the concepts with their weights

# While the name is *SUM*-aggregator, it is not actually summed RN #TODO rename it
# the summing is done in the main model, as the (in the usual sense prmature) output of this function helps interpret the model

class SumAggregator():
    def __init__(self, num_classes, **kwargs):
        """Basic Aggregator that joins the concepts and relevances by summing their products.
        -> weights every concept with its relevance the output is the sum
        """
        super().__init__()
        self.num_classes = num_classes

    @staticmethod
    def forward( concepts, relevances):
        """Forward pass of Sum Aggregator.

        Aggregates concepts and relevances and returns the predictions for each class.

        Parameters # TODO change this to appropriate TF variants
        ----------
        concepts : torch.Tensor
            Contains the output of the conceptizer with shape (BATCH, NUM_CONCEPTS, DIM_CONCEPT=1).
        relevances : torch.Tensor
            Contains the output of the parameterizer with shape (BATCH, NUM_CONCEPTS, NUM_CLASSES).

        Returns
        -------
        class_predictions : torch.Tensor
            Predictions for each class. Shape - (BATCH, NUM_CLASSES)
            
        """
        #permuted = tf.transpose(relevances, perm=[0, 2])  # so that the number of concepts is at the end
        #batch_matrix_matrix_product = tf.matmul(permuted, concepts)  # multiply all relevance scores
        #       with their corresponding concepts activation
        #aggregated = tf.squeeze(batch_matrix_matrix_product)  # squeeze(-1)  # remove the number of concepts
        aggregated = tf.math.multiply(concepts, relevances)
        return tf.nn.log_softmax(aggregated)


Within the models structure, there are repetetive patterns.

For readability those layers are combined into custom layers and models:

In [28]:
# A combination of layers, common in the parameterizer

class ParameterizerLayer(layers.Layer):
    
    def __init__(self, out_shape, dropout_rate):
        super(ParameterizerLayer, self).__init__()
        self.para_lin = layers.Dense(out_shape, activation='linear')
        self.para_drop = layers.Dropout(dropout_rate)
        self.para_relu = layers.Dense(out_shape, activation=tf.keras.layers.LeakyReLU(alpha=0.01))
        
    
    def call(self, input_tensor,  training=False):
        x = self.para_lin(input_tensor)
        if training:
            x = self.para_drop(x, training=training)
        x = self.para_relu(x)        
        return x
    
# should minimize robustness loss

In [29]:
# The purpose of this sub-model to the network is to assign weights to the conceptizers output.
# One could say that every value in its output is a measure of how important a certain concept is.
# Here a concept is one output node of the conceptizer,
# the Parameterizers output at position x relates to the conceptizers output at that same index (x) 

class Parameterizer(tf.keras.Model):
    
    def __init__(self, hidden_sizes, out_shape, dropout_rate):
        super(Parameterizer, self).__init__()
        self.para0 = ParameterizerLayer(hidden_sizes[0], dropout_rate)
        self.para1 = ParameterizerLayer(hidden_sizes[1], dropout_rate)
        self.para2 = ParameterizerLayer(hidden_sizes[2], dropout_rate)
        self.para3 = ParameterizerLayer(hidden_sizes[3], dropout_rate)
        self.para_lin = layers.Dense(out_shape, activation='linear')
        self.para_drop = layers.Dropout(dropout_rate)

        
    
    def call(self, input_tensor,  training=False):
        x = self.para0(input_tensor)
        x = self.para1(x)
        x = self.para2(x)
        x = self.para3(x)
        x = self.para_lin(x)
        if training:
            x = self.para_drop(x, training=training)
        return x
    


In [30]:
# As this model in non-standard in many ways, a custom loss function is necessary
# As we would like to track and reference those values outside of the model, they are defined here 
# TODO is that actually necessary?

#TODO make them the actually matching loss and metric!
loss_tracker = keras.metrics.BinaryCrossentropy(name="bcl") #keras.metrics.Mean(name="loss")
accuracy_metric = keras.metrics.Accuracy(name="accuracy")
bcl_metric = keras.metrics.BinaryCrossentropy(name="bcl")

In [50]:
# This class stores the network itself

# Basically: the input is preprocessed into normalized scalars and fed into 
# a) The conceptizer to learn which features go together to form underlying structures
#    Here all concepts are = inputs so there is no explicit conceptizer necessary
# b) The parameterizer to learn which concept is important in what situation.
# a and b are then combined by the aggregator
# the aggregators output is summed up to produce the prediction

#TODO replace the singe neuron at the end by a sum function.

class SENN(tf.keras.Model):

    def __init__(self, hidden_sizes, preprocessing_model, dropout_rate):
        
        #general (superclass) model constructor
        super(SENN, self).__init__()
        
        #preprocess the input
        self.input_layers = preprocessing_model    

        input_shape = 55 #TODO update - 55 is the number of parameters after encoding (preprocessing_layers)
        
        # The model that gives a weight per feature (parameterizer)
        out_shape = 55 # this needs to be devided by 4 for using the complex conceptizer
        self.parameterizer = Parameterizer(hidden_sizes, out_shape, dropout_rate)  # out: out_shape
        self.parameterizer.build((None, input_shape))
        
        # The way to combine the activation and weights (aggregator)
        self.aggregator = SumAggregator(input_shape)
        
        #The output should be a probability between 0 and 1
        self.sigmoid = layers.Activation(tf.keras.activations.sigmoid)
        
        
    def call(self, input_tensor, training=False):
        print("input_tensor:", input_tensor)
        pre_processed = self.input_layers(input_tensor) 
        
        concepts = pre_processed
        
        print("Pre_processed:", pre_processed)
        relevances = self.parameterizer(pre_processed, training)
        
        #aggregate the output
        aggr_output = self.aggregator.forward(concepts, relevances)
        
        summed = tf.keras.backend.sum(aggr_output, keepdims=True)
        
        probability = self.sigmoid(summed)
        
        normalized = probability # tf.keras.backend.round(probability)
        
        tf.print("normalized:", normalized)
        return normalized[0], concepts, relevances
    
    @property
    def metrics(self):
        # We list our `Metric` objects here so that `reset_states()` can be
        # called automatically at the start of each epoch
        # or at the start of `evaluate()`.
        # If you don't implement this property, you have to call
        # `reset_states()` yourself at the time of your choosing.
        return [loss_tracker, accuracy_metric, bcl_metric]
    
    # customize the train function, base code from:
    # https://www.tensorflow.org/guide/keras/customizing_what_happens_in_fit

In [51]:
# The network as object has to be initialized before use, the hidden_sizes are important to determine hoe big the network is going to be.
# hidden_sizes must be exactly 4 elements long, each number gives the size of a hidden layer in the parameterizer.
hidden_sizes = [100, 100, 50, 100]  # TODO fix this to actual size
model = SENN(hidden_sizes, preprocesessing_model, dropout_rate=0.1)     

model.compile(optimizer='adam', loss="mse", metrics=["mae"])

In [52]:
#One step in training

@tf.function
def train_step(data):
    # Unpack the data. Its structure depends on your model and
    # on what you pass to `fit()`.

    if len(data) == 3:
        x, y, sample_weight = data
        print("Warning: sample weight is not currently supported and will be ignored!")
    else:
        x, y = data


    with tf.GradientTape() as tape:
        y_pred = model(x, training=True)  # Forward pass

        """
        concepts = x

        relevances = model.parameterizer(x, training=True)

        #aggregate the output
        aggr_output = model.aggregator.forward(concepts, relevances)

        summed = tf.keras.backend.sum(aggr_output, keepdims=True)

        probability = model.sigmoid(summed)

        normalized = tf.keras.backend.round(probability)
        y_pred = normalized[0]
        tf.print("step y:", y)
        tf.print("step y_pred:", y_pred)
        """
        
        loss = keras.losses.binary_crossentropy(y, y_pred, from_logits=True)# TODO should be true, but gives shape error

    tf.print("y:", y)
    tf.print("y_pred:", y_pred)
    tf.print("loss:", loss)
    # Compute gradients
    trainable_vars = model.trainable_variables 
    
    gradients = tape.gradient(loss, trainable_vars)

    # Update weights
    model.optimizer.apply_gradients(
        zip(gradients, trainable_vars)
    )

    # Update metrics (includes the metric that tracks the loss)
    accuracy_metric.update_state(y, y_pred) # TODO support sample_weight=sample_weight

    # Compute metrics
    loss_tracker.update_state(y, y_pred) #loss)
    accuracy_metric.update_state(y, y_pred)
    bcl_metric.update_state(y, y_pred)

    return {
        "loss": loss_tracker.result(), 
        "accuracy": accuracy_metric.result(), 
        "bcl": bcl_metric.result()
    }

In [53]:
# Instantiate an optimizer.
optimizer = keras.optimizers.SGD(learning_rate=1e-3)

epochs = 2
for epoch in range(epochs):
    print("\nStart of epoch %d" % (epoch,))

    # Iterate over the batches of the dataset.
    for step, (x_batch_train, y_batch_train) in enumerate(train_ds):

        # Open a GradientTape to record the operations run
        # during the forward pass, which enables auto-differentiation.
        with tf.GradientTape() as tape:

            # Run the forward pass of the layer.
            # The operations that the layer applies
            # to its inputs are going to be recorded
            # on the GradientTape.
            aggregates, concepts, relevances = model(x_batch_train, training=True)  # Logits for this minibatch

            # Compute the loss value for this minibatch.
            # normed = keras.losses.binary_crossentropy(y, aggregates, from_logits=True)
            
            #custom loss:
            batch_size = 256 #TODO make this no longer static
            num_classes = 1 #TODO make this no longer static
            tf.print("aggregates:", aggregates)
            tf.print("relevances:", relevances)
            J_yx = tape.gradient(aggregates, x_batch_train)
            tf.print("J_yx", J_yx)
            robustness_loss = J_yx - relevances
            tf.print("loss:", robustness_loss)
            normed = tf.norm(robustness_loss) # ord='fro') TODO: want frobenius form but not supported?
            
            
        # Use the gradient tape to automatically retrieve
        # the gradients of the trainable variables with respect to the loss.
        #grads = tape.gradient(loss_value, model.trainable_weights)
        grads = normed

        # Run one step of gradient descent by updating
        # the value of the variables to minimize the loss.
        optimizer.apply_gradients(zip(grads, model.trainable_weights))

        # Log every 200 batches.
        if step % 200 == 0:
            print(
                "Training loss (for one batch) at step %d: %.4f"
                % (step, float(loss_value))
            )
            print("Seen so far: %s samples" % ((step + 1) * 64))


Start of epoch 0
input_tensor: {'Type': <tf.Tensor: shape=(1,), dtype=string, numpy=array([b'Cat'], dtype=object)>, 'Age': <tf.Tensor: shape=(1,), dtype=int64, numpy=array([9], dtype=int64)>, 'Breed1': <tf.Tensor: shape=(1,), dtype=string, numpy=array([b'Burmilla'], dtype=object)>, 'Gender': <tf.Tensor: shape=(1,), dtype=string, numpy=array([b'Male'], dtype=object)>, 'Color1': <tf.Tensor: shape=(1,), dtype=string, numpy=array([b'Gray'], dtype=object)>, 'Color2': <tf.Tensor: shape=(1,), dtype=string, numpy=array([b'No Color'], dtype=object)>, 'MaturitySize': <tf.Tensor: shape=(1,), dtype=string, numpy=array([b'Medium'], dtype=object)>, 'FurLength': <tf.Tensor: shape=(1,), dtype=string, numpy=array([b'Medium'], dtype=object)>, 'Vaccinated': <tf.Tensor: shape=(1,), dtype=string, numpy=array([b'Not Sure'], dtype=object)>, 'Sterilized': <tf.Tensor: shape=(1,), dtype=string, numpy=array([b'Not Sure'], dtype=object)>, 'Health': <tf.Tensor: shape=(1,), dtype=string, numpy=array([b'Healthy'], 

ValueError: Attempt to convert a value ({'Type': None, 'Age': None, 'Breed1': None, 'Gender': None, 'Color1': None, 'Color2': None, 'MaturitySize': None, 'FurLength': None, 'Vaccinated': None, 'Sterilized': None, 'Health': None, 'Fee': None, 'PhotoAmt': None}) with an unsupported type (<class 'dict'>) to a Tensor.

Let's visualize our connectivity graph:


In [37]:
# Define the Keras TensorBoard callback, used for the animated, interactive tensorboard visualizatioon
logdir="logs/fit/" + datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=logdir)

#This should plot the exhaustive graph, but is a bit unreliable
tf.keras.utils.plot_model(model, show_shapes=True, rankdir="LR")

('Failed to import pydot. You must `pip install pydot` and install graphviz (https://graphviz.gitlab.io/download/), ', 'for `pydotprint` to work.')


### Train the model


In [38]:
model.fit(train_ds, epochs=1, validation_data=val_ds, callbacks=[tensorboard_callback])

input_tensor: {'Type': <tf.Tensor 'ExpandDims_11:0' shape=(None, 1) dtype=string>, 'Age': <tf.Tensor 'ExpandDims:0' shape=(None, 1) dtype=int64>, 'Breed1': <tf.Tensor 'ExpandDims_1:0' shape=(None, 1) dtype=string>, 'Gender': <tf.Tensor 'ExpandDims_6:0' shape=(None, 1) dtype=string>, 'Color1': <tf.Tensor 'ExpandDims_2:0' shape=(None, 1) dtype=string>, 'Color2': <tf.Tensor 'ExpandDims_3:0' shape=(None, 1) dtype=string>, 'MaturitySize': <tf.Tensor 'ExpandDims_8:0' shape=(None, 1) dtype=string>, 'FurLength': <tf.Tensor 'ExpandDims_5:0' shape=(None, 1) dtype=string>, 'Vaccinated': <tf.Tensor 'ExpandDims_12:0' shape=(None, 1) dtype=string>, 'Sterilized': <tf.Tensor 'ExpandDims_10:0' shape=(None, 1) dtype=string>, 'Health': <tf.Tensor 'ExpandDims_7:0' shape=(None, 1) dtype=string>, 'Fee': <tf.Tensor 'ExpandDims_4:0' shape=(None, 1) dtype=int64>, 'PhotoAmt': <tf.Tensor 'ExpandDims_9:0' shape=(None, 1) dtype=int64>}
Pre_processed: Tensor("senn/strided_slice:0", shape=(), dtype=int32)
input_tens

type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
  62/7383 [..............................] - ETA: 57s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
  68/7383 [..............................] - ETA: 57s - bcl: 0.0000e+00 - accuracy: 

 144/7383 [..............................] - ETA: 53s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
 151/7383 [..............................] - ETA: 53s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.py

 228/7383 [..............................] - ETA: 53s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
 234/7383 [..............................] - ETA: 53s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.py

 312/7383 [>.............................] - ETA: 53s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
 318/7383 [>.............................] - ETA: 53s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.py

type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
 399/7383 [>.............................] - ETA: 52s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>


normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
 485/7383 [>.............................] - ETA: 50s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
 492/7383 [>.............................] - ETA: 50s - bcl: 0.000

 568/7383 [=>............................] - ETA: 49s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
 576/7383 [=>............................] - ETA: 49s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.py

 655/7383 [=>............................] - ETA: 47s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
 663/7383 [=>............................] - ETA: 47s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.py

normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
 744/7383 [==>...........................] - ETA: 46s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
 752/7383 [==>...........................] - ETA: 46s - bcl: 0.000

normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
 833/7383 [==>...........................] - ETA: 45s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
 838/7383 [==>...........................] - ETA: 45s - bcl: 0.000

normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
 915/7383 [==>...........................] - ETA: 44s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
 923/7383 [==>...........................] - ETA: 44s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.py

1000/7383 [===>..........................] - ETA: 44s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
1006/7383 [===>..........................] - ETA: 44s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.py

type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
1090/7383 [===>..........................] - ETA: 43s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>


1172/7383 [===>..........................] - ETA: 42s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
1180/7383 [===>..........................] - ETA: 42s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.py

normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
1259/7383 [====>.........................] - ETA: 42s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.frame

1341/7383 [====>.........................] - ETA: 42s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
1349/7383 [====>.........................] - ETA: 42s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.py

normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
1428/7383 [====>.........................] - ETA: 41s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
1437/7383 [====>.........................] - ETA: 41s - bcl: 0.000

normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
1518/7383 [=====>........................] - ETA: 40s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.frame

1600/7383 [=====>........................] - ETA: 39s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
1609/7383 [=====>........................] - ETA: 39s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.py

normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
1695/7383 [=====>........................] - ETA: 38s - bcl: 0.0000e+00 - accuracy: 0.0000e+00normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.frame

normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized

normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type norma

normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
type norma

normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type norma

normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
type norma

type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type norma

type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
type normalized: <class 'ten

type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
type normalized: <class 'tensorflow.python.framework.ops.Tensor'>
normalized: [[0]]
type normalized: <class 'ten

KeyboardInterrupt: 

In [None]:
accuracy = model.evaluate(test_ds)
print("Accuracy", accuracy)

In [39]:
model.summary()

Model: "senn"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
functional_1 (Functional)    (None, 55)                6         
_________________________________________________________________
parameterizer (Parameterizer multiple                  64255     
_________________________________________________________________
activation (Activation)      multiple                  0         
Total params: 64,261
Trainable params: 64,255
Non-trainable params: 6
_________________________________________________________________


In [42]:
#visualize model in an interactive way
#sadly only works until the preprocessing layers are over
# tensorboard sometimes thinks there still is an instance running when it is not
# fix that by deleting the contents of this folder or your equivalent of it
# C:\Users\deisl\AppData\Local\Temp\.tensorboard-info

%reload_ext tensorboard
# rankdir='LR' is used to make the graph horizontal.
#tf.keras.utils.plot_model(model, show_shapes=True, rankdir="LR")
%tensorboard --logdir logs

Reusing TensorBoard on port 6006 (pid 2060), started 0:01:26 ago. (Use '!kill 2060' to kill it.)

## Inference on new data

As the model contains all important parts, it should be able to work on any file of the right format


The model should be saved such that it can just be reloaded later.\
I will follow the tutorial [here](https://www.tensorflow.org/tutorials/keras/save_and_load)

In [43]:
model.save('my_pet_classifier')
reloaded_model = tf.keras.models.load_model('my_pet_classifier')

input_tensor: {'Type': <tf.Tensor 'Type:0' shape=(None,) dtype=string>, 'Age': <tf.Tensor 'Age:0' shape=(None,) dtype=int64>, 'Breed1': <tf.Tensor 'Breed1:0' shape=(None,) dtype=string>, 'Gender': <tf.Tensor 'Gender:0' shape=(None,) dtype=string>, 'Color1': <tf.Tensor 'Color1:0' shape=(None,) dtype=string>, 'Color2': <tf.Tensor 'Color2:0' shape=(None,) dtype=string>, 'MaturitySize': <tf.Tensor 'MaturitySize:0' shape=(None,) dtype=string>, 'FurLength': <tf.Tensor 'FurLength:0' shape=(None,) dtype=string>, 'Vaccinated': <tf.Tensor 'Vaccinated:0' shape=(None,) dtype=string>, 'Sterilized': <tf.Tensor 'Sterilized:0' shape=(None,) dtype=string>, 'Health': <tf.Tensor 'Health:0' shape=(None,) dtype=string>, 'Fee': <tf.Tensor 'Fee:0' shape=(None,) dtype=int64>, 'PhotoAmt': <tf.Tensor 'PhotoAmt:0' shape=(None,) dtype=int64>}


TypeError: len is not well defined for symbolic Tensors. (senn/functional_1/concatenate/concat:0) Please call `x.shape` rather than `len(x)` for shape information.

To get a prediction for a new sample, you can simply call `model.predict()`. There are just two things you need to do:

1.   Wrap scalars into a list so as to have a batch dimension (models only process batches of data, not single samples)
2.   Call `convert_to_tensor` on each feature

In [44]:
# TODO this does not work

sample = {
    'Type': 'Cat',
    'Age': 3,
    'Breed1': 'Tabby',
    'Gender': 'Male',
    'Color1': 'Black',
    'Color2': 'White',
    'MaturitySize': 'Small',
    'FurLength': 'Short',
    'Vaccinated': 'No',
    'Sterilized': 'No',
    'Health': 'Healthy',
    'Fee': 100,
    'PhotoAmt': 2,
}

input_dict = {name: tf.convert_to_tensor([value]) for name, value in sample.items()}
predictions = reloaded_model.predict(input_dict)
prob = tf.nn.sigmoid(predictions[0])

print(
    "This particular pet had a %.1f percent probability "
    "of getting adopted." % (100 * prob)
)

NameError: name 'reloaded_model' is not defined

Old code for reference:

In [46]:
#a complex conceptizer would include:
class ConceptizerLayer(layers.Layer):
    
    def __init__(self, out_shape):
        super(ConceptizerLayer, self).__init__()
        self.lin = layers.Dense(out_shape, activation='linear')
        self.relu = layers.Dense(out_shape, activation=tf.keras.layers.LeakyReLU(alpha=0.01))
        
    def call(self, input_tensor,  training=False):
        #x = self.inp(input_tensor)
        x = self.lin(input_tensor)
        x = self.relu(x)
        return x


In [None]:
# The conceptizer is a submodel of our network.
# The idea is, that it learns which combination of inputs are relevant together 

#For categorical data an identity mapping should be more interpretable,hence this class is no longer used
class Conceptizer(layers.Layer):
    
    def __init__(self, start_dim):
        super(Conceptizer, self).__init__()
        self.con0 = ConceptizerLayer(start_dim)
        self.con1 = ConceptizerLayer(start_dim/2)
        self.con2 = ConceptizerLayer(start_dim/4)
        self.lin = layers.Dense(start_dim/4, activation='linear')

        
    
    def call(self, input_tensor,  training=False):
        x = self.con0(input_tensor)
        x = self.con1(x)
        x = self.con2(x)
        x = self.lin(x)
        return x
# should minimize reconstruction loss, but as this class it is not used any more, that was never implemented

In [47]:
# While more compley concepts might work better, it will be most explainable if every factor is its own concept
class IdentityConceptizer(layers.Layer):
    
    def __init__(self, start_dim):
        super(IdentityConceptizer, self).__init__()
        self.identity = layers.Layer(start_dim) # standard implementation of call for Layer is identity
        
    def call(self, input_tensor, training=False):
        return self.identity(input_tensor)
    
    #This is the layer implementation of the identityConceptizer 
    # As I switched to sequential models - this function is out of date

In [49]:
# the most basic conceptizer (buggy, just for concept)
def get_identity_conceptizer(input_features):
    conceptizer = keras.Sequential(
        [
            layers.Layer(input_features)
        ]
    )
    conceptizer.compile(optimizer='adam', loss=tf.keras.losses.mean_squared_error) 
    # should minimize minimize reconstruction loss, but as it is non-trainable identity that is perfect already
    return conceptizer

Just a note about how the trainstep inside the model would look:

#One step in training

@tf.function
def train_step(data):
    # Unpack the data. Its structure depends on your model and
    # on what you pass to `fit()`.

    if len(data) == 3:
        x, y, sample_weight = data
        print("Warning: sample weight is not currently supported and will be ignored!")
    else:
        x, y = data


    with tf.GradientTape() as tape:
        y_pred = model(x, training=True)  # Forward pass

        """
        concepts = x

        relevances = model.parameterizer(x, training=True)

        #aggregate the output
        aggr_output = model.aggregator.forward(concepts, relevances)

        summed = tf.keras.backend.sum(aggr_output, keepdims=True)

        probability = model.sigmoid(summed)

        normalized = tf.keras.backend.round(probability)
        y_pred = normalized[0]
        tf.print("step y:", y)
        tf.print("step y_pred:", y_pred)
        """
        
        loss = keras.losses.binary_crossentropy(y, y_pred, from_logits=True)# TODO should be true, but gives shape error

    tf.print("y:", y)
    tf.print("y_pred:", y_pred)
    tf.print("loss:", loss)
    # Compute gradients
    trainable_vars = model.trainable_variables 
    
    gradients = tape.gradient(loss, trainable_vars)

    # Update weights
    model.optimizer.apply_gradients(
        zip(gradients, trainable_vars)
    )

    # Update metrics (includes the metric that tracks the loss)
    accuracy_metric.update_state(y, y_pred) # TODO support sample_weight=sample_weight

    # Compute metrics
    loss_tracker.update_state(y, y_pred) #loss)
    accuracy_metric.update_state(y, y_pred)
    bcl_metric.update_state(y, y_pred)

    return {
        "loss": loss_tracker.result(), 
        "accuracy": accuracy_metric.result(), 
        "bcl": bcl_metric.result()
    }

# TODO overwrite test_step(self, data) to support model.evaluate()
# gonna look something like this:
"""
def test_step(self, data):
    # Unpack the data
    x, y = data
    # Compute predictions
    y_pred = self(x, training=False)
    # Updates the metrics tracking the loss
    self.compiled_loss(y, y_pred, regularization_losses=self.losses)
    # Update the metrics.
    self.compiled_metrics.update_state(y, y_pred)
    # Return a dict mapping metric names to current value.
    # Note that it will include the loss (tracked in self.metrics).
    return {m.name: m.result() for m in self.metrics}
"""