# Imports

In [1]:
import pandas as pd
import numpy as np

This is required in order to be able to do relative imports like phcnn.layers

In [2]:
import os
import sys

# Change PYTHONPATH to allow for relative import
module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
    sys.path.append(module_path)

In [3]:
from keras.layers import (Lambda, MaxPooling1D, Flatten,
                          Dropout, Dense, Input)
from keras.models import Model
from keras.backend import floatx

Using TensorFlow backend.


In [4]:
from phcnn.layers import PhyloConv1D, euclidean_distances
from keras.utils.np_utils import to_categorical

Using cuDNN version 5110 on context None
Mapped name None to device cuda0: Tesla K80 (0BFE:00:00.0)


# Parameters

Parameters from convolutional layer. nb_neighbors is the number of neighbors to be convoluted together, nb_filters is the number of convolutional filter.

In [5]:
nb_neighbors = 4
nb_filters = 4

# Import of data

We need to expand Xs to be of the shape (filters, nb_samples, nb_features), so we apply a np.expand_dims to signal that we have only one filter. Futhermore we need to have y in a categorical form so we apply to_categorical.

In [6]:
Xs = pd.read_csv('../datasets/ibd_dataset/HS_CDf/Sokol_16S_taxa_HS_CDf_commsamp_training.txt',
                 sep='\t', header=0, index_col=0).as_matrix()
nb_features = Xs.shape[1]
Xs = np.expand_dims(Xs, axis=-1)
y = np.loadtxt('../datasets/ibd_dataset/HS_CDf/Sokol_16S_taxa_HS_CDf_commsamp_training_lab.txt', dtype=np.int)
Y = to_categorical(y)

Futhermore we need to import the MDS coordinates for our features. 

Pre-computed coordinates have been made available in the repository in the `datasets/coordinates`, for each of the diseases included in the **IBD** dataset.

Keras uses fast symbolic mathematical libraries as a backend, such as TensorFlow and Theano.

A downside of using these libraries is that the shape and size of your data must be defined once up front and held constant regardless of whether you are training your network or making predictions.

To fit the coordinate matrix as a valid Keras Tensor, we need to pre-process the numpy array to properly match dimensions. In more details, the first dimension of a tensor object must correspond to `n_samples`, that is the number of samples (_per batch_).

Thus we need to replicate **feature** coordinates for all the samples so that each sample  provide  padding to loaded numpy array for the coordinates. This is because, i We choosed to do it in the most straigthforward way possibile, we simply duplicate the coordinate matrix for every sample. We will drop such padding after the matrix is loaded in the network.

In [8]:
C = pd.read_csv('../datasets/coordinates/coordinates_cdf.txt',
                sep='\t', header=0, index_col=0).as_matrix()
nb_coordinates = C.shape[0]
Coords = np.empty((Xs.shape[0],) + C.shape, dtype=np.float64)
for i in range(Xs.shape[0]):
    Coords[i] = C
# add last dimension, i.e. channel, necessary for the Convolution
# operator to work.
Coords = np.expand_dims(Coords, axis=-1)

# Network

In [9]:
data = Input(shape=(nb_features, 1), name="data", dtype=floatx())
coordinates = Input(shape=(nb_coordinates, nb_features, 1),
                            name="coordinates", dtype=floatx())

conv_layer = data
# We remove the padding that we added to work around keras limitations
conv_crd = Lambda(lambda c: c[0], output_shape=lambda s: (s[1:]))(coordinates)

distances = euclidean_distances(conv_crd)
conv_layer, conv_crd = PhyloConv1D(distances, nb_neighbors,
                                   nb_filters, activation='relu')([conv_layer, conv_crd])

max = MaxPooling1D(pool_size=2, padding="valid")(conv_layer)
flatt = Flatten()(max)
drop = Dropout(0.25)(Dense(units=64, activation='relu')(flatt))
output = Dense(units=2, kernel_initializer="he_normal",
               activation="softmax", name='output')(drop)

model = Model(inputs=[data, coordinates], outputs=output)
model.compile(optimizer='Adam', loss='categorical_crossentropy')

In [10]:
model.fit(x=[Xs, Coords], y=Y)

Epoch 1/1


<keras.callbacks.History at 0x7f61b02d3978>