-
-
Notifications
You must be signed in to change notification settings - Fork 453
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extracting latent context vectors #62
Comments
@beasteers hey, sorry for the (very) late reply! You can use keract to get the activations of each layer in a Keras model. The only problem here is that we merge all the TCN into one single layer so at the moment it's not possible to get the activations of each sub-layer inside. I'll try to find a way around it. |
But at the moment, you can get exactly what you want with the previous version of keras-tcn: 2.8.3. So consider doing something like:
"""
#Trains a TCN on the IMDB sentiment classification task.
Output after 1 epochs on CPU: ~0.8611
Time per epoch on CPU (Core i7): ~64s.
Based on: https://github.com/keras-team/keras/blob/master/examples/imdb_bidirectional_lstm.py
"""
import numpy as np
from keras import Model, Input
from keras.datasets import imdb
from keras.layers import Dense, Dropout, Embedding
from keras.preprocessing import sequence
import keract
from tcn import TCN
max_features = 20000
# cut texts after this number of words
# (among top max_features most common words)
maxlen = 100
batch_size = 32
print('Loading data...')
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
print(len(x_train), 'train sequences')
print(len(x_test), 'test sequences')
print('Pad sequences (samples x time)')
x_train = sequence.pad_sequences(x_train, maxlen=maxlen)
x_test = sequence.pad_sequences(x_test, maxlen=maxlen)
print('x_train shape:', x_train.shape)
print('x_test shape:', x_test.shape)
y_train = np.array(y_train)
y_test = np.array(y_test)
i = Input(shape=(maxlen,))
x = Embedding(max_features, 128)(i)
x = TCN(nb_filters=64,
kernel_size=6,
dilations=[1, 2, 4, 8, 16, 32, 64])(x)
x = Dropout(0.5)(x)
x = Dense(1, activation='sigmoid')(x)
model = Model(inputs=[i], outputs=[x])
model.summary()
# try using different optimizers and different optimizer configs
model.compile('adam', 'binary_crossentropy', metrics=['accuracy'])
activations = keract.get_activations(model, x_train[0:1])
print('Train...')
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=1,
validation_data=[x_test, y_test]) I added Then you should get something like this: Let me know if it helps :) |
Issue resolved. I think there’s an example with keract in the examples folder |
I'm trying to extract context vectors from the latent TCN space. Basically I want to extract all of the highlighted nodes from the hidden layers in this diagram (nodes with arrows) into a list of activations, one for each residual block:
The way that I see it, I need to get the activation layers at the end of each residual block, get their output tensors, and slice them using a stride based on the dilation rate. I'm just not sure if I need to do anything special to handle dilation rates of stacked TCN blocks. I think it would be helpful to add a helper method/property that makes it easier to extract this context.
If you're wondering about my use case, I'm trying to build a time series autoencoder-style network for multi-scale anomaly detection and I'm trying to model the learned latent space at multiple temporal scales using some distribution (probably a gaussian mixture model atm).
The text was updated successfully, but these errors were encountered: