Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seq2seq learning: how to pad the features? #3023

Closed
ipoletaev opened this issue Jun 19, 2016 · 1 comment
Closed

Seq2seq learning: how to pad the features? #3023

ipoletaev opened this issue Jun 19, 2016 · 1 comment

Comments

@ipoletaev
Copy link

ipoletaev commented Jun 19, 2016

According to #395 we should pad all of the training sequences with a special padding symbol (0,for example) in order to make equal length feature vectors. But I don't understand how to do it from the code point of view.
Let's imagine that we have classification task of words in the sentences. So, for example we have an input vector with shape_in=(nb_samples,30), where "30" - maximum amount of words in the sentence. Output of the neural network has shape_out=(nb_smaples,30,20), where "20" - amount of classes, for example.
Then, if we have the following input: "What are you doing right now?", the features and output will be next:

input_vector=[1243,34,5776,54,45,878,0,...,0]; len(input_vector)=30. Each number corresponds to respective word in the sentence. 0 is padding vector to the necessary length.
output_matrix = [
[1,0,...,0], - vector of output with length of 20 corresponds to the "What" and means that this word from the 1 class
[0,1,0,...,0], - "are" corresponds to the second class,for example, and so on...
///there're 27 row of zeros(20)
[0,0,...,0] - 30th row of zeros for the last padding symbol.
]

I have a few questions. The first one is a little off topic.

  • Do I understand that I should end each sentence with a special additional symbol which has the fixed feature in numerical (one-hot) representation for every sentence? Maybe it's necessary for every beginning of sentences?
  • As it became clear the first layer of the network that I use - Embedding.Then go BiLSTM layer and TimeDistributedDense with Softmax in the end. And actually the question: how besides use of the flag mask_zero=True in the Embedding layer I should specify output masking? So when the neural network meets at the output for some word the vector of all zeros then the network will simply skip such vectors and move on to the next example?

Many thanks for the detailed comments.

@ipoletaev ipoletaev changed the title Seq2seq learning: hot to pad the features? Seq2seq learning: how to pad the features? Jun 19, 2016
@ipoletaev
Copy link
Author

So, after some experiments I made the following conclusion:

  1. You shouldn't use mask_zero flag;
  2. For right results you just need to specify sample_weights numpy array with the shape of your output data (with ones and zeroes if you want to include some timestamp or not respectively);
  3. When you compile model it's necessary to use flag compile_mode='temporal'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant