Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
LearnedSelfAttentionLayer requires fixed batchsize to output/fit or else fails #7777
Trying to build a sequence classifier using the ComputationGraphConfiguration:
word_input is an sequence of word embeddings (length 300). The maximum length of the sequence is 50.
After fitting some data using:
Using the SelfAttentionLayer works but when I use the LearnedSelfAttentionLayer and call an output function like:
it fails with the stack at:
It looks like output function is expecting a batch of 16 [223,50] inputs instead of just the one.