Problem of the “full context embeddings” implement

First thank you for your code.
But I am puzzled about your “full context embeddings” implementation. In the paper, the process of g' and f' are different, but you just stacked g' and f' and put them into a LSTM.
Perhaps this is why the accuray is low under miniImagenet ,I think?