- paper: multiway attention networks for modeling sentences pairs
- add qanet attention(trilinear attenion)
- add slqa attention (bilinear attention)
- using softmax to give each attention a weight
- add char embedding and alignment with word embedding
- QANet
- DGCNN, encoder with dilate convolution
- multi-cast attention
BERT:
- exture chinese character embedding from offical tensorflow checkpoint