question about unidirectional #9

bojone · 2019-05-21T02:15:02Z

In your paper, you use a unidirectional ON-LSTM to trained a language model and then phrase grammar with the output distance of the pretrained language model. How can we explain that the level of first token is independent with the future tokens? Is there any bidirectional way to do it?

yikangshen · 2019-05-21T16:51:30Z

I think you can try bidirection language model (e.g. elmo) or masked language model (e.g. bert). But the perplexity won't be comparable to previous language models.

bojone · 2019-05-22T09:06:17Z

I am confused of how can we calculate distance if we use bi-on-lstm? the average of distances of forward and backward on-lstm ?

speedcell4 · 2019-06-19T02:43:50Z

@bojone I get confused on the same question, did you figure it out?

bojone closed this as completed May 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about unidirectional #9

question about unidirectional #9

bojone commented May 21, 2019

yikangshen commented May 21, 2019

bojone commented May 22, 2019 •

edited

speedcell4 commented Jun 19, 2019

question about unidirectional #9

question about unidirectional #9

Comments

bojone commented May 21, 2019

yikangshen commented May 21, 2019

bojone commented May 22, 2019 • edited

speedcell4 commented Jun 19, 2019

bojone commented May 22, 2019 •

edited