One-hot encoded input? #7

matthew-jurewicz · 2020-07-14T19:07:22Z

I'm looking through the code, and I'm not seeing the token IDs being converted to one-hot encoded vectors. Is the input to the language model with autoregressive wrapper the token IDs?

lucidrains · 2020-07-15T02:11:28Z

@matthew-jurewicz Hi Matthew! Yup, you just to pass the token ids and make sure you instantiate the language model with the num_tokens set to the maximum of the ids! The token embedding are fetched from the embedding table here https://github.com/lucidrains/routing-transformer/blob/master/routing_transformer/routing_transformer.py#L523

matthew-jurewicz · 2020-07-15T02:24:35Z

Excellent! I'm a big fan of your work!

lucidrains · 2020-07-15T03:10:49Z

This was an implementation of someone else's research https://openreview.net/forum?id=B1gjs6EtDr Hope you find it useful!

matthew-jurewicz · 2020-07-15T03:33:25Z

What I mean is, as far as I know, no one's written the code for a fully-functional sparse transformer, much less this.

matthew-jurewicz closed this as completed Jul 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

One-hot encoded input? #7

One-hot encoded input? #7

matthew-jurewicz commented Jul 14, 2020

lucidrains commented Jul 15, 2020

matthew-jurewicz commented Jul 15, 2020

lucidrains commented Jul 15, 2020

matthew-jurewicz commented Jul 15, 2020

One-hot encoded input? #7

One-hot encoded input? #7

Comments

matthew-jurewicz commented Jul 14, 2020

lucidrains commented Jul 15, 2020

matthew-jurewicz commented Jul 15, 2020

lucidrains commented Jul 15, 2020

matthew-jurewicz commented Jul 15, 2020