Skip to content

Latest commit

 

History

History
3 lines (2 loc) · 607 Bytes

drl-nlp-action.md

File metadata and controls

3 lines (2 loc) · 607 Bytes

TLDR; The authors train a DQN on text-based games. The main difference is that their Q-Value functions embeds the state (textual context) and action (text-based choice) separately and then takes the dot product between them. The authors call this a Deep Reinforcement Learning Relevance network. Basically, just a different Q function implementation. Empirically, the authors show that their network can learn to solve "Saving John" and "Machine of Death" text games.