Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dual Learning for Machine Translation #13

Open
flrngel opened this issue Mar 11, 2018 · 0 comments
Open

Dual Learning for Machine Translation #13

flrngel opened this issue Mar 11, 2018 · 0 comments

Comments

@flrngel
Copy link
Owner

flrngel commented Mar 11, 2018

https://arxiv.org/abs/1611.00179
paper from USTC, PKU, Microsoft Research
(NIPS 2016)

Summary

Model

  1. Prepare 2 agents LM_{a,b} which learned languages(en,fr; WMT14) that outputs log probability
  2. 2 translation models P(•|s;Θ_{AB,BA}) are needed
  3. Feed P's output to LM and use policy gradient to train
  4. Flip data and restart training until model converges

Abstract

  • dual-NMT uses reinforcement learning process
  • and it works very well

1. Introduction

  • Parallel data are costly in Machine Translation(MT)
  • Two methods using monolingual data, proposed before this paper
    1. use monolingual data, and then integrate with parallel bilingual data trained model
    2. generate pseudo pair (untrustable method)
      1. train aligned parallel corpora model
      2. generate pseudo bilingual sentence pair
      3. use subsequent learning process
  • Dual learning mechanism
    • two agent communication game
    • feedback based

2. Background: Neural Machine Translation

  • talks about typical NMT model with using attention

3. Dual Learning from Neural Machine Translation

image

5. Discussions

  • Dual learning is generally applicable to (and already forms in)
    • Speech recognition vs Text to speech
    • Image caption vs image generation
    • Question answering vs Question generation
    • Search vs Keyword extraction
    • etc.
  • Not restricted to two tasks (can be generalized as close-loop learning)
  • Not only pair of languages, but also can use tuple of multiple(3+) monolinugal data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant