Skip to content

Yukino1010/Transformer_English_Chinese_Translation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Transformer_English_Chinese_Translation

transformer

If you want to talk about which model has the biggest contribution to NLP (natural language processing) in recent years,
Transformer will definitely be the first choice.

When the paper 「attention is all you need]」was published in 2017, it caused a big sensation at the time.

Because the transformer only uses the attention mechanism to construct the model, not like the traditional seq2seq model using RNN or CNN. Through the attention mechanism, it can convert sentences to to semantics well, and then the most important thing is that he can use GPU to perform parallel operations.

It is also because of the emergence of transfomer that many models with better performance have been produced,
such as Bert, Gpt3 etc.

Network Structure

image

Hyperparameters

  • BATCH_SIZE = 64
  • num_layers = 4
  • d_model = 512
  • dff = 2048
  • num_heads = 8
  • dropout_rate = 0.2

Data

This implementation uses the dataset 「wmt_translate」 from tensorflow dataset,
but only use the sentence lenth < 35.

Loss and Accuracy

image

Result

image

As you can see on the picture,you can surprisingly found that the model had concentrated on some certain english words when generating chinese words, such as years correspond to "年" on the first subgraph and economiy correspond to "經濟" on head2 、 head5.

References

  1. Attention Is All You Need [arxiv]
  2. Transformer model for language understanding [https://www.tensorflow.org/text/tutorials/transformer]
  3. LeeMeng [https://leemeng.tw/neural-machine-translation-with-transformer-and-tensorflow2.html]

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages