Skip to content

SeanvonB/language-translator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Language Translator

LIVE DEVELOPMENT NOTEBOOK

This project was part of my Natural Language Processing Nanodegree, which I completed in late 2020. This particular Nanodegree – in fact, this particular project – had been my goal throughout my studies of machine learning. I was just so excited to work on it back then, and I'm still excited to share the work with you now. Machine translation has a long and fascinating history that involved many different approaches before the widespread commercial adoption of Neural Machine Translation (NMT) around 2016 or so. The following NMT pipeline, that I created with TensorFlow via Keras, reflects some of the most state-of-the-art practices from that time period, but it was already somewhat outdated when I built it in 2020, thanks largely to Google Brain's Transformer model with attention.

The development notebook examines and preprocesses the data, tests a couple RNN architecture features, and finally assembles a bidirectional RNN model with embedding for training and prediction of English to French text, which reaches a pretty reasonable validation accuracy of 95% after just 10 training epochs.

Features

  • Translate English text into French text
  • Create a pipeline that could be modified to translate between any languages
  • Test the performance difference between word IDs and embeddings
  • Test the performance difference and training needs of simple and bidirectional RNNs

Credits

License

Copyright © 2020-2022 Sean von Bayern
Licensed under the MIT License