Skip to content

llravelo/mt-coe197

Repository files navigation

Machine Translation Using Sequence to Sequence Models

Members:

  1. Leiko Ravelo
  2. Paolo Valdez
  3. Darwin Bautista

Presentation Materials

  1. Google slides
  2. LaTeX project is in Slack chat

Development Environment

Preferably work on a virtual environment (Python 3). See this guide for installing virtual environments.

I also thought it would be a good idea to use jupyter notebooks for this.

Packages (so far):

  • Tensorflow
  • Keras
  • Jupyter
  • Ipython
pip install keras tensorflow jupyter ipython

The following command will open a browser to the jupyter environment.

jupyter notebook

It's also possible to run a jupyter notebook remotely. See running a notebook server. (Can access through vpn w/ opera)

Useful Resources

Add here some resources you think might be useful for the project.

  • Hyperparameter optimization: hyperopt
  • Save and Load Keras Models: link
  • Stanford CS224n NLP with DL: link
  • Hyperparamter optimization guide: link
  • Machine Translation Best Practices "mini guide": link

Keras Tutorials:

  • Francois Chollet DL with Python: link
  • ML-AI experiments: link
  • NMT-Keras: link

Research Papers

Add here relevant research papers

Tasks

Dataset Creation

Once we have the dataset available, curate it and format it properly (tab indented text file).

Data Preprocessing

Convert text data from file to acceptable representation. May also need to remove punctuations (like comma, period, must be settled early on). Theres one-hot representation, which is pretty easy to implement. word2vec and glove is also available but may take some time to implement.

Relevant Resources:

Model Design and Metric checking

Objective of the project is to create an optimal working machine translator. Define objectives and a success metric. There's BLEU score but let's wait for sir's definition. Standard RNN architecture ('Vanilla Model') used for machine translation can be seen on stanford lectures.

I think it's also important to decouple training and actual translation. Need a way to save the model and load it on a different python file. -Leiko

Relevant Resources:

Hyperparameter Tuning

Possible hyperparameters include. Task is to identify which hyperparameters are important, and find generally accepted ranges. One solution is do a random search through the hyperparameter space to find the model that gives optimal results.

I have found hyperopt although I'm not sure how it works yet.

Possible hyperparameters include:

  1. Learning rate
  2. Gradient Descent Optimizer (SGD, Adam, RMSprop)
  3. Minibatch size
  4. Epochs
  5. RNN layers (1-4)
  6. Choice of RNN architecture (GRU, RNN, LSTM)
  7. Misc (Attention model, bidirectional lstm, deep lstm)

I'm not sure yet which ones of the above are important -Leiko

About

Machine Translation with Deep Learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published