Machine Translation Using Sequence to Sequence Models

Members:

Leiko Ravelo
Paolo Valdez
Darwin Bautista

Presentation Materials

Google slides
LaTeX project is in Slack chat

Development Environment

Preferably work on a virtual environment (Python 3). See this guide for installing virtual environments.

I also thought it would be a good idea to use jupyter notebooks for this.

Packages (so far):

Tensorflow
Keras
Jupyter
Ipython

pip install keras tensorflow jupyter ipython

The following command will open a browser to the jupyter environment.

jupyter notebook

It's also possible to run a jupyter notebook remotely. See running a notebook server. (Can access through vpn w/ opera)

Useful Resources

Add here some resources you think might be useful for the project.

Hyperparameter optimization: hyperopt
Save and Load Keras Models: link
Stanford CS224n NLP with DL: link
Hyperparamter optimization guide: link
Machine Translation Best Practices "mini guide": link

Keras Tutorials:

Francois Chollet DL with Python: link
ML-AI experiments: link
NMT-Keras: link

Research Papers

Add here relevant research papers

Tasks

Dataset Creation

Once we have the dataset available, curate it and format it properly (tab indented text file).

Data Preprocessing

Convert text data from file to acceptable representation. May also need to remove punctuations (like comma, period, must be settled early on). Theres one-hot representation, which is pretty easy to implement. word2vec and glove is also available but may take some time to implement.

Relevant Resources:

Model Design and Metric checking

Objective of the project is to create an optimal working machine translator. Define objectives and a success metric. There's BLEU score but let's wait for sir's definition. Standard RNN architecture ('Vanilla Model') used for machine translation can be seen on stanford lectures.

I think it's also important to decouple training and actual translation. Need a way to save the model and load it on a different python file. -Leiko

Relevant Resources:

Basic Neural Machine Translator (Eng-Fr)

Hyperparameter Tuning

Possible hyperparameters include. Task is to identify which hyperparameters are important, and find generally accepted ranges. One solution is do a random search through the hyperparameter space to find the model that gives optimal results.

I have found hyperopt although I'm not sure how it works yet.

Possible hyperparameters include:

Learning rate
Gradient Descent Optimizer (SGD, Adam, RMSprop)
Minibatch size
Epochs
RNN layers (1-4)
Choice of RNN architecture (GRU, RNN, LSTM)
Misc (Attention model, bidirectional lstm, deep lstm)

I'm not sure yet which ones of the above are important -Leiko

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.ipynb_checkpoints		.ipynb_checkpoints
corpus		corpus
nmt		nmt
.DS_Store		.DS_Store
.gitignore		.gitignore
Machine Translation Project.ipynb		Machine Translation Project.ipynb
README.html		README.html
README.md		README.md
README.md~		README.md~
demo.py		demo.py
generate_embedding_weights.py		generate_embedding_weights.py
main.py		main.py
s2s.377.h5		s2s.377.h5
test.py		test.py
toy.py		toy.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine Translation Using Sequence to Sequence Models

Presentation Materials

Development Environment

Useful Resources

Research Papers

Tasks

Dataset Creation

Data Preprocessing

Model Design and Metric checking

Hyperparameter Tuning

About

Releases

Packages

Contributors 3

Languages

llravelo/mt-coe197

Folders and files

Latest commit

History

Repository files navigation

Machine Translation Using Sequence to Sequence Models

Presentation Materials

Development Environment

Useful Resources

Research Papers

Tasks

Dataset Creation

Data Preprocessing

Model Design and Metric checking

Hyperparameter Tuning

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages