Batak Toba language-Indonesian machine translation with transfer learning using No Language Left Behind
This repository contains the implementation of a machine translation model based on the research paper [COMING SOON🤞]. The model architecture and training procedure are described in detail in the provided Jupyter Notebook file.
- Transformers 4.34.1
- Pytorch 2.1.0+cu118
- Datasets 2.13.1
- Tokenizers 0.14.1
- Clone this repository to your local machine:
git clone https://github.com/caffeineeee/batak_toba_indonesian_nmt.git
cd batak_toba_indonesian_nmt
- Open the Jupyter Notebook file
batak_toba_indonesian_nmt.ipynb
using Jupyter Notebook:
jupyter notebook batak_toba_indonesian_nmt.ipynb
Alternatively, you can run it using Google Colab. For improved training speed, adjust the runtime type to T4 GPU or higher.
- Follow the instructions in the notebook to train the machine translation model, evaluate its performance, and experiment with different parameters.
The code is licensed under the MIT License.