English to Hindi Neural Machine Translator

This project implements a neural machine translation system for translating English text to Hindi using a sequence-to-sequence (Seq2Seq) architecture with attention mechanism. The model is built with PyTorch and features a GRU-based encoder-decoder network.

Features

Seq2Seq architecture with attention mechanism
GRU-based encoder and decoder
Customizable model parameters (hidden size, embedding dimensions, etc.)
Interactive translation through command line interface
Web interface using Gradio
Support for vocabulary management and data preprocessing
Teacher forcing during training

Project Structure

├── app.py                 # Gradio web interface for translation
├── inference.py           # Functions for model inference
├── train.py               # Model training script
├── models/                # Neural network architecture components
│   ├── encoder.py         # Encoder implementation
│   ├── decoder.py         # Decoder implementation  
│   ├── attention.py       # Attention mechanism
│   └── seq2seq.py         # Seq2Seq model
├── utils/                 # Utility functions
│   ├── config.py          # Configuration parameters
│   ├── data_loader.py     # Data loading utilities
│   └── preprocessing.py   # Text preprocessing functions
├── data/                  # Data directory
│   └── hindi_english_parallel.csv  # Parallel corpus
└── bin/                   # Model checkpoints and vocabularies
    ├── seq2seq.pth        # Trained model weights
    ├── eng_vocab.pkl      # English vocabulary
    └── hin_vocab.pkl      # Hindi vocabulary

Installation

Clone the repository:

git clone https://github.com/yourusername/english-hindi-translator.git
cd english-hindi-translator

Install the required packages:

pip install -r requirment.txt

Usage

Training the Model

To train the translation model:

python train.py

This will:

Load and preprocess the Hindi-English parallel corpus
Build vocabularies for both languages
Initialize and train the Seq2Seq model
Save model checkpoints after each epoch

Translation via Command Line

For interactive translation through command line:

python inference.py

Web Interface

To launch the web interface for translation:

python app.py

This will start a Gradio interface that you can access in your web browser.

Model Architecture

Encoder: GRU-based with configurable layers and embedding dimensions
Decoder: GRU with attention mechanism
Attention: Calculates attention scores between encoder outputs and decoder hidden states
Training: Uses teacher forcing and cross-entropy loss

Configuration

Model parameters can be adjusted in the config.py file:

embedding_dim: Size of word embeddings
hidden_size: Size of hidden layers
num_layers: Number of RNN layers
dropout: Dropout rate
batch_size: Training batch size
learning_rate: Learning rate for optimizer
epochs: Number of training epochs
teacher_forcing_ratio: Ratio of teacher forcing during training
max_vocab_english: Maximum size of English vocabulary
max_vocab_hindi: Maximum size of Hindi vocabulary
max_length: Maximum sentence length

Requirements

torch
pandas
numpy
tqdm
gradio

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

English to Hindi Neural Machine Translator

Features

Project Structure

Installation

Usage

Training the Model

Translation via Command Line

Web Interface

Model Architecture

Configuration

Requirements

About

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
models		models
utils		utils
.gitignore		.gitignore
app.py		app.py
inference.py		inference.py
readme.md		readme.md
requirment.txt		requirment.txt
train.py		train.py

nullHawk/translator

Folders and files

Latest commit

History

Repository files navigation

English to Hindi Neural Machine Translator

Features

Project Structure

Installation

Usage

Training the Model

Translation via Command Line

Web Interface

Model Architecture

Configuration

Requirements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages