This project implements a Character-Level Recurrent Neural Network (RNN) from scratch using PyTorch to classify names by their language of origin.
Example:
- Input:
Albert - Output:
English
- Custom RNN implementation (no
nn.RNN) - Character-level one-hot encoding
- Multi-class classification (languages)
- Training + prediction pipeline
- Loss visualization using Matplotlib
├── rnn.py # Main model, training loop, prediction ├── utils.py # Data processing + helper functions ├── data/ │ └── names/ # Dataset (language-wise name files) ├── README.md
Install dependencies:
pip install torch matplotlib
📥 Dataset
Download dataset from PyTorch tutorial:
👉 https://download.pytorch.org/tutorial/data.zip
Extract it like this:
project-folder/
├── data/
│ └── names/
│ ├── English.txt
│ ├── French.txt
│ ├── Italian.txt
│ └── ...
# How to Run
1. Navigate to project directory
cd your-project-folder
2. Run training
python rnn.py
3. During training you’ll see output like:
5000 5% 2.3456 Albert / English CORRECT
10000 10% 1.9876 Pierre / French CORRECT
4. After training, test manually
Input: Kumar
Input: Ahmed
Input: quit
🧠 How It Works
🔤 Encoding
Each letter → one-hot vector
Word → sequence of vectors
🔁 RNN Flow
For each letter:
hidden = f(W · [input, hidden])
output = softmax(W · [input, hidden])
📉 Loss Function
Negative Log Likelihood Loss (nn.NLLLoss)
📊 Output
Prints prediction accuracy during training
Shows loss curve using Matplotlib
⚙️ Hyperparameters
Parameter Value
Hidden Size 128
Learning Rate 0.005
Iterations 100000
# Improvements (Future Work)
Replace RNN with LSTM / GRU
Add dropout for regularization
Save & load trained model
Build web app using Flask
Deploy as API
💡 Example Predictions
> Albert
English
> Pierre
French
> Rossi
Italian
# Commands Summary
# Install dependencies
pip install torch matplotlib
# Run project
python rnn.py
## Author
Kareeb Sadab
CSE Student | AI & Blockchain Enthusiast