Skip to content

End-to-end Twitter emotion classification with NLTK preprocessing, stratified splits, Keras and PyTorch tokenization, and three models (PyTorch MLP/CNN, Keras Bi-LSTM) with standard training and evaluation.

Notifications You must be signed in to change notification settings

Morph23/Deep-Learning-Assignment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

UCD Deep Learning Project - Twitter Emotion Classification

This notebook implements an end-to-end pipeline for Twitter emotion classification: it loads data from /content/drive/MyDrive/dataset(clean).csv, cleans text (remove URLs and non-letters, lowercase, tokenize with NLTK, remove English stopwords), and performs EDA (emotion frequency, tweet-length histogram, overall WordCloud); it creates stratified train/validation/test splits (60/20/20, random_state=2025) and uses two vectorization paths—Keras Tokenizer with post-padding for TensorFlow models, and a PyTorch vocabulary with / and fixed-length sequences for torch models; three classifiers are trained: a PyTorch MLP baseline (Embedding + feed-forward, optimized with SGD), a PyTorch CNN (Embedding + Conv1d + max-pooling, Adam), and a Keras Bi-LSTM (Embedding + Bidirectional LSTM + dense, Adam with EarlyStopping)—trained for ~5 epochs with standard batch sizes; evaluation includes accuracy, classification_report, and confusion matrices, device selection is automatic (CUDA if available, else CPU), and model checkpoints are saved during training.

About

End-to-end Twitter emotion classification with NLTK preprocessing, stratified splits, Keras and PyTorch tokenization, and three models (PyTorch MLP/CNN, Keras Bi-LSTM) with standard training and evaluation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published