Skip to content

Deep learning models for text classification using AllenNLP

Notifications You must be signed in to change notification settings

sunyam/generalization-classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

generalization-classification

Deep learning models for text classification using AllenNLP

This project tackles a binary text classification problem where we have 3,456 instances/sentences annotated for whether they encode a generalization or not. This repository contains the annotated data & code for various deep learning models with the option of using pre-trained GloVe embeddings or ELMo embeddings (see /src/compare-models/).

The best performing model is a CNN using ELMo embeddings which is then trained on the entire training set of 3,456 instances (see .py files in /src/). It is used to predict on a test set of 16,816 sentences. The test set is obtained by extracting sentences from the beginning and end of a set of 230 txt files (see /src/Predict-TestSet.ipynb).

If the notebooks don't load on Github, you can always use Jupyter's nbviewer. All implementations are in Python 3.7 and utilise AllenNLP (0.8.4) and PyTorch (1.1.0) which are powerful deep learning frameworks. Thanks to Keita for this lovely tutorial on AllenNLP.

Releases

No releases published

Packages

No packages published