Skip to content

Identify the type of news based on headlines and short descriptions

Notifications You must be signed in to change notification settings

rootally/News-Category-Classification-with-BERT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

News Category Classification with BERT

Identify the type of news based on headlines and short descriptions

Dataset

This dataset contains around 200k news headlines from the year 2012 to 2018 obtained from HuffPost. The model trained on this dataset could be used to identify tags for untracked news articles or to identify the type of language used in different news articles. Kaggle

Implementations

  • BERT (Fine-Tuning)
  • Bi-GRU + CONV
  • LSTM + Attention

Try it on Colab Notebook

TL;DR

  • glove.840B.300d (840B tokens, 2.2M vocab, cased, 300d vectors, 2.03 GB download) was used as the embedding layer for the Bi-GRU and LSTM models.
  • bert-base-uncased (12-layer, 768-hidden, 12-heads, 110M parameters) pre-trained model was used.

Resuts

  • BERT - test_accuracy: 0.72, test_loss: 0.0015671474330127238
  • Bidirectional GRU + Conv - test_accuracy: 0.6545
  • LSTM with Attention - test_accuracy: 0.67144

Requirements

About

Identify the type of news based on headlines and short descriptions

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages