Skip to content
Deep learning models to identify clickbaits taking content into consideration
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Clickbaits Revisited

This repository provides the code used for :

Data Collection

To run the code you must first collect the data:

Data Pre-Processing

After the data has been collected, you need to run the following files to obtain training and test data. The order is important!

- $ cd data_processing
- $ python
- $ python
- $ python
- $ python
- $ python

After the steps above, you will end up with train.csv and test.csv in data/

Please note that the above steps will require a lot of memory. So, if you have anything less than 64GB, please modify the code according to your needs.

GloVe embeddings

Obtain GloVe embeddings from the following URL:

Extract the zip and place the CSV in data/


After all the above steps, you are ready to go and play around with the deep neural networks to classify clickbaits

Change directory to deepnets/

cd deepnets/

The deepnets are as folllows: : LSTM on title text without GloVe embeddings : LSTM on title text and content text without GloVe embeddings : LSTM on title and content text with GloVe emebeddings : Time distributed dense on title and content text with GloVe embeddings : LSTM on title + content text with GloVe embeddings & dense net for numerical features.


The network with LSTM on title and content text with GloVe embeddings with numerical features achieves an accuracy of 0.996 during validation and 0.992 on the test set.

All models were trained on NVIDIA TitanX, Ubuntu 16.04 system with 64GB memory.

You can’t perform that action at this time.