RECENT:
CLICKBAIT 2.0 Let's tackle text first! It will help us generate good labels, since... drum rol.... clickbaitiness of the title and thumbnail highly correlate!! (probably)
HISTORY:
So far got up to the data labelling part. The project stagnated because:
-
Labeling thousands of images is boring. (1000/6000)
-
The clickbaitiness of thumbnails is somewhat difficult to assign due to the variance of topics
-
Clickbaitiness evolves with time. The data set used here is from 2017, which is actually different from 2020
-
Same problems, but now built a model that slightly outperforms naive approaches