Skip to content

Analysis of labeling strategies aimed at identifying depression phenomena among users’ tweets.

Notifications You must be signed in to change notification settings

helemanc/COVID19-twitter-depression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Automatic Detection of Depression on COVID-19 Tweets

Project realized by Eleonora Mancini and Eleonora Misino as a part of the Natural Language Processing exam of the Master's degree in Artificial Intelligence @ University of Bologna (A.A. 2019-2020).

The purpose of this project is the analysis of labeling strategies aimed at identifying depression phenomena among users’ tweets. The tweets used in this analysis refer to a specific period of the COVID19 pandemic. In particular, the objective is to try to understand if the strategies studied allow to identify evident phenomena of depression among users during the pandemic period. 4 different strategies were developed and analyzed. It was not possible to arrive at a robust solution, but this project highlights some interesting aspects that could be the starting point for a more in-depth analysis.

Data

Project Workflow

  1. COVID19 Tweets
  • Exploratory Data Analysis
  • Preprocessing
  • Topic Modeling: Latent Dirichlet Allocation
  • Tweets Labelling through 3 strategies that we call TWINT, VADER, NRCLex
  • Labelling Comparison
  • Unsupervised Analysis (LSA and Clustering)
  1. CLPSych Dataset
  • Exploratory Data Analysis
  • Features Extraction
  • Tweets Classification

Please, refer to the notebooks folder for a more detailed description.

Running the code

To reproduce our results:

  • Download the data (please, note that the CLPsych Dataset is not publicly available)
  • Download the notebooks from here
  • Run first the NLP_Project.ipynb notebook and then the CLPsych.ipynb notebook

Results

A detailed analysis of the results can be found here.

Authors

Eleonora Mancini, Eleonora Misino

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Analysis of labeling strategies aimed at identifying depression phenomena among users’ tweets.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published