Skip to content

Natural language processing applied to social media, looking for new trends in health related posts

Notifications You must be signed in to change notification settings

NIHRIO/SocialMediaScanning

 
 

Repository files navigation

Mining social media for citizen insight

Social media sites such as Twitter, Reddit, and Mumsnet can be an important source of information for health research. They provide an archive of the thoughts, feelings, and concerns of large parts of the population on a wide range of topics. This can be used to explore citizen sentiment towards a topic, track changes over time, and reveal new bodies of concern that traditional research methods may miss. We can scrape data from websites such as Mumsnet or use the Twitter Academic API to search for tweets relevant to our research question. Natural language processing methods such as Latent Dirichlet Allocation can then be used to structure the collections of tweets or posts into topics. Qualitative methods can also be used to interpret the topics found, or independently applied to generate an understanding of small numbers of posts or tweets. Steps:

  • Identify sources of interest, e.g. the forum or social media you wish to search.
  • Formulate a search strategy: search terms and a method for applying them.
  • Collate your posts or tweets.
  • Clean and pre-process your posts or tweets.
  • Fit an LDA topic model.
  • Interpret the topics found.

Some relevant papers: “Mommy Blogs” and the Vaccination Exemption Narrative: Results From A Machine-Learning Approach for Story Aggregation on Parenting Social Media Sites https://publichealth.jmir.org/2016/2/e166/

Using social media Reddit data to examine foster families' concerns and needs during COVID-19 https://www.sciencedirect.com/science/article/pii/S0145213421003355

Text mining of Reddit posts: Using latent Dirichlet allocation to identify common parenting issues https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0262529

About

Natural language processing applied to social media, looking for new trends in health related posts

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 94.7%
  • Shell 5.3%