No description, website, or topics provided.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.DS_Store
Daddit_Final.ipynb
LDA heatmap_Final.ipynb
Mommit_Final.ipynb
README.md
RedditFinalClean.ipynb

README.md

Pseudonymous Parents: Comparing Parenting Roles and Identities on the Mommit and Daddit Subreddits

This repository contains the code used in our analysis of Parenting subreddits. In our CHI paper, we used unsupervised machine learning techniques - namely, Latent Dirichlet Analysis (LDA) and Word2Vec to explore the topics parents discuss on Reddit and the differences between mother-centric and father-centric subreddits.

The following figure shows how we trained an LDA model for the aggregated model of r/Parenting, r/Mommit and r/Daddit as well as independent LDA models for r/Daddit and r/Mommit. In addition, independent Word2Vec models for r/Mommit and r/Daddit were trained to differentiate similar topics discussed by users of the two subreddits.

alt text

This repository contains four Python notebooks: (1) Reddit_Final; (2) Daddit_Final; (3) Mommit_Final and (4) LDA heatmap_Final. The first three show how we carried out the LDA analysis for all three subreddits, then independent LDA models for Daddit and Mommit independently. Each of the Daddit and Mommit notebooks also show how we trained independent Word2Vec models for each of the corpora.

The code also shows how jaccard similarity was calculated for each of the subreddit comments. Following is the definition of how the score for each of the comments was calculated.

alt text

LDA heatmap_Final shows how we created heatmaps of select topic scores throughout all three subreddits. Following is an example:

alt text

If you'd like to know more about our research, you can find our paper here.