Skip to content

Latest commit

 

History

History
35 lines (25 loc) · 845 Bytes

README.md

File metadata and controls

35 lines (25 loc) · 845 Bytes

An analysis of the Reddit "Relationships" subreddit

I use exploratory data analysis and machine learning to understand what Redditors talk about when they talk about love!

List of files in repo:

  • reddit_scraper.py scrapes top N reddit posts of subreddit S
    • It returns the following information for each post in a dataframe:
      • author
      • flair
      • post_id
      • score
      • self_txt
      • timestamp
      • title
      • upvote_ratio
  • relationships_data.csv
    • 888 top posts from r/relationships
  • reddit_analysis.ipynb
    • A python notebook containing analysis of the data

Read my blog for a write-up and look at the ipynb notebook for analysis!

Blog entry: https://www.joshash.space/data-science/reddit-relationships

         .        .  
  * _  __|_  _. __|_ 
  |(_)_) [ )(_]_) [ )
._|