Skip to content

cheekeet86/project_3

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 

Project 3: Web APIs & Classification

Problem Statement

  • To utilize the Reddit API to scape posts automatically from 2 Subreddits.
  • To create and compare different classification models. The models predict which Subreddit a specific post came from.
  • To perform sentiment analysis on post contents. Provide advertising strategies for customers.

Data Collection

  • The posts are scaped using the Reddit API.
  • The posts are scaped from 2 popular Subreddits i.e. Board Games and Mobile Games with 2.1 million and 2.8k members respectively.
  • The Reddit API extracts the posts in JSON format and the posts are stored as json files for future analysis.

User Configurations

Variable Name Default Value Description
scape_data False True: Scape data from Reddit and save as json files
False: Load json files from input folder
scape_index 0 if scape_data=True:
0: Scape data from subreddits[0]
1: Scape data from subreddits[1]
num_requests 50 Number of Reddit API requests
Note: 25 posts are scaped per request.
posts_limit 900 Number of Posts used (per Subreddit) to build models
subreddits [ boardgames , mobilegames ] Subreddits List
url https://www.reddit.com/r/ Base URL for scaping
headers User-agent:Bleep blorp bot 0.1 User Agent Settings

Executive Summary

Click Here

References

Reddit API
Board Games Subreddit
Mobile Games Subreddit

About

Games Reddits Classifier (General Assembly SG Data Science Immersive Batch 9)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published