<img width="8%" alt="Reddit.png" src="https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/.github/assets/logos/Reddit.png" style="border-radius: 15%">

# Reddit - Get Hot Posts From Subreddit
<a href="https://bit.ly/3JyWIk6">Give Feedback</a> | <a href="https://github.com/jupyter-naas/awesome-notebooks/issues/new?assignees=&labels=bug&template=bug_report.md&title=Reddit+-+Get+Hot+Posts+From+Subreddit:+Error+short+description">Bug report</a>

**Tags:** #reddit #subreddit #data #hottopics #rss #information #opendata #snippet #dataframe

**Author:** [Yaswanthkumar GOTHIREDDY](https://www.linkedin.com/in/yaswanthkumargothireddy/)

**Last update:** 2023-04-12 (Created: 2021-08-16)

**Description:** This notebook allows users to retrieve the hottest posts from a specified subreddit on Reddit.

## Input

### Install packages

In [None]:
!pip install praw

In [4]:
import praw
import pandas as pd
import numpy as np
from datetime import datetime

### Choose Subreddit topic 

In [2]:
SUBREDDIT = "Python"  # example: "CryptoCurrency"

### Setup App to connect to Reddit API

* To get data from reddit, you need to [create a reddit app](https://www.reddit.com/prefs/apps) which queries the reddit API.
* Select “script” as the type of app.
* Name your app and give it a description.
* Set-up the redirect uri to be http://localhost:8080.
* Once you click on “create app”, you will get a box showing you your "client_id" and "client_secrets".
* "user_agent" is the name of your app.

If you need help on setting up and getting your API credentials, please visit ---> [Get Reddit API Credentials](https://www.jcchouinard.com/get-reddit-api-credentials-with-praw/)

In [1]:
MY_CLIENT_ID = "EtAr0o-oKbVuEnPOFbrRqQ"
MY_CLIENT_SECRET = "LmNpsZuFM-WXyZULAayVyNsOhMd_ug"
MY_USER_AGENT = "script by u/naas"

## Model

#### Connect with the reddit API

In [5]:
reddit = praw.Reddit(
    client_id=MY_CLIENT_ID, client_secret=MY_CLIENT_SECRET, user_agent=MY_USER_AGENT
)

#### Get the subreddit level data

In [6]:
posts = []
for post in reddit.subreddit(SUBREDDIT).hot(limit=50):

    posts.append(
        [
            post.title,
            post.score,
            post.id,
            post.subreddit,
            post.url,
            post.num_comments,
            post.selftext,
            post.created,
        ]
    )
posts = pd.DataFrame(
    posts,
    columns=[
        "title",
        "score",
        "id",
        "subreddit",
        "url",
        "num_comments",
        "body",
        "created",
    ],
)

* If you need more variables, check "vars()" function``
* Usage: 'vars(post)', you'll get post level variables 

#### Convert unix timestamp to interpretable date-time 

In [7]:
posts["created"] = pd.to_datetime(posts["created"], unit="s")

## Output

In [8]:
posts.head()

Hint: Filter data using "created" variable for past 24 hours hot posts

## Additional Resources
- More info on the PRAW package used: https://praw.readthedocs.io/en/stable/