<a href="https://colab.research.google.com/github/daniel-sjkdm/ConsumingAPIs/blob/master/Reddit/ConsumingReddit.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Reddit API


To make http requests to the reddit api an app must be created at https://www.reddit.com/prefs/apps. There are many types of apps:

+ web app
+ installed app: for mobile phones
+ script: only developer accounts have access (personal use)

For this notebook I'll create a script app so I can play a little with the api.

To make the http request, credentials must be provided which are generated as _client id_ and _client secret_.


With the app's credentials an authorization token needs to be generated by making a post request:

```sh
$ http -f post https://www.reddit.com/api/v1/access_token grant_type="password" username="username" password="password" --user "client_id:client_secret"
```

... but there's an easier method to do this with the __praw__ python library which is a wrapper for the reddit api.

## Aplications

There are many applications to use this api like:

+ Natural Language Processing 
  + Sentiment Analisys
  + Topic 



## Objective

1. Parse comment objects and store them as json format for the following subreddit:
+ Python

2. Parse redditor objects and store them as json format (most popular ones)



In [152]:
!pip install praw python-dotenv -q

In [164]:
from praw.models import MoreComments
from collections import OrderedDict
from google.colab import drive
from dotenv import load_dotenv
from pprint import pprint
import datetime
import json
import praw
import os

In [154]:
drive.mount("/content/drive")

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [173]:
path = "/content/drive/My Drive/Colab Notebooks/ConsumingAPIs/ConsumingReddit"
load_dotenv(dotenv_path=path + "/" + ".env")

True

In [156]:
client_id = os.getenv("CLIENT_ID")
client_secret = os.getenv("CLIENT_SECRET")
user_agent = os.getenv("USERAGENT")

In [157]:
# Useful functions

# Posts or submissions from reddit api are returned as an
# utc string, so it must be turned into a readable one
def get_readable_datetime(timestamp):
  return datetime.datetime.fromtimestamp(timestamp).strftime("%d/%m/%Y %H:%M:%S")

In [158]:
# Creating a read only instance

reddit = praw.Reddit(
    client_id = client_id,
    client_secret = client_secret,
    user_agent = user_agent,
)

reddit.read_only

True

In [176]:
# Searching for posts in subreddits

def subreddit_posts_data_to_json(subreddit):

  subreddit = reddit.subreddit(subreddit)

  posts = {}

  for post in subreddit.hot(limit=100):

    comments = []

    for comment in post.comments:
      if isinstance(comment, MoreComments):
        continue
      comments.append(comment.body)

    posts[post.id] = {
      "author": post.author.name,
      "subreddit": post.subreddit.display_name,
      "created": get_readable_datetime(post.created_utc),
      "distinguished": post.distinguished,
      "fullname": post.fullname,
      "downs": post.downs,
      "num_comments": post.num_comments,
      "title": post.title,
      "view_count": post.view_count,
      "score": post.score,
      "url": post.url,
      "comments": comments,
    }

  return posts


posts = subreddit_posts_data_to_json("Python")

pprint(posts['iyo4ij'])


with open(path + "/" + "posts.json", "w") as f:
  json.dump(posts, f, indent=4)

{'author': 'Tomas_83',
 'comments': ['Libraries are a double-edged sword. They are immensely helpful '
              'and allow you accomplish tasks easily but the also can abstract '
              'away a lot of the internal workings of whatever they do. You '
              "don't need to understand their implementations exactly but I'd "
              'encourage you to make sure you at least learn the concepts '
              "behind what they're doing.",
              'I rarely ever look at a libraries code, unless I am getting an '
              "error that a simple google search can't solve.  \n"
              '\n'
              '\n'
              "Then occasionally I'll crack open a library to see what I may "
              'be doing wrong.',
              'Use the library’s to help you accomplish the tasks necessary to '
              'build whatever applications you’re building… But there’s no '
              'need to be completely versed in the library if you’re learning',
   

In [177]:
# Getting the 100 most popular redditors

def redditors_to_json():
  redditors = {}
  for redditor in reddit.redditors.popular(limit=100):
    redditors[redditor.id] = {
      "name": redditor.display_name,
      "created": get_readable_datetime(redditor.created_utc),
      "description": redditor.description,
      "subcribers": redditor.subscribers,
    }
  return redditors

redditors = redditors_to_json()

pprint(redditors["11jw7w"])

with open(path + "/" + "redditors.json", "w") as f:
  json.dump(redditors, f, indent=4)

{'created': '17/05/2019 08:50:20',
 'description': '',
 'name': 'u_Wesley_Ford',
 'subcribers': 0}
