# PRAW

## Introduction to PRAW

`PRAW` is an API wrapper for Reddit. It lets you access Reddit content without worrying about violating Reddit's API rules. 

## Installing PRAW

Within the psych750 environment, install praw using `pip`:

```
pip install praw
```

## Authenticating PRAW to access Reddit

Before using PRAW, you need to authenticate with your Reddit account. Go [here](https://www.reddit.com/prefs/apps/) and click on the `are you a developer? create an app...` button to create an application. Name it `psych750` and choose `script`. Add in a short description and use `http://localhost:8080` as the redirect uri.

## Working with PRAW
Now that everything is ready to go, we need to create an instance first:

In [1]:
import praw

r = praw.Reddit(
    client_id = 'BHpXy52FE-8za63YtAqvOQ',
    client_secret = 'GWEUpuB6M4q9YB2q9SOvY6OuGJ20JQ',
    password='psych750!Tutorial',
    user_agent = 'testscript by u/tutorial_for_praw',
    username='tutorial_for_praw'
)
print(r.user.me()) # Checking if it's working, it should just show the user name

tutorial_for_praw


Now we can look at things that are potentially interesting. Let's take a look at the hottest 10 posts on Reddit (as of the time of writing this tutorial).

In [2]:
for cur_submission in r.front.hot(limit = 10):
    # print(cur_submission)
    submission = r.submission(cur_submission)
    print(submission.title)

Inhaled Cannabis Reduces Pain and Anxiety, Improves Health-Related Quality of Life in Patients with Treatment-Resistant Conditions
Magic Pineapples🎤 (Carmen Lagala)
Nancy Pelosi marching on Washington for Gay Rights in the 80s
'Tried & Failed': Vladimir Putin Suffers Another Devastating Loss After 'Doomsday' Nuclear-Powered Torpedo Fails To Launch
A church steeple blew off during a storm last night.
TIL blacking out from alcohol doesn’t cause you to lose memories, but rather your brain temporarily loses the ability to create new memories during the blackout (so the memories never exist)
Colorado voters approve free school meals for K-12 students
Lorde: “Touring Has Become a Demented Struggle to Break Even or Face Debt”
Can somebody ID this turtle? Found this critter boxed up on my driveway
How to impress a girl by hiding a card behind your hand


Let's look at one of them. [This submission](https://www.reddit.com/r/todayilearned/comments/ysm4ys/til_blacking_out_from_alcohol_doesnt_cause_you_to/) talks about how your brain temporarily loses the ability to create new memories when you experience a blackout from alcohol. It has more than 1000 comments and we can use PRAW to extract all the comments and see what people's opinions are. We can first take a look at the top 5 comments

In [2]:
url = 'https://www.reddit.com/r/todayilearned/comments/ysm4ys/til_blacking_out_from_alcohol_doesnt_cause_you_to/'

submission = r.submission(url = url)

for top_level_comment in submission.comments[:5]:
    print(top_level_comment.body)

Anterograde amnesia: the inability to form new memories. Alcohol blackout is *temporary* anterograde amnesia; it goes away when the alcohol intoxication goes away. Permanent anterograde amnesia is a thing; my boss suffered a heart attack years ago, and it left him with permanent anterograde amnesia. When you cannot learn new information at all, it's pretty hard to work in I.T., where you have to learn new stuff all the time. Management retired him pretty quick.
Pro tip: I'd you ever want to know if someone is "Blacked Out" ask them the same question a few times within a few minutes, they usually won't remember.
I like to call this state Read Only Mode
[Anterograde amnesia](https://en.wikipedia.org/wiki/Anterograde_amnesia).
Some drugs which have the same effect are sometimes used for potentially mentally traumatic medical or dental procedures.


With these comments, we can use things like `nltk` to extract information that we are interested in.

In [17]:
import nltk

all_comments = ''
submission.comments.replace_more(limit=None)
for top_level_comment in submission.comments:
    all_comments += top_level_comment.body

all_comments_tokens = nltk.word_tokenize(all_comments)
all_comments_pos = nltk.pos_tag(all_comments_tokens)
all_comments_nouns = [word[0].lower() for word in all_comments_pos if word[1][:2] == 'NN']

print(nltk.FreqDist(all_comments_nouns).most_common(15))

[('memories', 61), ('memory', 39), ('time', 33), ('night', 32), ('drunk', 32), ('*', 25), ('brain', 24), ('someone', 22), ('people', 21), ('’', 21), ('day', 21), ('alcohol', 17), ('thing', 16), ('times', 16), ('t', 16)]


Let's find out how many upvotes it received:

In [9]:
print('This submission has: ' + str(submission.score) + ' upvotes!')

This submission has: 53529 upvotes!
This submission has: 3747 downvotes!


What about downvotes? PRAW does not have a downvote attribute you can look up. It only has a downvote method that lets you actually downvote the submission. So don't do that unless you want to downvote it! To find the number of downvotes, we need to calculate it based on the score (number of upvotes) and the upvote ratio:

In [None]:
print(f'This submission has: {str(round(submission.score * (1-submission.upvote_ratio)))} downvotes!')

Let's try something else. Let's take a look at what people in Wisconsin are discussing these days.

In [15]:
all_hot_submissions = ''
for cur_submission in r.subreddit("wisconsin").hot():
    submission = r.submission(cur_submission)
    all_hot_submissions += submission.title

all_hot_submissions_tokens = nltk.word_tokenize(all_hot_submissions)
all_hot_submissions_pos = nltk.pos_tag(all_hot_submissions_tokens)
all_hot_submissions_nouns = [word[0].lower() for word in all_hot_submissions_pos if word[1][:2] == 'NN']

print(nltk.FreqDist(all_hot_submissions_nouns).most_common(15))

[('wisconsin', 26), ('county', 8), ('election', 7), ('johnson', 6), ('milwaukee', 5), ('’', 4), ('voters', 4), ('ballot', 4), ('state', 4), ('wi', 4), ('michels', 4), ('%', 4), ('ron', 4), ('department', 3), ('food', 3)]


Not a big surprise, since election is coming up, people are mostly talking about voting and Ron Johnson.

Now let's find out who actually has the hottest post in `wisconsin`.

In [23]:
for cur_submission in r.subreddit("wisconsin").hot(limit = 1):
    hottest_submission_in_wisconsin = r.submission(cur_submission)

print(f'The hottest submission in r/wisconsin is: {hottest_submission_in_wisconsin.title} by {hottest_submission_in_wisconsin.author}')


The hottest submission in r/wisconsin is: Yay…more lanes authored by Nimzay98


Can we find out more about this user? What's this person's karma? Where else has this user posted within the last month?

In [43]:
import time

user_instance = hottest_submission_in_wisconsin.author
print('this user has ' + str(user_instance.comment_karma) + ' karmas!')

all_subreddits_user_posted = []
for cur_submission in user_instance.submissions.new():
    submission = r.submission(cur_submission)
    all_subreddits_user_posted.append(submission.subreddit)
    if time.time() - submission.created_utc > 60*60*24*30:
        break

unique_subreddits = set(all_subreddits_user_posted)
for cur_subreddit in unique_subreddits:
    print('This user has posted in ' + str(cur_subreddit.display_name) + ' in the past month.')

this user has 73472 karmas!
This user has posted in MkeBucks in the past month.
This user has posted in milwaukee in the past month.
This user has posted in RealTwitterAccounts in the past month.
This user has posted in wisconsin in the past month.


We can tell from the data that this user is probably a very active reddit user. They live in Milwaukee, is a Bucks fan, and probably is also very active on Twitter.

And a joke to end the activity:

In [21]:
submission = r.submission('yshook') # You can also use the submission's ID which comes after comments/ in the URL
print(submission.title)
print(submission.selftext)

What's the difference between grey and gray?
One is a color, and the other is a colour.
