# Data Retrieval Example

## requirements:
- instal praw using the following command `!pip install praw`. Don't use conda for this specific model.
- instal requests if you don't have in your development environment 

## References:
- https://towardsdatascience.com/how-to-use-the-reddit-api-in-python-5e05ddfd1e5c
- https://www.reddit.com/dev/api
- https://praw.readthedocs.io/en/stable/index.html


In [2]:
import praw
import pandas as pd
import requests

## Connecting to the reddit API using `requests`

### Steps:
- create an app on reddit to acquire the access credentials:
    - `PASSWORD` password of the reddit account used to create the app
    - `USERNAME` username of the reddit account used to create the app
    - `CLIENT_ID` acquired after creating the app
    - `SECRET_TOKEN` acquired after creating the app


In [3]:

PASSWORD = 'j=HUZ`6S8B'
USERNAME = 'CMPS287_project'
CLIENT_ID = 'd5w9jc7mmyeLEL2DG1wtxg'
SECRET_TOKEN = 'HIOuTew4HunOVSJeFT47Yi4sCkdBCA' 
# note that CLIENT_ID refers to 'personal use script' and SECRET_TOKEN to 'token'
auth = requests.auth.HTTPBasicAuth(CLIENT_ID, SECRET_TOKEN)

# here we pass our login method (password), username, and password
data = {'grant_type': 'password',
        'username': USERNAME,
        'password': PASSWORD }

# setup our header info, which gives reddit a brief description of our app
headers = {'User-Agent': 'MyAPI/0.0.1'}

# send our request for an OAuth token
res = requests.post('https://www.reddit.com/api/v1/access_token',
                    auth=auth, data=data, headers=headers)

# convert response to JSON and pull access_token value
TOKEN = res.json()['access_token']

# add authorization to our headers dictionary
headers = {**headers, **{'Authorization': f"bearer {TOKEN}"}}

# while the token is valid (~2 hours) we just add headers=headers to our requests
requests.get('https://oauth.reddit.com/api/v1/me', headers=headers).json()

{'is_employee': False,
 'seen_layout_switch': False,
 'has_visited_new_profile': False,
 'pref_no_profanity': True,
 'has_external_account': False,
 'pref_geopopular': '',
 'seen_redesign_modal': False,
 'pref_show_trending': True,
 'subreddit': {'default_set': True,
  'user_is_contributor': False,
  'banner_img': '',
  'restrict_posting': True,
  'user_is_banned': False,
  'free_form_reports': True,
  'community_icon': None,
  'show_media': True,
  'icon_color': '#E4ABFF',
  'user_is_muted': False,
  'display_name': 'u_CMPS287_project',
  'header_img': None,
  'title': '',
  'coins': 0,
  'previous_names': [],
  'over_18': False,
  'icon_size': [256, 256],
  'primary_color': '',
  'icon_img': 'https://www.redditstatic.com/avatars/defaults/v2/avatar_default_7.png',
  'description': '',
  'submit_link_label': '',
  'header_size': None,
  'restrict_commenting': False,
  'subscribers': 0,
  'submit_text_label': '',
  'is_default_icon': True,
  'link_flair_position': '',
  'display_name_pr

In [4]:
hot = "https://oauth.reddit.com/r/python/hot"


# make a request for the trending posts in /r/Python
res = requests.get("https://oauth.reddit.com/r/python/hot",
                   headers=headers)

df = pd.DataFrame()  # initialize dataframe

# loop through each post retrieved from GET request
for post in res.json()['data']['children']:
    # append relevant data to dataframe
    df = df.append({
        'subreddit': post['data']['subreddit'],
        'title': post['data']['title'],
        'selftext': post['data']['selftext'],
        'upvote_ratio': post['data']['upvote_ratio'],
        'ups': post['data']['ups'],
        'downs': post['data']['downs'],
        'score': post['data']['score']
    }, ignore_index=True)


In [5]:
df.head()

Unnamed: 0,downs,score,selftext,subreddit,title,ups,upvote_ratio
0,0.0,4.0,Tell /r/python what you're working on this wee...,Python,Sunday Daily Thread: What's everyone working o...,4.0,0.83
1,0.0,2.0,Found a neat resource related to Python over t...,Python,Saturday Daily Thread: Resource Request and Sh...,2.0,0.75
2,0.0,118.0,,Python,Python open-source OpenBB Terminal against Blo...,118.0,0.94
3,0.0,27.0,,Python,How is PyPy Tested?,27.0,0.82
4,0.0,336.0,I don't get it. Help!\n\nhttps://preview.redd....,Python,"A commit from my lead dev: ""Improve readability"".",336.0,0.82


In [10]:
# getting usernames of the hot posts
authors_contributors = "https://oauth.reddit.com/r/explainlikeimfive/hot"

# make a request for the trending posts in /r/Python
res = requests.get(authors_contributors, headers=headers)

respons_json = res.json()['data']['children']

for i in range(len(respons_json)):
    print(respons_json[i]['data']['author'])

theconcorde
toasterstrewdal
RedNozomi
Troldilocks
1mrofflineoctave1
ottpro
rabid_erica
badgerprof
ramanujam
moosepooo
micuki
bebetterinsomething
foggiermeadows
i_was_way_off
moonieass13
Ok_Restaurant233
DirtyProjector
aTrolley
SerenityMcC
vingt_huit
rebeccawinston123
pairustwo
Questions7292
Spartan448
Mohkh84


## Example using praw library
- praw library is the python Reddit API wrapper
- We use the same access credentials we used in the previous example

In [11]:
reddit = praw.Reddit(client_id     = CLIENT_ID,
                     client_secret = SECRET_TOKEN,
                     user_agent    = 'MyAPI/0.0.1')

# To test if your instance is working use:
print(reddit.read_only) # Output: True

for submission in reddit.subreddit("learnpython").hot(limit=10):
    print(submission.title)

True
Ask Anything Monday - Weekly Thread
Seems easy, but can't find anywhere. How do I add a row to a pandas dataframe that includes a date?
Why does the program end before my loop is finished?
Explain this lambda
Question regarding Angela Yu's 100 days of python
can you explain a little about database ?
Overwhelmed
base 64 decoded not the same as original encoded value - been on this for hours
Twitter read and analysis
How to apply a function to a list in constant space without using a for loop.


In [82]:
#  getting the wikipages of the r/autowikibot subreddit
#  link to the subreddit http://www.reddit.com/user/autowikibot

for wikipage in reddit.subreddit("autowikibot").wiki:
    print(wikipage)

autowikibot/botlist
autowikibot/commandlist
autowikibot/config/description
autowikibot/config/sidebar
autowikibot/config/stylesheet
autowikibot/config/submit_text
autowikibot/css
autowikibot/excludedsubs
autowikibot/index
autowikibot/livelists
autowikibot/modfaqs
autowikibot/nsfwtag
autowikibot/planned
autowikibot/redditbots
autowikibot/rootonlysubs
autowikibot/statistics
autowikibot/summon
autowikibot/summononlysubs
autowikibot/userblacklist


In [83]:
# getting the content of a specific wikipage in the r/autowikibot subreddit
# link to the wikipage we are requesting https://www.reddit.com/r/autowikibot/wiki/redditbots
wikipage = reddit.subreddit("autowikibot").wiki["redditbots"]
print(wikipage.content_md)

Note: New details will be added every week.

* **Active**: Last post < 3 days
* Dormant: Last post <1 month
* Inactive: Last post >6 months

|Username|Trigger|Function|Status - comment freq - submission freq.|
|:-|:-|:-|:-||
|/u/A858DE45F56D9BC9|*Unknown*|Unknown|**Active** - Nil - 3 hrs|
|/u/AAbot|*submission in /r/asianamerican* |info about artist|Inactive - 0 mins - 5 mins|
|/u/ADHDbot|-|moderator @ /r/ADHD |**Active** - 30 hrs - 1198 hrs|
|/u/ALTcointip|`+/u/altcointip <currency numbers>`|tip bot|**Active** - 1 hrs - 6 mins|
|/u/AVR_Modbot|-|moderator @ /r/aussievapers|Dormant - 2 hrs - Nil|
|/u/A_random_gif|`random gif`|posts random gif from /r/gifs |Dormant - 53 hrs - Nil|
|/u/AltCodeBot|*Alt-code characters in comment*|Keyboard shortcuts for them|Dormant - 28 mins - Nil|
|/u/Antiracism_Bot|*racial slur in comment*|counts and mentions|Dormant - 58 mins - Nil|
|/u/ApiContraption|*submission in /r/photoshopbattles* |collects non-photoshop comments|**Active** - 18 mins - Nil|
|/u/As