# Scraping Social Media (e.g., Twitter, Reddit) with APIs
### Last Updated: 09/17/18

## What is an API?!

API stands for Application Programming Interface. For social media sites, it is essentially a set of defined interfaces of how one could communicate and interact with data captured by these social media sites.

It is normally for developers of apps who wish to integrate their services with the social media sites, but we are using it to collect data for academic purposes... FOR SCIENCE!

## Scraping Twitter

__Important note__: It is quite impossible to scrape Twitter for complete samples without actually paying Twitter these days. So below is just a taste of the kind of code you could use once you are granted access to the firehose (after paying for the API)

For more information: https://developer.twitter.com/en/pricing.html

### Creating a Twitter Application account

https://developer.twitter.com/en/apply/account
https://developer.twitter.com/en/account/get-started


### Importing required packages...

If packages are not priorly installed you can either use `pip` or `conda` to install the required packages.

Note: There are actually a lot of Twitter API handlers out there for python, `twitter` is just the one I have used before that has worked. There are others out there (e.g., `tweepy`) that may work better and/or more suitable to your needs.

For more information on the `twitter` package: https://pypi.org/project/twitter/

In [95]:
from twitter import Twitter, OAuth
import pandas as pd

In [96]:
con_key = "XXXXXXXXXXXXXXX"
con_sec = "XXXXXXXXXXXXXXX"
access_token = "XXXXXXXXXXXXXXX"
access_token_sec = "XXXXXXXXXXXXXXX"

t = Twitter(auth=OAuth(access_token, access_token_sec,
                       con_key, con_sec))

### Searching for historical tweets on Twitter
For more information on how the Twitter API works: https://developer.twitter.com/en/docs/tweets/search/overview

Very important note: not as powerful as it was in the past...unless you pay?

#### Kevin exercise

Accessing/collecting Tweets from Kevin's timeline:
https://twitter.com/kjs253


In [97]:
kjs_test = t.statuses.user_timeline(screen_name="kjs253",
                                    exclude_replies = True,
                                    include_rts = 1)

TwitterHTTPError: Twitter sent status 401 for URL: 1.1/statuses/user_timeline.json using parameters: (exclude_replies=True&include_rts=1&oauth_consumer_key=biTvmyLGlsd96hwwKhuaSxBuY&oauth_nonce=7205114650330413618&oauth_signature_method=HMAC-SHA1&oauth_timestamp=1537310628&oauth_token=51968999-JOR8FivsOq3bHjm2nlCS6UTCR5VKPzcmxSCkRRaeu&oauth_version=1.0&screen_name=kjs253&oauth_signature=2uU8Er6HWHHFctOOvxWaHLByOZ0%3D)
details: {'errors': [{'code': 89, 'message': 'Invalid or expired token.'}]}

In [50]:
kjs_test

[{'contributors': None,
  'coordinates': None,
  'created_at': 'Sun Sep 16 17:47:25 +0000 2018',
  'entities': {'hashtags': [], 'symbols': [], 'urls': [], 'user_mentions': []},
  'favorite_count': 1,
  'favorited': False,
  'geo': None,
  'id': 1041383066227302400,
  'id_str': '1041383066227302400',
  'in_reply_to_screen_name': None,
  'in_reply_to_status_id': None,
  'in_reply_to_status_id_str': None,
  'in_reply_to_user_id': None,
  'in_reply_to_user_id_str': None,
  'is_quote_status': False,
  'lang': 'en',
  'place': None,
  'retweet_count': 0,
  'retweeted': False,
  'source': '<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>',
  'text': 'Hello, World!',
  'truncated': False,
  'user': {'contributors_enabled': False,
   'created_at': 'Wed Jul 11 04:16:36 +0000 2018',
   'default_profile': True,
   'default_profile_image': True,
   'description': '',
   'entities': {'description': {'urls': []}},
   'favourites_count': 0,
   'follow_request_sent': F

In [51]:
kevin = []

for tweet in kjs_test:
    single = {}
    single["id"] = tweet['id']
    single["created_at"] = tweet['created_at']
    single["text"] = tweet['text']
    kevin.append(single)

In [52]:
kevin

[{'created_at': 'Sun Sep 16 17:47:25 +0000 2018',
  'id': 1041383066227302400,
  'text': 'Hello, World!'}]

In [53]:
kevin_data = pd.DataFrame(kevin)

In [54]:
kevin_data

Unnamed: 0,created_at,id,text
0,Sun Sep 16 17:47:25 +0000 2018,1041383066227302400,"Hello, World!"


#### Zach Lowe exercise
Accessing and collecting tweets from Zach Lowe's timeline: https://twitter.com/ZachLowe_NBA

In [59]:
lowe = []
zl_nba = t.statuses.user_timeline(screen_name="ZachLowe_NBA",
                                    exclude_replies = True,
                                    count = 1000000000000000,
                                    include_rts = 1)

In [60]:
len(zl_nba)

19

### Activity 1
How would you create a table to capture the top 1000 tweets from Zach Lowe's timeline in a pandas table? Please include the `id`, `created_at`, and actual `text` of the tweet in this table. [10 Min]

In [87]:
# Insert answer to Activity 1 here...








### Search for Tweets

In [61]:
search = t.search.tweets(q="#NBA",
                        count = 1000)

Two larger objects within the search object: 1) `search_metadata` which tells you how long the search it took, how many were collected, the id strings and all that stuff; and 2) `statuses` which is the meat of what you want when you do a query like this.

In [62]:
search

{'search_metadata': {'completed_in': 0.078,
  'count': 100,
  'max_id': 1042174059902115840,
  'max_id_str': '1042174059902115840',
  'next_results': '?max_id=1042167306045071361&q=%23NBA&count=100&include_entities=1',
  'query': '%23NBA',
  'refresh_url': '?since_id=1042174059902115840&q=%23NBA&include_entities=1',
  'since_id': 0,
  'since_id_str': '0'},
 'statuses': [{'contributors': None,
   'coordinates': None,
   'created_at': 'Tue Sep 18 22:10:33 +0000 2018',
   'entities': {'hashtags': [{'indices': [62, 74], 'text': 'DionWaiters'},
     {'indices': [75, 80], 'text': 'Heat'},
     {'indices': [81, 85], 'text': 'NBA'}],
    'symbols': [],
    'urls': [{'display_url': 'goo.gl/qwBipR',
      'expanded_url': 'https://goo.gl/qwBipR',
      'indices': [88, 111],
      'url': 'https://t.co/8a4JuStMSh'}],
    'user_mentions': [{'id': 1004864958100901888,
      'id_str': '1004864958100901888',
      'indices': [3, 16],
      'name': 'Timeout',
      'screen_name': 'basketimeout'}]},
   '

Since there is rate limit to these things, I think you can only collect 100 of the most recent tweets that feature the `#NBA` in their tweets.

In [63]:
len(search['statuses'])

100

In [64]:
search_ls = []
for tweet in search['statuses']:
    single = {}
    single["id"] = tweet['id']
    single["created_at"] = tweet['created_at']
    single["text"] = tweet['text']
    search_ls.append(single)

In [65]:
search_pandas = pd.DataFrame(search_ls)

In [66]:
search_pandas

Unnamed: 0,created_at,id,text
0,Tue Sep 18 22:10:33 +0000 2018,1042174059902115840,RT @basketimeout: Más notícias para os fãs dos...
1,Tue Sep 18 22:10:23 +0000 2018,1042174020962213889,RT @NOW_insports: #NBA │ OFFICIAL: The @MiamiH...
2,Tue Sep 18 22:10:12 +0000 2018,1042173972111126528,"Ginóbili y una clase de valores: ""Lo más impor..."
3,Tue Sep 18 22:09:16 +0000 2018,1042173737951473670,RT @Evo1S: The BEST Raffle on Twitter brought ...
4,Tue Sep 18 22:09:02 +0000 2018,1042173679050936320,ICYMI: Beats by Dr. Dre officially announced a...
5,Tue Sep 18 22:08:42 +0000 2018,1042173596473286657,RT @AmicoHoops: #Heat make Dwyane Wade return ...
6,Tue Sep 18 22:08:41 +0000 2018,1042173590764965889,Más notícias para os fãs dos Miami Heat 🏀🔥\n\n...
7,Tue Sep 18 22:08:08 +0000 2018,1042173453405708288,30 Days to that ring ceremony!!!!! #nba #dubna...
8,Tue Sep 18 22:08:01 +0000 2018,1042173423634591745,RT @AmicoHoops: #Spurs sign former UConn cente...
9,Tue Sep 18 22:07:43 +0000 2018,1042173347243720704,Biggest NBA Busts - OH MY! https://t.co/Vwb9gF...


### Stream public tweets that are happening in real-time
We will use Twython to do this (no particular reason, just because Kevin learned to use this before).

For more info on Twython: https://twython.readthedocs.io/en/latest/

In [67]:
from twython import TwythonStreamer

streamed_tweets = []

class MyStreamer(TwythonStreamer):
    
    def on_success(self, data):
        
        if data['lang'] == 'en':
            streamed_tweets.append(data)
            print("received tweet", len(streamed_tweets))
            
        if len(streamed_tweets) >= 5:
            self.disconnect()
            
    def on_error(self, status_code, data):
        print(status_code, data)
        self.disconnect

In [None]:
stream = MyStreamer(con_key, con_sec, access_token, access_token_sec)
stream.statuses.filter(track='iphone')

401 b'<html>\\n<head>\\n<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>\\n<title>Error 401 Unauthorized</title>\n</head>\n<body>\n<h2>HTTP ERROR: 401</h2>\n<p>Problem accessing \'/1.1/statuses/filter.json\'. Reason:\n<pre>    Unauthorized</pre>\n</body>\n</html>\n'
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 b'<html>\\n<head>\\n<meta htt

401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 b'<html>\\n<head>\\n<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>\\n<title>Error 401 Unauthorized</title>\n</head>\n<body>\n<h2>HTTP ERROR: 401</h2>\n<p>Problem accessing \'/1.1/statuses/filter.json\'. Reason:\n<pre>    Unauthorized</pre>\n</body>\n</html>\n'
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,   

401 b'<html>\\n<head>\\n<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>\\n<title>Error 401 Unauthorized</title>\n</head>\n<body>\n<h2>HTTP ERROR: 401</h2>\n<p>Problem accessing \'/1.1/statuses/filter.json\'. Reason:\n<pre>    Unauthorized</pre>\n</body>\n</html>\n'
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 b'<html>\\n<head>\\n<meta htt

401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 b'<html>\\n<head>\\n<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>\\n<title>Error 401 Unauthorized</title>\n</head>\n<body>\n<h2>HTTP ERROR: 401</h2>\n<p>Problem accessing \'/1.1/statuses/filter.json\'. Reason:\n<pre>    Unauthorized</pre>\n</body>\n</html>\n'
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,   

401 b'<html>\\n<head>\\n<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>\\n<title>Error 401 Unauthorized</title>\n</head>\n<body>\n<h2>HTTP ERROR: 401</h2>\n<p>Problem accessing \'/1.1/statuses/filter.json\'. Reason:\n<pre>    Unauthorized</pre>\n</body>\n</html>\n'
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 Unable to decode response,                                       not valid JSON.
401 b'<html>\\n<head>\\n<meta htt

In [71]:
streamed_tweets[1]

{'contributors': None,
 'coordinates': None,
 'created_at': 'Tue Sep 18 22:18:21 +0000 2018',
 'entities': {'hashtags': [],
  'symbols': [],
  'urls': [{'display_url': 'twitter.com/i/web/status/1…',
    'expanded_url': 'https://twitter.com/i/web/status/1042176024925097991',
    'indices': [111, 134],
    'url': 'https://t.co/zkXeRCGO04'}],
  'user_mentions': []},
 'extended_tweet': {'display_text_range': [0, 158],
  'entities': {'hashtags': [],
   'media': [{'display_url': 'pic.twitter.com/dBbY3HXlBh',
     'expanded_url': 'https://twitter.com/TheHackersNews/status/1041621081440825344/photo/1',
     'id': 1041620951585222656,
     'id_str': '1041620951585222656',
     'indices': [135, 158],
     'media_url': 'http://pbs.twimg.com/media/DnSUvJnXcAAQsBQ.jpg',
     'media_url_https': 'https://pbs.twimg.com/media/DnSUvJnXcAAQsBQ.jpg',
     'sizes': {'large': {'h': 380, 'resize': 'fit', 'w': 728},
      'medium': {'h': 380, 'resize': 'fit', 'w': 728},
      'small': {'h': 355, 'resize': 'fi

In [72]:
stream_ls = []
for tweet in streamed_tweets:
    single = {}
    single["id"] = tweet['id']
    single["created_at"] = tweet['created_at']
    single["text"] = tweet['text']
    stream_ls.append(single)

In [73]:
stream_pandas = pd.DataFrame(stream_ls)

In [74]:
stream_pandas

Unnamed: 0,created_at,id,text
0,Tue Sep 18 22:18:19 +0000 2018,1042176013697015811,RT @basedsavage3600: Thought you was gone kill...
1,Tue Sep 18 22:18:21 +0000 2018,1042176024925097991,RT TheHackersNews: Watch Out! A CSS-Based New ...
2,Tue Sep 18 22:18:21 +0000 2018,1042176025331998720,(iOS 12 accidentally removes select Apple Watc...
3,Tue Sep 18 22:18:22 +0000 2018,1042176026359554048,"Clickbank University, Our Clients Have Earned ..."
4,Tue Sep 18 22:18:22 +0000 2018,1042176027496083456,RT @Apple: Welcome to the big screens. iPhone ...
5,Tue Sep 18 22:19:09 +0000 2018,1042176224716627968,RT @basedsavage3600: Thought you was gone kill...


### Activity 2

Please stream 100 tweets that feature the word "sociology." Please present this data in a pandas dataframe, and please include the four following columns: `created_at`, `Id`, `text`, `screen_name`

In [None]:
# Please input your answers for activity 2 here










## Scraping Reddit

__Last Updated: 09/17/18__

This file is an implementation of the tutorial found here: http://www.storybench.org/how-to-scrape-reddit-with-python/ and https://github.com/aleszu/reddit-sentiment-analysis/blob/master/r_subreddit.py

For more information regarding `PRAW` please consult: https://praw.readthedocs.io/en/latest/getting_started/quick_start.html 

Thank you internet!

### Load the required packages

In [75]:
import praw
import pandas as pd

### Reddit API credentials
To be able to successfully scrape Reddit, you will need to first create a Reddit account and an "app" for Reddit

https://www.reddit.com/prefs/apps

In [76]:
reddit = praw.Reddit(client_id='XXXXXXXXXXXXXXX', \
                     client_secret='XXXXXXXXXXXXXXX', \
                     user_agent='XXXXXXXXXXXXXXX', \
                     username='XXXXXXXXXXXXXXX', \
                     password='XXXXXXXXXXXXXXX')

### Scraping NBA Reddit

https://www.reddit.com/r/nba/

In [77]:
subreddit = reddit.subreddit('nba') # Set the subreddit of interest

In [78]:
for submission in subreddit.hot(limit=5):
    print(submission.title, submission.id)

Daily Locker Room and Free Talk + Game Threads Index (2018.09.18) 9guplr
[Announcement] More AMA's! 9gmsy2
[Rovell] JUST IN: @kevinlove launches the Kevin Love Fund, with partners including Headspace, to focus on prioritizing mental health. Pushes the momentum forward. Proud to be a part of that work as well with @AllALittleCrazy programs & tour. 9guj99
[OC] The Shaq Rule is over. Who replaces him? 9guxyz
[Miami Heat] OFFICIAL: The Miami HEAT have re-signed guard @DwyaneWade. 9gxwtu


In [79]:
topics_dict = {"title":[], \
               "score":[], \
               "id":[], \
               "url":[], \
               "comms_num": [], \
               "created": [], \
               "body":[]}

In [80]:
for submission in subreddit.hot(limit = 10):
    topics_dict["title"].append(submission.title)
    topics_dict["score"].append(submission.score)
    topics_dict["id"].append(submission.id)
    topics_dict["url"].append(submission.url)
    topics_dict["comms_num"].append(submission.num_comments)
    topics_dict["created"].append(submission.created)
    topics_dict["body"].append(submission.selftext)

In [81]:
len(topics_dict["id"])

10

In [82]:
topics_data = pd.DataFrame(topics_dict)

In [83]:
topics_data

Unnamed: 0,body,comms_num,created,id,score,title,url
0,#[/r/NBA Rules](https://www.reddit.com/r/nba/w...,59,1537305000.0,9guplr,19,Daily Locker Room and Free Talk + Game Threads...,https://www.reddit.com/r/nba/comments/9guplr/d...
1,"On Thursday, September 20th at 2 ET, we will b...",39,1537237000.0,9gmsy2,175,[Announcement] More AMA's!,https://www.reddit.com/r/nba/comments/9gmsy2/a...
2,,104,1537304000.0,9guj99,5347,[Rovell] JUST IN: @kevinlove launches the Kevi...,https://twitter.com/darrenrovell/status/104203...
3,"When the Warriors won the 2018 NBA Finals, the...",125,1537307000.0,9guxyz,2046,[OC] The Shaq Rule is over. Who replaces him?,https://www.reddit.com/r/nba/comments/9guxyz/o...
4,,51,1537328000.0,9gxwtu,524,[Miami Heat] OFFICIAL: The Miami HEAT have re-...,https://twitter.com/MiamiHEAT/status/104213238...
5,,301,1537318000.0,9gwhww,725,[Wojnarowski] Minnesota's Tom Thibodeau is tra...,https://twitter.com/wojespn/status/10420922186...
6,,162,1537317000.0,9gwdkm,613,[Charania] Charlotte Hornets owner Michael Jor...,https://twitter.com/ShamsCharania/status/10420...
7,,86,1537330000.0,9gyaer,240,[Young] Russell Westbrook and his wife Nina an...,https://twitter.com/royceyoung/status/10421431...
8,,124,1537304000.0,9gum0q,746,Richard Jefferson will join the Nets broadcast...,https://www.netsdaily.com/2018/9/17/17872798/i...
9,,159,1537310000.0,9gvbdw,514,[Blake Murphy] Cliff notes: Kawhi has been to ...,https://twitter.com/blakemurphyodc/status/1042...


In [34]:
topics_data.to_csv('x.csv')

#### Scraping the top comments in thread "9gk8f5"

https://www.reddit.com/r/nba/comments/9gk8f5/daily_locker_room_and_free_talk_game_threads/

In [84]:
submission = reddit.submission(id="9gk8f5")

for top_level_comment in submission.comments:
    print(top_level_comment.body, top_level_comment.created)

Gotta love school bookstores. They sell you an IClicker with non-functioning batteries and tell you its your damn fault that they're not working. Bet they wouldn't say the same shit if one of their books had missing pages. "Too fucking bad, that's what happens when you rent a used book." Fuck off, and fuck that swarmy prick sitting at the desk. 1537227634.0
This is going to be the longest month of my life holy shit... 1537219340.0
idk if it's a sunday afternoon thing, but i cannot watch a NFL game without falling asleep. 1537225971.0
That Blake Bortles fuckin boomed me 1537226941.0
I can't believe Nick Young hasn't signed anywhere yet. Where do y'all think he's going?  1537231250.0
How we doing this morning gentlemen  1537223771.0
Is 30 teams in 30 days happening this year? 1537233058.0
I can't believe the browns traded Josh Gordon for a fifth round pick, unbelievable. 1537246700.0
Instead of studying for university (tons of math), I just reorganize my living space, run some family err

In [85]:
top_comms_dict = {"topic": [], \
              "body": [], \
              "comm_id": [], \
              "created": []}

In [86]:
for top_level_comment in submission.comments:
    top_comms_dict["topic"].append("9gk8f5")
    top_comms_dict["body"].append(top_level_comment.body)
    top_comms_dict["comm_id"].append(top_level_comment)
    top_comms_dict["created"].append(top_level_comment.created)

In [87]:
len(top_comms_dict["topic"])

28

In [88]:
top_comms_data = pd.DataFrame(top_comms_dict)

In [89]:
top_comms_data

Unnamed: 0,body,comm_id,created,topic
0,Gotta love school bookstores. They sell you an...,e64y0wh,1537228000.0,9gk8f5
1,This is going to be the longest month of my li...,e64p9b6,1537219000.0,9gk8f5
2,"idk if it's a sunday afternoon thing, but i ca...",e64w3xe,1537226000.0,9gk8f5
3,That Blake Bortles fuckin boomed me,e64x7qr,1537227000.0,9gk8f5
4,I can't believe Nick Young hasn't signed anywh...,e652acg,1537231000.0,9gk8f5
5,How we doing this morning gentlemen,e64to5w,1537224000.0,9gk8f5
6,Is 30 teams in 30 days happening this year?,e654h4r,1537233000.0,9gk8f5
7,I can't believe the browns traded Josh Gordon ...,e65l6ts,1537247000.0,9gk8f5
8,Instead of studying for university (tons of ma...,e65inc9,1537245000.0,9gk8f5
9,Anyone else ever find the first tests of the y...,e667q8s,1537267000.0,9gk8f5


### Scraping all comments in thread "9gk8f5"

In [90]:
all_comms_dict = {"topic": [], \
                  "body": [], \
                  "comm_id": [], \
                  "created": []}

In [91]:
for all_level_comment in submission.comments.list():
    all_comms_dict["topic"].append("9gk8f5")
    all_comms_dict["body"].append(all_level_comment.body)
    all_comms_dict["comm_id"].append(all_level_comment)
    all_comms_dict["created"].append(all_level_comment.created)

In [92]:
len(all_comms_dict["topic"])

53

In [93]:
comments_data = pd.DataFrame(all_comms_dict)

In [94]:
comments_data

Unnamed: 0,body,comm_id,created,topic
0,Gotta love school bookstores. They sell you an...,e64y0wh,1537228000.0,9gk8f5
1,This is going to be the longest month of my li...,e64p9b6,1537219000.0,9gk8f5
2,"idk if it's a sunday afternoon thing, but i ca...",e64w3xe,1537226000.0,9gk8f5
3,That Blake Bortles fuckin boomed me,e64x7qr,1537227000.0,9gk8f5
4,I can't believe Nick Young hasn't signed anywh...,e652acg,1537231000.0,9gk8f5
5,How we doing this morning gentlemen,e64to5w,1537224000.0,9gk8f5
6,Is 30 teams in 30 days happening this year?,e654h4r,1537233000.0,9gk8f5
7,I can't believe the browns traded Josh Gordon ...,e65l6ts,1537247000.0,9gk8f5
8,Instead of studying for university (tons of ma...,e65inc9,1537245000.0,9gk8f5
9,Anyone else ever find the first tests of the y...,e667q8s,1537267000.0,9gk8f5


### Activity 3

Please represent the "hottest" 25 topics from the "Sociology" subreddit (https://www.reddit.com/r/sociology) in a pandas table with the following columns: "Body", "Number of comments", "date created", "Title", and "URL"

In [25]:
# Please insert your answers for activity 3 here....







### Activity 4 (If time allows)

IF THERE IS TIME... Please scrape all the comments in this thread: https://www.reddit.com/r/sociology/comments/9fba2z/what_important_sociological_ideas_are_in/

Represent this data in the form of a pandas table, with the following columns: body, comment id, and date created

In [33]:
# Please insert your answers for activity 4 here....




