In [1]:
import requests
import pandas as pd
import time
import random

In [2]:
url = 'https://www.reddit.com/r/boardgames.json'

In [3]:
res = requests.get(url)

In [4]:
res.status_code

429

Reddit knows that you are using a Chrome browser on a Mac is trying to access the address https://www.reddit.com/r/boardgames.json However, Python has its own default user agent. Since there are so many scripts out there that are already 'hitting' reddit's API, reddit is basically shutting down all Python scripts from accessing its API.

We will change our request a little bit to make it not use the default user agent. 

In [5]:
res = requests.get(url, headers={'User-agent': 'Pony Inc 1.0'})

In [6]:
res.status_code

200

In [7]:
reddit_dict = res.json()

In [8]:
print(reddit_dict)

{'kind': 'Listing', 'data': {'after': 't3_146w0h6', 'dist': 27, 'modhash': '', 'geo_filter': None, 'children': [{'kind': 't3', 'data': {'approved_at_utc': None, 'subreddit': 'boardgames', 'selftext': '**Welcome to /r/boardgames\'s Daily Game Recommendations**\n\nThis is a place where you can ask any and all questions relating to the board gaming world including but not limited to[:](https://en.wiktionary.org/wiki/meeple#/media/File:Carcassonne_Miples.jpg)\n\n* general or specific game recommendations\n* help identifying a game or game piece\n* advice regarding situation limited to you (e.g, questions about a specific FLGS)\n* rule clarifications\n* and other quick questions that might not warrant their own post\n\n## Asking for Recommendations\n\nYou\'re much more likely to get good and personalized recommendations if you take the time to format a well-written ask. We **highly recommend** using [this template](/r/boardgames/wiki/personalized-game-recommendation-template-no-explainer) a

In [9]:
reddit_dict.keys()

dict_keys(['kind', 'data'])

In [10]:
reddit_dict['kind']

'Listing'

In [11]:
reddit_dict['data']

{'after': 't3_146w0h6',
 'dist': 27,
 'modhash': '',
 'geo_filter': None,
 'children': [{'kind': 't3',
   'data': {'approved_at_utc': None,
    'subreddit': 'boardgames',
    'selftext': '**Welcome to /r/boardgames\'s Daily Game Recommendations**\n\nThis is a place where you can ask any and all questions relating to the board gaming world including but not limited to[:](https://en.wiktionary.org/wiki/meeple#/media/File:Carcassonne_Miples.jpg)\n\n* general or specific game recommendations\n* help identifying a game or game piece\n* advice regarding situation limited to you (e.g, questions about a specific FLGS)\n* rule clarifications\n* and other quick questions that might not warrant their own post\n\n## Asking for Recommendations\n\nYou\'re much more likely to get good and personalized recommendations if you take the time to format a well-written ask. We **highly recommend** using [this template](/r/boardgames/wiki/personalized-game-recommendation-template-no-explainer) as a guide. [H

In [12]:
reddit_dict['data'].keys()

dict_keys(['after', 'dist', 'modhash', 'geo_filter', 'children', 'before'])

The most important keys are `children` and `after`.

In [13]:
reddit_dict['data']['children']

[{'kind': 't3',
  'data': {'approved_at_utc': None,
   'subreddit': 'boardgames',
   'selftext': '**Welcome to /r/boardgames\'s Daily Game Recommendations**\n\nThis is a place where you can ask any and all questions relating to the board gaming world including but not limited to[:](https://en.wiktionary.org/wiki/meeple#/media/File:Carcassonne_Miples.jpg)\n\n* general or specific game recommendations\n* help identifying a game or game piece\n* advice regarding situation limited to you (e.g, questions about a specific FLGS)\n* rule clarifications\n* and other quick questions that might not warrant their own post\n\n## Asking for Recommendations\n\nYou\'re much more likely to get good and personalized recommendations if you take the time to format a well-written ask. We **highly recommend** using [this template](/r/boardgames/wiki/personalized-game-recommendation-template-no-explainer) as a guide. [Here is a version](/r/boardgames/wiki/personalized-game-recommendation-template) with addit

In [14]:
len(reddit_dict['data']['children'])

27

In [15]:
reddit_dict['data']['children'][1]

{'kind': 't3',
 'data': {'approved_at_utc': None,
  'subreddit': 'boardgames',
  'selftext': "# What's going on?\n\nA recent Reddit policy change threatens to kill many beloved third-party mobile apps, making a great many quality-of-life features not seen in the official mobile app **permanently inaccessible** to users.\n\nOn May 31, 2023, Reddit announced they were raising the price to make calls to their API from being free to a level that will kill every third party app on Reddit, from Apollo to Reddit is Fun to Narwhal to BaconReader.\n\nEven if you're not a mobile user and don't use any of those apps, this is a step toward killing other ways of customizing Reddit, such as Reddit Enhancement Suite or the use of the old.reddit.com desktop interface.\n\nThis isn't only a problem on the user level: many subreddit moderators depend on tools only available outside the official app to keep their communities on-topic and spam-free.\n\n# What's the plan?\n\nOn June 12th, [many subreddits](

In [16]:
reddit_dict['data']['children'][0].keys()

dict_keys(['kind', 'data'])

In [17]:
reddit_dict['data']['children'][0]['kind']

't3'

In [18]:
reddit_dict['data']['children'][0]['data']

{'approved_at_utc': None,
 'subreddit': 'boardgames',
 'selftext': '**Welcome to /r/boardgames\'s Daily Game Recommendations**\n\nThis is a place where you can ask any and all questions relating to the board gaming world including but not limited to[:](https://en.wiktionary.org/wiki/meeple#/media/File:Carcassonne_Miples.jpg)\n\n* general or specific game recommendations\n* help identifying a game or game piece\n* advice regarding situation limited to you (e.g, questions about a specific FLGS)\n* rule clarifications\n* and other quick questions that might not warrant their own post\n\n## Asking for Recommendations\n\nYou\'re much more likely to get good and personalized recommendations if you take the time to format a well-written ask. We **highly recommend** using [this template](/r/boardgames/wiki/personalized-game-recommendation-template-no-explainer) as a guide. [Here is a version](/r/boardgames/wiki/personalized-game-recommendation-template) with additional explanations in case the

In [19]:
reddit_dict['data']['children'][0]['data']['subreddit']

'boardgames'

The cell directly above gives you the class label, aka your target.

In [20]:
reddit_dict['data']['children'][0]['data']['title']

'Daily Game Recommendations Thread (June 11, 2023)'

That's mapping to the first post.

In [21]:
reddit_dict['data']['children'][0]['data']['selftext']

'**Welcome to /r/boardgames\'s Daily Game Recommendations**\n\nThis is a place where you can ask any and all questions relating to the board gaming world including but not limited to[:](https://en.wiktionary.org/wiki/meeple#/media/File:Carcassonne_Miples.jpg)\n\n* general or specific game recommendations\n* help identifying a game or game piece\n* advice regarding situation limited to you (e.g, questions about a specific FLGS)\n* rule clarifications\n* and other quick questions that might not warrant their own post\n\n## Asking for Recommendations\n\nYou\'re much more likely to get good and personalized recommendations if you take the time to format a well-written ask. We **highly recommend** using [this template](/r/boardgames/wiki/personalized-game-recommendation-template-no-explainer) as a guide. [Here is a version](/r/boardgames/wiki/personalized-game-recommendation-template) with additional explanations in case the template isn\'t enough.\n\n## Bold Your Games\n\nHelp people ident

We want to get all these posts into a Pandas DataFrame and thereafter we can save it to a CSV.

In [22]:
posts = [p['data'] for p in reddit_dict['data']['children']]

In [23]:
pd.DataFrame(posts)

Unnamed: 0,approved_at_utc,subreddit,selftext,author_fullname,saved,mod_reason_title,gilded,clicked,title,link_flair_richtext,...,created_utc,num_crossposts,media,is_video,is_gallery,media_metadata,gallery_data,url_overridden_by_dest,crosspost_parent_list,crosspost_parent
0,,boardgames,**Welcome to /r/boardgames's Daily Game Recomm...,t2_6l4z3,False,,0,False,"Daily Game Recommendations Thread (June 11, 2023)","[{'e': 'text', 't': 'Daily Game Recs'}]",...,1686460000.0,1,,False,,,,,,
1,,boardgames,# What's going on?\n\nA recent Reddit policy c...,t2_16wy8q,False,,1,False,r/BoardGames will be going dark from June 12-1...,"[{'e': 'text', 't': 'Announcement'}]",...,1686037000.0,1,,False,,,,,,
2,,boardgames,,t2_xe4qz,False,,0,False,Painting Massive Darkness 2,[],...,1686526000.0,0,,False,True,"{'6lqlzfkj4h5b1': {'status': 'valid', 'e': 'Im...",{'items': [{'caption': 'Starting to paint it u...,https://www.reddit.com/gallery/1478wym,,
3,,boardgames,Hello internet! My husband and I are wondering...,t2_412ktl6k,False,,0,False,What game is this?,"[{'e': 'text', 't': 'Game/Piece ID'}]",...,1686505000.0,0,,False,True,"{'xd6wk25aef5b1': {'status': 'valid', 'e': 'Im...","{'items': [{'media_id': 'xd6wk25aef5b1', 'id':...",https://www.reddit.com/gallery/14704i0,,
4,,boardgames,Target is running their Buy Two Get One free s...,t2_928nh,False,,0,False,Target B2G1 (B3C1) Curated List,[],...,1686506000.0,0,,False,,,,,,
5,,boardgames,https://imgur.com/gallery/ml6fXoU\n\nA homemad...,t2_38ccrnim,False,,0,False,Turncoats homemade embroidery copy,"[{'e': 'text', 't': 'How-To/DIY'}]",...,1686488000.0,0,,False,,,,,,
6,,boardgames,Kickstarter &amp; Gamefound Campaigns Launchin...,t2_5vfmjmel,False,,0,False,📅 Crowdfunded Games Launching This Week [Jun 1...,"[{'e': 'text', 't': 'Crowdfunding'}]",...,1686495000.0,0,,False,,,,,,
7,,boardgames,I don't actually intend to do that I just thin...,t2_cmt2fdsfl,False,,0,False,"Say I want to stop being friends with people, ...","[{'e': 'text', 't': 'Question'}]",...,1686509000.0,0,,False,,,,,,
8,,boardgames,I tried to search earlier posts regarding this...,t2_k15ez,False,,0,False,Games where you can build your character(s) di...,"[{'e': 'text', 't': 'Question'}]",...,1686520000.0,0,,False,,,,,,
9,,boardgames,##What is this?\nThis is a weekly crowdfunding...,t2_36p8x,False,,0,False,Weekly Crowdfunding Roundup: June 11 2023 | 20...,"[{'e': 'text', 't': 'KS Roundup'}]",...,1686480000.0,0,,False,,,,,,


In [24]:
pd.DataFrame(posts).to_csv('posts.csv')

In [25]:
reddit_dict['data']['after']

't3_146w0h6'

This is the name of the last post.

In [26]:
pd.DataFrame(posts)['name']

0     t3_146kz2v
1     t3_1427tjw
2     t3_1478wym
3     t3_14704i0
4     t3_1470k3p
5     t3_146t7cn
6     t3_146w171
7     t3_1471qt2
8     t3_14767ck
9     t3_146qt6d
10    t3_147asbx
11    t3_1479jst
12    t3_1477jvo
13    t3_146ppuh
14    t3_146z78q
15    t3_147buy0
16    t3_1476yio
17    t3_1479fay
18    t3_147ceto
19    t3_146td2y
20    t3_1471iie
21    t3_1472kn4
22    t3_146sl8u
23    t3_1473vei
24    t3_14796bt
25    t3_1475n1w
26    t3_146w0h6
Name: name, dtype: object

In [27]:
reddit_dict['data']['after']

't3_146w0h6'

This is the new URL that gives you the next 25 posts.

In [28]:
url + '?after=' + reddit_dict['data']['after']

'https://www.reddit.com/r/boardgames.json?after=t3_146w0h6'

## Looping through the posts, 25 posts at a time

In [29]:
url = "YOUR_URL"  # You should replace this with your actual URL
headers = {'User-agent': 'Pony Inc 1.0'}
posts = []
after = None

for a in range(4):
    if after:
        current_url = f"{url}?after={after}"
    else:
        current_url = url

    res = requests.get(current_url, headers=headers)
    
    if res.status_code != 200:
        print('Status error', res.status_code)
        break
    
    current_dict = res.json()
    current_posts = [p['data'] for p in current_dict['data']['children']]
    posts.extend(current_posts)
    after = current_dict['data']['after']
    
    if a > 0:
        prev_posts = pd.read_csv('boardgames.csv')
        current_df = pd.DataFrame(posts)
        combined_df = pd.concat([prev_posts, current_df])
        combined_df.to_csv('boardgames.csv', index = False)
    else:
        pd.DataFrame(posts).to_csv('boardgames.csv', index = False)

    # generate a random sleep duration to look more 'natural'
    sleep_duration = random.randint(2,6)
    print('Sleeping for', sleep_duration, 'seconds.')
    time.sleep(sleep_duration)

https://www.reddit.com/r/boardgames.json
3
https://www.reddit.com/r/boardgames.json?after=t3_146w0h6
4
https://www.reddit.com/r/boardgames.json?after=t3_1463bwu
4
https://www.reddit.com/r/boardgames.json?after=t3_145i9x5
4
