## Failed Approach (Using API)

The following blocks try to use the API approach, which failed fantastically. I suggest you go to the next section, which works well.

You can also use directly PushishiftAPI() without psaw.

```Python
from pushshift_py import PushshiftAPI
import datetime as dt
import psaw
import pandas as pd
import requests
import json
import csv
import time
api = psaw.PushshiftAPI()

startEpoch = int(dt.datetime(2020,1,1).timestamp())
```
    
The following block shows how we can get information using pushshift. It shows how we can specify the features and get them. The returned data type is a generator with "submission" type as elements, though we can certainly make them into a list.

```Python
features = ['url','author', 'title', 'subreddit', 'id', 'created', 'score']
subreddit = 'NBA'

data = api.search_submissions(after=startEpoch,
                            subreddit=subreddit,
                            filter= features,
                            limit=10)

for datum in data:
    print(datum.id, datum.subreddit, datum.title, datum.author, datum.url, datum.created, datum.score)

import praw

reddit = praw.Reddit(
    client_id="kxbUr-4PyE7DlQ",
    client_secret="Q5rIAPS9IHZ1QgOIkHNY09Y9VMxDsA",
    password="AACAXZDE",
    user_agent="testscript by u/kc_the_scraper",
    username="kc_the_scraper",
)
```

We can use praw to get the post body using the following block.
```Python
reddit.submission(id='eiev5d').selftext
```



In the following blocks, we create tables and store the information. For some reason, though, the api often acts up and freezes when we loop through the data.
```Python
import sqlite3

conn = sqlite3.connect('redditPosts.sqlite')
cur = conn.cursor()

cur.execute('''CREATE TABLE IF NOT EXISTS Posts(
                id TEXT PRIMARY KEY,
                subreddit TEXT,
                title TEXT,
                author TEXT,
                url TEXT,
                created int)
                ''')

features = ['url','author', 'title', 'subreddit', 'id', 'created']
subreddit = 'stocks'
latest = dt.datetime(2021,5,8).timestamp()
earliest = dt.datetime(2020,1,1).timestamp()

startEpoch = earliest

while startEpoch <= latest:
    data = api.search_submissions(after=startEpoch,
                            subreddit=subreddit,
                            filter= features,
                            limit=100)
    
    for datum in data:
        print('Got here 2.')
        cur.execute('''INSERT OR IGNORE INTO Posts VALUES (?,?,?,?,?,?)'''
                    , (datum.id, datum.subreddit, datum.title, datum.author, datum.url, datum.created))
        
        currentTime = datum.created
    
    conn.commit()
    if currentTime == startEpoch:
        break
    startEpoch = currentTime + 1
    print(dt.datetime.fromtimestamp(startEpoch))    

```

## Another Approach (Getting JSON)

The method above is shaky at best. A lot of times the api just freezes. On the other hand, I find using requests much easier. The following code blocks contain what you need for storing reddit data you need.

In [1]:
import requests
import datetime as dt
import sqlite3
import json
import time

In [15]:
def getPushShiftData(after,before, sub):
    url = 'https://api.pushshift.io/reddit/search/submission/?size=100&after='+str(int(after))+'&before='+str(int(before))+'&subreddit='+str(sub)
    r = requests.get(url)
    data = json.loads(r.text)
    return data['data']

def extractInfo(datum,features):
    info = {}
    
    for feature in features:
        info[feature] = datum[feature]
    
    return info

def getLatestTime(data):
    return data[-1]['created_utc']

def dataStoragePipeline(after, before, sub, conn):
    features = ['full_link','author', 'title', 'subreddit', 'id', 'created_utc']
    cursor = conn.cursor()
    while after < before:
        data = getPushShiftData(after, before, sub)
        if not data:
            break
        for datum in data:
            cursor.execute('''INSERT OR IGNORE INTO Posts 
                                VALUES (?,?,?,?,?,?)'''
                              , (datum['id'], datum['subreddit'], datum['title'], datum['author'], datum['full_link'], datum['created_utc']))
        
        after = getLatestTime(data) + 1
        conn.commit()
        print("The latest post is submitted at", dt.datetime.fromtimestamp(after-1))
        time.sleep(0.1)
        
        

In [3]:
import sqlite3
conn = sqlite3.connect('redditPosts.sqlite')
cur = conn.cursor()
subreddit = 'wallstreetbets'
end = int(time.time())
start = dt.datetime(2021,1,1).timestamp()
cur.execute('''SELECT MIN(created), MAX(created) FROM Posts
                WHERE subreddit = ?''', (subreddit,))
datatimes = cur.fetchone()

if datatimes:
    dataEarly, dataLate = datatimes
    if end < dataEarly:
        end = dataEarly
    elif start < dataLate:
        start =dataLate

In [16]:
while start < end:
    try:
        dataStoragePipeline(after = start, before = end, sub = subreddit, conn = conn)
    except KeyboardInterrupt:
        print("Interrupted by keyboard. Stopping.")
        break
        
    except:
        print("Error occurred. Probably due to frequent requests. Will resume working in 1 seconds.")
        time.sleep(1)
        cur.execute('''SELECT MIN(created), MAX(created) FROM Posts
                        WHERE subreddit = ?''', (subreddit,))
        datatimes = cur.fetchone()
        
        if datatimes:
            dataEarly, dataLate = datatimes
            if end < dataEarly:
                end = dataEarly
            elif start < dataLate:
                start =dataLate
        

The latest post is submitted at 2021-01-27 10:57:48
The latest post is submitted at 2021-01-27 10:58:30
The latest post is submitted at 2021-01-27 10:59:12
The latest post is submitted at 2021-01-27 10:59:56
The latest post is submitted at 2021-01-27 11:00:34
The latest post is submitted at 2021-01-27 11:01:10
The latest post is submitted at 2021-01-27 11:01:49
The latest post is submitted at 2021-01-27 11:02:33
The latest post is submitted at 2021-01-27 11:03:16
The latest post is submitted at 2021-01-27 11:03:59
The latest post is submitted at 2021-01-27 11:04:31
The latest post is submitted at 2021-01-27 11:05:08
The latest post is submitted at 2021-01-27 11:05:46
The latest post is submitted at 2021-01-27 11:06:34
The latest post is submitted at 2021-01-27 11:07:17
The latest post is submitted at 2021-01-27 11:08:00
The latest post is submitted at 2021-01-27 11:08:39
The latest post is submitted at 2021-01-27 11:09:18
The latest post is submitted at 2021-01-27 11:10:01
The latest p

The latest post is submitted at 2021-01-27 13:33:13
The latest post is submitted at 2021-01-27 13:35:12
The latest post is submitted at 2021-01-27 13:37:40
The latest post is submitted at 2021-01-27 13:39:33
The latest post is submitted at 2021-01-27 13:41:00
The latest post is submitted at 2021-01-27 13:42:45
The latest post is submitted at 2021-01-27 13:44:18
The latest post is submitted at 2021-01-27 13:45:44
The latest post is submitted at 2021-01-27 13:47:11
The latest post is submitted at 2021-01-27 13:48:31
The latest post is submitted at 2021-01-27 13:49:48
The latest post is submitted at 2021-01-27 13:51:13
The latest post is submitted at 2021-01-27 13:52:37
The latest post is submitted at 2021-01-27 13:53:57
The latest post is submitted at 2021-01-27 13:55:18
The latest post is submitted at 2021-01-27 13:56:35
The latest post is submitted at 2021-01-27 13:57:50
The latest post is submitted at 2021-01-27 13:59:10
The latest post is submitted at 2021-01-27 14:00:14
The latest p

The latest post is submitted at 2021-01-27 19:03:52
The latest post is submitted at 2021-01-27 19:04:33
The latest post is submitted at 2021-01-27 19:05:21
The latest post is submitted at 2021-01-27 19:06:05
The latest post is submitted at 2021-01-27 19:06:50
The latest post is submitted at 2021-01-27 19:07:37
The latest post is submitted at 2021-01-27 19:08:19
The latest post is submitted at 2021-01-27 19:09:05
The latest post is submitted at 2021-01-27 19:09:50
The latest post is submitted at 2021-01-27 19:10:38
The latest post is submitted at 2021-01-27 19:11:27
The latest post is submitted at 2021-01-27 19:12:10
The latest post is submitted at 2021-01-27 19:12:56
The latest post is submitted at 2021-01-27 19:13:50
The latest post is submitted at 2021-01-27 19:14:30
The latest post is submitted at 2021-01-27 19:15:17
The latest post is submitted at 2021-01-27 19:16:06
The latest post is submitted at 2021-01-27 19:16:47
The latest post is submitted at 2021-01-27 19:17:34
The latest p

The latest post is submitted at 2021-01-27 22:22:11
The latest post is submitted at 2021-01-27 22:23:52
The latest post is submitted at 2021-01-27 22:25:49
The latest post is submitted at 2021-01-27 22:27:35
The latest post is submitted at 2021-01-27 22:29:03
The latest post is submitted at 2021-01-27 22:30:34
The latest post is submitted at 2021-01-27 22:32:07
The latest post is submitted at 2021-01-27 22:33:48
The latest post is submitted at 2021-01-27 22:35:20
The latest post is submitted at 2021-01-27 22:37:26
The latest post is submitted at 2021-01-27 22:39:07
The latest post is submitted at 2021-01-27 22:40:58
The latest post is submitted at 2021-01-27 22:43:15
The latest post is submitted at 2021-01-27 22:44:50
The latest post is submitted at 2021-01-27 22:46:51
The latest post is submitted at 2021-01-27 22:48:52
The latest post is submitted at 2021-01-27 22:50:48
The latest post is submitted at 2021-01-27 22:52:49
The latest post is submitted at 2021-01-27 22:54:38
The latest p

The latest post is submitted at 2021-01-28 05:13:06
The latest post is submitted at 2021-01-28 05:14:54
The latest post is submitted at 2021-01-28 05:16:22
The latest post is submitted at 2021-01-28 05:17:43
The latest post is submitted at 2021-01-28 05:19:28
The latest post is submitted at 2021-01-28 05:20:59
The latest post is submitted at 2021-01-28 05:22:54
The latest post is submitted at 2021-01-28 05:25:03
The latest post is submitted at 2021-01-28 05:27:01
The latest post is submitted at 2021-01-28 05:28:55
The latest post is submitted at 2021-01-28 05:30:42
The latest post is submitted at 2021-01-28 05:32:56
The latest post is submitted at 2021-01-28 05:34:44
The latest post is submitted at 2021-01-28 05:36:48
The latest post is submitted at 2021-01-28 05:39:01
The latest post is submitted at 2021-01-28 05:41:07
The latest post is submitted at 2021-01-28 05:43:29
The latest post is submitted at 2021-01-28 05:45:27
The latest post is submitted at 2021-01-28 05:47:29
The latest p

The latest post is submitted at 2021-01-28 08:02:52
The latest post is submitted at 2021-01-28 08:03:28
The latest post is submitted at 2021-01-28 08:04:03
The latest post is submitted at 2021-01-28 08:04:37
The latest post is submitted at 2021-01-28 08:05:03
The latest post is submitted at 2021-01-28 08:05:36
The latest post is submitted at 2021-01-28 08:06:10
The latest post is submitted at 2021-01-28 08:06:40
The latest post is submitted at 2021-01-28 08:07:15
The latest post is submitted at 2021-01-28 08:07:41
The latest post is submitted at 2021-01-28 08:08:21
The latest post is submitted at 2021-01-28 08:08:51
The latest post is submitted at 2021-01-28 08:09:23
The latest post is submitted at 2021-01-28 08:09:51
The latest post is submitted at 2021-01-28 08:10:27
The latest post is submitted at 2021-01-28 08:10:57
The latest post is submitted at 2021-01-28 08:11:30
The latest post is submitted at 2021-01-28 08:11:58
The latest post is submitted at 2021-01-28 08:12:27
The latest p

The latest post is submitted at 2021-01-28 09:03:26
The latest post is submitted at 2021-01-28 09:03:47
The latest post is submitted at 2021-01-28 09:04:07
The latest post is submitted at 2021-01-28 09:04:28
The latest post is submitted at 2021-01-28 09:04:46
The latest post is submitted at 2021-01-28 09:05:02
The latest post is submitted at 2021-01-28 09:05:19
The latest post is submitted at 2021-01-28 09:05:44
The latest post is submitted at 2021-01-28 09:06:01
The latest post is submitted at 2021-01-28 09:06:25
The latest post is submitted at 2021-01-28 09:06:41
The latest post is submitted at 2021-01-28 09:07:00
The latest post is submitted at 2021-01-28 09:07:20
The latest post is submitted at 2021-01-28 09:07:38
The latest post is submitted at 2021-01-28 09:07:59
The latest post is submitted at 2021-01-28 09:08:17
The latest post is submitted at 2021-01-28 09:08:35
The latest post is submitted at 2021-01-28 09:08:54
The latest post is submitted at 2021-01-28 09:09:14
The latest p

The latest post is submitted at 2021-01-28 09:59:43
The latest post is submitted at 2021-01-28 10:00:09
The latest post is submitted at 2021-01-28 10:00:28
The latest post is submitted at 2021-01-28 10:00:51
The latest post is submitted at 2021-01-28 10:01:15
The latest post is submitted at 2021-01-28 10:01:40
The latest post is submitted at 2021-01-28 10:02:00
The latest post is submitted at 2021-01-28 10:02:22
The latest post is submitted at 2021-01-28 10:02:45
The latest post is submitted at 2021-01-28 10:03:10
The latest post is submitted at 2021-01-28 10:03:34
The latest post is submitted at 2021-01-28 10:04:05
The latest post is submitted at 2021-01-28 10:04:34
The latest post is submitted at 2021-01-28 10:04:59
The latest post is submitted at 2021-01-28 10:05:31
The latest post is submitted at 2021-01-28 10:06:06
The latest post is submitted at 2021-01-28 10:06:33
The latest post is submitted at 2021-01-28 10:07:02
The latest post is submitted at 2021-01-28 10:07:32
The latest p

The latest post is submitted at 2021-01-28 11:01:28
The latest post is submitted at 2021-01-28 11:01:51
The latest post is submitted at 2021-01-28 11:02:12
The latest post is submitted at 2021-01-28 11:02:34
The latest post is submitted at 2021-01-28 11:02:55
The latest post is submitted at 2021-01-28 11:03:21
The latest post is submitted at 2021-01-28 11:03:42
The latest post is submitted at 2021-01-28 11:04:05
The latest post is submitted at 2021-01-28 11:04:27
The latest post is submitted at 2021-01-28 11:04:51
The latest post is submitted at 2021-01-28 11:05:17
The latest post is submitted at 2021-01-28 11:05:41
The latest post is submitted at 2021-01-28 11:06:03
The latest post is submitted at 2021-01-28 11:06:26
The latest post is submitted at 2021-01-28 11:06:51
The latest post is submitted at 2021-01-28 11:07:15
The latest post is submitted at 2021-01-28 11:07:38
The latest post is submitted at 2021-01-28 11:08:09
The latest post is submitted at 2021-01-28 11:08:32
The latest p

The latest post is submitted at 2021-01-28 12:40:54
The latest post is submitted at 2021-01-28 12:41:26
The latest post is submitted at 2021-01-28 12:41:59
The latest post is submitted at 2021-01-28 12:42:30
The latest post is submitted at 2021-01-28 12:43:02
The latest post is submitted at 2021-01-28 12:43:38
The latest post is submitted at 2021-01-28 12:44:17
The latest post is submitted at 2021-01-28 12:44:51
The latest post is submitted at 2021-01-28 12:45:24
The latest post is submitted at 2021-01-28 12:46:01
The latest post is submitted at 2021-01-28 12:46:34
The latest post is submitted at 2021-01-28 12:47:01
The latest post is submitted at 2021-01-28 12:47:35
The latest post is submitted at 2021-01-28 12:48:07
The latest post is submitted at 2021-01-28 12:48:37
The latest post is submitted at 2021-01-28 12:49:07
The latest post is submitted at 2021-01-28 12:49:43
The latest post is submitted at 2021-01-28 12:50:14
The latest post is submitted at 2021-01-28 12:50:48
The latest p

The latest post is submitted at 2021-01-28 14:11:19
The latest post is submitted at 2021-01-28 14:11:56
The latest post is submitted at 2021-01-28 14:12:29
Error occurred. Probably due to frequent requests. Will resume working in 1 seconds.
The latest post is submitted at 2021-01-28 14:13:05
The latest post is submitted at 2021-01-28 14:13:37
The latest post is submitted at 2021-01-28 14:14:18
The latest post is submitted at 2021-01-28 14:14:48
The latest post is submitted at 2021-01-28 14:15:29
The latest post is submitted at 2021-01-28 14:16:02
The latest post is submitted at 2021-01-28 14:16:41
Error occurred. Probably due to frequent requests. Will resume working in 1 seconds.
The latest post is submitted at 2021-01-28 14:17:10
The latest post is submitted at 2021-01-28 14:17:52
The latest post is submitted at 2021-01-28 14:18:29
The latest post is submitted at 2021-01-28 14:19:10
Error occurred. Probably due to frequent requests. Will resume working in 1 seconds.
The latest post i

The latest post is submitted at 2021-01-28 15:30:53
The latest post is submitted at 2021-01-28 15:31:28
The latest post is submitted at 2021-01-28 15:32:09
The latest post is submitted at 2021-01-28 15:32:50
Error occurred. Probably due to frequent requests. Will resume working in 1 seconds.
The latest post is submitted at 2021-01-28 15:33:27
The latest post is submitted at 2021-01-28 15:34:12
The latest post is submitted at 2021-01-28 15:34:50
The latest post is submitted at 2021-01-28 15:35:32
Error occurred. Probably due to frequent requests. Will resume working in 1 seconds.
The latest post is submitted at 2021-01-28 15:36:10
The latest post is submitted at 2021-01-28 15:36:51
The latest post is submitted at 2021-01-28 15:37:35
Error occurred. Probably due to frequent requests. Will resume working in 1 seconds.
The latest post is submitted at 2021-01-28 15:38:09
The latest post is submitted at 2021-01-28 15:38:59
The latest post is submitted at 2021-01-28 15:39:38
The latest post i

The latest post is submitted at 2021-01-28 17:00:53
Error occurred. Probably due to frequent requests. Will resume working in 1 seconds.
The latest post is submitted at 2021-01-28 17:01:41
The latest post is submitted at 2021-01-28 17:02:28
The latest post is submitted at 2021-01-28 17:03:13
The latest post is submitted at 2021-01-28 17:03:52
Error occurred. Probably due to frequent requests. Will resume working in 1 seconds.
The latest post is submitted at 2021-01-28 17:04:34
The latest post is submitted at 2021-01-28 17:05:19
The latest post is submitted at 2021-01-28 17:06:12
Error occurred. Probably due to frequent requests. Will resume working in 1 seconds.
The latest post is submitted at 2021-01-28 17:07:00
The latest post is submitted at 2021-01-28 17:07:44
The latest post is submitted at 2021-01-28 17:08:33
The latest post is submitted at 2021-01-28 17:09:23
Error occurred. Probably due to frequent requests. Will resume working in 1 seconds.
The latest post is submitted at 2021

The latest post is submitted at 2021-01-28 18:34:31
The latest post is submitted at 2021-01-28 18:35:29
Error occurred. Probably due to frequent requests. Will resume working in 1 seconds.
The latest post is submitted at 2021-01-28 18:36:28
The latest post is submitted at 2021-01-28 18:37:28
The latest post is submitted at 2021-01-28 18:38:28
The latest post is submitted at 2021-01-28 18:39:28
The latest post is submitted at 2021-01-28 18:40:26
Error occurred. Probably due to frequent requests. Will resume working in 1 seconds.
The latest post is submitted at 2021-01-28 18:41:21
The latest post is submitted at 2021-01-28 18:42:10
The latest post is submitted at 2021-01-28 18:43:08
Error occurred. Probably due to frequent requests. Will resume working in 1 seconds.
The latest post is submitted at 2021-01-28 18:44:07
The latest post is submitted at 2021-01-28 18:44:53
The latest post is submitted at 2021-01-28 18:45:52
The latest post is submitted at 2021-01-28 18:46:43
The latest post i

The latest post is submitted at 2021-01-28 20:42:17
Error occurred. Probably due to frequent requests. Will resume working in 1 seconds.
The latest post is submitted at 2021-01-28 20:43:23
The latest post is submitted at 2021-01-28 20:44:30
The latest post is submitted at 2021-01-28 20:45:37
The latest post is submitted at 2021-01-28 20:46:38
Error occurred. Probably due to frequent requests. Will resume working in 1 seconds.
The latest post is submitted at 2021-01-28 20:47:45
The latest post is submitted at 2021-01-28 20:48:50
The latest post is submitted at 2021-01-28 20:49:53
The latest post is submitted at 2021-01-28 20:50:46
The latest post is submitted at 2021-01-28 20:51:50
The latest post is submitted at 2021-01-28 20:52:50
The latest post is submitted at 2021-01-28 20:53:50
Error occurred. Probably due to frequent requests. Will resume working in 1 seconds.
The latest post is submitted at 2021-01-28 20:54:52
The latest post is submitted at 2021-01-28 20:55:45
The latest post i

The latest post is submitted at 2021-01-28 23:51:17
The latest post is submitted at 2021-01-28 23:52:49
The latest post is submitted at 2021-01-28 23:54:25
The latest post is submitted at 2021-01-28 23:55:47
The latest post is submitted at 2021-01-28 23:57:56
The latest post is submitted at 2021-01-28 23:59:26
The latest post is submitted at 2021-01-29 00:01:22
The latest post is submitted at 2021-01-29 00:03:18
The latest post is submitted at 2021-01-29 00:05:16
The latest post is submitted at 2021-01-29 00:07:07
The latest post is submitted at 2021-01-29 00:08:46
The latest post is submitted at 2021-01-29 00:10:48
The latest post is submitted at 2021-01-29 00:12:57
The latest post is submitted at 2021-01-29 00:15:00
The latest post is submitted at 2021-01-29 00:16:46
The latest post is submitted at 2021-01-29 00:18:36
The latest post is submitted at 2021-01-29 00:20:48
The latest post is submitted at 2021-01-29 00:22:53
The latest post is submitted at 2021-01-29 00:24:43
The latest p

The latest post is submitted at 2021-01-29 05:42:33
The latest post is submitted at 2021-01-29 05:44:35
The latest post is submitted at 2021-01-29 05:46:25
The latest post is submitted at 2021-01-29 05:48:43
The latest post is submitted at 2021-01-29 05:50:32
The latest post is submitted at 2021-01-29 05:52:36
The latest post is submitted at 2021-01-29 05:55:04
The latest post is submitted at 2021-01-29 05:57:02
The latest post is submitted at 2021-01-29 05:59:18
The latest post is submitted at 2021-01-29 06:01:16
The latest post is submitted at 2021-01-29 06:03:23
The latest post is submitted at 2021-01-29 06:05:42
The latest post is submitted at 2021-01-29 06:07:59
The latest post is submitted at 2021-01-29 06:09:58
The latest post is submitted at 2021-01-29 06:12:00
The latest post is submitted at 2021-01-29 06:13:54
The latest post is submitted at 2021-01-29 06:15:32
The latest post is submitted at 2021-01-29 06:17:06
The latest post is submitted at 2021-01-29 06:18:52
The latest p

The latest post is submitted at 2021-01-29 08:57:24
The latest post is submitted at 2021-01-29 08:58:02
The latest post is submitted at 2021-01-29 08:58:31
The latest post is submitted at 2021-01-29 08:59:00
The latest post is submitted at 2021-01-29 08:59:36
The latest post is submitted at 2021-01-29 09:00:08
The latest post is submitted at 2021-01-29 09:00:49
The latest post is submitted at 2021-01-29 09:01:29
The latest post is submitted at 2021-01-29 09:02:04
The latest post is submitted at 2021-01-29 09:02:42
The latest post is submitted at 2021-01-29 09:03:19
The latest post is submitted at 2021-01-29 09:03:52
The latest post is submitted at 2021-01-29 09:04:23
The latest post is submitted at 2021-01-29 09:05:03
The latest post is submitted at 2021-01-29 09:05:35
The latest post is submitted at 2021-01-29 09:06:06
The latest post is submitted at 2021-01-29 09:06:38
The latest post is submitted at 2021-01-29 09:07:17
The latest post is submitted at 2021-01-29 09:07:55
The latest p

In [5]:
dataStoragePipeline(after = start, before = latest, sub = subreddit, conn = conn)

NameError: name 'latest' is not defined