# SAWP: Sunbelt API Wrapper for Python
Author: Jacob Bayer

##### Introduction
Sunbelt is a database that stores information mined from Reddit. Unlike other services such as Pushshift and Reveddit, which store data on posts and comments immediately after they are posted (Pushshift), or create a new way for users to see live data on Reddit (Reveddit), Sunbelt stores information about how posts, comments, redditors, and subreddits have changed over time. 

**Sunbelt is the only service that does this** *(as far as I know)*, but it is still in a very early stage of development and does not have data at the same scale as Pushshift.

To start using Sunbelt, install the [Sunbelt API Wrapper for Python (SAWP)](https://pypi.org/project/sawp/) by running 

` pip install sawp `

Then import and instantiate the SunbeltClient from SAWP as follows.

In [4]:
from sawp import SunbeltClient
sunbelt = SunbeltClient()

SAWP enables a user to query the Sunbelt database using a GraphQL API. In this example, I select the first post in the Sunbelt database.

Posts stored in the Sunbelt database are called "SunPosts" to differentiate them from other reddit objects you may be analyzing (for example PRAW Submissions).

In [28]:
post = sunbelt.posts.first()
post

SunPost(1)

The SunPost object can be used to access attributes of the post.


In [29]:
post.permalink

'/r/AskReddit/comments/10kzboh/happy_birthday_askreddit/'

In [30]:
post.title

'Happy Birthday AskReddit!'

We can list the comments for this post using the post.comments attribute.

In [8]:
post.comments

[SunComment(1),
 SunComment(2),
 SunComment(3),
 SunComment(4),
 SunComment(5),
 SunComment(6),
 SunComment(7),
 SunComment(8),
 SunComment(9),
 SunComment(10),
 SunComment(11),
 SunComment(12),
 SunComment(13),
 SunComment(14),
 SunComment(15),
 SunComment(16),
 SunComment(17),
 SunComment(18),
 SunComment(19),
 SunComment(20),
 SunComment(21),
 SunComment(22),
 SunComment(23),
 SunComment(24),
 SunComment(25),
 SunComment(26),
 SunComment(27),
 SunComment(28),
 SunComment(29),
 SunComment(30),
 SunComment(31),
 SunComment(32),
 SunComment(33),
 SunComment(34),
 SunComment(35),
 SunComment(36),
 SunComment(37),
 SunComment(38),
 SunComment(40),
 SunComment(46),
 SunComment(47),
 SunComment(39),
 SunComment(41),
 SunComment(42),
 SunComment(43),
 SunComment(44),
 SunComment(45),
 SunComment(48),
 SunComment(49),
 SunComment(50)]

Sunbelt stores multiple versions of data for any given object, representing different times that the SunCrawler saw the entity on Reddit. These versions describe the non-permanent attributes of an object such as upvotes, karma, or subreddit subscribers.

Let's take a look at how many versions we have for a comment on SunPost(2).

In [14]:
post = sunbelt.posts.get(2)
comment = post.comments[3]
comment

SunComment(59)

In [15]:
comment.versions

[CommentVersion(SunComment = 59 , SunVersion = 1),
 CommentVersion(SunComment = 59 , SunVersion = 2)]

Let's look at some of the version data.

In [16]:
print('\n Upvotes over time for Comment:', comment.reddit_comment_id, '\n')
for v in comment.versions:
    print(v.ups, 'upvotes at', v.sun_created_at)


 Upvotes over time for Comment: t1_j5t0ysc 

22389 upvotes at 25-01-2023 18:06:12
24792 upvotes at 26-01-2023 14:45:14


By looking at the comment body text of each version, we can see that this comment has been deleted by the author.

In [19]:
[x.body for x in comment.versions]

['Being a YouTube "prankster"', '[deleted]']

The details from the most recent version of any object are also stored as attributes with the "most_recent_" prefix.

In [22]:
print(comment.most_recent_ups)
print(comment.most_recent_body)

24792
[deleted]


#### Sunbelt to Pandas

Sunbelt uses a GraphQL API to query only the data specifically requested by the user. When a Sun object is first initalized by SAWP, it contains only bare minimum of information necessary to initialize the object unless additional information is specifically requested by the user. When an attribute is requested, a new API call is made to obtain that attribute from the database. A batch request for many attributes can be made by passing the requested attributes as arguments.

In [33]:
all_comments = sunbelt.comments.all(# Requested fields can be passed as args
                                 'sun_post_id',
                                 'sun_comment_id',
                                 'reddit_post_id',
                                 'reddit_comment_id',
                                 'reddit_parent_id',
                                 'most_recent_body',
                                 'most_recent_ups',
                                 'most_recent_downs',
                                 'created_utc',
                                 'most_recent_edited',
                                 'most_recent_gilded',
                                 'depth')

Sunbelt objects have a useful to_dict method, which can be used to create a pandas dataframe.

In [25]:
comment = all_comments[0]
comment.to_dict()

{'kind': 'comment',
 'uid': 1,
 'created_utc': 1674655786.0,
 'depth': '0',
 'most_recent_body': "Happy birthday to the world's internet town square.",
 'most_recent_downs': 0,
 'most_recent_edited': 0,
 'most_recent_gilded': '0',
 'most_recent_ups': 28,
 'reddit_comment_id': 't1_j5tm5b1',
 'reddit_parent_id': None,
 'reddit_post_id': 't3_10kzboh',
 'sun_comment_id': '1',
 'sun_post_id': 1,
 'sun_unique_id': 1}

In [35]:
import pandas as pd
comments_df = pd.DataFrame(x.to_dict() for x in all_comments)
comments_df

Unnamed: 0,kind,uid,created_utc,depth,most_recent_body,most_recent_downs,most_recent_edited,most_recent_gilded,most_recent_ups,reddit_comment_id,reddit_parent_id,reddit_post_id,sun_comment_id,sun_post_id,sun_unique_id
0,comment,1,1.674656e+09,0,Happy birthday to the world's internet town sq...,0,0,0,28,t1_j5tm5b1,,t3_10kzboh,1,1,1
1,comment,2,1.674656e+09,0,ask reddit is aquarius,0,0,0,8,t1_j5tlz13,,t3_10kzboh,2,1,2
2,comment,3,1.674655e+09,0,Cool,0,0,0,7,t1_j5tlfri,,t3_10kzboh,3,1,3
3,comment,4,1.674658e+09,0,Thanks for being there for 15 years so we coul...,0,0,0,8,t1_j5tq7nj,,t3_10kzboh,4,1,4
4,comment,5,1.674656e+09,0,happy birthday reddits most disturbing comment...,0,0,0,8,t1_j5tmm97,,t3_10kzboh,5,1,5
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
655,comment,648,1.674667e+09,0,What a fucking asshole.,0,0,0,1,t1_j5udej5,,t3_10kzjx3,648,15,648
656,comment,649,1.674667e+09,0,"Yep, like a cancer.",0,0,0,1,t1_j5uf640,,t3_10kzjx3,649,15,649
657,comment,650,1.674658e+09,1,I mean if you're a leading religious figure in...,0,0,0,39,t1_j5tselp,,t3_10kzjx3,650,15,650
658,comment,655,1.674662e+09,2,Except the ones involving invading your country,0,0,0,11,t1_j5u073l,,t3_10kzjx3,655,15,655
