In [17]:
%%capture
!pip install -r requirements.txt

## Attributes Exploration

### Imports

In [18]:
import os
import praw
import pandas as pd

from dotenv import load_dotenv

load_dotenv()

True

In [19]:
reddit = praw.Reddit(
    client_id=os.getenv('REDDIT_CLIENT_ID'),
    client_secret=os.getenv('REDDIT_CLIENT_SECRET'),
    user_agent=os.getenv('REDDIT_USER_AGENT')
)

print(f"Read-only mode: {reddit.read_only}")
print(f"User agent: {reddit.config.user_agent}")

Read-only mode: True
User agent: Miserable_Seaweed691


### Subreddit

In [20]:
subreddit = reddit.subreddit('movies')

print(f"Subreddit: {subreddit.display_name}")
print(f"Subscribers: {subreddit.subscribers:,}")
print(f"Is NSFW: {subreddit.over18}")

Subreddit: movies
Subscribers: 37,045,593
Is NSFW: False
Subscribers: 37,045,593
Is NSFW: False


In [21]:
print("=== HOT POSTS FROM movies ===\n")
hot_posts = subreddit.hot(limit=100)

search_query = "Dune"

print(f"=== SEARCH RESULTS FOR '{search_query}' in movies ===\n")

for i, submission in enumerate(subreddit.search(search_query, limit=3), 1):
    print(f"{i}. {submission.title}")
    print(f"   Score: {submission.score}, Comments: {submission.num_comments}")
    print(f"   Created: {submission.created_utc}")

=== HOT POSTS FROM movies ===

=== SEARCH RESULTS FOR 'Dune' in movies ===

1. I finally watched Dune (2021) and am shocked at how bad it is
   Score: 270, Comments: 323
   Created: 1691999408.0
2. Robert Pattinson finally confirms ‘Dune: Part Three’ casting and reflects on filming the sequel in the desert
   Score: 10374, Comments: 586
   Created: 1762255424.0
3. ‘Dune: Part Three’ Wraps Filming
   Score: 6260, Comments: 731
   Created: 1762901415.0
1. I finally watched Dune (2021) and am shocked at how bad it is
   Score: 270, Comments: 323
   Created: 1691999408.0
2. Robert Pattinson finally confirms ‘Dune: Part Three’ casting and reflects on filming the sequel in the desert
   Score: 10374, Comments: 586
   Created: 1762255424.0
3. ‘Dune: Part Three’ Wraps Filming
   Score: 6260, Comments: 731
   Created: 1762901415.0


### Post details and content

In [22]:
sample_post = next(hot_posts)

print(f"Title: {sample_post.title}")
print(f"Author: {sample_post.author}")
print(f"Author flair: {sample_post.author_flair_text}")
print(f"Selftext (body): {sample_post.selftext[:50] if len(sample_post.selftext) > 50 else sample_post.selftext}")
print(f"Score: {sample_post.score}")
print(f"Upvote ratio: {sample_post.upvote_ratio}")
print(f"Number of comments: {sample_post.num_comments}")
print(f"Post flair: {sample_post.link_flair_text}")
print(f"Is stickied: {sample_post.stickied}")
print(f"Is locked: {sample_post.locked}")
print(f"Is spoiler: {sample_post.spoiler}")
print(f"Is NSFW: {sample_post.over_18}")
print(f"Permalink: https://reddit.com{sample_post.permalink}")
print(f"Post URL: {sample_post.url}")
print(f"Gilded count: {sample_post.gilded}")
print(f"Total awards: {sample_post.total_awards_received}")

Title: Hi reddit! I'm Hikari, writer, director and producer of RENTAL FAMILY, a film set in Japan starring Brendan Fraser. It's out in theaters nationwide on November 21 via Searchlight Pictures. Ask me anything!
Author: HikariAMA
Author flair: Hikari, Director of 'Rental Family'
Selftext (body): Hi reddit! I'm Hikari, I wrote, directed, and prod
Score: 1349
Upvote ratio: 0.92
Number of comments: 136
Post flair: AMA
Is stickied: True
Is locked: False
Is spoiler: False
Is NSFW: False
Permalink: https://reddit.com/r/movies/comments/1p16nd4/hi_reddit_im_hikari_writer_director_and_producer/
Post URL: https://i.redd.it/0zaylvmjl72g1.png
Gilded count: 0
Total awards: 0


### Comments

In [23]:
print("=== COMMENT TREE STRUCTURE ===\n")

if len(sample_post.comments) > 0:
    first_comment = sample_post.comments[0]
    print(f"Top-level comment by {first_comment.author}:")
    print(f"Body: {first_comment.body[:100]}...")
    print(f"Direct replies: {len(first_comment.replies)}\n")

    if len(first_comment.replies) > 0:
        for i, reply in enumerate(first_comment.replies[:3], 1):
            print(f"  Reply {i} by {reply.author}: {reply.body[:80]}...")
            print(f"    Score: {reply.score}, Further replies: {len(reply.replies)}")
else:
    print("No comments available on this post.")

=== COMMENT TREE STRUCTURE ===

Top-level comment by BunyipPouch:
Body: This AMA has been verified and approved by the mods. Hikari will be back at 5 PM ET today (Wednesday...
Direct replies: 0

Top-level comment by BunyipPouch:
Body: This AMA has been verified and approved by the mods. Hikari will be back at 5 PM ET today (Wednesday...
Direct replies: 0



### Author

In [24]:
redditor = sample_post.author

print(f"Username: {redditor.name}")
print(f"User ID: {redditor.id}")
print(f"Fullname: {redditor.fullname}")
print(f"Link karma: {redditor.link_karma:,}")
print(f"Comment karma: {redditor.comment_karma:,}")
print(f"Total karma: {redditor.total_karma:,}")
print(f"Account created: {redditor.created_utc}")
print(f"Is verified: {redditor.has_verified_email}")
print(f"Is gold: {redditor.is_gold}")
print(f"Is mod: {redditor.is_mod}")
print(f"Is employee: {redditor.is_employee}")

print(f"=== Recent submissions ===")
for i, post in enumerate(redditor.submissions.new(limit=3), 1):
    print(f"{i}. {post.title} ({post.subreddit.display_name})")

print(f"\n=== Recent comments ===")
for i, comment in enumerate(redditor.comments.new(limit=3), 1):
    print(f"{i}. {comment.body[:80]}... ({comment.subreddit.display_name})")

Username: HikariAMA
User ID: 229k6tgb3v
Fullname: t2_229k6tgb3v
Link karma: 584
Comment karma: 121
Total karma: 705
Account created: 1763511740.0
Is verified: False
Is gold: False
Is mod: True
Is employee: False
=== Recent submissions ===
1. Hi reddit! I'm Hikari, writer, director and producer of RENTAL FAMILY, a film set in Japan starring Brendan Fraser. It's out in theaters nationwide on November 21 via Searchlight Pictures. Ask me anything! (movies)

=== Recent comments ===
1. Hi reddit! I'm Hikari, writer, director and producer of RENTAL FAMILY, a film set in Japan starring Brendan Fraser. It's out in theaters nationwide on November 21 via Searchlight Pictures. Ask me anything! (movies)

=== Recent comments ===
1. Nausicaa of the Valley of the Wind is my favorite!  All of Hayao Miyazaki's film... (movies)
2. A cell phone!  It would reveal that we are always connected.... (movies)
3. Working with Brendan was pure joy!  I was already working on RENTAL FAMILY so I ... (movies)
1. Naus

### Summary

#### Subreddit Attributes

| Attribute | Type | Description | Example |
|-----------|------|-------------|---------|
| display_name | str | Subreddit name without r/ prefix | movies |
| id | str | Unique subreddit ID | t5_2qh3s |
| title | str | Full title of the subreddit | Movie News and Discussion |
| description | str | Detailed description (sidebar) | Long sidebar text... |
| public_description | str | Short public description | News & Discussion about... |
| subscribers | int | Total number of subscribers | 31,500,000 |
| created_utc | float | Creation timestamp (Unix) | 1201242535.0 |
| over18 | bool | NSFW status | False |
| subreddit_type | str | Type: public/private/restricted | public |
| submission_type | str | Allowed submission types | any |
| allow_images | bool | Whether images are allowed | True |
| allow_videos | bool | Whether videos are allowed | True |
| allow_videogifs | bool | Whether video gifs are allowed | True |
| spoilers_enabled | bool | Spoiler tags enabled | True |
| original_content_tag_enabled | bool | OC tag enabled | False |
| wiki_enabled | bool | Wiki feature enabled | True |
| accounts_active | int | Currently active users (unreliable) | 15,000 |
| comment_score_hide_mins | int | Comment score hide duration (minutes) | 0 |
| can_assign_user_flair | bool | Users can set their flair | True |
| can_assign_link_flair | bool | Link flair can be assigned | True |
| all_original_content | bool | All content must be OC | False |
| key_color | str | Theme color key | #000000 |
| display_name_prefixed | str | Full name with r/ prefix | r/movies |
| community_icon | str | Community icon URL | https://...icon.png |
| banner_img | str | Banner image URL | https://...banner.jpg |
| header_img | str | Header image URL | https://...header.jpg |
| header_title | str | Header title text | Movies |
| quarantine | bool | Quarantine status | False |
| emojis_enabled | bool | Custom emojis enabled | True |
| advertiser_category | str | Advertising category | Entertainment |
| public_traffic | bool | Traffic stats public | False |
| language | str | Primary language | en |
| whitelist_status | str | Whitelist status | all_ads |

#### Post (Submission) Attributes

| Attribute | Type | Description |
|-----------|------|-------------|
| id | str | Unique post ID |
| title | str | Post title |
| selftext | str | Self-post text body |
| author | Redditor | Author object |
| subreddit | Subreddit | Subreddit object |
| score | int | Net upvotes (ups - downs) |
| upvote_ratio | float | Percentage upvoted |
| num_comments | int | Total comment count |
| created_utc | float | Creation timestamp |
| edited | bool/float | Edit timestamp or False |
| is_self | bool | Is text post |
| is_video | bool | Is video post |
| url | str | Post URL |
| permalink | str | Relative URL path |
| domain | str | Domain of link |
| link_flair_text | str | Post flair text |
| link_flair_css_class | str | Post flair CSS class |
| author_flair_text | str | Author flair text |
| author_flair_css_class | str | Author flair CSS |
| stickied | bool | Stickied by mod |
| locked | bool | Comments locked |
| spoiler | bool | Marked as spoiler |
| over_18 | bool | NSFW content |
| distinguished | str/None | Mod/admin distinction |
| gilded | int | Gold awards count |
| total_awards_received | int | Total awards received |
| all_awardings | list | Detailed award info |
| thumbnail | str | Thumbnail URL |
| preview | dict | Preview images dict |
| media | dict | Media metadata |
| secure_media | dict | Secure media metadata |
| is_original_content | bool | Marked as OC |
| is_reddit_media_domain | bool | Hosted on Reddit |
| is_meta | bool | Meta post flag |
| pinned | bool | Pinned to profile |
| archived | bool | Archived (6mo+) |
| hidden | bool | Hidden by user |
| saved | bool | Saved by user |
| clicked | bool | Clicked by user |
| visited | bool | Visited by user |
| num_crossposts | int | Crosspost count |
| can_mod_post | bool | Can moderate |
| fullname | str | Full Reddit ID (t3_id) |
| name | str | Same as fullname |
| post_hint | str | Type hint (image/video) |
| suggested_sort | str | Suggested sort order |
| view_count | int/None | View count |
| hide_score | bool | Score hidden |
| removed_by_category | str/None | Removal category |
| approved_by | str/None | Approved by |

#### Comment Attributes

| Attribute | Type | Description |
|-----------|------|-------------|
| id | str | Unique comment ID |
| body | str | Comment text |
| author | Redditor | Author object |
| subreddit | Subreddit | Subreddit object |
| submission | Submission | Parent submission |
| parent_id | str | Parent comment/post ID |
| score | int | Net score |
| created_utc | float | Creation timestamp |
| edited | bool/float | Edit timestamp or False |
| is_submitter | bool | Is OP of post |
| stickied | bool | Stickied by mod |
| distinguished | str/None | Mod/admin distinction |
| controversiality | int | Controversy score |
| gilded | int | Gold awards count |
| total_awards_received | int | Total awards |
| all_awardings | list | Award details |
| replies | CommentForest | Nested replies |
| depth | int | Nesting level |
| permalink | str | Full URL |
| link_id | str | Parent submission ID |
| author_flair_text | str | Author flair text |
| author_flair_css_class | str | Author flair CSS |
| collapsed | bool | Auto-collapsed |
| collapsed_reason | str/None | Collapse reason |
| score_hidden | bool | Score hidden |
| locked | bool | Locked by mod |
| saved | bool | Saved by user |
| archived | bool | Archived |
| can_mod_post | bool | Can moderate |
| fullname | str | Full ID (t1_id) |
| name | str | Same as fullname |
| body_html | str | HTML formatted body |
| removed | bool | Removed status |
| approved_by | str/None | Approved by |
| banned_by | str/None | Banned by |
| num_reports | int/None | Report count |
| mod_reports | list | Mod reports |
| user_reports | list | User reports |

#### Author Attributes

| Attribute | Type | Description |
|-----------|------|-------------|
| id | str | Unique user ID |
| name | str | Username |
| fullname | str | Full ID (t2_id) |
| created_utc | float | Account creation timestamp |
| link_karma | int | Post karma |
| comment_karma | int | Comment karma |
| total_karma | int | Combined karma |
| awardee_karma | int | Karma from receiving awards |
| awarder_karma | int | Karma from giving awards |
| has_verified_email | bool | Email verified |
| is_gold | bool | Reddit premium |
| is_mod | bool | Is moderator anywhere |
| is_employee | bool | Reddit employee |
| verified | bool | Verified account |
| has_subscribed | bool | Has subscriptions |
| hide_from_robots | bool | Hide from search engines |
| icon_img | str | Profile icon URL |
| subreddit | dict | User profile subreddit |
| accept_followers | bool | Allows followers |
| pref_show_snoovatar | bool | Shows Snoo avatar |
| is_blocked | bool | Blocked by you |
| is_friend | bool | Is your friend |
| is_suspended | bool | Account suspended |

#### Limitations

**Subreddit Level:**
- List of all subscribers (privacy protection)
- Individual user subscription lists
- Detailed traffic statistics (unless you're a moderator)
- Historical subscriber count over time
- Private subreddit content (without access)
- Detailed moderation logs (moderator only)
- Actual real-time active user count (unreliable metric)

**Post Level:**
- Individual upvoter/downvoter identities (privacy protection)
- Exact upvote and downvote counts (only ratio and net score)
- Complete edit history of posts
- Deleted content (unless cached externally)
- IP addresses or location data
- Device information used to post
- Draft versions before posting

**Comment Level:**
- Individual upvoter/downvoter identities
- Exact upvote and downvote counts
- Complete edit history
- Deleted comments (unless cached)
- Private messages or direct messages
- Read/view status by other users

**Author Level:**
- Email addresses (privacy protection)
- List of subscribed subreddits (privacy protection)
- Private messages sent/received
- Voting history (upvotes/downvotes given)
- Saved posts/comments (private)
- Hidden posts (private)
- Browsing history
- IP addresses or location data
- Real identity information
- Account password or authentication tokens
- Multi-account connections
- Blocked users list (your own only)

**Rate Limiting & Access:**
- More than 1000 posts per request (pagination required)
- Historical data beyond Reddit's retention period
- Real-time streaming of all Reddit activity
- Quarantined content (restricted access)
- Shadowbanned user content (invisible to API)
- Content from banned/suspended subreddits

#### Ideas for Social Network Analysis

**Approach 1: Interaction Networks**
- Extract posts and comments over time periods
- Map author → post, author → comment, commenter → parent_author relationships
- Build directed graphs based on reply patterns
- Weight edges by interaction frequency

**Approach 2: Co-participation Networks**
- Track which users comment on the same posts
- Identify users active in similar subreddits
- Create bipartite graphs (users ↔ posts)

**Approach 3: Temporal Analysis**
- Analyze activity patterns over time
- Track user engagement evolution
- Identify influential users by engagement metrics

**Key Identifiers for Network Analysis:**
- User ID (`redditor.id`) - unique, permanent
- Post ID (`submission.id`) - unique, permanent
- Comment ID (`comment.id`) - unique, permanent
- Subreddit ID (`subreddit.id`) - unique, permanent