# Emotional and Linguistic Framing of Digital Detox

### Notebook 1: Reddit Data Collection

This notebook collects Reddit posts from selected subreddits using the PRAWPIIa:

- **Detox group**: Subreddits and keywords related to digital detox
- **Control group**: General subreddits without detox-related contene combined dataset is saved for further anaanrthelysis.


#### Project Introduction and Goals
This dataset serves as the foundation for exploring how individuals discuss digital detoxing online and how emotional or intentional language differs from general Reddit discourse.

Research Question:
**How does emotional and linguistic framing differ between Reddit posts about digital detox experiences and posts from general, non-detox-related subreddits**

Hypothesis 1:
There will be statistically significant differences in emotional language (e.g., Valence, Arousal, Dominance scores) between detox-related posts and control posts.

Hypothesis 2:
Detox-related posts will contain higher frequencies of self-reflective and wellness-oriented keywords (e.g., "stress relief", "mental clarity", "dopamine detox") compared to control posts.

### Package Imports and Setup

In [1]:
# required packages 
!pip install --upgrade pip
!pip install praw pandas tqdm

# core Libraries
import pandas as pd
import numpy as np
import datetime
import time
import warnings

# reddit api Wrapper
import praw

# progress bar
from tqdm import tqdm

# file I/O and formatting
import re
import json

# suppress warnings for cleaner output
warnings.filterwarnings("ignore")

Collecting pip
  Using cached pip-25.1.1-py3-none-any.whl.metadata (3.6 kB)
Using cached pip-25.1.1-py3-none-any.whl (1.8 MB)
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 24.3.1
    Uninstalling pip-24.3.1:
      Successfully uninstalled pip-24.3.1
Successfully installed pip-25.1.1
Collecting praw
  Using cached praw-7.8.1-py3-none-any.whl.metadata (9.4 kB)
Collecting prawcore<3,>=2.4 (from praw)
  Using cached prawcore-2.4.0-py3-none-any.whl.metadata (5.0 kB)
Collecting update_checker>=0.18 (from praw)
  Using cached update_checker-0.18.0-py3-none-any.whl.metadata (2.3 kB)
Using cached praw-7.8.1-py3-none-any.whl (189 kB)
Using cached prawcore-2.4.0-py3-none-any.whl (17 kB)
Using cached update_checker-0.18.0-py3-none-any.whl (7.0 kB)
Installing collected packages: update_checker, prawcore, praw
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3/3[0m [praw][32m2/3[0m [praw]
[1A[2KSuccessfully installed praw-7.8.1 p

### Data Collection from Reddit

In [2]:
# load in data - code adapted from notebook Tokenisation and APIS
reddit = praw.Reddit(
    client_id="-YOGPpOlGEIsFVA4hpOqHQ",
    client_secret="KRQVBZxLGtSKY7XXuwWJl97vUj_LBw",
    user_agent="Test_1 v1.0 by /u/Entire_Formal_9582",
    username="Entire_Formal_9582",  
    password="hm,E5l$3h5A:"  
) 

In [3]:
# code adapted from notebook - tokenisation and API's
# subreddits
detox_subreddits = [
    'digitalminimalism', 'dopaminedetoxing', 'digitaldetox', 
    'nophones', 'nosurf', 'mindfulness'
]

control_subreddits = [
    'askreddit', 'offmychest', 'casualconversation', 
    'nostupidquestions', 'todayilearned', 'advice'
]

In [4]:
# keywords
keywords_list = [
    'digital detox', 'unplugged', 'tech break', 'unplug',
    'well-being', 'stress relief', 'mental health', 'dopamine detox'
]

In [5]:
# detox phases
phase_keywords = {
    'pre_detox': [
        "thinking about detox", "planning to quit", 
        "considering detox", "want to quit social media"
    ],
    'during_detox': [
        "i'm offline", "detox week 1", 
        "offline this week", "day 1 detox"
    ],
    'post_detox': [
        "after detox", "how it went", 
        "reflections on detox", "detox experience", 
        "post detox thoughts"
    ]
}

In [21]:
# function to collect detox-related Reddit posts from a subreddit
# searches using keywords grouped by detox phases and returns a list of dictionaries

def collect_detox_posts(subreddit_name, limit=100):
    import datetime
    results = []
    subreddit = reddit.subreddit(subreddit_name)

    cutoff_date = datetime.datetime(2021, 1, 1)

    # iterate over each detox phase and associated keywords
    for phase, keywords in phase_keywords.items():
        for keyword in keywords:
            for post in subreddit.search(keyword, sort='relevance', time_filter='all', limit=limit):
                submission_time = datetime.datetime.utcfromtimestamp(post.created_utc)

                # only include posts after the cutoff date
                if submission_time >= cutoff_date:
                    results.append({
                        'title': post.title,
                        'body': post.selftext,
                        'keyword': keyword,
                        'subreddit': subreddit_name,
                        'timestamp': submission_time,
                        'detox_phase': phase,
                        'group': 'detox'
                    })

    return results

In [22]:
# function to collect control-related Reddit posts from a subreddit
# searches using keywords grouped by detox phases and returns a list of dictionaries

def collect_control_posts(subreddit_name, detox_keywords, search_terms=None, limit=100):
    import datetime
    results = []
    subreddit = reddit.subreddit(subreddit_name)

    # Default to generic, high-frequency non-detox terms if none provided
    if search_terms is None:
        search_terms = ["the", "life", "people", "question", "story"]

    # Define cutoff date (post-COVID time frame)
    cutoff_date = datetime.datetime(2021, 1, 1)

    for term in search_terms:
        for post in subreddit.search(term, sort='relevance', time_filter='all', limit=limit):
            post_time = datetime.datetime.utcfromtimestamp(post.created_utc)

            # Filter by time and exclude detox-related content
            if post_time >= cutoff_date:
                combined_text = (post.title + " " + post.selftext).lower()
                if not any(keyword in combined_text for keyword in detox_keywords):
                    results.append({
                        'title': post.title,
                        'body': post.selftext,
                        'subreddit': subreddit_name,
                        'timestamp': post_time,
                        'group': 'control'
                    })

    return results


In [23]:
all_posts = []

# detox posts
for subreddit in detox_subreddits:
    print(f"loading detox posts r/{subreddit}")
    all_posts.extend(collect_detox_posts(subreddit, limit=200))

# control posts
for subreddit in control_subreddits:
    print(f"loading control posts r/{subreddit}")
    all_posts.extend(collect_control_posts(subreddit, keywords_list, limit=200))

loading detox posts r/digitalminimalism
loading detox posts r/dopaminedetoxing
loading detox posts r/digitaldetox
loading detox posts r/nophones
loading detox posts r/nosurf
loading detox posts r/mindfulness
loading control posts r/askreddit
loading control posts r/offmychest
loading control posts r/casualconversation
loading control posts r/nostupidquestions
loading control posts r/todayilearned
loading control posts r/advice


In [9]:
# create df
combined_df = pd.DataFrame(all_posts)

In [11]:
# remove empty posts
combined_df = combined_df[combined_df['body'].str.strip() != ''] 

In [12]:
combined_df

Unnamed: 0,title,body,keyword,subreddit,timestamp,detox_phase,group
0,"I quit social media for a month, and now I see...","A month ago, I decided to quit all social medi...",thinking about detox,digitalminimalism,2025-02-08 01:19:25,pre_detox,detox
1,3 years without social media - my experience,Some of you may remember me posting here after...,thinking about detox,digitalminimalism,2024-09-12 13:13:24,pre_detox,detox
2,digital minimalism journey as a 36yr old mom,"(this is going to be long, but i'm hoping it r...",thinking about detox,digitalminimalism,2025-03-28 19:22:32,pre_detox,detox
3,I did a 30-day digital detox and realised I've...,"Inspired by Cal Newport's digital minimalism, ...",thinking about detox,digitalminimalism,2025-05-27 08:10:33,pre_detox,detox
4,"If you find ""screen time"" apps don't work to g...",TLDR: I made a fully free app that keeps you o...,thinking about detox,digitalminimalism,2025-02-24 14:07:55,pre_detox,detox
...,...,...,...,...,...,...,...
11609,Very very very long story.,Hello all. I’m not gonna spend too much on an ...,,advice,2025-03-16 03:43:52,,control
11610,My ex gf got with the guy she told me not to w...,Hi guys first post I just needed somewhere to ...,,advice,2025-01-02 23:00:13,,control
11613,Soon to be ex-wife is trying to ruin my life (...,I (23M) got married to my soon to be ex (28F)a...,,advice,2025-03-24 03:59:37,,control
11614,My sister thinks I'm being selfish because I'm...,So just over 4 months ago I gave birth to my 4...,,advice,2020-09-12 05:55:00,,control


In [13]:
# save to a CSV
combined_df.to_csv('/home/jovyan/XXX/Back up/XXX/combined_reddit_digital_detox_study_dataframe.csv')