<a href="https://colab.research.google.com/github/FaarisIq/Persuasion-Analysis-Engine/blob/main/Persuasion_Analysis_Engine.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Persuasion Analysis Engine - Faaris Iqbal

In [None]:
!python -m spacy download en_core_web_sm
!pip install praw pandas spacy vaderSentiment

Collecting en-core-web-sm==3.8.0
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.8.0/en_core_web_sm-3.8.0-py3-none-any.whl (12.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.8/12.8 MB[0m [31m64.1 MB/s[0m eta [36m0:00:00[0m
[?25h[38;5;2m✔ Download and installation successful[0m
You can now load the package via spacy.load('en_core_web_sm')
[38;5;3m⚠ Restart to reload dependencies[0m
If you are in a Jupyter or Colab notebook, you may need to restart Python in
order to load all the package's dependencies. You can do this by selecting the
'Restart kernel' or 'Restart runtime' option.
Collecting praw
  Downloading praw-7.8.1-py3-none-any.whl.metadata (9.4 kB)
Collecting vaderSentiment
  Downloading vaderSentiment-3.3.2-py2.py3-none-any.whl.metadata (572 bytes)
Collecting prawcore<3,>=2.4 (from praw)
  Downloading prawcore-2.4.0-py3-none-any.whl.metadata (5.0 kB)
Collecting update_checker>=0.18 (from praw)
  Downlo

In [None]:
import praw
import pandas as pd
import re
import spacy
import json
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
from IPython.display import display, HTML
import time
import numpy as np
from datetime import datetime
import random

"""
Persuasion Analysis Engine - Faaris Iqbal

This engine analyzes persuasiveness in the subreddit r/ChangeMyView by
collecting posts and scoring them based off of different factors.

RATE LIMITING FEATURES:
- Exponential backoff on errors
- Progressive delays between requests
- Automatic retry with longer waits
- Error handling for API limits
- Safe handling of large comment threads

Main Features Include:
- Data collection of top posts and ALL their comments using Reddit API
- Analyzing the argument's structure, evidence quality, and persuasive techniques
- Detects deltas awarded (gift awarded to OP when one's view is changed)
- Scores persuasiveness from 0-1 using weights
- Exports clean datasets for further research

The persuasion factors analyzed are:
1. Argument Length/Depth (25% weight) - Detailed vs surface level arguments
2. Evidence quality (20% weight) - Academic sources vs blogs
3. Argument sophistication (20% weight) - Logical structure
4. Delta Awards (15% weight) - Proof of persuasion
5. Comment Engagement (10% weight) - Quality of responses and discussion
6) Emotional appeal (10% weight) - Emotional connection and language used

The persuasive techniques detected are:
- Analogies
- Rhetorical questions and direct questions
- Stats and data
- Personal experience
- Moral/ethical appeals
- Authority citations
- Concessions
- Counterargument acknowledgement
- Logical connectors

It outputs:
- 'cmv_posts_analysis.csv' - Main post data with all metrics
- 'cmv_comments_analysis.csv' - Individual comment analysis
- 'cmv_summary_stats.csv' - Aggregated statistics
- Summary stats of most/least persuasive posts
"""

# reddit API setup
reddit = praw.Reddit(
    client_id="1Mqp8_sUj6ivNylhouNiUg",
    client_secret="uKxoTRwLFpA8p_JKp-20tV4qbiWcoA",
    user_agent="changemyview_data_collector_v2"
)

# nlp setup
nlp = spacy.load("en_core_web_sm")
analyzer = SentimentIntensityAnalyzer()

# scoring weights
SCORING_WEIGHTS = {
    'length': 0.25,
    'evidence': 0.20,
    'sophistication': 0.20,
    'delta': 0.15,
    'engagement': 0.10,
    'emotion': 0.10
}

# Rate limiting configuration
RATE_LIMIT_CONFIG = {
    'base_delay': 2.0,          # Base delay between requests
    'max_delay': 120.0,         # Maximum delay on retries
    'retry_attempts': 5,        # Number of retry attempts
    'batch_delay': 30.0,        # Extra delay every N posts
    'batch_size': 10,           # Size of batch before extra delay
    'comment_batch_delay': 5.0, # Delay after processing comments
}

def safe_sleep(duration, message=""):
    """Sleep with progress indication"""
    if message:
        print(f"Rate limiting: {message} (waiting {duration:.1f}s)")

    # For long waits, show progress
    if duration > 10:
        intervals = int(duration / 5)
        for i in range(intervals):
            time.sleep(5)
            print(f"  ... {(i+1)*5}s / {duration}s")
        remaining = duration - (intervals * 5)
        if remaining > 0:
            time.sleep(remaining)
    else:
        time.sleep(duration)

def retry_with_backoff(func, max_retries=5, base_delay=5.0):
    """Retry function with exponential backoff"""
    for attempt in range(max_retries):
        try:
            return func()
        except Exception as e:
            if "429" in str(e) or "TooManyRequests" in str(e):
                if attempt < max_retries - 1:
                    delay = base_delay * (2 ** attempt) + random.uniform(0, 5)
                    delay = min(delay, RATE_LIMIT_CONFIG['max_delay'])
                    safe_sleep(delay, f"Rate limit hit, retry {attempt + 1}/{max_retries}")
                else:
                    print(f"Max retries reached for rate limiting. Skipping.")
                    raise
            else:
                print(f"Non-rate-limit error: {e}")
                raise
    return None

# text preprocessing
def clean_text(text):
    """Enhanced text cleaning for Reddit content"""
    if pd.isna(text) or not isinstance(text, str):
        return ""

    # Remove Reddit markdown
    text = re.sub(r'\*\*(.*?)\*\*', r'\1', text)
    text = re.sub(r'\*(.*?)\*', r'\1', text)
    text = re.sub(r'\[([^\]]+)\]\([^)]+\)', r'\1', text)
    text = re.sub(r'&gt;.*?\n', '', text)
    text = re.sub(r'http\S+', '', text)
    text = re.sub(r'\n+', ' ', text)
    text = re.sub(r'[^\w\s.,!?;:-]', '', text)

    return text.strip()

def spacy_sent_tokenize(text):
    """sentence tokenization"""
    if not isinstance(text, str) or not text.strip():
        return []

    try:
        doc = nlp(text)
        return [sent.text.strip() for sent in doc.sents if sent.text.strip()]
    except:
        # Fallback to simple splitting if spacy fails
        return [s.strip() for s in text.split('.') if s.strip()]

# delta detection (a delta is an award given to OP if the gifter was persuaded)
def extract_actual_deltas(post_comments):
    delta_count = 0
    delta_patterns = [
        r'!delta\b',
        r'∆',
        r'Δ',
        r'awarded.*?delta',
        r'changed my view',
        r'view.*?changed',
        r'cmv.*?successful'
    ]

    for comment in post_comments:
        body = comment.get('body', '').lower()

        # Skip bot/moderator comments
        if any(bot in str(comment.get('author', '')).lower()
               for bot in ['deltabot', 'automoderator', '[deleted]']):
            continue

        # Check for delta indicators
        for pattern in delta_patterns:
            if re.search(pattern, body, re.I):
                delta_count += 1
                break  # Max one delta per comment

    return delta_count

def is_delta_metadata_comment(text):
    """Filter out system delta messages"""
    if not isinstance(text, str):
        return False

    metadata_phrases = [
        "all comments that earned deltas",
        "delta system explained",
        "/r/deltalog",
        "change of view doesn't necessarily mean a reversal",
        "awarded a delta",
        "confirmation that a delta has been awarded"
    ]

    lowered = text.lower()
    return any(phrase in lowered for phrase in metadata_phrases)

# evidence quality analysis
def analyze_source_credibility(text):
    if not text or not isinstance(text, str):
        return 0.0

    urls = re.findall(r'https?://[^\s<>\[\]]+', text)
    print(f"DEBUG: Found {len(urls)} URLs in text of length {len(text)}")
    if urls:
        print(f"DEBUG: URLs found: {urls[:3]}...")  # Show first 3 URLs

    if not urls:
        return 0.0

    credibility_score = 0.0

    # High credibility sources
    high_cred_domains = [
        '.edu', '.gov', 'scholar.google', 'jstor', 'pubmed',
        'nature.com', 'science.org', 'cell.com', 'nejm.org',
        'stanford.edu', 'harvard.edu', 'mit.edu'
    ]

    # Medium credibility sources
    medium_cred_domains = [
        'reuters.com', 'ap.org', 'bbc.com', 'npr.org',
        'economist.com', 'wsj.com', 'nytimes.com',
        'washingtonpost.com', 'theguardian.com'
    ]

    # Low credibility indicators
    low_cred_indicators = [
        'blog', 'wordpress', 'medium.com', 'reddit.com',
        'youtube.com', 'twitter.com', 'facebook.com'
    ]

    for url in urls:
        url_lower = url.lower()

        if any(domain in url_lower for domain in high_cred_domains):
            credibility_score += 1.0
            print(f"DEBUG: High credibility URL: {url}")
        elif any(domain in url_lower for domain in medium_cred_domains):
            credibility_score += 0.7
            print(f"DEBUG: Medium credibility URL: {url}")
        elif any(indicator in url_lower for indicator in low_cred_indicators):
            credibility_score += 0.2
            print(f"DEBUG: Low credibility URL: {url}")
        else:
            credibility_score += 0.4  # Generic web source
            print(f"DEBUG: Generic URL: {url}")

    final_score = min(credibility_score / len(urls), 1.0)
    print(f"DEBUG: Final evidence score: {final_score}")
    return final_score

# argument sophistication analysis
def detect_argument_sophistication(text):
    """Measure argument quality and sophistication"""
    if not text:
        return 0

    text_lower = text.lower()
    sophistication_score = 0

    # Counterargument acknowledgment (high value)
    counter_patterns = [
        r"some might argue", r"critics say", r"on the other hand",
        r"however", r"nevertheless", r"although", r"admittedly",
        r"while.*?true", r"granted", r"i understand.*?but",
        r"fair point.*?but", r"you could argue"
    ]
    counter_count = sum(1 for p in counter_patterns if re.search(p, text_lower))
    sophistication_score += min(counter_count * 0.15, 0.4)

    # Qualification/nuance
    nuance_patterns = [
        r"in some cases", r"generally", r"tends to", r"often",
        r"usually", r"primarily", r"largely", r"typically",
        r"in most cases", r"under certain conditions"
    ]
    nuance_count = sum(1 for p in nuance_patterns if re.search(p, text_lower))
    sophistication_score += min(nuance_count * 0.08, 0.3)

    # Evidence integration
    evidence_patterns = [
        r"according to", r"research shows", r"studies indicate",
        r"data suggests", r"evidence shows", r"surveys found",
        r"analysis reveals", r"statistics show"
    ]
    evidence_count = sum(1 for p in evidence_patterns if re.search(p, text_lower))
    sophistication_score += min(evidence_count * 0.1, 0.3)

    # Logical structure indicators
    logic_patterns = [
        r"first", r"second", r"third", r"finally",
        r"therefore", r"thus", r"consequently", r"as a result",
        r"this leads to", r"it follows that"
    ]
    logic_count = sum(1 for p in logic_patterns if re.search(p, text_lower))
    sophistication_score += min(logic_count * 0.05, 0.2)

    return min(sophistication_score, 1.0)

# persuasive Features
def analyze_persuasive_features(text):
    """Comprehensive persuasive element detection"""
    if not text:
        return {}

    text_lower = text.lower()

    features = {
        'analogies': len(re.findall(r'\b(like|as if|similar to|just as|imagine if)\b', text_lower)),
        'questions': len(re.findall(r'\?', text)),
        'statistics': len(re.findall(r'\b\d+(\.\d+)?%?|\bpercent\b|\bratio\b|\btimes\b', text_lower)),
        'hedging': len(re.findall(r'\b(i think|maybe|possibly|could be|might be|seems like)\b', text_lower)),
        'personal_experience': len(re.findall(r'\b(i have|i\'ve|my experience|personally|i witnessed)\b', text_lower)),
        'moral_appeals': len(re.findall(r'\b(moral|ethics|right|wrong|should|ought|duty|responsibility)\b', text_lower)),
        'emotional_appeals': len(re.findall(r'\b(feel|emotion|heart|compassion|empathy|sympathy)\b', text_lower)),
        'authority_appeals': len(re.findall(r'\b(expert|professor|doctor|researcher|authority|official)\b', text_lower)),
        'consensus_appeals': len(re.findall(r'\b(everyone|most people|society|generally accepted|common sense)\b', text_lower)),
        'concessions': len(re.findall(r'\b(i admit|you\'re right|fair point|i concede|granted)\b', text_lower))
    }

    return features

def score_persuasive_features(features):
    """Convert feature counts to normalized score"""
    if not features:
        return 0

    weights = {
        'analogies': 0.15,
        'questions': 0.10,
        'statistics': 0.20,
        'hedging': 0.05,
        'personal_experience': 0.12,
        'moral_appeals': 0.10,
        'emotional_appeals': 0.08,
        'authority_appeals': 0.15,
        'consensus_appeals': 0.10,
        'concessions': 0.15
    }

    score = 0
    for feature, count in features.items():
        if feature in weights:
            normalized_count = min(count / 3, 1.0)
            score += weights[feature] * normalized_count

    return min(score, 1.0)

# comment engagement analysis
def analyze_comment_engagement(comments_data):
    """Analyze quality of community response"""
    if not comments_data:
        return 0

    engagement_score = 0

    # Average comment length (longer = more thoughtful)
    avg_length = np.mean([len(c.get('body', '')) for c in comments_data])
    length_score = min(avg_length / 500, 0.4)

    # Comment depth (replies to replies = deeper engagement)
    root_comments = [c for c in comments_data if c.get('is_root', False)]
    non_root_comments = [c for c in comments_data if not c.get('is_root', False)]
    depth_score = min(len(non_root_comments) / max(len(root_comments), 1) * 0.2, 0.3)

    # Quality indicators in comments
    quality_indicators = 0
    for comment in comments_data[:10]:  # Check top 10 comments
        body = comment.get('body', '').lower()

        # Positive engagement
        if any(word in body for word in ['interesting', 'good point', 'valid', 'thoughtful']):
            quality_indicators += 0.1

        # Constructive disagreement
        if any(word in body for word in ['however', 'but consider', 'what about', 'counterpoint']):
            quality_indicators += 0.15

        # Evidence sharing
        if any(word in body for word in ['source', 'link', 'study', 'research']):
            quality_indicators += 0.1

    engagement_score = length_score + depth_score + min(quality_indicators, 0.3)
    return min(engagement_score, 1.0)

# Sentiment Analysis
def get_enhanced_emotion_scores(text):
    """Combine VADER with emotional analysis that is domain-based"""
    vader_scores = analyzer.polarity_scores(text)

    # CMV-specific emotional indicators
    persuasive_emotions = [
        'understand', 'realize', 'believe', 'feel', 'think',
        'important', 'significant', 'crucial', 'essential'
    ]

    emotional_intensity = sum(1 for word in persuasive_emotions if word in text.lower())
    emotional_score = min(emotional_intensity / 15, 0.5)  # Normalize

    return {
        'vader_compound': vader_scores['compound'],
        'vader_pos': vader_scores['pos'],
        'vader_neg': vader_scores['neg'],
        'vader_neu': vader_scores['neu'],
        'persuasive_emotion': emotional_score
    }

# persuasion scoring function
def compute_enhanced_persuasion_score(post_text, comments_data, actual_deltas=0, evidence_score=None):
    """Comprehensive persuasion scoring with improved methodology"""

    if not post_text:
        return 0

    # 1. Length/Depth Analysis
    sentences = spacy_sent_tokenize(post_text)
    length_score = min(len(sentences) / 25, 1)  # Optimal around 25 sentences

    # 2. Evidence Quality (use passed score or calculate from text)
    if evidence_score is not None:
        evidence_score_final = evidence_score
    else:
        evidence_score_final = analyze_source_credibility(post_text)

    # 3. Argument Sophistication
    sophistication_score = detect_argument_sophistication(post_text)

    # 4. Delta Integration (ground truth)
    delta_score = min(actual_deltas * 0.25, 1.0)  # Each delta worth 0.25, cap at 1.0

    # 5. Comment Engagement Quality
    engagement_score = analyze_comment_engagement(comments_data)

    # 6. Emotional Connection
    emotion_data = get_enhanced_emotion_scores(post_text)
    emotion_score = emotion_data['persuasive_emotion']

    # Weighted final score
    total_score = (
        SCORING_WEIGHTS['length'] * length_score +
        SCORING_WEIGHTS['evidence'] * evidence_score_final +
        SCORING_WEIGHTS['sophistication'] * sophistication_score +
        SCORING_WEIGHTS['delta'] * delta_score +
        SCORING_WEIGHTS['engagement'] * engagement_score +
        SCORING_WEIGHTS['emotion'] * emotion_score
    )

    return round(min(total_score, 1.0), 3)

def format_timestamp(unix_timestamp):
    """Convert Unix timestamp to readable format"""
    try:
        return datetime.fromtimestamp(unix_timestamp).strftime('%Y-%m-%d %H:%M:%S')
    except:
        return ""

def safe_get_comments(post):
    """Safely retrieve ALL comments with rate limiting"""
    def get_comments():
        post.comment_sort = 'top'
        # This gets ALL comments, including nested ones
        post.comments.replace_more(limit=None)
        return post.comments.list()

    try:
        comments = retry_with_backoff(get_comments, max_retries=3, base_delay=10.0)
        safe_sleep(RATE_LIMIT_CONFIG['comment_batch_delay'], "Post comment processing complete")
        return comments
    except Exception as e:
        print(f"Failed to get all comments: {e}")
        # Fallback: get limited comments
        try:
            post.comments.replace_more(limit=10)
            return post.comments.list()
        except:
            print("Fallback comment retrieval failed")
            return []

# data collection with comprehensive rate limiting
def collect_cmv_data(limit=50, time_filter="year"):
    print(f"Starting rate-limited CMV data collection (limit: {limit})")
    print(f"Configuration: Base delay={RATE_LIMIT_CONFIG['base_delay']}s, "
          f"Batch delay every {RATE_LIMIT_CONFIG['batch_size']} posts")

    subreddit = reddit.subreddit("changemyview")
    posts_data = []
    comments_data = []

    # Get posts with rate limiting
    def get_posts():
        return list(subreddit.top(time_filter=time_filter, limit=limit))

    try:
        posts = retry_with_backoff(get_posts, max_retries=3, base_delay=5.0)
    except Exception as e:
        print(f"Failed to get posts: {e}")
        return [], []

    for i, post in enumerate(posts):
        print(f"Processing post {i+1}/{len(posts)}: {post.title[:50]}...")

        # Batch delay for rate limiting
        if i > 0 and i % RATE_LIMIT_CONFIG['batch_size'] == 0:
            safe_sleep(RATE_LIMIT_CONFIG['batch_delay'],
                      f"Batch checkpoint ({i} posts processed)")

        try:
            # Get ALL comments with rate limiting
            all_comments = safe_get_comments(post)
            post_comments = []

            print(f"  Processing {len(all_comments)} comments...")

            for comment_idx, comment in enumerate(all_comments):
                try:
                    if is_delta_metadata_comment(comment.body):
                        continue

                    cleaned_body = clean_text(comment.body)
                    if len(cleaned_body) < 10:  # Skip very short comments
                        continue

                    # Analyze individual comment
                    comment_features = analyze_persuasive_features(cleaned_body)
                    comment_emotion = get_enhanced_emotion_scores(cleaned_body)
                    comment_args = spacy_sent_tokenize(cleaned_body)

                    comment_data = {
                        'post_id': post.id,
                        'comment_id': comment.id,
                        'author': str(comment.author),
                        'comment_text': cleaned_body,
                        'comment_score': comment.score,
                        'created_timestamp': format_timestamp(comment.created_utc),
                        'created_utc': comment.created_utc,
                        'is_root_comment': comment.parent_id == post.id,
                        'comment_word_count': len(cleaned_body.split()),
                        'comment_sentence_count': len(comment_args),

                        # Persuasive features (flattened)
                        'analogies_count': comment_features.get('analogies', 0),
                        'questions_count': comment_features.get('questions', 0),
                        'statistics_count': comment_features.get('statistics', 0),
                        'hedging_count': comment_features.get('hedging', 0),
                        'personal_experience_count': comment_features.get('personal_experience', 0),
                        'moral_appeals_count': comment_features.get('moral_appeals', 0),
                        'emotional_appeals_count': comment_features.get('emotional_appeals', 0),
                        'authority_appeals_count': comment_features.get('authority_appeals', 0),
                        'consensus_appeals_count': comment_features.get('consensus_appeals', 0),
                        'concessions_count': comment_features.get('concessions', 0),

                        # Sentiment scores (flattened)
                        'vader_compound': comment_emotion['vader_compound'],
                        'vader_positive': comment_emotion['vader_pos'],
                        'vader_negative': comment_emotion['vader_neg'],
                        'vader_neutral': comment_emotion['vader_neu'],
                        'persuasive_emotion_score': comment_emotion['persuasive_emotion']
                    }

                    comments_data.append(comment_data)
                    post_comments.append({
                        'body': cleaned_body,
                        'author': str(comment.author),
                        'score': comment.score,
                        'created_utc': comment.created_utc,
                        'parent_id': comment.parent_id,
                        'comment_id': comment.id,
                        'is_root': comment.parent_id == post.id,
                        'features': comment_features,
                        'emotion': comment_emotion,
                        'arg_units': comment_args
                    })

                except Exception as comment_error:
                    print(f"    Error processing comment {comment_idx}: {comment_error}")
                    continue

            # Analyze main post - IMPORTANT: analyze evidence BEFORE cleaning text
            post_original = post.selftext  # Keep original text with URLs
            evidence_quality = analyze_source_credibility(post_original)  # Analyze URLs first

            # Now clean text for other analyses
            post_clean = clean_text(post_original)
            post_features = analyze_persuasive_features(post_clean)
            post_emotion = get_enhanced_emotion_scores(post_clean)
            post_args = spacy_sent_tokenize(post_clean)
            actual_deltas = extract_actual_deltas(post_comments)

            # Additional metrics
            sophistication_score = detect_argument_sophistication(post_clean)
            engagement_score = analyze_comment_engagement(post_comments)

            # Compute persuasion score with evidence quality passed separately
            persuasion_score = compute_enhanced_persuasion_score(
                post_clean, post_comments, actual_deltas, evidence_quality
            )

            # Create clean post record
            post_data = {
                # Basic post info
                'post_id': post.id,
                'title': post.title,
                'author': str(post.author),
                'post_text': post_clean,
                'original_post_text': post.selftext,
                'created_timestamp': format_timestamp(post.created_utc),
                'created_utc': post.created_utc,
                'post_score': post.score,
                'num_comments': post.num_comments,
                'post_url': f"https://reddit.com{post.permalink}",

                # Text metrics
                'post_word_count': len(post_clean.split()),
                'post_sentence_count': len(post_args),
                'post_character_count': len(post_clean),

                # Persuasion scores
                'persuasion_score': persuasion_score,
                'argument_sophistication_score': sophistication_score,
                'evidence_quality_score': evidence_quality,
                'comment_engagement_score': engagement_score,
                'delta_count': actual_deltas,
                'has_deltas': actual_deltas > 0,

                # Persuasive features (flattened from post_features)
                'analogies_count': post_features.get('analogies', 0),
                'questions_count': post_features.get('questions', 0),
                'statistics_count': post_features.get('statistics', 0),
                'hedging_count': post_features.get('hedging', 0),
                'personal_experience_count': post_features.get('personal_experience', 0),
                'moral_appeals_count': post_features.get('moral_appeals', 0),
                'emotional_appeals_count': post_features.get('emotional_appeals', 0),
                'authority_appeals_count': post_features.get('authority_appeals', 0),
                'consensus_appeals_count': post_features.get('consensus_appeals', 0),
                'concessions_count': post_features.get('concessions', 0),

                # Boolean flags for easy filtering
                'has_analogies': post_features.get('analogies', 0) > 0,
                'has_statistics': post_features.get('statistics', 0) > 0,
                'has_personal_experience': post_features.get('personal_experience', 0) > 0,
                'has_moral_appeals': post_features.get('moral_appeals', 0) > 0,
                'has_authority_appeals': post_features.get('authority_appeals', 0) > 0,

                # Sentiment scores (flattened from post_emotion)
                'vader_compound': post_emotion['vader_compound'],
                'vader_positive': post_emotion['vader_pos'],
                'vader_negative': post_emotion['vader_neg'],
                'vader_neutral': post_emotion['vader_neu'],
                'persuasive_emotion_score': post_emotion['persuasive_emotion'],

                # Engagement metrics
                'root_comments_count': len([c for c in post_comments if c.get('is_root', False)]),
                'reply_comments_count': len([c for c in post_comments if not c.get('is_root', False)]),
                'avg_comment_length': np.mean([len(c.get('body', '')) for c in post_comments]) if post_comments else 0,
                'total_comment_words': sum([len(c.get('body', '').split()) for c in post_comments])
            }

            posts_data.append(post_data)
            print(f"  Completed: {len(post_comments)} comments processed, {actual_deltas} deltas found")

        except Exception as post_error:
            print(f"Error processing post {i+1}: {post_error}")
            safe_sleep(30.0, "Error recovery")
            continue

        # Rate limiting between posts
        safe_sleep(RATE_LIMIT_CONFIG['base_delay'], "Inter-post delay")

    print(f"Data collection complete. Collected {len(posts_data)} posts and {len(comments_data)} comments")
    return posts_data, comments_data

# Enhanced analysis + export functions
def save_clean_datasets(posts_data, comments_data):
    """Save multiple clean CSV files for different analysis needs"""

    # 1. Main posts dataset
    posts_df = pd.DataFrame(posts_data)
    posts_df.to_csv('cmv_posts_analysis.csv', index=False)
    print(f"Posts data saved to 'cmv_posts_analysis.csv' ({len(posts_df)} rows)")

    # 2. Comments dataset
    if comments_data:
        comments_df = pd.DataFrame(comments_data)
        comments_df.to_csv('cmv_comments_analysis.csv', index=False)
        print(f"Comments data saved to 'cmv_comments_analysis.csv' ({len(comments_df)} rows)")
    else:
        comments_df = None

    # 3. Summary statistics dataset
    summary_stats = create_summary_statistics(posts_df)
    summary_df = pd.DataFrame([summary_stats])
    summary_df.to_csv('cmv_summary_stats.csv', index=False)
    print(f"Summary statistics saved to 'cmv_summary_stats.csv'")

    # 4. Top performers dataset
    if len(posts_df) > 0:
        top_count = min(20, len(posts_df))
        top_performers = posts_df.nlargest(top_count, 'persuasion_score')[
            ['post_id', 'title', 'persuasion_score', 'delta_count', 'post_score',
             'post_word_count', 'has_deltas', 'created_timestamp']
        ].copy()
        top_performers['rank'] = range(1, len(top_performers) + 1)
        top_performers.to_csv('cmv_top_performers.csv', index=False)
        print(f"Top performers saved to 'cmv_top_performers.csv' ({len(top_performers)} rows)")

    return posts_df, comments_df, summary_df

def create_summary_statistics(df):
    """Create summary stats"""
    if len(df) == 0:
        return {}

    return {
        'analysis_date': datetime.now().strftime('%Y-%m-%d %H:%M:%S'),
        'total_posts_analyzed': len(df),
        'avg_persuasion_score': round(df['persuasion_score'].mean(), 3),
        'median_persuasion_score': round(df['persuasion_score'].median(), 3),
        'std_persuasion_score': round(df['persuasion_score'].std(), 3),
        'max_persuasion_score': round(df['persuasion_score'].max(), 3),
        'min_persuasion_score': round(df['persuasion_score'].min(), 3),

        'posts_with_deltas': int((df['delta_count'] > 0).sum()),
        'posts_with_deltas_percent': round((df['delta_count'] > 0).mean() * 100, 1),
        'total_deltas_awarded': int(df['delta_count'].sum()),
        'avg_deltas_per_post': round(df['delta_count'].mean(), 2),
        'max_deltas_single_post': int(df['delta_count'].max()),

        'avg_post_word_count': round(df['post_word_count'].mean(), 0),
        'avg_post_sentence_count': round(df['post_sentence_count'].mean(), 1),
        'avg_reddit_score': round(df['post_score'].mean(), 1),
        'avg_comment_count': round(df['num_comments'].mean(), 1),

        # Persuasive technique prevalence
        'posts_with_analogies_percent': round((df['has_analogies']).mean() * 100, 1),
        'posts_with_statistics_percent': round((df['has_statistics']).mean() * 100, 1),
        'posts_with_personal_experience_percent': round((df['has_personal_experience']).mean() * 100, 1),
        'posts_with_moral_appeals_percent': round((df['has_moral_appeals']).mean() * 100, 1),
        'posts_with_authority_appeals_percent': round((df['has_authority_appeals']).mean() * 100, 1),

        # Quality metrics
        'avg_argument_sophistication': round(df['argument_sophistication_score'].mean(), 3),
        'avg_evidence_quality': round(df['evidence_quality_score'].mean(), 3),
        'avg_comment_engagement': round(df['comment_engagement_score'].mean(), 3),

        # Sentiment distribution
        'avg_vader_compound': round(df['vader_compound'].mean(), 3),
        'positive_sentiment_posts_percent': round((df['vader_compound'] > 0.1).mean() * 100, 1),
        'negative_sentiment_posts_percent': round((df['vader_compound'] < -0.1).mean() * 100, 1),
        'neutral_sentiment_posts_percent': round((abs(df['vader_compound']) <= 0.1).mean() * 100, 1)
    }

def display_enhanced_analysis_summary(posts_df, comments_df=None):
    """Enhanced summary with actionable insights"""
    print("\n" + "="*60)
    print("PERSUASION ANALYSIS SUMMARY")
    print("="*60)

    # Basic stats
    print(f"Dataset Overview:")
    print(f"   • Total posts analyzed: {len(posts_df):,}")
    if comments_df is not None:
        print(f"   • Total comments analyzed: {len(comments_df):,}")
    print(f"   • Analysis completed: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")

    # Persuasion metrics
    print(f"\nPersuasion Metrics:")
    print(f"   • Average persuasion score: {posts_df['persuasion_score'].mean():.3f}")
    print(f"   • Highest persuasion score: {posts_df['persuasion_score'].max():.3f}")
    print(f"   • Posts with deltas: {(posts_df['delta_count'] > 0).sum()} ({(posts_df['delta_count'] > 0).mean()*100:.1f}%)")
    print(f"   • Total deltas awarded: {posts_df['delta_count'].sum()}")
    print(f"   • Average deltas per post: {posts_df['delta_count'].mean():.2f}")

    # Content analysis
    print(f"\nContent Analysis:")
    print(f"   • Average word count: {posts_df['post_word_count'].mean():.0f}")
    print(f"   • Average sentence count: {posts_df['post_sentence_count'].mean():.1f}")
    print(f"   • Posts with statistics: {posts_df['has_statistics'].mean()*100:.1f}%")
    print(f"   • Posts with personal experience: {posts_df['has_personal_experience'].mean()*100:.1f}%")
    print(f"   • Posts with moral appeals: {posts_df['has_moral_appeals'].mean()*100:.1f}%")

    # Quality indicators
    print(f"\nQuality Indicators:")
    print(f"   • Average argument sophistication: {posts_df['argument_sophistication_score'].mean():.3f}")
    print(f"   • Average evidence quality: {posts_df['evidence_quality_score'].mean():.3f}")
    print(f"   • Average comment engagement: {posts_df['comment_engagement_score'].mean():.3f}")

    # Top performing posts
    print(f"\nMost Persuasive Posts:")
    top_posts = posts_df.nlargest(5, 'persuasion_score')[
        ['title', 'persuasion_score', 'delta_count', 'post_score']
    ]

    for i, (_, row) in enumerate(top_posts.iterrows(), 1):
        title_truncated = row['title'][:55] + "..." if len(row['title']) > 55 else row['title']
        print(f"   {i}. [{row['persuasion_score']:.3f}] {title_truncated}")
        print(f"      Deltas: {row['delta_count']} | Reddit Score: {row['post_score']}")

    # Correlation insights
    print(f"\nKey Correlations:")
    corr_deltas = posts_df['persuasion_score'].corr(posts_df['delta_count'])
    corr_reddit_score = posts_df['persuasion_score'].corr(posts_df['post_score'])
    corr_word_count = posts_df['persuasion_score'].corr(posts_df['post_word_count'])

    print(f"   • Persuasion score ↔ Delta count: {corr_deltas:.3f}")
    print(f"   • Persuasion score ↔ Reddit score: {corr_reddit_score:.3f}")
    print(f"   • Persuasion score ↔ Word count: {corr_word_count:.3f}")

    print(f"\nFiles Generated:")
    print(f"   • cmv_posts_analysis.csv - Main dataset with all post metrics")
    if comments_df is not None:
        print(f"   • cmv_comments_analysis.csv - Individual comment analysis")
    print(f"   • cmv_summary_stats.csv - Aggregated statistics")
    print(f"   • cmv_top_performers.csv - Top 20 most persuasive posts")

def create_data_dictionary():
    """Generate a data dictionary for the CSV files"""

    posts_dictionary = {
        'Column Name': [
            'post_id', 'title', 'author', 'post_text', 'original_post_text',
            'created_timestamp', 'created_utc', 'post_score', 'num_comments',
            'post_url', 'post_word_count', 'post_sentence_count', 'post_character_count',
            'persuasion_score', 'argument_sophistication_score', 'evidence_quality_score',
            'comment_engagement_score', 'delta_count', 'has_deltas',
            'analogies_count', 'questions_count', 'statistics_count', 'hedging_count',
            'personal_experience_count', 'moral_appeals_count', 'emotional_appeals_count',
            'authority_appeals_count', 'consensus_appeals_count', 'concessions_count',
            'has_analogies', 'has_statistics', 'has_personal_experience',
            'has_moral_appeals', 'has_authority_appeals',
            'vader_compound', 'vader_positive', 'vader_negative', 'vader_neutral',
            'persuasive_emotion_score', 'root_comments_count', 'reply_comments_count',
            'avg_comment_length', 'total_comment_words'
        ],
        'Description': [
            'Unique Reddit post identifier',
            'Post title text',
            'Reddit username of post author',
            'Cleaned post content text',
            'Original unprocessed post text',
            'Human-readable creation timestamp',
            'Unix timestamp of post creation',
            'Reddit upvote score',
            'Total number of comments',
            'Direct URL to Reddit post',
            'Number of words in post',
            'Number of sentences in post',
            'Total character count',
            'Overall persuasion score (0-1)',
            'Argument sophistication score (0-1)',
            'Quality of cited evidence (0-1)',
            'Comment engagement quality (0-1)',
            'Number of delta awards received',
            'Boolean: post received any deltas',
            'Count of analogies used',
            'Count of questions asked',
            'Count of statistics/numbers cited',
            'Count of hedging language used',
            'Count of personal experience references',
            'Count of moral/ethical appeals',
            'Count of emotional language',
            'Count of authority citations',
            'Count of consensus appeals',
            'Count of concessions made',
            'Boolean: contains analogies',
            'Boolean: contains statistics',
            'Boolean: contains personal experience',
            'Boolean: contains moral appeals',
            'Boolean: contains authority appeals',
            'VADER sentiment compound score (-1 to 1)',
            'VADER positive sentiment (0-1)',
            'VADER negative sentiment (0-1)',
            'VADER neutral sentiment (0-1)',
            'Domain-specific persuasive emotion score',
            'Number of top-level comments',
            'Number of reply comments',
            'Average length of comments',
            'Total words across all comments'
        ],
        'Data Type': [
            'string', 'string', 'string', 'string', 'string',
            'datetime', 'integer', 'integer', 'integer', 'string',
            'integer', 'integer', 'integer', 'float', 'float', 'float',
            'float', 'integer', 'boolean', 'integer', 'integer', 'integer',
            'integer', 'integer', 'integer', 'integer', 'integer', 'integer',
            'integer', 'boolean', 'boolean', 'boolean', 'boolean', 'boolean',
            'float', 'float', 'float', 'float', 'float', 'integer', 'integer',
            'float', 'integer'
        ]
    }

    dictionary_df = pd.DataFrame(posts_dictionary)
    dictionary_df.to_csv('cmv_data_dictionary.csv', index=False)
    print(f"Data dictionary saved to 'cmv_data_dictionary.csv'")

    return dictionary_df

# execution with rate limiting for 50 posts + ALL comments
if __name__ == "__main__":
    print("Starting Rate-Limited CMV Persuasion Analysis")
    print("=" * 50)
    print("Target: 10 posts with ALL comments")
    print("Expected runtime: 5-10 minutes depending on comment volume")
    print("=" * 50)

    # Collect data with comprehensive rate limiting
    posts_data, comments_data = collect_cmv_data(limit=10, time_filter="year")

    # Save clean datasets
    posts_df, comments_df, summary_df = save_clean_datasets(posts_data, comments_data)

    # Create data dictionary
    dictionary_df = create_data_dictionary()

    # Display comprehensive summary
    display_enhanced_analysis_summary(posts_df, comments_df)

    print(f"\nRate-limited analysis complete!")
    print(f"Clean, analyzable datasets ready for research.")
    print(f"Check 'cmv_data_dictionary.csv' for column explanations.")

DEBUG: Found 0 URLs in text of length 2850
  Completed: 1025 comments processed, 0 deltas found
Rate limiting: Inter-post delay (waiting 2.0s)


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



Processing post 7/10: CMV: We don't need the old Republican party back...


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/l

Rate limiting: Post comment processing complete (waiting 5.0s)
  Processing 1424 comments...
DEBUG: Found 0 URLs in text of length 1138
  Completed: 1357 comments processed, 5 deltas found
Rate limiting: Inter-post delay (waiting 2.0s)


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



Processing post 8/10: CMV: if nuclear war breaks out, there's no point i...


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/l

Rate limiting: Post comment processing complete (waiting 5.0s)
  Processing 903 comments...
DEBUG: Found 0 URLs in text of length 1334
  Completed: 859 comments processed, 3 deltas found
Rate limiting: Inter-post delay (waiting 2.0s)


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



Processing post 9/10: CMV: Touch screens in cars are incredibly stupid a...
Rate limiting: Post comment processing complete (waiting 5.0s)
  Processing 156 comments...
DEBUG: Found 0 URLs in text of length 569
  Completed: 148 comments processed, 0 deltas found
Rate limiting: Inter-post delay (waiting 2.0s)


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



Processing post 10/10: CMV: There is no reason to be against homosexualit...


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/l

Rate limiting: Post comment processing complete (waiting 5.0s)
  Processing 1116 comments...
DEBUG: Found 0 URLs in text of length 766
  Completed: 1040 comments processed, 7 deltas found
Rate limiting: Inter-post delay (waiting 2.0s)
Data collection complete. Collected 10 posts and 15260 comments
Posts data saved to 'cmv_posts_analysis.csv' (10 rows)
Comments data saved to 'cmv_comments_analysis.csv' (15260 rows)
Summary statistics saved to 'cmv_summary_stats.csv'
Top performers saved to 'cmv_top_performers.csv' (10 rows)
Data dictionary saved to 'cmv_data_dictionary.csv'

PERSUASION ANALYSIS SUMMARY
Dataset Overview:
   • Total posts analyzed: 10
   • Total comments analyzed: 15,260
   • Analysis completed: 2025-09-21 01:32:57

Persuasion Metrics:
   • Average persuasion score: 0.330
   • Highest persuasion score: 0.427
   • Posts with deltas: 8 (80.0%)
   • Total deltas awarded: 49
   • Average deltas per post: 4.90

Content Analysis:
   • Average word count: 236
   • Average senten

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



Starting Rate-Limited CMV Persuasion Analysis
Target: 10 posts with ALL comments
Expected runtime: 5-10 minutes depending on comment volume
Starting rate-limited CMV data collection (limit: 10)
Configuration: Base delay=2.0s, Batch delay every 10 posts


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



Processing post 1/10: CMV: Voting for Donald Trump in the 2024 election ...


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/l

Rate limiting: Post comment processing complete (waiting 5.0s)
  Processing 4623 comments...
DEBUG: Found 0 URLs in text of length 1733
  Completed: 4309 comments processed, 12 deltas found
Rate limiting: Inter-post delay (waiting 2.0s)


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



Processing post 2/10: CMV: Hijabs are sexist...


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/l

Rate limiting: Post comment processing complete (waiting 5.0s)
  Processing 2502 comments...
DEBUG: Found 0 URLs in text of length 1530
  Completed: 2333 comments processed, 4 deltas found
Rate limiting: Inter-post delay (waiting 2.0s)


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



Processing post 3/10: CMV: Broadway would never allow a “Book of Mormon”...


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/l

Rate limiting: Post comment processing complete (waiting 5.0s)
  Processing 939 comments...
DEBUG: Found 0 URLs in text of length 1701
  Completed: 900 comments processed, 7 deltas found
Rate limiting: Inter-post delay (waiting 2.0s)


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



Processing post 4/10: CMV: Israel Should Be Sanctioned for Killing an Am...


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/l

Rate limiting: Post comment processing complete (waiting 5.0s)
  Processing 1627 comments...
DEBUG: Found 0 URLs in text of length 1165
  Completed: 1476 comments processed, 3 deltas found
Rate limiting: Inter-post delay (waiting 2.0s)


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



Processing post 5/10: CMV: Voter ID is a totally sensible policy....


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/l

Rate limiting: Post comment processing complete (waiting 5.0s)
  Processing 1930 comments...
DEBUG: Found 0 URLs in text of length 846
  Completed: 1813 comments processed, 8 deltas found
Rate limiting: Inter-post delay (waiting 2.0s)


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



Processing post 6/10: CMV: Universal Healthcare is less expensive and mo...


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/l

Rate limiting: Post comment processing complete (waiting 5.0s)
  Processing 1064 comments...
DEBUG: Found 0 URLs in text of length 2850
  Completed: 1025 comments processed, 0 deltas found
Rate limiting: Inter-post delay (waiting 2.0s)


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



Processing post 7/10: CMV: We don't need the old Republican party back...


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/l

Rate limiting: Post comment processing complete (waiting 5.0s)
  Processing 1424 comments...
DEBUG: Found 0 URLs in text of length 1138
  Completed: 1357 comments processed, 5 deltas found
Rate limiting: Inter-post delay (waiting 2.0s)


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



Processing post 8/10: CMV: if nuclear war breaks out, there's no point i...


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/l

Rate limiting: Post comment processing complete (waiting 5.0s)
  Processing 903 comments...
DEBUG: Found 0 URLs in text of length 1334
  Completed: 859 comments processed, 3 deltas found
Rate limiting: Inter-post delay (waiting 2.0s)


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



Processing post 9/10: CMV: Touch screens in cars are incredibly stupid a...
Rate limiting: Post comment processing complete (waiting 5.0s)
  Processing 156 comments...
DEBUG: Found 0 URLs in text of length 569
  Completed: 148 comments processed, 0 deltas found
Rate limiting: Inter-post delay (waiting 2.0s)


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.



Processing post 10/10: CMV: There is no reason to be against homosexualit...


It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/latest/getting_started/multiple_instances.html#discord-bots-and-asynchronous-environments for more info.

It is strongly recommended to use Async PRAW: https://asyncpraw.readthedocs.io.
See https://praw.readthedocs.io/en/l

Rate limiting: Post comment processing complete (waiting 5.0s)
  Processing 1116 comments...
DEBUG: Found 0 URLs in text of length 766
  Completed: 1040 comments processed, 7 deltas found
Rate limiting: Inter-post delay (waiting 2.0s)
Data collection complete. Collected 10 posts and 15260 comments
Posts data saved to 'cmv_posts_analysis.csv' (10 rows)
Comments data saved to 'cmv_comments_analysis.csv' (15260 rows)
Summary statistics saved to 'cmv_summary_stats.csv'
Top performers saved to 'cmv_top_performers.csv' (10 rows)
Data dictionary saved to 'cmv_data_dictionary.csv'

PERSUASION ANALYSIS SUMMARY
Dataset Overview:
   • Total posts analyzed: 10
   • Total comments analyzed: 15,260
   • Analysis completed: 2025-09-21 01:50:02

Persuasion Metrics:
   • Average persuasion score: 0.330
   • Highest persuasion score: 0.427
   • Posts with deltas: 8 (80.0%)
   • Total deltas awarded: 49
   • Average deltas per post: 4.90

Content Analysis:
   • Average word count: 236
   • Average senten