# Deep Rule Analysis - Extracting Transferable Violation Patterns

This notebook performs an in-depth analysis of the 2 training rules to extract patterns that can generalize to unseen rules in the test set.

## Goals:
1. **Rule Characterization**: Deep dive into each rule's violation patterns
2. **Cross-Rule Pattern Discovery**: Find common violation indicators
3. **Transferable Feature Extraction**: Identify rule-agnostic features
4. **Semantic Analysis**: Understand the underlying violation concepts
5. **Generalization Strategy**: Design features for unseen rules

In [2]:
# Core libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import re
import warnings
from collections import Counter

# NLP libraries
from sklearn.feature_extraction.text import TfidfVectorizer
from sentence_transformers import SentenceTransformer
import nltk
from nltk.corpus import stopwords
from nltk.sentiment import SentimentIntensityAnalyzer

# Visualization
from wordcloud import WordCloud

warnings.filterwarnings('ignore')
%matplotlib inline

# Download required NLTK data
nltk.download('punkt', quiet=True)
nltk.download('stopwords', quiet=True)
nltk.download('vader_lexicon', quiet=True)

True

## 1. Load and Prepare Data

In [3]:
# Load the training data
train_df = pd.read_csv('Data/train.csv')

print(f"Dataset shape: {train_df.shape}")
print(f"Columns: {list(train_df.columns)}")

# Display the two rules
rules = train_df['rule'].unique()
print(f"\nTraining Rules ({len(rules)}):")
for i, rule in enumerate(rules, 1):
    print(f"{i}. {rule}")

# Basic statistics
print(f"\nBasic Statistics:")
for rule in rules:
    rule_data = train_df[train_df['rule'] == rule]
    violation_rate = rule_data['rule_violation'].mean()
    print(f"{rule[:30]}...: {len(rule_data)} samples, {violation_rate:.1%} violations")

Dataset shape: (2029, 9)
Columns: ['row_id', 'body', 'rule', 'subreddit', 'positive_example_1', 'positive_example_2', 'negative_example_1', 'negative_example_2', 'rule_violation']

Training Rules (2):
1. No Advertising: Spam, referral links, unsolicited advertising, and promotional content are not allowed.
2. No legal advice: Do not offer or request legal advice.

Basic Statistics:
No Advertising: Spam, referral...: 1012 samples, 43.3% violations
No legal advice: Do not offer ...: 1017 samples, 58.3% violations


## 2. Rule-Specific Analysis

In [4]:
def analyze_rule_patterns(rule_name, rule_data):
    """Comprehensive analysis of a specific rule's violation patterns"""
    
    print(f"\n{'='*60}")
    print(f"ANALYSIS: {rule_name[:40]}...")
    print(f"{'='*60}")
    
    violations = rule_data[rule_data['rule_violation'] == 1]
    non_violations = rule_data[rule_data['rule_violation'] == 0]
    
    print(f"Violations: {len(violations)}, Non-violations: {len(non_violations)}")
    
    # Text length analysis
    viol_lengths = violations['body'].str.len()
    non_viol_lengths = non_violations['body'].str.len()
    
    print(f"\nText Length Patterns:")
    print(f"  Violation avg: {viol_lengths.mean():.1f} chars")
    print(f"  Non-violation avg: {non_viol_lengths.mean():.1f} chars")
    print(f"  Difference: {viol_lengths.mean() - non_viol_lengths.mean():.1f} chars")
    
    # Word patterns
    viol_words = violations['body'].str.split().str.len()
    non_viol_words = non_violations['body'].str.split().str.len()
    
    print(f"\nWord Count Patterns:")
    print(f"  Violation avg: {viol_words.mean():.1f} words")
    print(f"  Non-violation avg: {non_viol_words.mean():.1f} words")
    
    # Sentiment analysis
    sia = SentimentIntensityAnalyzer()
    
    viol_sentiment = [sia.polarity_scores(str(text))['compound'] 
                     for text in violations['body'] if pd.notna(text)]
    non_viol_sentiment = [sia.polarity_scores(str(text))['compound'] 
                         for text in non_violations['body'] if pd.notna(text)]
    
    print(f"\nSentiment Patterns:")
    print(f"  Violation sentiment: {np.mean(viol_sentiment):.3f}")
    print(f"  Non-violation sentiment: {np.mean(non_viol_sentiment):.3f}")
    
    return violations, non_violations

# Analyze each rule
rule_analyses = {}
for rule in rules:
    rule_data = train_df[train_df['rule'] == rule]
    violations, non_violations = analyze_rule_patterns(rule, rule_data)
    rule_analyses[rule] = {'violations': violations, 'non_violations': non_violations}


ANALYSIS: No Advertising: Spam, referral links, un...
Violations: 438, Non-violations: 574

Text Length Patterns:
  Violation avg: 155.5 chars
  Non-violation avg: 139.3 chars
  Difference: 16.2 chars

Word Count Patterns:
  Violation avg: 21.1 words
  Non-violation avg: 15.4 words

Sentiment Patterns:
  Violation sentiment: 0.383
  Non-violation sentiment: 0.221

ANALYSIS: No legal advice: Do not offer or request...
Violations: 593, Non-violations: 424

Text Length Patterns:
  Violation avg: 225.0 chars
  Non-violation avg: 182.3 chars
  Difference: 42.7 chars

Word Count Patterns:
  Violation avg: 41.2 words
  Non-violation avg: 33.6 words

Sentiment Patterns:
  Violation sentiment: -0.172
  Non-violation sentiment: -0.196


## 3. Feature Engineering for Transferability

In [10]:
def extract_transferable_features(text):
    """Extract features that should transfer across different rule types"""
    
    if pd.isna(text):
        text = ""
    
    text = str(text).lower()
    
    features = {
        # Basic text statistics
        'char_count': len(text),
        'word_count': len(text.split()),
        'sentence_count': len(re.findall(r'[.!?]+', text)),
        
        # Punctuation and formatting
        'exclamation_count': text.count('!'),
        'question_count': text.count('?'),
        #'caps_ratio': sum(1 for c in text if c.isupper()) / len(text) if text else 0,
        # Fixed caps_ratio calculation:
        'caps_ratio': sum(1 for c in str(text) if c.isupper()) / len(str(text)) if text else 0,
        
        # URLs and links
        'url_count': len(re.findall(r'http[s]?://\S+', text)),
        'link_words': int(any(word in text for word in ['click', 'link', 'here', 'visit'])),
        
        # Commercial indicators
        'commercial_words': int(any(word in text for word in ['buy', 'sell', 'price', 'cost', 'money', 'pay', 'free', 'discount', 'sale'])),
        'promotional_words': int(any(word in text for word in ['offer', 'deal', 'special', 'limited', 'now', 'today'])),
        
        # Advice and instruction indicators
        'advice_words': int(any(word in text for word in ['should', 'must', 'need', 'have to', 'recommend', 'suggest', 'advice'])),
        'instruction_words': int(any(word in text for word in ['how to', 'step', 'guide', 'tutorial', 'instructions'])),
        
        # Legal and professional terms
        'legal_words': int(any(word in text for word in ['lawyer', 'legal', 'court', 'law', 'sue', 'lawsuit', 'attorney'])),
        'professional_words': int(any(word in text for word in ['professional', 'expert', 'consultant', 'service'])),
        
        # Spam indicators
        'spam_words': int(any(word in text for word in ['spam', 'scam', 'fake', 'bot', 'automated'])),
        'urgency_words': int(any(word in text for word in ['urgent', 'immediate', 'asap', 'quickly', 'hurry'])),
        
        # Special characters
        'dollar_count': text.count('$'),
        'percent_count': text.count('%'),
        'number_count': len(re.findall(r'\d+', text)),
    }
    
    return features

# Extract features for all comments
print("Extracting transferable features...")
feature_list = []
labels = []
rules_list = []

for _, row in train_df.iterrows():
    features = extract_transferable_features(row['body'])
    feature_list.append(features)
    labels.append(row['rule_violation'])
    rules_list.append(row['rule'])

features_df = pd.DataFrame(feature_list)
features_df['rule_violation'] = labels
features_df['rule'] = rules_list

print(f"Extracted {len(features_df.columns)-2} transferable features")

Extracting transferable features...
Extracted 19 transferable features


In [12]:
# Print preview of new features
print("TRANSFERABLE FEATURES PREVIEW:")
print("=" * 50)

print(f"\nDataset shape: {features_df.shape}")
print(f"Feature columns: {[col for col in features_df.columns if col not in ['rule_violation', 'rule']]}")

print(f"\nFirst 5 rows of extracted features:")
display(features_df.head())

print(f"\nFeature statistics:")
feature_cols = [col for col in features_df.columns if col not in ['rule_violation', 'rule']]
display(features_df[feature_cols].describe())

print(f"\nFeature correlation with violations:")
correlations = []
for feature in feature_cols:
    corr = features_df[feature].corr(features_df['rule_violation'])
    correlations.append((feature, corr))

# Sort by absolute correlation
correlations.sort(key=lambda x: abs(x[1]), reverse=True)

print("\nTop 10 features correlated with violations:")
for i, (feature, corr) in enumerate(correlations[:10]):
    direction = "↑" if corr > 0 else "↓"
    print(f"{i+1:2d}. {feature:20s}: {corr:6.3f} {direction}")

print(f"\nSample feature values for violations vs non-violations:")
violations = features_df[features_df['rule_violation'] == 1]
non_violations = features_df[features_df['rule_violation'] == 0]

print(f"\nViolations (n={len(violations)}):")
print(violations[feature_cols].mean().round(3))

print(f"\nNon-violations (n={len(non_violations)}):")
print(non_violations[feature_cols].mean().round(3))

TRANSFERABLE FEATURES PREVIEW:

Dataset shape: (2029, 21)
Feature columns: ['char_count', 'word_count', 'sentence_count', 'exclamation_count', 'question_count', 'caps_ratio', 'url_count', 'link_words', 'commercial_words', 'promotional_words', 'advice_words', 'instruction_words', 'legal_words', 'professional_words', 'spam_words', 'urgency_words', 'dollar_count', 'percent_count', 'number_count']

First 5 rows of extracted features:


Unnamed: 0,char_count,word_count,sentence_count,exclamation_count,question_count,caps_ratio,url_count,link_words,commercial_words,promotional_words,...,instruction_words,legal_words,professional_words,spam_words,urgency_words,dollar_count,percent_count,number_count,rule_violation,rule
0,59,12,2,2,0,0.0,0,1,0,1,...,0,0,0,0,0,0,0,0,0,"No Advertising: Spam, referral links, unsolici..."
1,91,7,2,0,0,0.0,1,1,0,0,...,0,0,0,0,0,0,0,2,0,"No Advertising: Spam, referral links, unsolici..."
2,57,12,2,0,0,0.0,0,0,0,0,...,0,0,0,0,0,0,0,0,1,No legal advice: Do not offer or request legal...
3,75,12,2,0,0,0.0,1,0,0,0,...,0,0,0,0,0,0,0,1,1,"No Advertising: Spam, referral links, unsolici..."
4,313,23,10,0,2,0.0,3,1,1,0,...,0,0,0,0,0,1,0,10,1,"No Advertising: Spam, referral links, unsolici..."



Feature statistics:


Unnamed: 0,char_count,word_count,sentence_count,exclamation_count,question_count,caps_ratio,url_count,link_words,commercial_words,promotional_words,advice_words,instruction_words,legal_words,professional_words,spam_words,urgency_words,dollar_count,percent_count,number_count
count,2029.0,2029.0,2029.0,2029.0,2029.0,2029.0,2029.0,2029.0,2029.0,2029.0,2029.0,2029.0,2029.0,2029.0,2029.0,2029.0,2029.0,2029.0,2029.0
mean,176.843765,27.963036,3.111878,0.255298,0.282405,0.0,0.473632,0.209463,0.194184,0.17792,0.168063,0.021193,0.199606,0.018236,0.033021,0.009364,0.040907,0.037457,1.186299
std,113.625378,21.230214,2.193275,0.765296,0.619848,0.0,0.722913,0.407026,0.395669,0.38254,0.374014,0.144062,0.399803,0.133835,0.178736,0.096338,0.242851,0.356092,2.535311
min,51.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,87.0,11.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
50%,138.0,22.0,3.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
75%,238.0,39.0,4.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0
max,499.0,97.0,30.0,9.0,9.0,0.0,8.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,4.0,10.0,34.0



Feature correlation with violations:

Top 10 features correlated with violations:
 1. caps_ratio          :    nan ↓
 2. legal_words         :  0.346 ↑
 3. word_count          :  0.224 ↑
 4. char_count          :  0.167 ↑
 5. url_count           : -0.163 ↓
 6. commercial_words    :  0.119 ↑
 7. dollar_count        :  0.109 ↑
 8. promotional_words   :  0.089 ↑
 9. sentence_count      :  0.058 ↑
10. spam_words          :  0.055 ↑

Sample feature values for violations vs non-violations:

Violations (n=1031):
char_count            195.483
word_count             32.637
sentence_count          3.238
exclamation_count       0.281
question_count          0.273
caps_ratio              0.000
url_count               0.358
link_words              0.231
commercial_words        0.241
promotional_words       0.211
advice_words            0.185
instruction_words       0.024
legal_words             0.336
professional_words      0.025
spam_words              0.043
urgency_words           0.013
dollar_c

## 4. Cross-Rule Pattern Analysis

In [7]:
# Analyze which features are most discriminative across both rules
print("\nCROSS-RULE DISCRIMINATIVE FEATURES:")
print("="*50)

violations = features_df[features_df['rule_violation'] == 1]
non_violations = features_df[features_df['rule_violation'] == 0]

feature_importance = []

for feature in features_df.columns:
    if feature not in ['rule_violation', 'rule']:
        viol_mean = violations[feature].mean()
        non_viol_mean = non_violations[feature].mean()
        difference = abs(viol_mean - non_viol_mean)
        
        feature_importance.append({
            'feature': feature,
            'violation_mean': viol_mean,
            'non_violation_mean': non_viol_mean,
            'difference': difference,
            'direction': 'Higher' if viol_mean > non_viol_mean else 'Lower'
        })

# Sort by importance
feature_importance.sort(key=lambda x: x['difference'], reverse=True)

print("\nTop 10 most discriminative features:")
for i, feat in enumerate(feature_importance[:10]):
    print(f"{i+1:2d}. {feat['feature']:20s}: {feat['direction']:6s} in violations (diff: {feat['difference']:.3f})")

# Analyze by rule
print("\n\nRULE-SPECIFIC FEATURE ANALYSIS:")
print("="*50)

for rule in rules:
    print(f"\n{rule[:40]}...")
    
    rule_data = features_df[features_df['rule'] == rule]
    rule_violations = rule_data[rule_data['rule_violation'] == 1]
    rule_non_violations = rule_data[rule_data['rule_violation'] == 0]
    
    print(f"  Samples: {len(rule_data)} ({len(rule_violations)} violations)")
    
    # Top features for this rule
    rule_feature_importance = []
    for feature in features_df.columns:
        if feature not in ['rule_violation', 'rule']:
            viol_mean = rule_violations[feature].mean()
            non_viol_mean = rule_non_violations[feature].mean()
            difference = abs(viol_mean - non_viol_mean)
            rule_feature_importance.append((feature, difference, viol_mean, non_viol_mean))
    
    rule_feature_importance.sort(key=lambda x: x[1], reverse=True)
    
    print("  Top 5 discriminative features:")
    for feature, diff, v_mean, nv_mean in rule_feature_importance[:5]:
        direction = "Higher" if v_mean > nv_mean else "Lower"
        print(f"    {feature:20s}: {direction:6s} ({diff:.3f})")


CROSS-RULE DISCRIMINATIVE FEATURES:

Top 10 most discriminative features:
 1. char_count          : Higher in violations (diff: 37.895)
 2. word_count          : Higher in violations (diff: 9.503)
 3. legal_words         : Higher in violations (diff: 0.276)
 4. sentence_count      : Higher in violations (diff: 0.256)
 5. url_count           : Lower  in violations (diff: 0.235)
 6. number_count        : Lower  in violations (diff: 0.180)
 7. commercial_words    : Higher in violations (diff: 0.094)
 8. promotional_words   : Higher in violations (diff: 0.068)
 9. dollar_count        : Higher in violations (diff: 0.053)
10. exclamation_count   : Higher in violations (diff: 0.053)


RULE-SPECIFIC FEATURE ANALYSIS:

No Advertising: Spam, referral links, un...
  Samples: 1012 (438 violations)
  Top 5 discriminative features:
    char_count          : Higher (16.203)
    word_count          : Higher (5.668)
    exclamation_count   : Higher (0.226)
    commercial_words    : Higher (0.213)
    

## 5. Rule Examples Deep Dive

In [8]:
# Analyze the provided rule examples
print("\nRULE EXAMPLES ANALYSIS:")
print("="*50)

unique_rules = train_df.drop_duplicates(subset=['rule'])

for _, row in unique_rules.iterrows():
    rule = row['rule']
    print(f"\nRule: {rule}")
    print(f"Subreddit: {row['subreddit']}")
    
    examples = {
        'Positive Example 1': row['positive_example_1'],
        'Positive Example 2': row['positive_example_2'],
        'Negative Example 1': row['negative_example_1'],
        'Negative Example 2': row['negative_example_2']
    }
    
    for example_type, example_text in examples.items():
        if pd.notna(example_text):
            print(f"\n  {example_type}:")
            print(f"    Text: {str(example_text)[:100]}...")
            
            # Extract features from example
            features = extract_transferable_features(example_text)
            
            # Show key features
            key_features = ['commercial_words', 'promotional_words', 'advice_words', 'legal_words', 'url_count']
            feature_summary = []
            for feat in key_features:
                if features[feat] > 0:
                    feature_summary.append(feat)
            
            if feature_summary:
                print(f"    Key features: {', '.join(feature_summary)}")
            else:
                print(f"    Key features: None detected")


RULE EXAMPLES ANALYSIS:

Rule: No Advertising: Spam, referral links, unsolicited advertising, and promotional content are not allowed.
Subreddit: Futurology

  Positive Example 1:
    Text: If you could tell your younger self something different about sex, what would that be?

i AM IN A CO...
    Key features: promotional_words, url_count

  Positive Example 2:
    Text: hunt for lady for jack off in neighbourhood http://url.inmusi.com/gakq...
    Key features: url_count

  Negative Example 1:
    Text: Watch Golden Globe Awards 2017 Live Online in HD Coverage without ADS (VIP STREAMS)
=

HD STREAM QUA...
    Key features: url_count

  Negative Example 2:
    Text: DOUBLE CEE x BANDS EPPS - "BIRDS"

DOWNLOAD/STREAM:

http://music.theblacksmithed.com/download/birds...
    Key features: url_count

Rule: No legal advice: Do not offer or request legal advice.
Subreddit: pcmasterrace

  Positive Example 1:
    Text: Don't break up with him or call the cops.  If you are willing to get beat 

## 6. Transferable Patterns Summary

In [9]:
print("\nTRANSFERABLE PATTERNS SUMMARY:")
print("="*60)

print("\n1. MOST IMPORTANT CROSS-RULE FEATURES:")
top_features = feature_importance[:8]
for i, feat in enumerate(top_features, 1):
    print(f"   {i}. {feat['feature']:20s} - {feat['direction']:6s} in violations")

print("\n2. VIOLATION INDICATORS THAT TRANSFER:")
transfer_indicators = []
for feat in feature_importance[:15]:
    if feat['difference'] > 0.05:  # Significant difference
        transfer_indicators.append(feat)

for indicator in transfer_indicators:
    print(f"   • {indicator['feature']}: {indicator['direction']} in violations")

print("\n3. RECOMMENDED FEATURES FOR UNSEEN RULES:")
recommended_features = [
    'commercial_words', 'promotional_words', 'advice_words', 'legal_words',
    'url_count', 'link_words', 'urgency_words', 'char_count', 'caps_ratio'
]

for feature in recommended_features:
    feat_info = next((f for f in feature_importance if f['feature'] == feature), None)
    if feat_info:
        print(f"   • {feature:20s}: {feat_info['direction']:6s} (importance: {feat_info['difference']:.3f})")

print("\n4. GENERALIZATION STRATEGY:")
print("   • Focus on semantic content rather than rule-specific keywords")
print("   • Use commercial/promotional language as violation indicators")
print("   • Leverage text formatting patterns (caps, punctuation)")
print("   • Consider advice-giving language patterns")
print("   • Build rule embeddings from provided examples")

print("\n" + "="*60)
print("ANALYSIS COMPLETE - Ready for model building!")
print("="*60)


TRANSFERABLE PATTERNS SUMMARY:

1. MOST IMPORTANT CROSS-RULE FEATURES:
   1. char_count           - Higher in violations
   2. word_count           - Higher in violations
   3. legal_words          - Higher in violations
   4. sentence_count       - Higher in violations
   5. url_count            - Lower  in violations
   6. number_count         - Lower  in violations
   7. commercial_words     - Higher in violations
   8. promotional_words    - Higher in violations

2. VIOLATION INDICATORS THAT TRANSFER:
   • char_count: Higher in violations
   • word_count: Higher in violations
   • legal_words: Higher in violations
   • sentence_count: Higher in violations
   • url_count: Lower in violations
   • number_count: Lower in violations
   • commercial_words: Higher in violations
   • promotional_words: Higher in violations
   • dollar_count: Higher in violations
   • exclamation_count: Higher in violations

3. RECOMMENDED FEATURES FOR UNSEEN RULES:
   • commercial_words    : Higher (impo