# Emotion Mapping: 27 GoEmotions → 13 Target Emotions
## Defining Emotions for Crisis Detection

This notebook:
1. Shows all 27 GoEmotions labels
2. Defines 13 target emotions optimized for crisis detection
3. Creates the mapping dictionary (27→13)
4. Validates the mapping
5. Tests on sample data

In [6]:
import pandas as pd
import numpy as np
from collections import Counter

pd.set_option('display.max_columns', None)

## 1. GoEmotions: All 27 Original Emotions

In [7]:
# All 27 GoEmotions (index: emotion)
GOEMOTIONS_27 = {
    0: 'admiration',
    1: 'amusement',
    2: 'anger',
    3: 'annoyance',
    4: 'approval',
    5: 'caring',
    6: 'confusion',
    7: 'curiosity',
    8: 'desire',
    9: 'disappointment',
    10: 'disapproval',
    11: 'disgust',
    12: 'embarrassment',
    13: 'excitement',
    14: 'fear',
    15: 'gratitude',
    16: 'grief',
    17: 'joy',
    18: 'love',
    19: 'nervousness',
    20: 'optimism',
    21: 'pride',
    22: 'realization',
    23: 'relief',
    24: 'remorse',
    25: 'sadness',
    26: 'surprise',
    27: 'neutral'
}

print("GoEmotions 27 Original Emotions:")
print("=" * 60)
for idx, emotion in GOEMOTIONS_27.items():
    print(f"{idx:2d}: {emotion}")

GoEmotions 27 Original Emotions:
 0: admiration
 1: amusement
 2: anger
 3: annoyance
 4: approval
 5: caring
 6: confusion
 7: curiosity
 8: desire
 9: disappointment
10: disapproval
11: disgust
12: embarrassment
13: excitement
14: fear
15: gratitude
16: grief
17: joy
18: love
19: nervousness
20: optimism
21: pride
22: realization
23: relief
24: remorse
25: sadness
26: surprise
27: neutral


## 2. Define 13 Target Emotions for Crisis Detection

### Selection Criteria:
1. **Crisis-relevant emotions** (fear, anxiety, grief, anger)
2. **Contrast emotions** (joy, excitement) to distinguish non-crisis
3. **Information-seeking** (confusion, curiosity) during crises
4. **Social emotions** (caring, gratitude) common in crisis response
5. **General sentiment** (sadness, anger, surprise, neutral)

In [8]:
# 13 Target Emotions (label: emotion name)
TARGET_EMOTIONS_13 = {
    1: 'fear',           # CRITICAL for crisis detection
    2: 'anger',          # Common in crises (frustration, blame)
    3: 'sadness',        # Grief, loss, despair
    4: 'anxiety',        # Nervousness, worry, uncertainty
    5: 'confusion',      # Information seeking, uncertainty
    6: 'surprise',       # Shock, unexpected events
    7: 'disgust',        # Revulsion, strong negative reaction
    8: 'caring',         # Empathy, support, helping behavior
    9: 'joy',            # Positive contrast (non-crisis)
    10: 'excitement',    # High energy positive (non-crisis contrast)
    11: 'gratitude',     # Thankfulness (often post-crisis or rescue)
    12: 'disappointment', # Unmet expectations, letdown
    13: 'neutral'        # Informational, factual reporting
}

print("\nTarget 13 Emotions for Crisis Detection:")
print("=" * 60)
for label, emotion in TARGET_EMOTIONS_13.items():
    print(f"{label:2d}: {emotion}")


Target 13 Emotions for Crisis Detection:
 1: fear
 2: anger
 3: sadness
 4: anxiety
 5: confusion
 6: surprise
 7: disgust
 8: caring
 9: joy
10: excitement
11: gratitude
12: disappointment
13: neutral


## 3. Create Mapping Dictionary (27 → 13)

### Mapping Logic:
- **Combine similar emotions**: nervousness → anxiety, grief → sadness
- **Group positive emotions**: admiration, amusement, love → joy
- **Merge minor emotions**: embarrassment → disgust, remorse → disappointment
- **Preserve crisis-critical**: fear, anger, confusion stay separate

In [9]:
# Mapping: GoEmotions (27) → Target (13)
# Key: GoEmotions emotion name, Value: Target emotion label (1-13)

EMOTION_MAPPING_27_TO_13 = {
    # CRISIS-CRITICAL EMOTIONS (keep separate)
    'fear': 1,              # fear → fear
    'anger': 2,             # anger → anger
    'sadness': 3,           # sadness → sadness
    'nervousness': 4,       # nervousness → anxiety
    'confusion': 5,         # confusion → confusion
    'surprise': 6,          # surprise → surprise
    'disgust': 7,           # disgust → disgust
    'caring': 8,            # caring → caring
    
    # POSITIVE EMOTIONS (contrast for non-crisis)
    'joy': 9,               # joy → joy
    'amusement': 9,         # amusement → joy (similar)
    'love': 9,              # love → joy (positive)
    'admiration': 9,        # admiration → joy (positive)
    'excitement': 10,       # excitement → excitement
    'gratitude': 11,        # gratitude → gratitude
    'optimism': 11,         # optimism → gratitude (positive outlook)
    'pride': 11,            # pride → gratitude (positive achievement)
    'relief': 11,           # relief → gratitude (positive after stress)
    
    # NEGATIVE EMOTIONS (merge similar)
    'grief': 3,             # grief → sadness (deep sadness)
    'disappointment': 12,   # disappointment → disappointment
    'disapproval': 12,      # disapproval → disappointment (negative)
    'annoyance': 2,         # annoyance → anger (mild anger)
    'embarrassment': 7,     # embarrassment → disgust (self-disgust)
    'remorse': 12,          # remorse → disappointment (regret)
    
    # NEUTRAL/INFORMATIONAL
    'neutral': 13,          # neutral → neutral
    'approval': 13,         # approval → neutral (agreement)
    'curiosity': 5,         # curiosity → confusion (information seeking)
    'desire': 10,           # desire → excitement (wanting something)
    'realization': 6,       # realization → surprise (sudden understanding)
}

print("\nEmotion Mapping (27 → 13):")
print("=" * 80)
print(f"{'GoEmotion (27)':<20} → {'Target Label':<15} {'Target Emotion (13)'}")
print("=" * 80)

# Group by target emotion for better visualization
by_target = {}
for source, target_label in EMOTION_MAPPING_27_TO_13.items():
    target_name = TARGET_EMOTIONS_13[target_label]
    if target_name not in by_target:
        by_target[target_name] = []
    by_target[target_name].append(source)

for target_label, target_name in sorted(TARGET_EMOTIONS_13.items()):
    sources = by_target.get(target_name, [])
    print(f"\n[{target_label:2d}] {target_name.upper()}:")
    for source in sources:
        print(f"   {source:<20} → {target_label} ({target_name})")


Emotion Mapping (27 → 13):
GoEmotion (27)       → Target Label    Target Emotion (13)

[ 1] FEAR:
   fear                 → 1 (fear)

[ 2] ANGER:
   anger                → 2 (anger)
   annoyance            → 2 (anger)

[ 3] SADNESS:
   sadness              → 3 (sadness)
   grief                → 3 (sadness)

[ 4] ANXIETY:
   nervousness          → 4 (anxiety)

[ 5] CONFUSION:
   confusion            → 5 (confusion)
   curiosity            → 5 (confusion)

[ 6] SURPRISE:
   surprise             → 6 (surprise)
   realization          → 6 (surprise)

[ 7] DISGUST:
   disgust              → 7 (disgust)
   embarrassment        → 7 (disgust)

[ 8] CARING:
   caring               → 8 (caring)

[ 9] JOY:
   joy                  → 9 (joy)
   amusement            → 9 (joy)
   love                 → 9 (joy)
   admiration           → 9 (joy)

[10] EXCITEMENT:
   excitement           → 10 (excitement)
   desire               → 10 (excitement)

[11] GRATITUDE:
   gratitude            → 11 (gratitud

## 4. Validate Mapping

Ensure:
- All 27 GoEmotions are mapped
- No duplicates
- All target labels (1-13) are used

In [10]:
print("\nVALIDATION:")
print("=" * 60)

# Check all 27 emotions are mapped
goemotions_list = list(GOEMOTIONS_27.values())
mapped_emotions = list(EMOTION_MAPPING_27_TO_13.keys())

missing = set(goemotions_list) - set(mapped_emotions)
if missing:
    print(f"❌ Missing mappings: {missing}")
else:
    print(f"✅ All 27 GoEmotions are mapped")

# Check for duplicates in source
if len(mapped_emotions) == len(set(mapped_emotions)):
    print(f"✅ No duplicate source emotions")
else:
    print(f"❌ Duplicate source emotions found")

# Check all target labels 1-13 are used
target_labels_used = set(EMOTION_MAPPING_27_TO_13.values())
expected_labels = set(range(1, 14))

if target_labels_used == expected_labels:
    print(f"✅ All target labels (1-13) are used")
else:
    unused = expected_labels - target_labels_used
    print(f"⚠️  Unused target labels: {unused}")

# Show distribution
print(f"\nDistribution of target labels:")
label_counts = Counter(EMOTION_MAPPING_27_TO_13.values())
for label in sorted(label_counts.keys()):
    count = label_counts[label]
    name = TARGET_EMOTIONS_13[label]
    print(f"  [{label:2d}] {name:<15}: {count} source emotions")


VALIDATION:
✅ All 27 GoEmotions are mapped
✅ No duplicate source emotions
✅ All target labels (1-13) are used

Distribution of target labels:
  [ 1] fear           : 1 source emotions
  [ 2] anger          : 2 source emotions
  [ 3] sadness        : 2 source emotions
  [ 4] anxiety        : 1 source emotions
  [ 5] confusion      : 2 source emotions
  [ 6] surprise       : 2 source emotions
  [ 7] disgust        : 2 source emotions
  [ 8] caring         : 1 source emotions
  [ 9] joy            : 4 source emotions
  [10] excitement     : 2 source emotions
  [11] gratitude      : 4 source emotions
  [12] disappointment : 3 source emotions
  [13] neutral        : 2 source emotions


## 5. Test Mapping on GoEmotions Data

In [11]:
# Load GoEmotions sample
goemotions_df = pd.read_csv('goemotion_data/goemotions.csv', nrows=1000)

print("\nGoEmotions Sample:")
print("=" * 60)
print(f"Total rows loaded: {len(goemotions_df):,}")
print(f"\nColumns: {goemotions_df.columns.tolist()}")
print(f"\nFirst 3 rows:")
display(goemotions_df.head(3))


GoEmotions Sample:
Total rows loaded: 1,000

Columns: ['text', 'labels', 'id', 'Unnamed: 3', '[27] = neutral [0] = admiration [1] = amusement [2] = anger [3] = annoyance [4] = approval [5] = caring [6] = confusion [7] = curiosity [8] = desire [9] = disappointment [10] = disapproval [11] = disgust [12] = embarrassment [13] = excitement [14] = fear [15] = gratitude [16] = grief [17] = joy [18] = love [19] = nervousness [20] = optimism [21] = pride [22] = realization [23] = relief [24] = remorse [25] = sadness [26] = surprise [27] = neutral']

First 3 rows:


Unnamed: 0,text,labels,id,Unnamed: 3,[27] = neutral [0] = admiration [1] = amusement [2] = anger [3] = annoyance [4] = approval [5] = caring [6] = confusion [7] = curiosity [8] = desire [9] = disappointment [10] = disapproval [11] = disgust [12] = embarrassment [13] = excitement [14] = fear [15] = gratitude [16] = grief [17] = joy [18] = love [19] = nervousness [20] = optimism [21] = pride [22] = realization [23] = relief [24] = remorse [25] = sadness [26] = surprise [27] = neutral
0,My favourite food is anything I didn't have to...,[27],eebbqej,,
1,"Now if he does off himself, everyone will thin...",[27],ed00q6i,,
2,WHY THE FUCK IS BAYLESS ISOING,[2],eezlygj,,


In [12]:
# Create mapping function
def map_emotion_27_to_13(goemotions_label):
    """
    Maps GoEmotions label (0-27) to target emotion label (1-13)
    
    Args:
        goemotions_label: int or list of ints (GoEmotions can have multiple labels)
    
    Returns:
        int: Target emotion label (1-13)
    """
    # Handle list of labels (take first one for now)
    if isinstance(goemotions_label, list):
        if len(goemotions_label) == 0:
            return 13  # Default to neutral
        goemotions_label = goemotions_label[0]
    
    # Convert int to emotion name
    emotion_name = GOEMOTIONS_27.get(goemotions_label, 'neutral')
    
    # Map to target label
    target_label = EMOTION_MAPPING_27_TO_13.get(emotion_name, 13)
    
    return target_label

# Test the function
print("\nTesting Mapping Function:")
print("=" * 60)
print(f"{'Original':<20} → {'New Label':<10} {'New Emotion Name'}")
print("-" * 60)
test_cases = [0, 2, 14, 19, 25, 27]  # admiration, anger, fear, nervousness, sadness, neutral

for label in test_cases:
    source_name = GOEMOTIONS_27[label]
    target_label = map_emotion_27_to_13(label)
    target_name = TARGET_EMOTIONS_13[target_label]
    print(f"[{label:2d}] {source_name:<15} → [{target_label:2d}]        {target_name}")


Testing Mapping Function:
Original             → New Label  New Emotion Name
------------------------------------------------------------
[ 0] admiration      → [ 9]        joy
[ 2] anger           → [ 2]        anger
[14] fear            → [ 1]        fear
[19] nervousness     → [ 4]        anxiety
[25] sadness         → [ 3]        sadness
[27] neutral         → [13]        neutral


## 6. Save Mapping Configuration

In [13]:
import json

# Save mapping to JSON for use in other scripts
mapping_config = {
    'target_emotions': TARGET_EMOTIONS_13,
    'goemotions_27': GOEMOTIONS_27,
    'mapping_27_to_13': EMOTION_MAPPING_27_TO_13,
    'description': 'Emotion mapping for TEMPO crisis detection project',
    'created': '2026-01-27'
}

# Save to file
with open('emotion_mapping_config.json', 'w') as f:
    json.dump(mapping_config, f, indent=2)

print("✅ Saved emotion mapping to: emotion_mapping_config.json")

✅ Saved emotion mapping to: emotion_mapping_config.json


## Summary

### 13 Target Emotions:
1. **fear** - Critical for crisis detection
2. **anger** - Frustration, blame
3. **sadness** - Grief, loss, despair
4. **anxiety** - Nervousness, worry
5. **confusion** - Information seeking
6. **surprise** - Shock, unexpected
7. **disgust** - Strong negative reaction
8. **caring** - Empathy, support
9. **joy** - Positive emotions (non-crisis contrast)
10. **excitement** - High energy positive
11. **gratitude** - Thankfulness, relief
12. **disappointment** - Unmet expectations
13. **neutral** - Informational, factual

### Next Steps:
1. ✅ Mapping created and validated
2. **Apply mapping to all datasets** (Phase 3 scripts)
3. **Add emotion_label column** to standardized data
4. **Re-run Phase 4** to create master training file
5. **Train BERT** on 13-emotion classification