# 02 - Labeling Intent

This notebook is where we create our **Golden Dataset** — a high-quality, manually labeled sample of customer tweets, annotated by intent.

---

### Intent Categories

| Intent Label        | Description                                      |
|---------------------|------------------------------------------------|
| `cancel_service`    | Customer wants to cancel account or switch plan|
| `billing_issue`     | Complaints or questions about billing           |
| `technical_issue`   | Problems with device, network, or service       |
| `account_help`      | Issues with account management or login         |
| `upgrade_request`   | Requests to upgrade device or plan               |
| `general_question`  | General questions or inquiries                    |
| `positive_feedback` | Praise or compliments                             |
| `complaint`         | Negative feedback not related to billing or tech|
| `other`             | Anything else or unclear                          |

# Labeling and Logic of Categorization

In [4]:
import pandas as pd

# Loads Data from Cleaned CSV
df = pd.read_csv("../data/processed/cleaned_tweets.csv")

# Sample 250 rows to label
golden_df = df.sample(250, random_state=42).copy()

# Creates Categories to put into column later
intent_categories = [
    "Billing",
    "Technical Support",
    "Account Management",
    "Complaint",
    "Praise/Thank You",
    "Other"
]

keywords = {
    "Billing": ["bill", "charge", "payment", "refund"],
    "Technical Support": ["error", "issue", "problem", "disconnect", "slow"],
    "Account Management": ["password", "login", "account", "reset"],
    "Complaint": ["bad", "terrible", "worst", "disappointed", "angry"],
    "Praise/Thank You": ["thank", "great", "love", "awesome", "appreciate"]
}

def assign_intent(text):
    if not isinstance(text, str):
        return "Other"
    
    for intent, kws in keywords.items():
        for kw in kws:
            if kw in text:
                return intent
    return "Other"

# Apply Categorization Function to Datarframe and Write to CSV file

In [15]:
df['intent'] = df['cleaned_text'].apply(assign_intent)
print(df[['cleaned_text', 'intent']].sample(10))

# Writes golden dataset to csv file
df.to_csv("../data/processed/golden_intent_labeled.csv", index=False)
print("Golden dataset saved")

                                           cleaned_text             intent
5148   thanks to and for swapping out damaged item f...   Praise/Thank You
6929   hi there please write in your suggestion to a...              Other
347    flt has been sitting ft from gate for min wha...  Technical Support
8935   im always organised this is late for me this ...   Praise/Thank You
1134   hi i need an address to your residential acco...            Billing
8926   if you follow and dm us we can take a look fo...              Other
5827   hello there we are excited that your are thin...              Other
6668   its not an exact match but the nearest possib...              Other
9701   how adorable kevin looks very cozy there kevi...              Other
5267  ok i need yall again can yall renew my month f...              Other
Golden dataset saved


# Quick Evaluation

In [23]:
# Peep the examples by intent
for label in df['intent'].unique():
    print(f"Intent: {label}")
    sample = df[df['intent'] == label][['cleaned_text']].dropna().sample(3, random_state=1)
    for i, row in enumerate(sample.itertuples(index=False), 1):
        print(f"\n  {i}. {row.cleaned_text}")

Intent: Account Management

  1.  hey that doesnt sound good did the app crash at any point during the minutes that can cause the adfree time to reset nq

  2.  no you are not i log into my account this morning and its still incorrect

  3.  hey there can you please dm us the phone number associated with your account 
Intent: Technical Support

  1.  that didnt help on either browser i click one song to play but i see dozens from the playlist scroll by then i get that error message

  2.  is there a way to find out what company delivers your packages we have a problem with one driver and it needs sorting

  3.  i did the tmobile tuesday today for the free movie from vudu but it doesnt work i get the following error 
Intent: Other

  1.  i did but im not getting any answers

  2. are stores open tomorrow

  3.  your own courier is shite attempted deliveries that were never attempted as we were home all day yesterday epicfail lazies
Intent: Praise/Thank You

  1.  okay thanks for the upd