## Support Tickets Simulation

| Column Name      | Type  | Description                                       |
| ---------------- | ----- | ------------------------------------------------- |
| ticket\_id       | UUID  | Unique ID for each support case                   |
| customer\_id     | TEXT  | Link to customer                                  |
| ticket\_date     | DATE  | When they reached support                         |
| issue\_type      | TEXT  | "App Bug", "Login Issue", "Fraud Report", "Other" |
| resolution\_time | FLOAT | How long it took to close (in hours)              |
| resolved         | INT   | 1 = resolved, 0 = unresolved                      |


## Code to Simulate Support Tickets
##### You can adjust to 100k–300k rows

In [3]:
import pandas as pd
import numpy as np
from faker import Faker
import uuid
import random

faker = Faker()

## Load customers data
df_customers = pd.read_csv("customers.csv")

In [4]:

# Parameters
n_tickets = 300_000
issue_types = ['App Bug', 'Login Issue', 'Fraud Report', 'Other']

# Simulate support tickets
support_df = pd.DataFrame({
    'ticket_id': [str(uuid.uuid4()) for _ in range(n_tickets)],
    'customer_id': np.random.choice(df_customers['customer_id'], size=n_tickets),
    'ticket_date': [faker.date_time_between(start_date='-12M', end_date='now') for _ in range(n_tickets)],
    'issue_type': np.random.choice(issue_types, size=n_tickets, p=[0.4, 0.3, 0.2, 0.1]),
    'resolution_time': np.round(np.random.exponential(scale=12, size=n_tickets), 2),  # Avg ~12 hrs
    'resolved': np.random.choice([1, 0], size=n_tickets, p=[0.9, 0.1])
})

print("Completed ✔")

Completed ✔


In [5]:
support_df.to_csv("support_tickets.csv", index = False)

In [6]:
support_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 300000 entries, 0 to 299999
Data columns (total 6 columns):
 #   Column           Non-Null Count   Dtype         
---  ------           --------------   -----         
 0   ticket_id        300000 non-null  object        
 1   customer_id      300000 non-null  object        
 2   ticket_date      300000 non-null  datetime64[ns]
 3   issue_type       300000 non-null  object        
 4   resolution_time  300000 non-null  float64       
 5   resolved         300000 non-null  int32         
dtypes: datetime64[ns](1), float64(1), int32(1), object(3)
memory usage: 12.6+ MB
