In [2]:
import pandas as pd

# Load the dataset
file_path = "customer_support_tickets.csv"  # Update if your file is in a different location
df = pd.read_csv(file_path)

# Step 1: Convert relevant columns to datetime
df['Date of Purchase'] = pd.to_datetime(df['Date of Purchase'], errors='coerce')
df['First Response Time'] = pd.to_datetime(df['First Response Time'], errors='coerce')
df['Time to Resolution'] = pd.to_datetime(df['Time to Resolution'], errors='coerce')

# Step 2: Clean Ticket Description by removing template placeholders
df['Ticket Description'] = df['Ticket Description'].str.replace(r'\{.*?\}', '', regex=True).str.strip()

# Step 3: Calculate Response Delay and Resolution Time in hours
df['Response Delay (hrs)'] = (df['First Response Time'] - df['Date of Purchase']).dt.total_seconds() / 3600
df['Resolution Time (hrs)'] = (df['Time to Resolution'] - df['First Response Time']).dt.total_seconds() / 3600

# Step 4: Handle missing values
# Fill missing numeric values with median
df['Response Delay (hrs)'].fillna(df['Response Delay (hrs)'].median(), inplace=True)
df['Resolution Time (hrs)'].fillna(df['Resolution Time (hrs)'].median(), inplace=True)

# Fill missing text fields
df['Resolution'].fillna("Unresolved", inplace=True)

# Fill missing ratings with median rating
df['Customer Satisfaction Rating'].fillna(df['Customer Satisfaction Rating'].median(), inplace=True)

# Step 5: Save the cleaned dataset
output_path = "cleaned_customer_support_tickets.csv"
df.to_csv(output_path, index=False)

print("Cleaned dataset saved to:", output_path)


Cleaned dataset saved to: cleaned_customer_support_tickets.csv


The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df['Response Delay (hrs)'].fillna(df['Response Delay (hrs)'].median(), inplace=True)
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df['Resolution Time (hrs)'].fillna(df['Resolution Time (hrs)'].median(), inplace=True)
The behavior will change in pandas 3.0. This inplace method 