Find the rate of processed tickets for each type

Find the rate of processed tickets for each type.

In [1]:
import pandas as pd
import numpy as np

In [3]:
facebook_complaints = pd.read_excel("../CSV/facebook_complaints.xlsx", header=1)
facebook_complaints

Unnamed: 0,complaint_id,type,processed
0,0,0,True
1,1,0,True
2,2,0,False
3,3,1,True
4,4,1,True
5,5,1,False


In [4]:
facebook_complaints['processed'] = facebook_complaints['processed'].astype(int)
facebook_complaints

Unnamed: 0,complaint_id,type,processed
0,0,0,1
1,1,0,1
2,2,0,0
3,3,1,1
4,4,1,1
5,5,1,0


In [5]:
grouped = facebook_complaints.groupby(['type']).agg({'processed':'sum','complaint_id':'size'}).reset_index()
grouped

Unnamed: 0,type,processed,complaint_id
0,0,2,3
1,1,2,3


In [6]:
grouped['processed_rate'] =grouped['processed']/grouped['complaint_id']
grouped

Unnamed: 0,type,processed,complaint_id,processed_rate
0,0,2,3,0.666667
1,1,2,3,0.666667


In [7]:
result = grouped[['type','processed_rate']]
result

Unnamed: 0,type,processed_rate
0,0,0.666667
1,1,0.666667


Solution Walkthrough
In this walkthrough, we will go through the code snippet provided and understand how to find the rate of processed tickets for each type using pandas and numpy libraries in Python.

Understanding The Data
The code assumes that there is a dataframe named facebook_complaints which contains information about complaints on Facebook. The dataframe has columns such as processed, type, and complaint_id. The processed column contains binary values indicating whether a complaint is processed or not. The type column contains the type of complaint, and the complaint_id column contains a unique identifier for each complaint.

The Problem Statement
The task is to calculate the rate of processed tickets for each type of complaint. This rate represents the percentage of processed tickets out of the total number of tickets for each type.

Breaking Down The Code
Let's analyze the code snippet step by step:

import pandas as pd
import numpy as np

facebook_complaints["processed"] = facebook_complaints[
    "processed"
].astype(int)
This code snippet imports the pandas and numpy libraries and converts the 'processed' column of the facebook_complaints dataframe from boolean to integer. This conversion is necessary for later calculations.

grouped = (
    facebook_complaints.groupby(["type"])
    .agg({"processed": "sum", "complaint_id": "size"})
    .reset_index()
)
Next, the groupby() function is used to group the facebook_complaints dataframe by the 'type' column. The agg() function is then applied to calculate the sum of the 'processed' column and the count of the 'complaint_id' column for each group. This information is stored in the grouped dataframe.

grouped["processed_rate"] = (
    grouped["processed"] / grouped["complaint_id"]
)
After grouping the data, a new column named 'processed_rate' is added to the grouped dataframe. This column represents the rate of processed tickets for each type, calculated by dividing the sum of processed tickets by the total number of tickets for each type.

result = grouped[["type", "processed_rate"]]
Finally, the result dataframe is created by selecting only the 'type' and 'processed_rate' columns from the grouped dataframe. This dataframe contains the final result, showing the rate of processed tickets for each type of complaint.

Bringing It All Together
The complete code snippet is as follows:

import pandas as pd
import numpy as np

facebook_complaints["processed"] = facebook_complaints[
    "processed"
].astype(int)
grouped = (
    facebook_complaints.groupby(["type"])
    .agg({"processed": "sum", "complaint_id": "size"})
    .reset_index()
)
grouped['processed_rate'] =grouped['processed']/grouped['complaint_id']
result = grouped[['type','processed_rate']]
This code imports the necessary libraries, converts the 'processed' column to integers, groups the data by type, calculates the rates of processed tickets for each type, and stores the result in the 'result' dataframe.

Conclusion
By following this code snippet, you can calculate the rate of processed tickets for each type of complaint using pandas and numpy libraries in Python.