DESCRIPTION

Comcast is an American global telecommunication company. The firm has been providing terrible customer service. They continue to fall short despite repeated promises to improve. Only last month (October 2016) the authority fined them a $2.3 million, after receiving over 1000 consumer complaints.
The existing database will serve as a repository of public customer complaints filed against Comcast.
It will help to pin down what is wrong with Comcast's customer service.

Data Dictionary

Ticket #: Ticket number assigned to each complaint
Customer Complaint: Description of complaint
Date: Date of complaint
Time: Time of complaint
Received Via: Mode of communication of the complaint
City: Customer city
State: Customer state
Zipcode: Customer zip
Status: Status of complaint
Filing on behalf of someone
Analysis Task

To perform these tasks, you can use any of the different Python libraries such as NumPy, SciPy, Pandas, scikit-learn, matplotlib, and BeautifulSoup.

- Import data into Python environment.
- Provide the trend chart for the number of complaints at monthly and daily granularity levels.
- Provide a table with the frequency of complaint types.

Which complaint types are maximum i.e., around internet, network issues, or across any other domains.
- Create a new categorical variable with value as Open and Closed. Open & Pending is to be categorized as Open and Closed & Solved is to be categorized as Closed.
- Provide state wise status of complaints in a stacked bar chart. Use the categorized variable from Q3. Provide insights on:

Which state has the maximum complaints
Which state has the highest percentage of unresolved complaints
- Provide the percentage of complaints resolved till date, which were received through the Internet and customer care calls.

The analysis results to be provided with insights wherever applicable.

# Import data into Python environment.

In [1]:
import pandas as pd

In [2]:
comcast = pd.read_csv("D:\SimpliLearn\Python\Data Science with python\Py_comcast\Comcast_telecom_complaints_data.csv")

In [3]:
comcast.head()

Unnamed: 0,Ticket #,Customer Complaint,Date,Date_month_year,Time,Received Via,City,State,Zip code,Status,Filing on Behalf of Someone
0,250635,Comcast Cable Internet Speeds,22-04-15,22-Apr-15,3:53:50 PM,Customer Care Call,Abingdon,Maryland,21009,Closed,No
1,223441,Payment disappear - service got disconnected,04-08-15,04-Aug-15,10:22:56 AM,Internet,Acworth,Georgia,30102,Closed,No
2,242732,Speed and Service,18-04-15,18-Apr-15,9:55:47 AM,Internet,Acworth,Georgia,30101,Closed,Yes
3,277946,Comcast Imposed a New Usage Cap of 300GB that ...,05-07-15,05-Jul-15,11:59:35 AM,Internet,Acworth,Georgia,30101,Open,Yes
4,307175,Comcast not working and no service to boot,26-05-15,26-May-15,1:25:26 PM,Internet,Acworth,Georgia,30101,Solved,No


In [4]:
comcast.isnull().sum()

Ticket #                       0
Customer Complaint             0
Date                           0
Date_month_year                0
Time                           0
Received Via                   0
City                           0
State                          0
Zip code                       0
Status                         0
Filing on Behalf of Someone    0
dtype: int64

In [5]:
comcast.shape

(2224, 11)

# Provide the trend chart for the number of complaints at monthly and daily granularity levels.

In [321]:
comcast.groupby(["Date","Customer Complaint"]).sum()

Unnamed: 0_level_0,Unnamed: 1_level_0,Zip code
Date,Customer Complaint,Unnamed: 2_level_1
04-01-15,Comcast,15642
04-01-15,Comcast Cable,37067
04-01-15,Comcast Customer Service; Theft; Inconsistency,19121
04-01-15,Comcast Lied About Pricing And Installation,94560
04-01-15,Comcast harassment,60193
...,...,...
31-05-15,Complaint against Comcast for incredibly bad service,98372
31-05-15,Hidden Product Installation Fee,17011
31-05-15,Poor Service from Comcast,30096
31-05-15,Questionable internet slowdown,1960


# 

# Which complaint types are maximum i.e., around internet, network issues, or across any other domains.

In [322]:
comcast["Received Via"].value_counts()

Customer Care Call    1119
Internet              1105
Name: Received Via, dtype: int64

# Provide a table with the frequency of complaint types.

In [323]:
comcast["Status"].value_counts()

Solved     973
Closed     734
Open       363
Pending    154
Name: Status, dtype: int64

# Create a new categorical variable with value as Open and Closed. Open & Pending is to be categorized as Open and Closed & Solved is to be categorized as Closed.

In [324]:
comcast["New_Status"] = comcast["Status"]

In [325]:
comcast.loc[comcast.Status.isin(["Open","Pending"]),"New_Status"] = "open"

In [326]:
comcast.loc[comcast.Status.isin(["Closed","Solved"]),"New_Status"] = "close"

In [327]:
comcast.head()

Unnamed: 0,Ticket #,Customer Complaint,Date,Date_month_year,Time,Received Via,City,State,Zip code,Status,Filing on Behalf of Someone,New_Status
0,250635,Comcast Cable Internet Speeds,22-04-15,22-Apr-15,3:53:50 PM,Customer Care Call,Abingdon,Maryland,21009,Closed,No,close
1,223441,Payment disappear - service got disconnected,04-08-15,04-Aug-15,10:22:56 AM,Internet,Acworth,Georgia,30102,Closed,No,close
2,242732,Speed and Service,18-04-15,18-Apr-15,9:55:47 AM,Internet,Acworth,Georgia,30101,Closed,Yes,close
3,277946,Comcast Imposed a New Usage Cap of 300GB that ...,05-07-15,05-Jul-15,11:59:35 AM,Internet,Acworth,Georgia,30101,Open,Yes,open
4,307175,Comcast not working and no service to boot,26-05-15,26-May-15,1:25:26 PM,Internet,Acworth,Georgia,30101,Solved,No,close


# Provide state wise status of complaints in a stacked bar chart. Use the categorized variable from Q3. Provide insights on:

In [328]:
comcast.groupby(["State","New_Status"])["Zip code"].count()

State          New_Status
Alabama        close         17
               open           9
Arizona        close         14
               open           6
Arkansas       close          6
                             ..
Virginia       open          11
Washington     close         75
               open          23
West Virginia  close          8
               open           3
Name: Zip code, Length: 77, dtype: int64

# Which state has the maximum complaints  

In [329]:
comcast.State[comcast.New_Status == "open"].value_counts() 

Georgia                 80
California              61
Tennessee               47
Florida                 39
Illinois                29
Michigan                23
Washington              23
Texas                   22
Colorado                22
Pennsylvania            20
New Jersey              19
Mississippi             16
Maryland                15
Oregon                  13
Virginia                11
Massachusetts           11
Alabama                  9
Indiana                  9
Arizona                  6
Utah                     6
New Mexico               4
Delaware                 4
Minnesota                4
New Hampshire            4
South Carolina           3
Connecticut              3
Kentucky                 3
West Virginia            3
District Of Columbia     2
Maine                    2
Kansas                   1
Vermont                  1
Louisiana                1
Missouri                 1
Name: State, dtype: int64

# Which state has the highest percentage of unresolved complaints

In [330]:
len(comcast.New_Status)

2224

In [9]:
len(comcast.Customer complaint)

SyntaxError: invalid syntax (<ipython-input-9-c2ff0c8a7313>, line 1)

In [336]:
(comcast.State[comcast.New_Status == "open"].value_counts()/len(comcast.New_Status))*100

Georgia                 3.597122
California              2.742806
Tennessee               2.113309
Florida                 1.753597
Illinois                1.303957
Michigan                1.034173
Washington              1.034173
Texas                   0.989209
Colorado                0.989209
Pennsylvania            0.899281
New Jersey              0.854317
Mississippi             0.719424
Maryland                0.674460
Oregon                  0.584532
Virginia                0.494604
Massachusetts           0.494604
Alabama                 0.404676
Indiana                 0.404676
Arizona                 0.269784
Utah                    0.269784
New Mexico              0.179856
Delaware                0.179856
Minnesota               0.179856
New Hampshire           0.179856
South Carolina          0.134892
Connecticut             0.134892
Kentucky                0.134892
West Virginia           0.134892
District Of Columbia    0.089928
Maine                   0.089928
Kansas    

### Provide the percentage of complaints resolved till date, which were received through the Internet and customer care calls.

In [10]:
comcast.Customer Complaint[comcast.New_Status == "Close"].value_counts()/len(comcast.New_Status))*100

SyntaxError: invalid syntax (<ipython-input-10-6683813ffcff>, line 1)