# Customer complaints analysis

In [3]:
# Importing python libraries to be used in the analysis process


import pandas as pd
import spacy
import gensim
import pyLDAvis
import numpy
import matplotlib
from cleantext import clean
!python -m spacy download en_core_web_sm -qq

[+] Download and installation successful
You can now load the package via spacy.load('en_core_web_sm')


### Importing customer complaints csv file

In [4]:
customer_complaints = pd.read_csv("comcast_consumeraffairs_complaints.csv")

In [5]:
# Displaying the first five rows of the imported csv file

customer_complaints.head()

Unnamed: 0,author,posted_on,rating,text
0,"Alantae of Chesterfeild, MI","Nov. 22, 2016",1,I used to love Comcast. Until all these consta...
1,"Vera of Philadelphia, PA","Nov. 19, 2016",1,I'm so over Comcast! The worst internet provid...
2,"Sarah of Rancho Cordova, CA","Nov. 17, 2016",1,If I could give them a negative star or no sta...
3,"Dennis of Manchester, NH","Nov. 16, 2016",1,I've had the worst experiences so far since in...
4,"Ryan of Bellevue, WA","Nov. 14, 2016",1,Check your contract when you sign up for Comca...


In [6]:
# Filtering the complaints data to display only the elements that contain a complaint and removing the ones that are empty or null

customer_complaints = customer_complaints[pd.notnull(customer_complaints['text'])]


# Converting the 'text' column (containing text for customer complaints) to a list 

complaints_text_list = customer_complaints.text.values.tolist()


# Printing individual customer's complaints on a separate line 

for individual_complaint in complaints_text_list:
    
    clean(str(individual_complaint), no_emoji= True)   # Trying to remove emojis from complaints text but isn't working
    print(individual_complaint)
    print(" ")
    print(" ")
    

I used to love Comcast. Until all these constant updates. My internet and cable crash a lot at night, and sometimes during the day, some channels don't even work and on demand sometimes don't play either. I wish they will do something about it. Because just a few mins ago, the internet have crashed for about 20 mins for no reason. I'm tired of it and thinking about switching to Wow or something. Please do not get Xfinity.
 
 
I'm so over Comcast! The worst internet provider. I'm taking online classes and multiple times was late with my assignments because of the power interruptions in my area that lead to poor quality internet service. Definitely switching to Verizon. I'd rather pay $10 extra then dealing w/ Comcast and non stopping internet problems.
 
 
If I could give them a negative star or no stars on this review I would. I have never worked with any industry with as bad of customer service as Comcast. It is not a matter of money because I make well enough above and beyond to affo

  text = text.encode("latin", "backslashreplace").decode("unicode-escape")



 
 
In May 2015 I agreed to upgrade to the X1 platform at the rate of 159.49, with no contract. Since the upgrade my bill has never been the correct amount. It has gone up. Also since installation the service has never fully worked. It will freeze up and black out. Some of these have been due to Comcast working in the area and others there has been no explanation. I have had to switch my modem out once already which has not fixed the problem of it not assigning the correct IP. I have tried several times to talk to someone using the 800 Comcast number but get bounced around departments or told that there is nothing they can do. I am at the point to disconnect all my services and find another option due to the utter lack of ownership of the problems I have had. I tried to have a technician come out and check the original install of the X1 and was told that they cannot schedule techs during an outage and that I would have to call back later. This is ridiculous. I would like to speak to s

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)



Comcast offered a great deal to switch my TV programming from DirecTV to Comcast Digital Cable. When they signed me up, we chose Digital Starter pack with 40 HD channels that included CNN HD and PLD HD, among others. Little that we realize that their service and equipment would be extremely pathetic. They set us up with an HD box with DVR that would literally hang at least 4-5 times a day and would need cold boot every time. Today, 6/29/09, we decided to lug the bulky box to the local Comcast service station in Fremont and got a replacement. When we set up the replacement box today, we no longer received the HD programming line-up that was promised to us as part of the Digital Starter pack (which we locked for one year through a contract, by the way). When I got on the phone and waited for 20 minutes, I gave up and got on a chat session instead. The chat session support person named Ronald told us that I will have to upgrade my package in order to get the HD programs that I until a cou

### Preprocessing complaints text

In [None]:
# Loading nlp pipeline
nlp = spacy.load('en_core_web_sm')

# Setting the variable 'index' to 0 which will later on be incremented

index = 0

# Using the while loop to iterate through the complaints_text_list

while len(complaints_text_list) > index:

    for individual_complaint in complaints_text_list:
        
        individual_complaint = individual_complaint.lower()   # To convert all text to lowercase
       
        document = nlp(individual_complaint)   # Creating a document by running individual complaints through the nlp pipeline for tokenizing
         
        doc_no_punct = [token for token in document if not token.is_punct ]   # Creating a refined list of tokens from the 'document' tokens without any punctuation
        doc_no_stopwords = [token for token in doc_no_punct if token.is_stop == False]   # Further refining list of tokens from 'doc_no_punct' tokens without any stopwords
        ##lemmatized_doc = [token.lemma_ for token in doc_no_stopwords]  # To lemmatize text like walking, resetting etc. but not working
    
        print("Tokens_of_complaint_",  index + 1 , ":    ", [token.text for token in doc_no_stopwords])   # Printing the tokens of each complaint as a list
        print(" ")
        print(" ")
        index += 1   # Incrementing index by 1
    

Tokens_of_complaint_ 1 :     ['love', 'comcast', 'constant', 'updates', 'internet', 'cable', 'crash', 'lot', 'night', 'day', 'channels', 'work', 'demand', 'play', 'wish', 'mins', 'ago', 'internet', 'crashed', '20', 'mins', 'reason', 'tired', 'thinking', 'switching', 'wow', 'xfinity']
 
 
Tokens_of_complaint_ 2 :     ['comcast', 'worst', 'internet', 'provider', 'taking', 'online', 'classes', 'multiple', 'times', 'late', 'assignments', 'power', 'interruptions', 'area', 'lead', 'poor', 'quality', 'internet', 'service', 'definitely', 'switching', 'verizon', 'pay', '$', '10', 'extra', 'dealing', 'w/', 'comcast', 'non', 'stopping', 'internet', 'problems']
 
 
Tokens_of_complaint_ 3 :     ['negative', 'star', 'stars', 'review', 'worked', 'industry', 'bad', 'customer', 'service', 'comcast', 'matter', 'money', 'afford', 'services', 'legitimate', 'ripoff', 'think', 'biggest', 'scam', 'mortgage', 'industry', 'major', 'meltdown', 'hope', 'comcast', 'exist', 'disregard', 'want', 'help', 'right', 't