# *Análise de sentimentos de marcas de carros*
Será realizado uma análise de sentimentos nas avalicoes de 3 marcas comumente encontradas no Brasil, neste estudo chamadas de **Marca A, Marca B e Marca C**. 


 Utilizado a biblioteca **TextBlob** para o processamento de dados textuais. Que segundo a documentacao fornece uma API consistente para executar tarefas de processamento de linguagem natural comum (NLP), tais como marcação de parte da fala, extração de substantivos, análise de sentimentos, e muito mais.

É importante ressaltar que o TextBlob só fornece suporte para a língua inglesa.


#### **O que é análise de sentimentos**

A análise dos sentimentos é o processo de determinar a atitude ou a emoção de um escritor em um texto, ou seja, se é positivo ou negativo ou neutro. 

1. **Rótulos de sentimento:** Cada palavra em um corpus é rotulada em termos de polaridade e subjetividade. O sentimento de um corpus é a média destes.

   * *Polaridade*: Quão positiva ou negativa é uma palavra. -1 é muito negativa. +1 é muito positivo.
   * *Subjetividade*: Quão subjetiva, ou opinativa, é uma palavra. retorna um valor dentro da faixa [0,0, 1,0] onde 0,0 é muito objetivo e 1,0 é muito subjetivo.



In [29]:
import pandas as pd
import pickle

# Alterando configuracao de tamanho das colunas e linhas
pd.set_option('display.max_columns', 200)
pd.set_option('display.max_colwidth', 200)

from textblob import TextBlob
import cufflinks as cf


In [17]:
sentimento_df = pd.read_pickle('webinar_marca_com_ano.pkl')
sentimento_df.head()

Unnamed: 0,Rating,review,year,marca,date,review_year,month,day,week_day_name
0,5,2006 Mustang GT Doesn’t disappoint,2006,Marca_A,2018-06-06,2018,6,6,Wednesday
1,3,"DREAM CAR I bought mine 4/17 with 98K. Have been wanting a V6 5-sp, '05-'09 vintage for years. The engine is fine. Sounds good. Great mileage. Good power. I pride myself on smooth take-off and gea...",2006,Marca_A,2017-08-12,2017,8,12,Saturday
2,5,Great Ride There will always be a 05-09 mustang for sale and their fairly reasonable. Purchased mine as second car and I believe it was a great investment,2006,Marca_A,2017-06-15,2017,6,15,Thursday
3,5,"I have wanted a Mustang for 40 years. I bought my car from an auction I work at ( Adesa Sacramento ) and I love it!! It is a v6 with an air aid cold air injector, throttle body spacer and a Flowma...",2006,Marca_A,2017-05-18,2017,5,18,Thursday
4,5,One owner I bought this car spankin new and i still am In love with this car. This car hugs the road and does whatever you ask at a moments notice. The only thing I have had to fix is the alter...,2006,Marca_A,2016-01-03,2016,1,3,Sunday


In [18]:
# Para investigar a polaridade e subjetividade da revisão usando TextBlob
# Utilizei uma função lambda para criar a polaridade e subjetividade de cada revisão

rev_polaridade     = []
rev_subjetividade = []

def review_polaridade_subjetividade(x):
    review_analise = TextBlob(x)
    rev_polaridade.append(round(review_analise.polarity, 2))
    rev_subjetividade.append(round(review_analise.subjectivity, 2))
    
_ = sentimento_df.review.apply(lambda x: review_polaridade_subjetividade(x))
del _

# Creating 
sentimento_df['review_polaridade']     = rev_polaridade
sentimento_df['review_subjetividade'] = rev_subjetividade


In [19]:
# Apenas testando o TextBlob
TextBlob("Great Ride There will always be for sale and their fairly reasonable. Purchased mine as second car and I believe it was a great investment").sentiment

Sentiment(polarity=0.45, subjectivity=0.525)

In [21]:
sentimento_df.tail()

Unnamed: 0,Rating,review,year,marca,date,review_year,month,day,week_day_name,review_polaridade,review_subjetividade
3295,4,Greatest Car in the Whole Darn Universe! I bought my Beetle Baby used!! I loved it from day 1 that I first saw it. The color is bright green with a sunroof. After I had it for about 3 months I ...,2000,Marca_C,2008-05-11,2008,5,11,Sunday,0.34,0.5
3296,3,Pain in the back I have had more trouble with this car than I care to admit. I got more frustrated because I like the way the car drives but I have had to replace so many parts in this car from a...,2000,Marca_C,2008-04-12,2008,4,12,Saturday,0.01,0.41
3297,2,Too Bad This was a fun car. But its low level of quality really stinks. Something was always broken. The bulbs for the headlight are impossible to change if the plastic mounting/locking device for...,2000,Marca_C,2008-03-13,2008,3,13,Thursday,-0.15,0.51
3298,4,"Good value, cool snug bug, but.. Nimble handling, chic style and VW's engineering has not let us down in the 2 years we've driven it. Warning! Replacement of body panels may be extremely delayed....",2000,Marca_C,2007-07-09,2007,7,9,Monday,0.18,0.57
3299,2,"BAD CAR! Do not buy a VW. I cannot wait to get rid of this lemon. I bought this brand new back in 2000 and have had nothing but problems. Although they are not too major, they are annoying and cau...",2000,Marca_C,2007-07-02,2007,7,2,Monday,-0.14,0.45


In [22]:
# Neutro/Positivo/Negativo 
def polaridade_status(x):

    if x == 0:
        return 'Neutro'
    elif x > 0.00:
        return 'Positivo'
    elif x < 0.00:
        return 'Negativo'

# Muito Objetivo / Objetivo / Subjetivo / Muito Subjetivo
def subjetividade_status(x):
    if x == 0:
        return 'Muito Objetivo'
    elif x > 0.00 and x < 0.40:
        return 'Objetivo'
    elif x >= 0.40 and x < 0.70:
        return 'Subjetivo'
    elif x >= 0.70:
        return 'Muito Subjetivo'

# Colocando a polaridade e a subjetividade no DataFrame
sentimento_df['polaridade_status'] = sentimento_df.review_polaridade.apply(lambda x: polaridade_status(x))
sentimento_df['subjetividade_status'] = sentimento_df.review_subjetividade.apply(lambda x: subjetividade_status(x))

In [23]:
# Verificação do Datafram com o status de polaridade e subjetividade
sentimento_df.head()

Unnamed: 0,Rating,review,year,marca,date,review_year,month,day,week_day_name,review_polaridade,review_subjetividade,polaridade_status,subjetividade_status
0,5,2006 Mustang GT Doesn’t disappoint,2006,Marca_A,2018-06-06,2018,6,6,Wednesday,0.0,0.0,Neutro,Muito Objetivo
1,3,"DREAM CAR I bought mine 4/17 with 98K. Have been wanting a V6 5-sp, '05-'09 vintage for years. The engine is fine. Sounds good. Great mileage. Good power. I pride myself on smooth take-off and gea...",2006,Marca_A,2017-08-12,2017,8,12,Saturday,0.31,0.61,Positivo,Subjetivo
2,5,Great Ride There will always be a 05-09 mustang for sale and their fairly reasonable. Purchased mine as second car and I believe it was a great investment,2006,Marca_A,2017-06-15,2017,6,15,Thursday,0.45,0.53,Positivo,Subjetivo
3,5,"I have wanted a Mustang for 40 years. I bought my car from an auction I work at ( Adesa Sacramento ) and I love it!! It is a v6 with an air aid cold air injector, throttle body spacer and a Flowma...",2006,Marca_A,2017-05-18,2017,5,18,Thursday,-0.25,0.69,Negativo,Subjetivo
4,5,One owner I bought this car spankin new and i still am In love with this car. This car hugs the road and does whatever you ask at a moments notice. The only thing I have had to fix is the alter...,2006,Marca_A,2016-01-03,2016,1,3,Sunday,0.43,0.66,Positivo,Subjetivo


In [24]:
sentimento_df.loc[(sentimento_df.polaridade_status == 'Negativo')]

Unnamed: 0,Rating,review,year,marca,date,review_year,month,day,week_day_name,review_polaridade,review_subjetividade,polaridade_status,subjetividade_status
3,5,"I have wanted a Mustang for 40 years. I bought my car from an auction I work at ( Adesa Sacramento ) and I love it!! It is a v6 with an air aid cold air injector, throttle body spacer and a Flowma...",2006,Marca_A,2017-05-18,2017,5,18,Thursday,-0.25,0.69,Negativo,Subjetivo
5,3,"Poor Lots of problems with Ford these days, sensors issues, cam phasers, and got solenoid problems.",2006,Marca_A,2015-10-24,2015,10,24,Saturday,-0.40,0.60,Negativo,Subjetivo
18,4,So so.... I was exceptionally impressed with the down and dirty V6. The Chevy Monte Carlo LT and SS Intimidator have nothing on the 4.0 stock!! I bought an 06 5-speed Pony Edition before I went t...,2006,Marca_A,2010-03-01,2010,3,1,Monday,-0.01,0.47,Negativo,Subjetivo
20,3,"Worst Mustang ever, Don't buy Ford !!! 09/23/06 I financed a '06 Mustang GT, my 1st new car. Previously my family and I have had about 10 Fords. after 6 mos. the car stalled out, I didnt think any...",2006,Marca_A,2010-01-18,2010,1,18,Monday,-0.17,0.54,Negativo,Subjetivo
29,4,"Like this car a lot I currently have 31,000 miles on this car and it has never been back to the dealer. I have been doing oil changes and routine maintenance. The only complaint I have is the Road...",2006,Marca_A,2009-07-11,2009,7,11,Saturday,-0.05,0.44,Negativo,Subjetivo
...,...,...,...,...,...,...,...,...,...,...,...,...,...
3286,2,Dangerous to drive The first few months headlights went out on my way home from work in the dark. Both went out at the same time. Happened again a year later. No one enjoys replacing the headlight...,2000,Marca_C,2008-11-22,2008,11,22,Saturday,-0.15,0.46,Negativo,Subjetivo
3287,2,"I hate Volkswagen! This car is my nightmare. I hate it so much. I have it since January and I already spent $3,000 fixing it! I'm so tired of all that happened with it. And I'm broke obviously. Do...",2000,Marca_C,2008-11-12,2008,11,12,Wednesday,-0.08,0.54,Negativo,Subjetivo
3292,2,"A BIG lemon! My parents bought my bug for me when I turned 16. I was so excited when I got it! But soon I started to hate it! So many things have gone wrong such as I have had to get a new clutch,...",2000,Marca_C,2008-07-26,2008,7,26,Saturday,-0.01,0.55,Negativo,Subjetivo
3297,2,Too Bad This was a fun car. But its low level of quality really stinks. Something was always broken. The bulbs for the headlight are impossible to change if the plastic mounting/locking device for...,2000,Marca_C,2008-03-13,2008,3,13,Thursday,-0.15,0.51,Negativo,Subjetivo


# Mudando o status de polaridade com one hot encoding

|  Negativo |

|  Positivo  |

|  Neutro    |



Negativo | Positivo | Neutro

    1      |    1   |    1


In [25]:
# mudar a polaridade e o estado de sujeira para valores numéricos
neutro_lista  = []
positivo_lista = []
negativo_lista= []

def polaridade_status(x):
    if x == 0.00:
        neutro_lista.append(1)
        positivo_lista.append(0)
        negativo_lista.append(0)
    elif x > 0.00:
        positivo_lista.append(1)
        neutro_lista.append(0)
        negativo_lista.append(0)
    elif x < 0.00:
        positivo_lista.append(0)
        neutro_lista.append(0)
        negativo_lista.append(1)
    
_ = sentimento_df.review_polaridade.apply(lambda x: polaridade_status(x))
del _

sentimento_df['neutro_review']  = neutro_lista
sentimento_df['positivo_review'] = positivo_lista
sentimento_df['negativo_review'] = negativo_lista

In [26]:
sentimento_df.head()

Unnamed: 0,Rating,review,year,marca,date,review_year,month,day,week_day_name,review_polaridade,review_subjetividade,polaridade_status,subjetividade_status,neutro_review,positivo_review,negativo_review
0,5,2006 Mustang GT Doesn’t disappoint,2006,Marca_A,2018-06-06,2018,6,6,Wednesday,0.0,0.0,Neutro,Muito Objetivo,1,0,0
1,3,"DREAM CAR I bought mine 4/17 with 98K. Have been wanting a V6 5-sp, '05-'09 vintage for years. The engine is fine. Sounds good. Great mileage. Good power. I pride myself on smooth take-off and gea...",2006,Marca_A,2017-08-12,2017,8,12,Saturday,0.31,0.61,Positivo,Subjetivo,0,1,0
2,5,Great Ride There will always be a 05-09 mustang for sale and their fairly reasonable. Purchased mine as second car and I believe it was a great investment,2006,Marca_A,2017-06-15,2017,6,15,Thursday,0.45,0.53,Positivo,Subjetivo,0,1,0
3,5,"I have wanted a Mustang for 40 years. I bought my car from an auction I work at ( Adesa Sacramento ) and I love it!! It is a v6 with an air aid cold air injector, throttle body spacer and a Flowma...",2006,Marca_A,2017-05-18,2017,5,18,Thursday,-0.25,0.69,Negativo,Subjetivo,0,0,1
4,5,One owner I bought this car spankin new and i still am In love with this car. This car hugs the road and does whatever you ask at a moments notice. The only thing I have had to fix is the alter...,2006,Marca_A,2016-01-03,2016,1,3,Sunday,0.43,0.66,Positivo,Subjetivo,0,1,0


In [30]:
# cufflinks para visualizar a polaridade dos sentimentos

cf.go_offline()
cf.set_config_file(offline=False, world_readable=True)

In [31]:
#Decrição das colunas numéricas e categóricas 
sentimento_df.describe(include=['O']).T

Unnamed: 0,count,unique,top,freq
review,2264,2261,"My dream car is a lemon :( I always wanted a bug, and made it a \rgoal to purchase one. I bought this \rcar brand new off the lot with 4 miles \ron it. After 4 months of bliss, this \rcar became...",2
year,2264,16,2004,580
marca,2264,3,Marca_A,783
week_day_name,2264,7,Wednesday,381
polaridade_status,2264,3,Positivo,2014
subjetividade_status,2264,4,Subjetivo,1643


In [43]:
sentimento_review = sentimento_df[['marca','review','Rating','positivo_review','negativo_review','neutro_review','subjetividade_status']]
sentimento_review.tail()

Unnamed: 0,marca,review,Rating,positivo_review,negativo_review,neutro_review,subjetividade_status
3295,Marca_C,Greatest Car in the Whole Darn Universe! I bought my Beetle Baby used!! I loved it from day 1 that I first saw it. The color is bright green with a sunroof. After I had it for about 3 months I ...,4,1,0,0,Subjetivo
3296,Marca_C,Pain in the back I have had more trouble with this car than I care to admit. I got more frustrated because I like the way the car drives but I have had to replace so many parts in this car from a...,3,1,0,0,Subjetivo
3297,Marca_C,Too Bad This was a fun car. But its low level of quality really stinks. Something was always broken. The bulbs for the headlight are impossible to change if the plastic mounting/locking device for...,2,0,1,0,Subjetivo
3298,Marca_C,"Good value, cool snug bug, but.. Nimble handling, chic style and VW's engineering has not let us down in the 2 years we've driven it. Warning! Replacement of body panels may be extremely delayed....",4,1,0,0,Subjetivo
3299,Marca_C,"BAD CAR! Do not buy a VW. I cannot wait to get rid of this lemon. I bought this brand new back in 2000 and have had nothing but problems. Although they are not too major, they are annoying and cau...",2,0,1,0,Subjetivo


In [36]:
# Polaridade agregada das marcas
sentimento_marca = sentimento_df.groupby(['marca']).agg({ 'positivo_review': 'sum',       
                                           'negativo_review': 'sum','neutro_review': 'sum'
                                           }).reset_index()

                                    
                            
# Separar dataframe por marca 
sentimento_marca_chevrolet_Marca_B = sentimento_marca[sentimento_marca.marca == 'Marca_B'].drop('marca', axis = 1).copy()
sentimento_marca_Marca_A= sentimento_marca[sentimento_marca.marca == 'Marca_A'].drop('marca', axis = 1).copy()
sentimento_marca_Marca_C = sentimento_marca[sentimento_marca.marca == 'Marca_C'].drop('marca', axis = 1).copy()

In [37]:
# Obtendo a polaridade percentual

sentimento_marca['pct_positivo'] = (sentimento_marca['positivo_review'] / 
                                     (sentimento_marca['positivo_review']+ sentimento_marca['negativo_review']+
                                     sentimento_marca['neutro_review'])*100)

sentimento_marca['pct_negativo'] = (sentimento_marca['negativo_review'] / 
                                     (sentimento_marca['positivo_review']+ sentimento_marca['negativo_review']+
                                     sentimento_marca['neutro_review'])*100)

sentimento_marca['pct_neutro'] = (sentimento_marca['neutro_review'] / 
                                     (sentimento_marca['positivo_review']+ sentimento_marca['negativo_review']+
                                     sentimento_marca['neutro_review'])*100)

# Adicionando a coluna ao DataFrame
sentimento_marca['pct_positivo']= sentimento_marca['pct_positivo'].apply(lambda x:round(x,2))
sentimento_marca['pct_negativo']= sentimento_marca['pct_negativo'].apply(lambda x:round(x,2))
sentimento_marca['pct_neutro'] = sentimento_marca['pct_neutro'].apply(lambda x:round(x,2))

# Removendo algumas colunas
sentimento_marca.drop(['positivo_review', 'negativo_review', 'neutro_review'],axis=1,inplace=True)

In [38]:
# Salvando o dataset
sentimento_marca.to_csv('sentimento_marca_data.csv')

# Sentimento nas reviews das marcas Ford, Volkwagen e Chevrolet #

**INSIGHTS**


Dentre as 3 marcas, A Ford apresenta ter o maior percertual de comentários positivos e a Volkswagen de comentários negativos feitos pelos consumidores.

In [39]:
brands = sentimento_marca['marca'].tolist()
pct_sent = sentimento_marca.drop(columns='marca').T.reset_index()
pct_sent.rename(columns={"index": "sentiment", 0: "Marca_B", 1: 'Marca_A', 2: 'Marca_C'}, inplace=True)
pct_sent

Unnamed: 0,sentiment,Marca_B,Marca_A,Marca_C
0,pct_positivo,90.8,90.83,85.44
1,pct_negativo,5.75,5.44,11.88
2,pct_neutro,3.45,3.72,2.68


In [40]:
import plotly.graph_objs as go

colors = ['#1DE9B6', '#03A9F4', '#FF5252']

trace1 = go.Pie(
     values=pct_sent.Marca_B,
     labels=pct_sent.sentiment,
     domain=dict(x=[0, 0.333]),
     name="Marca_B",
     hoverinfo="label+percent+name",
     title='Marca_B'
)
trace2 = go.Pie(
     values=pct_sent.Marca_A,
     labels=pct_sent.sentiment,
     domain=dict(x=[0.333, 0.666]),
     name="Marca_A",
     hoverinfo="label+percent+name",
     title='Marca_A'
)
trace3 = go.Pie(
     values=pct_sent.Marca_C,
     labels=pct_sent.sentiment,
     domain=dict(x=[0.666, 0.999]),
     name="Marca_C",
     hoverinfo="label+percent+name",
     title='Marca_C'
)
layout = go.Layout(title="Sentimento dos reviews por marca",)
data = [trace1, trace2, trace3]
fig = go.Figure(data=data, layout=layout)

fig.update_traces(titlefont_size=20, textfont_size=15,
                  marker=dict(colors=colors))


fig.show()

In [41]:
import plotly.express as px
import numpy as np

df_line = sentimento_df[["review_year","positivo_review","marca"]].groupby(['review_year','marca']).sum().reset_index()

fig = px.line(df_line, x="review_year", y="positivo_review", color='marca',
                title="Reviews positivos por marca ao longo dos anos", markers=True)



fig.show()

In [42]:
df_line = sentimento_df[["review_year","negativo_review","marca"]].groupby(['review_year','marca']).sum().reset_index()

fig = px.line(df_line, x="review_year", y="negativo_review", color='marca',
                title="Reviews negativos por marca ao longo dos anos", markers=True)


fig.show()