### Sentiment Analysis - Part 1

Instalar o Google Play Scraper e importar os pacotes necessários

In [1]:
#Install Google play scraper: https://github.com/JoMingyu/google-play-scraper
!pip install google_play_scraper

Collecting google_play_scraper
  Using cached google_play_scraper-1.1.0-py3-none-any.whl
Installing collected packages: google-play-scraper
Successfully installed google-play-scraper-1.1.0


In [2]:
import json
import pandas as pd
from tqdm import tqdm

import seaborn as sns
import matplotlib.pyplot as plt

from pygments import highlight
from pygments.lexers import JsonLexer
from pygments.formatters import TerminalFormatter

from google_play_scraper import Sort, reviews, app

%matplotlib inline
%config InlineBackend.figure_format='retina'

sns.set(style='whitegrid', palette='muted', font_scale=1.2)

#### Top 10 Food and Drinks Apps in Portugal

1. Burger King - Portugal ⇒ com.bk.pt

2. Uber Eats: Entrega de comida ⇒ com.ubercab.eats

3. McDonald's ⇒ com.mcdonalds.mobileapp

4. Too Good To Go ⇒ com.app.tgtg

5. TheFork - Restaurantes ⇒ com.lafourchette.lafourchette

6. Bolt Food ⇒ com.bolt.deliveryclient

7. Zomato Portugal ⇒ com.outdarelab.zomato

8. Moulinex, receitas e mais... ⇒ com.groupeseb.moulinex.food

9. H3 ⇒ pt.yunit.mobile.android.h3

10. Telepizza Refeições ao Domicílio ⇒ com.telepizza

Source: https://www.mobileaction.co/

Definir os IDs dos apps a serem analisados e extrair informações básicas sobre cada um deles.

In [3]:
apps_ids = ['com.bk.pt', 'com.ubercab.eats',

'com.mcdonalds.mobileapp', 'com.app.tgtg',

'com.lafourchette.lafourchette',

'com.bolt.deliveryclient', 'com.outdarelab.zomato',

'com.groupeseb.moulinex.food', 'pt.yunit.mobile.android.h3',

'com.telepizza']

Scraping data para cada app

In [4]:
app_infos = []

for ap in tqdm(apps_ids):
    info = app(ap, lang='en', country='us')
    del info['comments']
    app_infos.append(info)

100%|██████████| 10/10 [00:09<00:00,  1.01it/s]


In [5]:
app_infos_df = pd.DataFrame(app_infos)
app_infos_df.head()

Unnamed: 0,title,description,descriptionHTML,summary,installs,minInstalls,score,ratings,reviews,histogram,...,contentRatingDescription,adSupported,containsAds,released,updated,version,recentChanges,recentChangesHTML,appId,url
0,Burger King - Portugal,Join the exclusive savings with the official B...,Join the exclusive savings with the official B...,Coupons for in-store use only,"500,000+",500000,0.0,0,0,"[0, 0, 0, 0, 0]",...,,False,False,"Mar 20, 2018",1654869205,4.5.0,We have news! Update the APP and you will be a...,We have news! Update the APP and you will be a...,com.bk.pt,https://play.google.com/store/apps/details?id=...
1,Uber Eats: Food Delivery,Get food delivery to your doorstep from thousa...,Get food delivery to your doorstep from thousa...,"Food & Grocery Delivery App. Order Pizza, Sush...","100,000,000+",100000000,4.44534,4622030,257264,"[381795, 56572, 139487, 587750, 3456411]",...,,True,True,"Feb 29, 2016",1655134904,6.116.10002,We update the Uber Eats app as often as possib...,We update the Uber Eats app as often as possib...,com.ubercab.eats,https://play.google.com/store/apps/details?id=...
2,McDonald's,Download the McDonald's™ App for unique offers...,Download the McDonald&#39;s™ App for unique of...,Download the McDonald’s App to have all the of...,"50,000,000+",50000000,3.65493,471378,1611,"[128305, 15427, 20977, 32076, 274369]",...,,True,True,"Feb 26, 2018",1654678119,2.43.0,,,com.mcdonalds.mobileapp,https://play.google.com/store/apps/details?id=...
3,Too Good To Go: End Food Waste,Join millions of food waste warriors by downlo...,Join millions of food waste warriors by downlo...,Eat delicious food for next to nothing and fig...,"10,000,000+",10000000,4.786268,879305,1729,"[28220, 2919, 12650, 40870, 794546]",...,,False,False,"Jan 14, 2016",1655224910,22.5.10,Thanks for helping fight food waste with us! T...,Thanks for helping fight food waste with us! T...,com.app.tgtg,https://play.google.com/store/apps/details?id=...
4,TheFork - Restaurant bookings,<b>10€ discount on your next meal!</b>\r\nMake...,<b>10€ discount on your next meal!</b><br>Make...,Download TheFork to book your next table with ...,"10,000,000+",10000000,4.873874,213400,208,"[0, 0, 3702, 19098, 190208]",...,,False,False,"Nov 3, 2011",1655134882,20.15.0,We constantly update TheFork app to provide yo...,We constantly update TheFork app to provide yo...,com.lafourchette.lafourchette,https://play.google.com/store/apps/details?id=...


#### Avaliações do Scraping App

Recolher as avaliações dos utilizadores para cada uma das aplicações. Como os textos serão divididos em três classes possíveis (positivo, negativo ou neutro) e o objetivo é obter um dataset o mais balanceado possível, selecionou-se 400 amostras de avaliações com score 3 e 200 de cada uma das outras.

Nós queremos:
* Dataset balanceado - aproximadamente o mesmo número de avaliações para cada pontuação (1-5)
* Amostra representativa das avaliações de cada aplicação

Podemos satisfazer o primeiro requisito usando a opção do pacote de scraping para filtrar a pontuação da revisão. Para o segundo, classificaremos os comentários por sua utilidade, que são os comentários que o Google Play considera mais importantes.

In [6]:
app_reviews = []

for ap in tqdm(apps_ids):
    for score in list(range(1, 6)):
        for sort_order in [Sort.MOST_RELEVANT, Sort.NEWEST]:
            rvs, _ = reviews(
                ap,
                lang='pt',
                country='br',
                sort=sort_order,
                count= 200 if score == 3 else 100,
                filter_score_with=score
            )
            for r in rvs:
                r['sortOrder'] = 'most_relevant' if sort_order == Sort.MOST_RELEVANT else 'newest'
                r['appId'] = ap
            app_reviews.extend(rvs)

100%|██████████| 10/10 [00:41<00:00,  4.17s/it]


In [7]:
len(app_reviews)

7126

Guardar as avaliações num ficheiro CSV

In [8]:
app_reviews_df = pd.DataFrame(app_reviews)
app_reviews_df.head()

Unnamed: 0,reviewId,userName,userImage,content,score,thumbsUpCount,reviewCreatedVersion,at,replyContent,repliedAt,sortOrder,appId
0,6e4e5f91-94d4-4af7-a2c5-b859aa762f54,Carlos V. Gonzalez,https://play-lh.googleusercontent.com/a-/AOh14...,"BurgerKing Portugal deixa muito a desejar, não...",1,4,4.4.9,2022-06-09 13:24:17,,NaT,most_relevant,com.bk.pt
1,b407ee48-f772-4d1c-abad-dda15e47571f,Vitor Ferraz,https://play-lh.googleusercontent.com/a-/AOh14...,Após bastante tempo passado decidi voltar a pe...,1,40,4.4.7,2022-05-06 13:21:44,Lamentamos o incidente. Iremos reportá-lo e co...,2019-10-10 15:20:20,most_relevant,com.bk.pt
2,397d3395-94ac-45a3-a03a-e5acd5320a30,Diogo Mendes DM116,https://play-lh.googleusercontent.com/a-/AOh14...,Ja teve 5estrelas. Agora esta mesmo fraca.daqu...,1,1,4.4.9,2022-06-06 12:31:04,,NaT,most_relevant,com.bk.pt
3,fede826c-bd11-4c4b-8046-fb309469886e,Ricardo Lemos,https://play-lh.googleusercontent.com/a/AATXAJ...,O burguer King mais proximo de minha casa fica...,1,24,4.4.5,2022-04-07 15:50:14,,NaT,most_relevant,com.bk.pt
4,e7d38cc6-ff0f-4394-8f16-da16a563f271,Rui Moreira,https://play-lh.googleusercontent.com/a/AATXAJ...,Não funciona. Diz que a ligação à internet não...,1,2,4.4.8,2022-05-28 17:03:29,,NaT,most_relevant,com.bk.pt


In [9]:
app_reviews_df.to_csv('reviews.csv', index=None, header=True)