# **Scraping Ulasan Aplikasi Gojek pada Google Play Store**

Melakukan **scraping ulasan aplikasi Gojek** dari Google Play Store menggunakan library `google-play-scraper`. Proses dimulai dengan mengimpor library yang dibutuhkan. Ulasan aplikasi Gojek diambil dengan parameter bahasa Indonesia, negara Indonesia, dan jumlah ulasan sebanyak 10.000, serta diurutkan berdasarkan relevansi. Data yang diperoleh kemudian disimpan dalam file CSV.

- **Nama**: Muhammad Azhar Putra Nadian
- **Email**: azharnadian@student.ub.ac.id
- **ID Dicoding**: azharnadian
- **Cohort ID**: MC006D5Y1335
- **Coding Camp Email**: mc006d5y1335@student.devacademy.id

## Import Library

In [18]:
!pip install google-play-scraper



In [19]:
from google_play_scraper import app, reviews_all, Sort
import pandas as pd
pd.options.mode.chained_assignment = None
import numpy as np
import csv

In [20]:
seed = 0
np.random.seed(seed)

## Scraping Dataset Ulasan Aplikasi Gojek di Play Store

In [25]:
scrapreview = reviews_all(
    'com.gojek.app',  # ID aplikasi Gojek di Play Store
    lang='id',
    country='id',
    sort=Sort.MOST_RELEVANT,
    count=10000
)

## Load Dataset

In [26]:
with open('ulasan_gojek.csv', mode='w', newline='', encoding='utf-8') as file:
    writer = csv.writer(file)
    writer.writerow(['Review'])  # Header kolom
    for review in scrapreview:
        writer.writerow([review['content']])

In [27]:
gojek_reviews_df = pd.DataFrame(scrapreview)

print(gojek_reviews_df.shape)
gojek_reviews_df.head()

(90000, 11)


Unnamed: 0,reviewId,userName,userImage,content,score,thumbsUpCount,reviewCreatedVersion,at,replyContent,repliedAt,appVersion
0,595da86c-acc1-4a64-ae43-90ff85eaf53d,Pengguna Google,https://play-lh.googleusercontent.com/EGemoI2N...,terlalu terlalu terlalu... apk yg tidak bisa d...,1,1,4.31.1,2022-04-21 20:37:07,"Hai, mohon maaf atas kendala yang kamu alami. ...",2022-04-22 08:33:31,4.31.1
1,7874a624-ec35-4b2f-8b1b-d34e160c5180,Pengguna Google,https://play-lh.googleusercontent.com/EGemoI2N...,pesan go food dengan estimasi awal 30-40 menit...,1,22,5.16.3,2025-04-11 20:18:06,"Hai Kak Afiifah, mohon maaf atas ketidaknyaman...",2025-04-11 21:39:05,5.16.3
2,606f946d-2bab-4c43-9b20-429a679a8fe0,Pengguna Google,https://play-lh.googleusercontent.com/EGemoI2N...,"aplikasi ga jelas, sering banget gofood udh nu...",1,62,5.15.1,2025-04-03 22:04:48,"Hai Kak Andreas, mohon maaf atas ketidaknyaman...",2025-04-04 09:40:35,5.15.1
3,46c3b900-0136-4fbf-91a0-c3f73d661fac,Pengguna Google,https://play-lh.googleusercontent.com/EGemoI2N...,sebagai pengguna lama baru kali ini saya kecew...,1,36,5.14.2,2025-04-07 14:25:32,"Hai Kak Harliani, mohon maaf atas ketidaknyama...",2025-04-07 15:59:35,5.14.2
4,825e4595-c073-4388-8229-415e372b6c01,Pengguna Google,https://play-lh.googleusercontent.com/EGemoI2N...,tinggal 2menit lg driver sampe di lokasi tiba-...,1,175,5.14.2,2025-03-26 07:35:56,"Hai Kak Wahyu, mohon maaf atas ketidaknyamanan...",2025-03-26 11:26:47,5.14.2


## Menyimpan Dataframe Menjadi CSV

In [39]:
gojek_reviews_df.to_csv('ulasan_gojek_full.csv', index=False)

In [37]:
gojek_reviews_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 90000 entries, 0 to 89999
Data columns (total 11 columns):
 #   Column                Non-Null Count  Dtype         
---  ------                --------------  -----         
 0   reviewId              90000 non-null  object        
 1   userName              90000 non-null  object        
 2   userImage             90000 non-null  object        
 3   content               89999 non-null  object        
 4   score                 90000 non-null  int64         
 5   thumbsUpCount         90000 non-null  int64         
 6   reviewCreatedVersion  72303 non-null  object        
 7   at                    90000 non-null  datetime64[ns]
 8   replyContent          33205 non-null  object        
 9   repliedAt             33205 non-null  datetime64[ns]
 10  appVersion            72303 non-null  object        
dtypes: datetime64[ns](2), int64(2), object(7)
memory usage: 7.6+ MB
