# **Install the required libraries**

In [1]:
!pip install google-play-scraper

Collecting google-play-scraper
  Downloading google_play_scraper-1.2.6-py3-none-any.whl (28 kB)
Installing collected packages: google-play-scraper
Successfully installed google-play-scraper-1.2.6


The code in the image is written for Python and uses the `pip` package manager to install a library called `google-play-scraper`. This library allows you to scrape data from the Google Play Store.

Here’s a breakdown of the code:

* `!`: This exclamation mark is typically used in Jupyter notebooks to indicate that the following line is a shell command.
* `pip`: This refers to the Python package installer.
* `install`: This tells pip to install a package.
* `google-play-scraper`: This is the name of the library you want to install.

After running this line of code, you’ll be able to use the `google-play-scraper` library in your Python scripts to collect information from the Google Play Store.  

It’s important to note that scraping data from websites can sometimes violate their terms of service, so it’s a good idea to check the Google Play Store’s terms before using this library.

# **Setting up a Python Environment to Analyze Data**

In [2]:
from google_play_scraper import app
import pandas as pd
import numpy as np

Breakdown of the code:

* `from google_play_scraper import app`: This line imports a specific function or class from the `google_play_scraper` library. It's likely that the `app` function is used to interact with the Google Play Store.
* `import pandas as pd`: This line imports the pandas library and assigns it the alias `pd`. Pandas is a popular library for data manipulation and analysis in Python.
* `import numpy as np`: This line imports the NumPy library and assigns it the alias `np`. NumPy is another popular library for numerical computing in Python.

In [None]:
from google_play_scraper import Sort, reviews_all

result = reviews_all(
    'com.discord',
    sleep_milliseconds=0, # defaults to 0
    lang='id', # defaults to 'en'
    country='id', # defaults to 'us'
    sort=Sort.NEWEST, # defaults to Sort.MOST_RELEVANT
)

Let's break down the code line by line:

  * `from google_play_scraper import Sort, reviews_all`: This line imports two functions, `Sort` and `reviews_all`, from the `google-play-scraper` library.
  * `Sort`: This function is likely used to specify how you want the reviews to be sorted (e.g., by most relevant, newest, etc.).
  * `reviews_all`: This function is used to retrieve all reviews for a given app.
  * `result = reviews_all(...)`: This line calls the `reviews_all` function you imported earlier. Let's look at the arguments being passed to the function:
  * `'com.discord'`: This is the ID of the app you want to retrieve reviews for. You'll need to replace this with the ID of the specific app you're interested in.
  * `sleep_milliseconds=0`: This argument is optional and specifies the number of milliseconds to sleep between requests to the Google Play Store. The default value is 0, which means there will be no delay between requests. However, it's a good practice to add a small delay to avoid overwhelming the server with requests.
  * `lang='id'`: This argument is optional and specifies the language of the reviews you want to retrieve. In this case, it's set to 'id' for Indonesian.
  * `country='id'`: This argument is optional and specifies the country where the reviews come from. Here, it's set to 'id' for Indonesia.
  * `sort=Sort.NEWEST`: This argument is optional and specifies how you want the reviews to be sorted. Here, it's set to `Sort.NEWEST`, which means the reviews will be returned from newest to oldest.

In [3]:
from google_play_scraper import Sort, reviews

result, continuation_token = reviews(
    'com.discord',
    lang='id', # defaults to 'en'
    country='id', # defaults to 'us'
    sort=Sort.NEWEST, # defaults to Sort.MOST_RELEVANT
    count=5000,
    filter_score_with=None
)

Here's a breakdown of the code:

* `from google_play_scraper import Sort, reviews`:** This line imports two functions, `Sort` and `reviews`, from the `google-play-scraper` library.
    * `Sort`: This function is likely used to specify how you want the reviews to be sorted (e.g., by most relevant, newest, etc.).
    * `reviews`: This function is used to retrieve reviews for a given app.
    
* This line calls the `reviews` function you imported earlier. Let's look at the arguments being passed to the function:
    * `'com.discord'`: This is the ID of the app you want to retrieve reviews for. You'll need to replace this with the ID of the specific app you're interested in.
    * `lang='id'`: This argument is optional and specifies the language of the reviews you want to retrieve. In this case, it's set to 'id' for Indonesian.
    * `country='id'`: This argument is optional and specifies the country where the reviews come from. Here, it's set to 'id' for Indonesia.
    * `sort=Sort.NEWEST`: This argument is optional and specifies how you want the reviews to be sorted. Here, it's set to `Sort.NEWEST`, which means the reviews will be returned from newest to oldest.
    * `count=5000`: This argument is optional and specifies the maximum number of reviews to retrieve. Here, it's set to 5000.
    * `filter_score_with=None`: This argument is optional and allows you to filter reviews based on a score. However, it's set to `None` in this case, which means all reviews will be included.

* The line assigns the results of the function call to two variables: `result` and `continuation_token`.
    * `result`: This variable will likely contain a list of dictionaries, where each dictionary represents a review.
    * `continuation_token`: This variable might be used to retrieve additional results if the total number of reviews exceeds the specified `count`. The library documentation can clarify how this works.

# **Convert a Dictionary into a DataFrame**

In [9]:
df_busu = pd.DataFrame(np.array(result),columns=['review'])

df_busu = df_busu.join(pd.DataFrame(df_busu.pop('review').tolist()))

df_busu.head(1500)

Unnamed: 0,reviewId,userName,userImage,content,score,thumbsUpCount,reviewCreatedVersion,at,replyContent,repliedAt,appVersion
0,c19e9709-7cc3-431e-9783-04186dcc1895,Mr MOG,https://play-lh.googleusercontent.com/a-/ALV-U...,Please decrease the size of this app.. don't t...,3,0,10.0.5,2024-03-30 16:00:05,,NaT,10.0.5
1,361826f0-5090-477e-b311-9959b41b5fae,Maliq Drajat,https://play-lh.googleusercontent.com/a-/ALV-U...,Register susah semua udh bener tapi di bilang ...,1,0,221.16 - Stable,2024-03-30 15:22:33,,NaT,221.16 - Stable
2,1228cd94-57c1-46c9-acc2-12d2404aa941,Ryan Gizan,https://play-lh.googleusercontent.com/a-/ALV-U...,Kenapa yh setiap saya buka aplikasi lain discr...,1,0,,2024-03-30 15:08:05,,NaT,
3,df186039-67f8-4750-a812-37ffc9f7f916,Kemon Squigrrel,https://play-lh.googleusercontent.com/a/ACg8oc...,Why was the search button removed?,1,0,,2024-03-30 15:02:32,,NaT,
4,d270ee21-a60f-46ed-bc10-eb53cc26ab18,Akbar Maulana,https://play-lh.googleusercontent.com/a-/ALV-U...,Aneh padahal gw udah klik forgot password. Eh ...,1,0,,2024-03-30 14:49:58,,NaT,
...,...,...,...,...,...,...,...,...,...,...,...
194,554f6039-aac6-4010-8e25-70fec24c199d,adnan Yusufi,https://play-lh.googleusercontent.com/a/ACg8oc...,"if you really care about ur app, just revert b...",1,0,219.21 - Stable,2024-03-21 08:23:52,Thanks for writing in! I understand that you w...,2024-03-24 23:38:03,219.21 - Stable
195,1401e3a2-3b80-4eb5-af6f-b89bfcd14238,analim,https://play-lh.googleusercontent.com/a-/ALV-U...,"Makin hari, maki ngebug aja Kalian ini. Harus ...",2,0,222.8 - Beta,2024-03-21 07:46:43,We appreciate the feedback. We'd be happy to h...,2024-03-24 23:26:03,222.8 - Beta
196,fdf9ceb9-ca14-485b-8e39-d2711ef84039,ゼインダ,https://play-lh.googleusercontent.com/a-/ALV-U...,"can't verification my phone number, please fix...",1,0,,2024-03-21 06:22:30,I understand how frustrating verification issu...,2024-03-21 06:35:06,
197,97c1b275-4c95-473d-a73f-566fa7822484,Diki iD,https://play-lh.googleusercontent.com/a-/ALV-U...,aplikasi ga jelas saya mau masukan kode verifi...,1,0,221.16 - Stable,2024-03-21 05:27:37,"For verification issues, please note that we'r...",2024-03-24 23:11:01,221.16 - Stable


Here's a breakdown of the code:
   - `df_busu = pd.DataFrame(...)`: This line creates a Pandas DataFrame object and assigns it to the variable `df_busu`.
   - `pd.DataFrame`: This part of the code calls the `DataFrame` constructor function from the Pandas library.
   - `np.array(result)`: This argument converts the data in the `result` variable to a NumPy array. The `result` variable likely contains the list of reviews you retrieved from the Google Play Store using the `google-play-scraper` library in the previous part of your code (not shown in this snippet).  
   - `columns=['review']`: This argument specifies the column names for the DataFrame. Here, it's set to a single column named "review". This suggests that each element (or row) in your DataFrame will contain the text of a review from the Google Play Store.

In [5]:
df_busu[['userName', 'score', 'at', 'content']].head()

Unnamed: 0,userName,score,at,content
0,Mr MOG,3,2024-03-30 16:00:05,Please decrease the size of this app.. don't t...
1,Maliq Drajat,1,2024-03-30 15:22:33,Register susah semua udh bener tapi di bilang ...
2,Ryan Gizan,1,2024-03-30 15:08:05,Kenapa yh setiap saya buka aplikasi lain discr...
3,Kemon Squigrrel,1,2024-03-30 15:02:32,Why was the search button removed?
4,Akbar Maulana,1,2024-03-30 14:49:58,Aneh padahal gw udah klik forgot password. Eh ...


This line of code selects four specific columns ("userName", "score", "at", and "content") from a DataFrame named df_busu and then displays the first five rows of the resulting data.

In [6]:
new_df = df_busu[['userName','score','at','content']]
sorted_df = new_df.sort_values(by = 'at', ascending = False)
sorted_df.head()

Unnamed: 0,userName,score,at,content
0,Mr MOG,3,2024-03-30 16:00:05,Please decrease the size of this app.. don't t...
1,Maliq Drajat,1,2024-03-30 15:22:33,Register susah semua udh bener tapi di bilang ...
2,Ryan Gizan,1,2024-03-30 15:08:05,Kenapa yh setiap saya buka aplikasi lain discr...
3,Kemon Squigrrel,1,2024-03-30 15:02:32,Why was the search button removed?
4,Akbar Maulana,1,2024-03-30 14:49:58,Aneh padahal gw udah klik forgot password. Eh ...


This code snippet takes a DataFrame named new_df and sorts it by the 'at' column in descending order. The sorted DataFrame is then stored in a new variable named sorted_df.

In [7]:
my_df = sorted_df[['userName','score','at','content']]
my_df.head()

Unnamed: 0,userName,score,at,content
0,Mr MOG,3,2024-03-30 16:00:05,Please decrease the size of this app.. don't t...
1,Maliq Drajat,1,2024-03-30 15:22:33,Register susah semua udh bener tapi di bilang ...
2,Ryan Gizan,1,2024-03-30 15:08:05,Kenapa yh setiap saya buka aplikasi lain discr...
3,Kemon Squigrrel,1,2024-03-30 15:02:32,Why was the search button removed?
4,Akbar Maulana,1,2024-03-30 14:49:58,Aneh padahal gw udah klik forgot password. Eh ...


# **Saves The DataFrame to a CSV File**

In [8]:
my_df.to_csv("scrapped_data_discord_4.csv", index = False)

This line of code saves the DataFrame  named my_df to a CSV file named "scrapped_data_discord_4.csv", excluding the row index from the CSV file.