In [None]:
'''
=================================================

Nama  : Mirza Rendra Sjarief
Project : ETL - Web Scraping - Data Cleaning - Load SQL

This program was developed to automate data retrieval from a website 
(web scraping) by controlling the browser through automated scripts. T
he data collected includes product information from the Bukalapak.com website, 
such as Product Name, Price, Total Sales, Rating, Store Name, and Store Location.

All the data is then converted into a DataFrame and undergoes a data preparation stage 
to facilitate data exploration and reporting. Subsequently, the processed DataFrame is 
exported into a CSV file. The CSV data is then used to create a PostgreSQL database, 
with the columns and data types aligned with the processed DataFrame.
=================================================
'''

In [41]:
from selenium import webdriver
from bs4 import BeautifulSoup
import pandas as pd
import time

from selenium import webdriver:

Imports the webdriver module from Selenium into the Python code. Selenium is a library that automates interactions with web browsers, and webdriver is one of its core components used to control the browser through automated scripts.
from bs4 import BeautifulSoup:

Imports BeautifulSoup from the bs4 module into the Python code. BeautifulSoup is a Python library used for parsing HTML or XML documents and extracting data, making it very useful for web scraping.
import pandas as pd:

A Python command used to import the pandas library, which is widely used for data analysis and manipulation. Pandas is an external library that works with tabular data such as CSV, Excel files, and more.
import time:

Imports the time module, which provides various time-related functions. For example, the time.sleep() function can be used to pause the execution of the program for a specified amount of time.


In [42]:
nama_brand = []
harga_hp = []
total_terjual = []
toko_penjual = []
lokasi_penjual = []
rating_brand = []


Creating an empty list [] is intended to temporarily store the data resulting from the iteration of the container_produk variable, which is then used to populate the columns and rows of the DataFrame.
In Python, a list is essentially an array, a data structure consisting of an ordered sequence of elements defined within square brackets. An empty list in Python is defined as a list with no elements or items inside it.

In [43]:
driver = webdriver.Safari()

for i in range (1,11):

    driver.get(f'https://www.bukalapak.com/c/handphone/hp-smartphone/hp-android?page={i}')
    inspect = driver.page_source
    time.sleep(1)
    html = driver.page_source

    soup = BeautifulSoup(html,'html.parser')

    container_produk = soup.find_all('div',{"class":"bl-flex-item mb-8"})

    for container in container_produk : 
        nama = container.find("p",{"class":"bl-text bl-text--body-14 bl-text--secondary bl-text--ellipsis__2"})
        if nama != None : 
            nama_brand.append(nama.text) 
        else:
            nama_brand.append(nama) 


        harga = container.find("p",{"class":"bl-text bl-text--semi-bold bl-text--ellipsis__1 bl-product-card-new__price"})
        if harga != None :
            harga_hp.append(harga.text)
        else:
            harga_hp.append(harga)


        lokasi = container.find("p",{"class":"bl-text bl-text--caption-12 bl-text--secondary bl-text--ellipsis__1 bl-product-card-new__store-location"})
        if lokasi != None :
            lokasi_penjual.append(lokasi.text)
        else:
            lokasi_penjual.append(lokasi)


        terjual = container.find("p",{"class":"bl-text bl-text--caption-12 bl-text--secondary bl-product-card-new__sold-count"})
        if   terjual != None :
            total_terjual.append(terjual.text)
        else:
            total_terjual.append(terjual)

        
        toko = container.find("p",{"class":"bl-text bl-text--caption-12 bl-text--secondary bl-text--ellipsis__1 bl-product-card-new__store-name"})
        if   toko != None :
            toko_penjual.append(toko.text)
        else:
            toko_penjual.append(toko)


        rating = container.find("p",{"class":"bl-text bl-text--caption-12 bl-text--bold"})
        if rating != None :
            rating_brand.append(rating.text)
        else:
            rating_brand.append(rating)




- driver = webdriver.Safari():

A Python command to initialize the Safari WebDriver with Selenium. It calls the specific web driver for the Safari browser, allowing it to perform functions such as opening browser tabs, visiting web pages, filling out forms, and clicking buttons automatically.

- for i in range(1, 11)::

This creates a variable i to be used in web page iterations, where the value of i changes during each iteration. It generates a sequence of numbers from 1 to 11. In this case, it effectively iterates through web pages 1 to 10, as Python treats 11 as the upper boundary and excludes it.

- driver.get(f'https://www.bukalapak.com/c/handphone/hp-smartphone/hp-android?page={i}'):

The driver.get() function is used to open a URL controlled by the Selenium WebDriver. The page={i} parameter dynamically uses the i variable to navigate through different pages in the Safari web browser.

- driver.page_source:

A command used to retrieve the HTML source code of the web page currently loaded by the WebDriver.

- soup = BeautifulSoup(html, 'html.parser'):

BeautifulSoup is used to parse (analyze) the HTML source code, and the parsed result is stored in the variable soup.

- container_produk = soup.find_all('div', {"class": "bl-flex-item mb-8"}):

container_produk is a list-like variable that holds all "product" elements (e.g., product name, price, etc.) found on the web page. These elements are containers in div tags with the class "bl-flex-item mb-8", representing full product data on one page of Bukalapak.com.

- for container in container_produk::

This loop processes the container variable to iterate over each element in container_produk, which contains a list of product elements from the web page 
stored in the div tags with the class "bl-flex-item mb-8".

- nama = container.find("p", {"class": "bl-text bl-text--body-14 bl-text--secondary bl-text--ellipsis__2"}):

This uses the container variable to perform a single-element search within the specified class 'bl-text bl-text--body-14 bl-text--secondary bl-text--ellipsis__2'. This element represents the product's "brand name," which is then assigned to the variable nama.

- if nama != None::

A conditional statement used as a filter. If the nama variable contains an element with a value, it is not treated as None.

- nama_brand.append(nama.text):

If an element is found and has a value, it is extracted as plain text (removing HTML tags) and appended to the nama_brand list.

- else: nama_brand.append(nama):

If no value is found, the None value is appended to the nama_brand list.
The same looping method is applied to extract data for other lists, such as harga_hp (price) and rating_brand (ratings).

In [44]:
data = pd.DataFrame({"Nama_Produk":nama_brand,
                    "Harga":harga_hp,
                    "Total_Terjual":total_terjual,
                     "Rating":rating_brand,
                     "Nama_Toko":toko_penjual,
                     "Kota_Toko":lokasi_penjual})
data

Unnamed: 0,Nama_Produk,Harga,Total_Terjual,Rating,Nama_Toko,Kota_Toko
0,\n Samsung Galaxy A12 4 128 GB Ga...,\n 2.476.000\n,\n Terjual 20612\n,\n 4.9\n,\n Rdj_cell\n ...,\n Jakarta Pusat\n ...
1,\n Xiaomi Redmi 9A 2 32 GB Garans...,\n 1.392.000\n,\n Terjual 16008\n,\n 5\n,\n Rdj_cell\n ...,\n Jakarta Pusat\n ...
2,\n Xiaomi Redmi 9c 4 64 GB Garans...,\n 1.941.000\n,\n Terjual 11334\n,\n 5\n,\n Rdj_cell\n ...,\n Jakarta Pusat\n ...
3,\n Infinix Hot 11 Play 4 64 GB Ga...,\n 1.834.000\n,\n Terjual 7490\n,\n 5\n,\n Rdj_cell\n ...,\n Jakarta Pusat\n ...
4,\n Infinix Hot 11s NFC 6 128 GB G...,\n 2.469.000\n,\n Terjual 6764\n,\n 5\n,\n Rdj_cell\n ...,\n Jakarta Pusat\n ...
...,...,...,...,...,...,...
95,\n iPhone 11 256GB 128GB 64GB Ful...,\n 4.361.000\n,\n Terjual 140\n,\n 5\n,\n Mimin store\n ...,\n Jakarta Barat\n ...
96,\n Softcase blackmatte Vivo V9 17...,\n 7.500\n,\n Terjual 139\n,\n 4.8\n,\n Farah_collection\n ...,\n Kediri\n
97,\n iPhone XS MAX 512GB 256GB 64GB...,\n 5.500.000\n,\n Terjual 138\n,\n 4.9\n,\n Mimin store\n ...,\n Jakarta Barat\n ...
98,\n Softcase auto fokus Vivo Y17 Y...,\n 9.500\n,\n Terjual 136\n,\n 4.9\n,\n Farah_collection\n ...,\n Kediri\n


- data = pd.DataFrame:

- This step involves converting the objects obtained from web scraping the Bukalapak.com website into a complete table format consisting of an index, rows, and columns, commonly known as a DataFrame.

- The DataFrame above contains data with an index ranging from 0 to 99, 100 rows, and 6 columns.

In [47]:
replacement = [(".",""),("\n",""),(" ","")]
for old, new in replacement:
    data['Harga'] = data['Harga'].str.replace(old, new)

replacement = [("\n","")]
for old, new in replacement:
    data['Nama_Produk'] = data['Nama_Produk'].str.replace(old, new)

    replacement = [("\n",""),("_"," ")]
for old, new in replacement:
    data['Nama_Toko'] = data['Nama_Toko'].str.replace(old, new)

    replacement = [("\n","")]
for old, new in replacement:
    data['Kota_Toko'] = data['Kota_Toko'].str.replace(old, new)

    replacement = [("\n",""),("Terjual","")]
for old, new in replacement:
    data['Total_Terjual'] = data['Total_Terjual'].str.replace(old, new)

    replacement = [("\n","")]
for old, new in replacement:
    data['Rating'] = data['Rating'].str.replace(old, new)


   
data

Unnamed: 0,Nama_Produk,Harga,Total_Terjual,Rating,Nama_Toko,Kota_Toko
0,Samsung Galaxy A12 4 128 GB Gara...,2476000,20612,4.9,Rdj cell,Jakarta Pusat
1,Xiaomi Redmi 9A 2 32 GB Garansi ...,1392000,16008,5,Rdj cell,Jakarta Pusat
2,Xiaomi Redmi 9c 4 64 GB Garansi ...,1941000,11334,5,Rdj cell,Jakarta Pusat
3,Infinix Hot 11 Play 4 64 GB Gara...,1834000,7490,5,Rdj cell,Jakarta Pusat
4,Infinix Hot 11s NFC 6 128 GB Gar...,2469000,6764,5,Rdj cell,Jakarta Pusat
...,...,...,...,...,...,...
95,iPhone 11 256GB 128GB 64GB Fulls...,4361000,140,5,Mimin store,Jakarta Barat
96,Softcase blackmatte Vivo V9 1727...,7500,139,4.8,Farah collection ...,Kediri
97,iPhone XS MAX 512GB 256GB 64GB f...,5500000,138,4.9,Mimin store,Jakarta Barat
98,Softcase auto fokus Vivo Y17 Y15...,9500,136,4.9,Farah collection ...,Kediri


- .str.replace():

- Definition: A method in Pandas used to replace specific substrings within the values of a column containing string-type data. It is useful for modifying string data by replacing part or all of the text in each element.

- This method can be used to remove or replace symbols, commas, or even add elements to string-type data by utilizing the .str.replace() method.

- replacement = [(".", ""), ("\n", ""), (" ", "")]:

- Starts by creating a variable replacement containing elements to replace the content of a column's rows. In this case, it is used to remove the symbols ".", "\n", and spaces " ".

- for old, new in replacement:

- An iteration command to loop through the "old" data to be replaced and the "new" data to use, as specified in the replacement variable.
data['Harga'] = data['Harga'].str.replace(old, new):

- Updates the content of the 'Harga' column by replacing the old data with the new data.


In [58]:
data['Nama_Produk']=data['Nama_Produk'].str.title()
data['Nama_Toko']=data['Nama_Toko'].str.title()
data

Unnamed: 0,Nama_Produk,Harga,Total_Terjual,Rating,Nama_Toko,Kota_Toko
0,Samsung Galaxy A12 4 128 Gb Gara...,2476000,20612,4.9,Rdj Cell,Jakarta Pusat
1,Xiaomi Redmi 9A 2 32 Gb Garansi ...,1392000,16008,5.0,Rdj Cell,Jakarta Pusat
2,Xiaomi Redmi 9C 4 64 Gb Garansi ...,1941000,11334,5.0,Rdj Cell,Jakarta Pusat
3,Infinix Hot 11 Play 4 64 Gb Gara...,1834000,7490,5.0,Rdj Cell,Jakarta Pusat
4,Infinix Hot 11S Nfc 6 128 Gb Gar...,2469000,6764,5.0,Rdj Cell,Jakarta Pusat
...,...,...,...,...,...,...
95,Iphone 11 256Gb 128Gb 64Gb Fulls...,4361000,140,5.0,Mimin Store,Jakarta Barat
96,Softcase Blackmatte Vivo V9 1727...,7500,139,4.8,Farah Collection ...,Kediri
97,Iphone Xs Max 512Gb 256Gb 64Gb F...,5500000,138,4.9,Mimin Store,Jakarta Barat
98,Softcase Auto Fokus Vivo Y17 Y15...,9500,136,4.9,Farah Collection ...,Kediri


- The above demonstrates the use of the .str.title() method, which is used to process and modify string-type data by transforming the first letter of each word in a column's text into uppercase.

In [48]:
data ['Harga'] = data ['Harga'].astype(int)

data ['Total_Terjual'] = data ['Total_Terjual'].astype(int)

data ['Rating'] = data ['Rating'].astype(float)



- The above demonstrates the use of a method to change the data type of a column in a DataFrame, namely astype(int). In this case, it is used to convert objects into integers and floats, specifically in the columns "Harga" (int), "Total_Terjual" (int), and "Rating" (float).


In [66]:
datadrop = data.dropna()
datadrop

Unnamed: 0,Nama_Produk,Harga,Total_Terjual,Rating,Nama_Toko,Kota_Toko
0,Samsung Galaxy A12 4 128 Gb Gara...,2476000,20612,4.9,Rdj Cell,Jakarta Pusat
1,Xiaomi Redmi 9A 2 32 Gb Garansi ...,1392000,16008,5.0,Rdj Cell,Jakarta Pusat
2,Xiaomi Redmi 9C 4 64 Gb Garansi ...,1941000,11334,5.0,Rdj Cell,Jakarta Pusat
3,Infinix Hot 11 Play 4 64 Gb Gara...,1834000,7490,5.0,Rdj Cell,Jakarta Pusat
4,Infinix Hot 11S Nfc 6 128 Gb Gar...,2469000,6764,5.0,Rdj Cell,Jakarta Pusat
...,...,...,...,...,...,...
95,Iphone 11 256Gb 128Gb 64Gb Fulls...,4361000,140,5.0,Mimin Store,Jakarta Barat
96,Softcase Blackmatte Vivo V9 1727...,7500,139,4.8,Farah Collection ...,Kediri
97,Iphone Xs Max 512Gb 256Gb 64Gb F...,5500000,138,4.9,Mimin Store,Jakarta Barat
98,Softcase Auto Fokus Vivo Y17 Y15...,9500,136,4.9,Farah Collection ...,Kediri


- The above demonstrates the use of the .dropna() method, which is used to separate or remove rows containing NULL values, or in other words, rows that do not have a value. In this case, it aims to filter out products without values and remove them, ensuring that each column contains balanced values for further data processing, which is then stored in the variable datadrop.

- It can be concluded that this DataFrame consists of 97 rows of valid product data.

In [68]:
datadrop.info()

<class 'pandas.core.frame.DataFrame'>
Index: 97 entries, 0 to 99
Data columns (total 6 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   Nama_Produk    97 non-null     object 
 1   Harga          97 non-null     int64  
 2   Total_Terjual  97 non-null     int64  
 3   Rating         97 non-null     float64
 4   Nama_Toko      97 non-null     object 
 5   Kota_Toko      97 non-null     object 
dtypes: float64(1), int64(2), object(3)
memory usage: 5.3+ KB


- The above demonstrates the use of the data.info() method to view the structure of the DataFrame. From the output, it can be confirmed that this DataFrame now has consistent values in each column, with an index range from 0 to 99. It consists of 97 rows and 6 columns. The index remains at 99 because the initial DataFrame had 100 rows, which were adjusted by removing rows without data (Null/None) using the .dropna() method.

- It can be seen that after processing with the astype() method, the "Harga" and "Total_Terjual" columns have been converted to the int data type, and "Rating" has been converted to the float data type, which were previously of the object type

In [69]:
datadrop.head()

Unnamed: 0,Nama_Produk,Harga,Total_Terjual,Rating,Nama_Toko,Kota_Toko
0,Samsung Galaxy A12 4 128 Gb Gara...,2476000,20612,4.9,Rdj Cell,Jakarta Pusat
1,Xiaomi Redmi 9A 2 32 Gb Garansi ...,1392000,16008,5.0,Rdj Cell,Jakarta Pusat
2,Xiaomi Redmi 9C 4 64 Gb Garansi ...,1941000,11334,5.0,Rdj Cell,Jakarta Pusat
3,Infinix Hot 11 Play 4 64 Gb Gara...,1834000,7490,5.0,Rdj Cell,Jakarta Pusat
4,Infinix Hot 11S Nfc 6 128 Gb Gar...,2469000,6764,5.0,Rdj Cell,Jakarta Pusat


The above demonstrates the use of the .head() method in Pandas, which is used to display the first 5 rows of a DataFrame.







In [70]:
datadrop.tail()

Unnamed: 0,Nama_Produk,Harga,Total_Terjual,Rating,Nama_Toko,Kota_Toko
95,Iphone 11 256Gb 128Gb 64Gb Fulls...,4361000,140,5.0,Mimin Store,Jakarta Barat
96,Softcase Blackmatte Vivo V9 1727...,7500,139,4.8,Farah Collection ...,Kediri
97,Iphone Xs Max 512Gb 256Gb 64Gb F...,5500000,138,4.9,Mimin Store,Jakarta Barat
98,Softcase Auto Fokus Vivo Y17 Y15...,9500,136,4.9,Farah Collection ...,Kediri
99,Case Koper Samsung Galaxy A71 A7...,12500,136,4.9,Farah Collection ...,Kediri


- The above demonstrates the use of the .tail() method in Pandas, which is used to display the last 5 rows of a DataFrame.







In [139]:
tabelutama = datadrop 

tabelutama["Nama_Produk"] = tabelutama["Nama_Produk"].str.strip()
tabelutama["Nama_Toko"] = tabelutama["Nama_Toko"].str.strip()
tabelutama["Kota_Toko"] = tabelutama["Kota_Toko"].str.strip()

tabelutama.head()





A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  tabelutama["Nama_Produk"] = tabelutama["Nama_Produk"].str.strip()
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  tabelutama["Nama_Toko"] = tabelutama["Nama_Toko"].str.strip()
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  tabelutama["Kota_Toko"] = tabelutama["Kota_Toko"].str.strip()


Unnamed: 0,Nama_Produk,Harga,Total_Terjual,Rating,Nama_Toko,Kota_Toko
0,Samsung Galaxy A12 4 128 Gb Garansi Resmi Sein,2476000,20612,4.9,Rdj Cell,Jakarta Pusat
1,Xiaomi Redmi 9A 2 32 Gb Garansi Resmi,1392000,16008,5.0,Rdj Cell,Jakarta Pusat
2,Xiaomi Redmi 9C 4 64 Gb Garansi Resmi,1941000,11334,5.0,Rdj Cell,Jakarta Pusat
3,Infinix Hot 11 Play 4 64 Gb Garansi Resmi,1834000,7490,5.0,Rdj Cell,Jakarta Pusat
4,Infinix Hot 11S Nfc 6 128 Gb Garansi Resmi,2469000,6764,5.0,Rdj Cell,Jakarta Pusat


- Finally, the use of the .str.strip() method is demonstrated. This is one way to modify string-type data by removing leading and trailing spaces (white spaces) without affecting spaces within the target text. A new variable, "tabelutama," is then used for further simple data processing, followed by transferring the data into a CSV format.







In [101]:
tabelutama[(tabelutama['Harga']>=3000000) & (tabelutama['Rating']> 4.5)]

Unnamed: 0,Nama_Produk,Harga,Total_Terjual,Rating,Nama_Toko,Kota_Toko
5,Xiaomi Redmi Note 10S 8 128 Gb Garasni Resmi,3794000,5656,5.0,Rdj Cell,Jakarta Pusat
16,Xiaomi Redmi Note 10S 6 128 Gb Garansi Resmi,3104000,1442,4.9,Rdj Cell,Jakarta Pusat
19,Samsung Galaxy A52 8 128 Gb Garansi Resmi Sein,5128000,1356,5.0,Rdj Cell,Jakarta Pusat
50,Xiaomi Redmi Note 11 Pro 5G 8 128 Gb Garansi R...,4518000,290,5.0,Rdj Cell,Jakarta Pusat
51,Xiaomi Redmi Note 10 5G 8 128 Gb - Garansi Res...,3495000,290,5.0,Akbarstore,Jakarta Utara
56,Samsung Galaxy S20 Fe 8 128Gb Grs Resmi Sein,6999000,285,5.0,Pluto Sellular,Jakarta Timur
59,Xiaomi Redmi Note 10S 6 64Gb 8 128Gb - Grs Res...,3029000,278,5.0,Pluto Sellular,Jakarta Timur
64,Xiaomi Poco M3 Pro 5G 6 128 Gb - Garansi Resmi...,3283000,229,5.0,Akbarstore,Jakarta Utara
67,Xiaomi Redmi Note 10 Pro 6 128 Gb Garansi Resmi,3839000,225,5.0,Rdj Cell,Jakarta Pusat
78,Samsung Galaxy A22 5G 6 128 Gb - Garansi Resmi...,3594000,182,4.9,Akbarstore,Jakarta Utara


- The above demonstrates the use of multiple conditions to find product data that has a price greater than or equal to three million and a rating above 4.5.







In [104]:
tabelutama[tabelutama['Nama_Produk'].str.contains('Samsung')]


Unnamed: 0,Nama_Produk,Harga,Total_Terjual,Rating,Nama_Toko,Kota_Toko
0,Samsung Galaxy A12 4 128 Gb Garansi Resmi Sein,2476000,20612,4.9,Rdj Cell,Jakarta Pusat
6,Samsung M12 4 64 Gb Garasi Resmi Sein,2148000,5655,4.9,Rdj Cell,Jakarta Pusat
10,Samsung Galaxy A03 3 32 Gb Garansi Resmi Sein,1639000,1616,5.0,Rdj Cell,Jakarta Pusat
17,Samsung Galaxy A11 3Gb,2000000,1414,5.0,Arnold,Jakarta Barat
19,Samsung Galaxy A52 8 128 Gb Garansi Resmi Sein,5128000,1356,5.0,Rdj Cell,Jakarta Pusat
20,Samsung Galaxy A01 2Gb,1500000,873,5.0,Arnold,Jakarta Barat
23,Samsung Galaxy A13 4 128 Gb Garansi Resmi,2692000,812,5.0,Rdj Cell,Jakarta Pusat
26,Anti Crack Akrilik Samsung Grand 2 Duos 2016 A...,13000,778,4.9,Farah Collection,Kediri
29,Anti Crack Samsung J3 J7 Pro A3 2015 A12 Note ...,8000,737,4.9,Farah Collection,Kediri
34,Softcase Permen Samsung Galaxy A20 A30 M10 A20...,10500,539,4.9,Farah Collection,Kediri


- The above is an example of processing string-type data using the .str.contains method, which in this case is used to search and display the rows in the 'Nama_Produk' column that contain the word 'Samsung'.

In [128]:
tabelutama.sort_values(['Total_Terjual', 'Nama_Produk'],ascending=False) 

Unnamed: 0,Nama_Produk,Harga,Total_Terjual,Rating,Nama_Toko,Kota_Toko
0,Samsung Galaxy A12 4 128 Gb Garansi Resmi Sein,2476000,20612,4.9,Rdj Cell,Jakarta Pusat
1,Xiaomi Redmi 9A 2 32 Gb Garansi Resmi,1392000,16008,5.0,Rdj Cell,Jakarta Pusat
2,Xiaomi Redmi 9C 4 64 Gb Garansi Resmi,1941000,11334,5.0,Rdj Cell,Jakarta Pusat
3,Infinix Hot 11 Play 4 64 Gb Garansi Resmi,1834000,7490,5.0,Rdj Cell,Jakarta Pusat
4,Infinix Hot 11S Nfc 6 128 Gb Garansi Resmi,2469000,6764,5.0,Rdj Cell,Jakarta Pusat
...,...,...,...,...,...,...
95,Iphone 11 256Gb 128Gb 64Gb Fullset Second Norm...,4361000,140,5.0,Mimin Store,Jakarta Barat
96,Softcase Blackmatte Vivo V9 1727 Y17 Y15 Y12 2...,7500,139,4.8,Farah Collection,Kediri
97,Iphone Xs Max 512Gb 256Gb 64Gb Fullset Second ...,5500000,138,4.9,Mimin Store,Jakarta Barat
98,Softcase Auto Fokus Vivo Y17 Y15 Y12 2019 Y11 ...,9500,136,4.9,Farah Collection,Kediri


- Below is the display of product names with the highest to lowest total sales, using the .sort_values method based on the "total_penjualan" column.







In [131]:
tabelutama.to_csv('coda_P0M1_Mirza_Sjarief.csv', index=False)

- Below is the method to transfer the DataFrame into CSV format. The index=False parameter is used to prevent the row index in the DataFrame from being included in the CSV file.





