# **i. Perkenalan**

Program ini dapat digunakan untuk mempermudah seller saat memakai marketplace. Program ini dapat menawarkan tipe produk yang cocok digunakan kepada seller saat seller memasukkan produk mereka ke dalam marketplace. Hal ini dilakukan dengan membuat model deep learning klasifikasi berdasarkan data produk dari Januari 2020 hingga Januari 2024. Model tersebut akan memprediksi tipe produk berdasarkan deskripsi produk yang seller tulis. Informasi yang dihasilkan dari model ini nantinya akan dijadikan sebagai bahan pertimbangan untuk seller tentang tipe produk yang cocok digunakan untuk produk mereka.

# **ii. Model Inference**

## **A. Import Libraries**

In [55]:
import re
import nltk
import spacy
import numpy as np
import pandas as pd
import tensorflow as tf
from spacy.cli import download
from nltk.corpus import stopwords

## **B. Membuat data inference**

In [None]:
# Membuat data inference
Clothing_product_description = '''Fabric:68% Polyurethane+27% Polyester+5% Cotton,High quality faux Leather
Design:Oversized,Plus,Long Sleeve,Faux Leather Jackets,Zip Up Motorcycle Jacket,Fall Outfits,Y2k Fashion,Trendy Clothes,Punk Coat,Moto Biker Outwear,Bomber jacket women.
Match:This Jacket is suit for spring, autumn and winter. Just wear a basic T-shirt with jeans for a casual look or wear a dress shirt under it for formal occasions. This all-match style leather jacket must be an indispensable outerwear in your wardrobe.
'''

book_product_description = '''Today is supposed to be the happiest day of my life.

I'm engaged to the man of my dreams, and in a few short hours, I'm going to stand before a judge, who will declare us husband and wife, till death does us part. Despite some bumps in the road, this day is everything I dreamed it would be.

There's only one problem:

Someone out there doesn't want me to live long enough to say my vows.

And if I'm not careful, they may very well get their wish.

'''

# Mengubah data inference menjadi dataframe
data={'descriptions': [Clothing_product_description, book_product_description]}

inference_data = pd.DataFrame(data)
inference_data

Unnamed: 0,descriptions
0,Fabric:68% Polyurethane+27% Polyester+5% Cotto...
1,Today is supposed to be the happiest day of my...


Sumber Clothing_product_description : https://www.amazon.com/Leggings-Waisted-Control-See-Through-Workout/dp/B09NJHVJ6W/ref=sr_1_3?crid=22TEZP9CVP5YQ&dib=eyJ2IjoiMSJ9.W8LR5TuvYTbCsmq5NPBKcaQzCj32HXVh8iP5W-nZ__tiuABappvAwl1LCW36ufu9QW9kMG9doOxXe4voJGVdHcW98myLd61se7f2dIHABQteS1-3qgsdWjZwLN-JwJYOQt9WXEw5H867V7ec2UpGaTIPHn8IcZ7sFAeG-GfE5fZblUALl8NwjRAjBjBnNLS8fgflNGZgMTWuMm0z9B75I1y5v5sKHCQQa9RBueJbXJruWgotDAMAy2TB_nJ8-cP0di4UA7E4yscMdgYM_LyzI_UxW06-nSpi8LR_rFdKBDk.uVmEucVP4T4RPPsiMkZLVR_B3J_6GA6slEHnjJXo2VE&dib_tag=se&keywords=clothes&qid=1730823107&sprefix=cloth%2Caps%2C386&sr=8-3

Sumber book_product_description : https://www.amazon.com/Housemaids-Wedding-Short-Story/dp/B0DLHLBK74/ref=zg_d_sccl_2/141-2764111-1729026?pd_rd_w=19y6q&content-id=amzn1.sym.7f37c16c-1aa6-48d9-bd2d-34f2cb3ae9e0&pf_rd_p=7f37c16c-1aa6-48d9-bd2d-34f2cb3ae9e0&pf_rd_r=BBV4RXXZZXHJ865QE8TR&pd_rd_wg=nF6D0&pd_rd_r=711abdef-da24-4f20-b076-2d3553c82b13&pd_rd_i=B0DLHLBK74&psc=1

## **C. Pre-processing inference data**

In [56]:
# Fungsi untuk menghilangkan karakter yang tidak bermakna
def f_menghilangkan_karakter_tidak_bermakna(text):
  # Mengkecilkan huruf
  text = text.lower()

  # Menghilangkan karakter spesial dan angka
  text = re.sub(r'[^A-Za-z\s]', '', text)

  # Menghilangkan baris ganda
  text = re.sub(r'\\n', ' ',text)

  # Menghilangkan spasi ganda
  text = text.strip()

  # Menghilangkan link website
  text = re.sub(r"http\S+", " ", text)
  text = re.sub(r"www.\S+", " ", text)

  return text

In [57]:
# Mengunduh vocabulary stopwords dari nltk berbahasa inggris
nltk.download('stopwords')
stpwds_en = list(set(stopwords.words('english')))

[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\mnuzu\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


In [58]:
# Menghilangkan kata yang tidak bermakna
def f_menghilangkan_kata_tidak_bermakna(text):
  # Mengubah teks menjadi list berdasarkan spasi
  tokens = re.findall(r'\w+|[^\w\s]', text)

  # Menghilangkan kata stopwords
  tokens = [word for word in tokens if word not in stpwds_en]

  # Menggabungkan kata pada list menjadi teks
  text = ' '.join(tokens)

  return text

In [59]:
# Mengunduh en_core_web_sm dari spacy
download('en_core_web_sm')
nlp = spacy.load('en_core_web_sm')

[38;5;2m✔ Download and installation successful[0m
You can now load the package via spacy.load('en_core_web_sm')
[38;5;3m⚠ Restart to reload dependencies[0m
If you are in a Jupyter or Colab notebook, you may need to restart Python in
order to load all the package's dependencies. You can do this by selecting the
'Restart kernel' or 'Restart runtime' option.


In [60]:
# Fungsi untuk menghilangkan kata yang bermakna sama
def f_menghilangkan_kata_bermakna_sama(text):
  # Melakukan Lemmanization
  tokens = [token.lemma_ for token in nlp(text)]
  text = ' '.join(tokens)

  return text

In [61]:
# Menghilangkan karakter yang tidak bermakna
df_temp = inference_data['descriptions'].apply(lambda x: f_menghilangkan_karakter_tidak_bermakna(x))
# Menghilangkan kata yang tidak bermakna dengan stopwords
df_temp = df_temp.apply(lambda x: f_menghilangkan_kata_tidak_bermakna(x))
# Menghilangkan kata yang bermakna sama dengan lemmanization
inference_data_pre_processed = df_temp.apply(lambda x: f_menghilangkan_kata_bermakna_sama(x))
inference_data_pre_processed

0    fabric polyurethane polyester cottonhigh quali...
1    today suppose happy day life I m engaged man d...
Name: descriptions, dtype: object

## **D. Model Prediction with inference data**

In [63]:
# Mengambil model yang disimpan
loaded_model = tf.keras.models.load_model("model_2")

In [51]:
# Memprediksi data inference
predictions = loaded_model.predict(inference_data_pre_processed)

# Mencari nilai dengan hasil prediksi tertinggi
vector_predicted = np.argmax(predictions, axis=1)

# Vector mapping
mapping_dict = {0: 'Household', 1: 'Books', 2: 'Clothing & Accessories', 3: 'Electronics'}

# Mengubah vektor prediksi menjadi tipe produk
tipe_produk_predicted = np.vectorize(mapping_dict.get)(vector_predicted)

print("Tipe produk yang terprediksi adalah:", tipe_produk_predicted)

Tipe produk yang terprediksi adalah: ['Clothing & Accessories' 'Books']
