## **🌍 Pemilihan Algoritma untuk Rekomendasi Berdasarkan Lokasi**

🔍 **Strategi**: Data input dari pengguna berbentuk kombinasi **fitur numerik (suhu rata-rata)** dan **kategorikal (iklim)**. Model perlu menangani **relasi non-linear** dan mampu **beradaptasi** jika fitur bertambah seiring integrasi sistem.

🧠 **Keputusan**: Dipilih algoritma **Neural Network (MLP via TensorFlow)** karena:
- Mendukung **representasi kompleks** dari fitur campuran.
- Cocok untuk **pengembangan jangka panjang**, seperti integrasi ke API dan deployment.
- Dapat dioptimalkan dengan **early stopping, dropout**, dan teknik regularisasi lainnya.

📊 **Alternatif seperti** SVM memiliki performa kuat untuk margin klasifikasi, tetapi sulit di-_scale_ dan lambat untuk _inference_ pada data besar. XGBoost sangat baik untuk tabular, tapi arsitekturnya kurang fleksibel dibanding model neural modern dalam konteks integrasi real-time dan fine-tuning.

---

### 📋 Tabel Perbandingan Algoritma

| Aspek Evaluasi                  | ✅ MLP (TensorFlow)            | 🟡 SVM (Support Vector Machine) | 🔴 XGBoost                      |
|----------------------------------|-------------------------------|----------------------------------|--------------------------------|
| Tangani Fitur Campuran          | ✔️ Sangat Baik                | ⚠️ Perlu preprocessing           | ✔️ Baik                        |
| Dukungan Non-Linearitas         | ✔️ Sangat Fleksibel           | ✔️ Kuat via kernel               | ✔️ Via boosting trees          |
| Skalabilitas Model              | ✔️ Sangat Tinggi              | ❌ Kurang baik                   | ⚠️ Menengah                   |
| Deployment & Integrasi API      | ✔️ Mudah via TensorFlow       | ❌ Kompleks                      | ⚠️ Perlu konversi model       |
| Regularisasi & Overfitting Ctrl | ✔️ Dropout, EarlyStop         | ⚠️ Terbatas                     | ✔️ Built-in                    |
| Interpretabilitas               | ⚠️ Rendah                     | ⚠️ Sulit                        | ✔️ Cukup jelas                |

---

📌 Kesimpulan: **MLP TensorFlow** unggul dalam fleksibilitas dan kemampuan generalisasi untuk integrasi web dan pengembangan sistem jangka panjang. Ini membuatnya ideal untuk konteks rekomendasi tanaman berbasis lokasi dengan data campuran.

## **1️⃣ Install & Import Library**

In [None]:
import pandas as pd
import numpy as np
import gdown
import tensorflow as tf
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split

## **2️⃣ Download & Load Dataset Houseplants**

In [None]:
url = "https://drive.google.com/uc?id=1tE628fdnq32SR8_LrZxJGoTAZ8bTCFvu"
gdown.download(url, "Houseplants.csv", quiet=False)
df = pd.read_csv("Houseplants.csv")
df.head(5)

Downloading...
From: https://drive.google.com/uc?id=1tE628fdnq32SR8_LrZxJGoTAZ8bTCFvu
To: /content/Houseplants.csv
100%|██████████| 72.2k/72.2k [00:00<00:00, 28.0MB/s]


Unnamed: 0,latin,family,category,climate,ideallight,toleratedlight,watering,insects,use,tempmax_celsius,tempmin_celsius,temp_avg,combined
0,Aeschynanthus lobianus,Gesneriaceae,Hanging,Tropical,Bright light,Direct sunlight,Keep moist between watering. Can be a bit dry ...,"Mealy bug, Aphid, Thrips","Hanging, Flower, Tertiary",32,14,23.0,Tropical Bright light Direct sunlight Keep moi...
1,Adiantum raddianum,Polypodiaceae,Fern,Tropical,Bright light,Diffused,Keep moist between watering. Must not be dry b...,"Mealy bug, Aphid, Snail","Potted plant, Ground cover, Table top",30,12,21.0,Tropical Bright light Diffused Keep moist betw...
2,Aechmea fatsiata,Bromeliaceae,Bromeliad,Tropical humid,Bright light,Diffused,Water when soil is half dry. Change water in t...,,"Flower, Table top, Tertiary",30,12,21.0,Tropical humid Bright light Diffused Water whe...
3,Agave angustilolia Marginata,Amaryllidaceae,Cactus And Succulent,Tropical,6 or more hours of direct sunlight per day.,Direct sunlight.,Water only when the soil is dry. Must be dry b...,"Scale, Mealy bug","Potted plant, Primary, Secondary",35,5,20.0,Tropical 6 or more hours of direct sunlight pe...
4,Aechmea ramosa,Bromeliaceae,Bromeliad,Subtropical,Bright light,Diffused,Water when soil is half dry. Change water in t...,,"Flower, Table top, Primary",30,12,21.0,Subtropical Bright light Diffused Water when s...


## **3️⃣ Preprocessing Data**

In [None]:
df = df.dropna(subset=["climate", "tempmin_celsius", "tempmax_celsius", "combined", "latin"])
df['avg_temp'] = (df['tempmin_celsius'] + df['tempmax_celsius']) / 2

## **4️⃣ Encode Categorical Feature**

In [None]:
df['climate_cat'] = df['climate'].astype('category').cat.codes
climate_encoder = dict(zip(df['climate'], df['climate_cat']))

X = df[['climate_cat', 'avg_temp']].values

label_encoder = LabelEncoder()
y = label_encoder.fit_transform(df['latin'])

## **5️⃣ Train Neural Network Classifier (TensorFlow)**

In [None]:
# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Build model
model = tf.keras.Sequential([
    tf.keras.layers.Input(shape=(2,)),
    tf.keras.layers.Dense(16, activation='relu'),
    tf.keras.layers.Dense(32, activation='relu'),
    tf.keras.layers.Dense(len(np.unique(y)), activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train model
model.fit(X_train, y_train, epochs=50, verbose=0)

<keras.src.callbacks.history.History at 0x7824626750d0>

## **6️⃣ Fungsi Prediksi Berdasarkan Lokasi (Content-Based)**

In [None]:
recommendation_history_location = set()

def recommend_plants_by_location(climate_input, temp_input, top_n=3, reset=False):
    global recommendation_history_location

    if reset:
        recommendation_history_location = set()

    # Encode iklim
    matching_climates = df[df['climate'].str.contains(climate_input, case=False, na=False)]
    if matching_climates.empty:
        return ["Iklim tidak dikenali"]

    climate_code = matching_climates['climate_cat'].mode().iloc[0]
    user_input = np.array([[climate_code, temp_input]])

    preds = model.predict(user_input, verbose=0)[0]
    prob_df = pd.DataFrame({
        'latin': label_encoder.inverse_transform(np.arange(len(preds))),
        'prob': preds
    })
    prob_df = prob_df[~prob_df['latin'].isin(recommendation_history_location)]

    top_latin = prob_df.sort_values(by='prob', ascending=False)['latin'].head(top_n).tolist()
    recommendation_history_location.update(top_latin)

    return df[df['latin'].isin(top_latin)][['latin', 'climate', 'avg_temp']].drop_duplicates('latin')

## **7️⃣ Contoh Penggunaan**

In [None]:
recommend_plants_by_location(climate_input="Tropical", temp_input=25, reset=False)

Unnamed: 0,latin,climate,avg_temp
9,Aglaonema,Tropical,23.0
69,Dracaena deremensis,Tropical,20.0
130,Philodendron,Tropical,23.0


## **8️⃣ Simpan Model TensorFlow ke File .h5**

In [None]:
# model.save("RekomendasibyLokasi_model.h5")

