# Klasifikasi Bunga Iris (Model Multinomial Naive Bayes)

<img src="http://orbitfutureacademy.id/wp-content/uploads/2020/02/dark_logo.png" alt="Logo Orbit" width="100" height="30">


## Modules dan Packages

In [None]:
# pastikan versi scikit-learn == 1.2.2
# jika tidak sesuai, install versi scikit-learn yang sesuai
!pip show scikit-learn

Name: scikit-learn
Version: 1.2.2
Summary: A set of python modules for machine learning and data mining
Home-page: http://scikit-learn.org
Author: 
Author-email: 
License: new BSD
Location: /usr/local/lib/python3.10/dist-packages
Requires: joblib, numpy, scipy, threadpoolctl
Required-by: fastai, imbalanced-learn, librosa, lightgbm, mlxtend, qudida, sklearn-pandas, yellowbrick


In [None]:
import pandas as pd
import numpy as np
from sklearn.naive_bayes import MultinomialNB
from sklearn.model_selection import train_test_split
import joblib

## Import Data

In [None]:
!wget https://raw.githubusercontent.com/hilya09/deployment-ds/master/train-model/iris.csv

--2023-04-29 08:11:07--  https://raw.githubusercontent.com/hilya09/deployment-ds/master/train-model/iris.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 5107 (5.0K) [text/plain]
Saving to: ‘iris.csv’


2023-04-29 08:11:07 (60.5 MB/s) - ‘iris.csv’ saved [5107/5107]



In [None]:
df = pd.read_csv("iris.csv")
df.drop('Id', axis=1, inplace=True)

## Exploratory Data Analysis (EDA)

In [None]:
df.head()

Unnamed: 0,SepalLengthCm,SepalWidthCm,PetalLengthCm,PetalWidthCm,Species
0,5.1,3.5,1.4,0.2,Iris-setosa
1,4.9,3.0,1.4,0.2,Iris-setosa
2,4.7,3.2,1.3,0.2,Iris-setosa
3,4.6,3.1,1.5,0.2,Iris-setosa
4,5.0,3.6,1.4,0.2,Iris-setosa


In [None]:
df['Species'].value_counts()

Iris-setosa        50
Iris-versicolor    50
Iris-virginica     50
Name: Species, dtype: int64

In [None]:
df.describe()

Unnamed: 0,SepalLengthCm,SepalWidthCm,PetalLengthCm,PetalWidthCm
count,150.0,150.0,150.0,150.0
mean,5.843333,3.054,3.758667,1.198667
std,0.828066,0.433594,1.76442,0.763161
min,4.3,2.0,1.0,0.1
25%,5.1,2.8,1.6,0.3
50%,5.8,3.0,4.35,1.3
75%,6.4,3.3,5.1,1.8
max,7.9,4.4,6.9,2.5


## Training Model

In [None]:
X = df.iloc[:, :-1]
y = df.iloc[:, -1]

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y)

In [None]:
model = MultinomialNB()
model = model.fit(X_train, y_train)

## Evaluasi Model

In [None]:
accuracy_train = model.score(X_train, y_train)
accuracy_test  = model.score(X_test, y_test)

In [None]:
print(f"Akurasi Model (Train) : {np.round(accuracy_train * 100,2)} %")
print(f"Akurasi Model (Test)  : {np.round(accuracy_test * 100,2)} %")

Akurasi Model (Train) : 76.79 %
Akurasi Model (Test)  : 65.79 %


## Menyimpan Model

In [None]:
joblib.dump((model), "model_iris_mnb.model")

['model_iris_dt.model']

## Prediksi

In [None]:
df_test = pd.DataFrame(data={
    "SepalLengthCm" : [5.1],
    "SepalWidthCm"  : [3.5],
    "PetalLengthCm" : [1.4],
    "PetalWidthCm"  : [0.2]
})

df_test[0:1]

Unnamed: 0,SepalLengthCm,SepalWidthCm,PetalLengthCm,PetalWidthCm
0,5.1,3.5,1.4,0.2


In [None]:
pred_test = model.predict(df_test[0:1])
pred_test[0]

'Iris-setosa'