# 03 Machine Learning Workflow dengan Scikit Learn

## Persiapan Dataset

#### Load Sample Dataset: Iris Dataset

In [1]:
from sklearn.datasets import load_iris

iris = load_iris()

X = iris.data
y = iris.target

#### Splitting Dataset: Training & Testing Set

In [2]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X,
                                                    y,
                                                    test_size=0.4,
                                                    random_state=1)

## Training Model

- Pada Scikit Learn, model machine learning dibentuk dari class yang dikenal dengan istilah **estimator**.
- Setiap estimator akan mengimplementasikan dua method utama, yaitu `fit()` dan `predict()`.
- Method `fit()` digunakan untuk melakukan training model.
- Method `predict()` digunakan untuk melakukan estimasi/prediksi dengan memanfaatkan trained model.

In [3]:
from sklearn.neighbors import KNeighborsClassifier

model = KNeighborsClassifier(n_neighbors=3)
model.fit(X_train, y_train)

KNeighborsClassifier(n_neighbors=3)

## Evaluasi Model

In [4]:
from sklearn.metrics import accuracy_score

y_pred = model.predict(X_test)
acc = accuracy_score(y_test, y_pred)
print(f'Accuracy: {acc}')

Accuracy: 0.9833333333333333


## Pemanfaatan Trained Model

In [5]:
data_baru = [[5, 5, 3, 2], 
             [2, 4, 3, 5]]

preds =  model.predict(data_baru)
preds

array([1, 2])

In [6]:
pred_species = [iris.target_names[p] for p in preds] 
print(f'Hasil Prediksi: {pred_species}')

Hasil Prediksi: ['versicolor', 'virginica']


## Dump & Load Trained Model

#### Dumping Model Machine Learning menjadi file `joblib`

In [7]:
import joblib

joblib.dump(model, 'iris_classifier_knn.joblib')

['iris_classifier_knn.joblib']

#### Loading Model Machine Learning dari file `joblib`

In [8]:
production_model = joblib.load('iris_classifier_knn.joblib')