<a href="https://colab.research.google.com/github/ebzkurt1/ml_from_scratch/blob/main/ML_Models_with_Sklearn.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [17]:
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import Perceptron
import numpy as np

## Data Handling

### Loading the data

In [10]:
iris_X, iris_y = datasets.load_iris(
    return_X_y=True # When set true the function returns (X,y)
)
print("Iris data size : ", iris_X.shape)
print("Iris data unique labels : ", np.unique(iris_y))

Iris data size :  (150, 4)
Iris data unique labels :  [0 1 2]


### Splitting the data into train and test

In [11]:
X_train, X_test, y_train, y_test = train_test_split(
    iris_X,
    iris_y,
    test_size=0.3, # Test data size
    random_state=1, # Specifying the random state
    stratify=iris_y # To acquire same label size in each split
)

print("Train data size : ", X_train.shape)
print("Train data label number for each label : ", np.bincount(y_train))

print("Test data size : ", X_test.shape)
print("Test data label number for each label : ", np.bincount(y_test))

Train data size :  (105, 4)
Train data label number for each label :  [35 35 35]
Test data size :  (45, 4)
Test data label number for each label :  [15 15 15]


### Scaling the data

In [16]:
standard_scaler = StandardScaler()
standard_scaler.fit(X_train)

print("Mean of the train data BEFORE scaling : ", X_train.mean())

X_train_scaled = standard_scaler.transform(X_train)
X_test_scaled = standard_scaler.transform(X_test)

print("Mean of the train data AFTER scaling : ", X_train_scaled.mean())

Mean of the train data BEFORE scaling :  3.4783333333333335
Mean of the train data AFTER scaling :  4.1871268357291617e-16


## Model Generation

### Perceptron model

In [18]:
perceptron_model = Perceptron(
    eta0=0.1, # Learning rate of the model
    random_state=1
)
perceptron_model.fit(
    X_train_scaled,
    y_train
)
perceptron_prediction = perceptron_model.predict(X_test_scaled)
perceptron_accuracy = (perceptron_prediction==y_test).sum()/y_test.shape[0]
print("Perceptron model accuracy is : ", perceptron_accuracy)

Perceptron model accuracy is :  0.9555555555555556
