<a href="https://colab.research.google.com/github/dchlseo/DataScienceProjects/blob/main/DeepLearningBasics/TensorFlow/02_tf_perceptron.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [4]:
import tensorflow as tf
import numpy as np

from sklearn.datasets import load_iris # practice dataset

__Activation function: Hyperbolic tangent function__

$\Large{ \hat{y} = tanh(w^Tx) }$

- The tanh function outputs values that range from 1 to 1, which is a normalized range that can sometimes make learning more stable and faster.

__Cost function (hinge loss)__

$
\large{
Loss = \sum_{i=1}^N \max(0, -y_i \hat{y_i})
}
$

- Commonly used in machine learning for "maximum-margin" classification tasks, such as Support Vector Machines (SVMs). It is particularly used for binary classification problems.

In [5]:
iris = load_iris()
print(iris.DESCR)

.. _iris_dataset:

Iris plants dataset
--------------------

**Data Set Characteristics:**

    :Number of Instances: 150 (50 in each of three classes)
    :Number of Attributes: 4 numeric, predictive attributes and the class
    :Attribute Information:
        - sepal length in cm
        - sepal width in cm
        - petal length in cm
        - petal width in cm
        - class:
                - Iris-Setosa
                - Iris-Versicolour
                - Iris-Virginica
                
    :Summary Statistics:

                    Min  Max   Mean    SD   Class Correlation
    sepal length:   4.3  7.9   5.84   0.83    0.7826
    sepal width:    2.0  4.4   3.05   0.43   -0.4194
    petal length:   1.0  6.9   3.76   1.76    0.9490  (high!)
    petal width:    0.1  2.5   1.20   0.76    0.9565  (high!)

    :Missing Attribute Values: None
    :Class Distribution: 33.3% for each of 3 classes.
    :Creator: R.A. Fisher
    :Donor: Michael Marshall (MARSHALL%PLU@io.arc.nasa.gov)
    :

In [6]:
# use only subset of data
idx = np.in1d(iris.target, [0, 2])
X_data = iris.data[idx, 0:2]
y_data = (iris.target[idx] - 1.0)[:, np.newaxis]

In [7]:
X_data.shape, y_data.shape

((100, 2), (100, 1))

In [8]:
num_iter = 500
lr = 0.0003

In [24]:
# initialize weight and bias
w = tf.Variable(tf.random.normal([2,1], dtype=tf.float64))
b = tf.Variable(tf.random.normal([1,1], dtype=tf.float64))

In [25]:
zero = tf.constant(0, dtype=tf.float64)

for epoch in range(num_iter):
  for i in range(X_data.shape[0]):
    x = X_data[i:i+1]
    y = y_data[i:i+1]

    with tf.GradientTape() as tape:
      logit = tf.matmul(x, w) + b
      y_hat = tf.tanh(logit)
      loss = tf.maximum(zero, tf.multiply(-y, y_hat)) # zero should be tensor (assigned above)

    grad = tape.gradient(loss, [w, b])
    w.assign_sub(lr * grad[0])
    b.assign_sub(lr * grad[1])

In [26]:
y_pred = tf.tanh(tf.matmul(X_data, w) + b)

In [27]:
X_data.shape, w.shape, b.shape

((100, 2), TensorShape([2, 1]), TensorShape([1, 1]))

In [32]:
print('Predicted: ', y_pred[0], 'Answer: ', y_data[0])
print('Predicted: ', y_pred[13], 'Answer: ', y_data[13])
print('Predicted: ', y_pred[50], 'Answer: ', y_data[50])

Predicted:  tf.Tensor([-0.12466277], shape=(1,), dtype=float64) Answer:  [-1.]
Predicted:  tf.Tensor([-0.2721102], shape=(1,), dtype=float64) Answer:  [-1.]
Predicted:  tf.Tensor([0.17045922], shape=(1,), dtype=float64) Answer:  [1.]


In [31]:
# make prediction output consistent with tanh function (range: -1~1)
print('Predicted: ', -1 if y_pred[0] < 0 else 1, 'Answer: ', y_data[0])
print('Predicted: ', -1 if y_pred[13] < 0 else 1, 'Answer: ', y_data[13])
print('Predicted: ', -1 if y_pred[50] < 0 else 1, 'Answer: ', y_data[50])

Predicted:  -1 Answer:  [-1.]
Predicted:  -1 Answer:  [-1.]
Predicted:  1 Answer:  [1.]
