<a href="https://colab.research.google.com/github/raj-vijay/dl/blob/master/09_Activation_Functions_in_TensorFlow.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Activation function**

Components of a typical hidden layer
- Linear: Matrix multiplication
- Nonlinear: Activation function

Activation functions are mathematical equations that determine the output of a neural network. The function is attached to each neuron in the network, and determines whether it should be activated (“fired”) or not, based on whether each neuron's input is relevant for the model's prediction.

**Sigmoid activation function**
- Binary classification
- Low-level: tf.keras.activations.sigmoid()
- High-level: sigmoid

![alt text](https://upload.wikimedia.org/wikipedia/commons/8/88/Logistic-curve.svg)

**Rectified Linear Unit (ReLu) activation function**
- Hidden layers
- Low-level: tf.keras.activations.relu()
- High-level: relu

![alt text](https://upload.wikimedia.org/wikipedia/commons/0/0e/Rectified_linear_and_Softplus_activation_functions.png)

**Softmax activation function**
- Output layer (>2 classes)
- Low-level: tf.keras.activations.softmax()
- High-level: softmax

**Default of Credit Card Clients Dataset**

This research aimed at the case of customers default payments in Taiwan and compares the predictive accuracy of probability of default among six data mining methods. 

From the perspective of risk management, the result of predictive accuracy of the estimated probability of default will be more valuable than the binary result of classification - credible or not credible clients. 

Because the real probability of default is unknown, this study presented the novel Sorting Smoothing Method to estimate the real probability of default. With the real probability of default as the response variable (Y), and the predictive probability of default as the independent variable (X), the simple linear regression result (Y = A + BX) shows that the forecasting model produced by artificial neural network has the highest coefficient of determination; its regression intercept (A) is close to zero, and regression coefficient (B) to one. 

Therefore, among the six data mining techniques, artificial neural network is the only one that can accurately estimate the real probability of default.



Installing Kaggle Package to access the diabetes dataset from Kaggle.

In [0]:
!pip install kaggle



Make .kaggle directory under root to import the Kaggle Authentication JSON.

In [0]:
!mkdir ~/.kaggle

Change file path to root/.kaggle/kaggle.json

In [0]:
!cp /content/kaggle.json ~/.kaggle/kaggle.json

Chmod 600 (chmod a+rwx,u-x,g-rwx,o-rwx) sets permissions so that, (U)ser / owner can read, can write and can't execute. (G)roup can't read, can't write and can't execute. (O)thers can't read, can't write and can't execute.

In [0]:
!chmod 600 /root/.kaggle/kaggle.json

Download housing dataset from Kaggle!

In [0]:
!kaggle datasets download -d uciml/default-of-credit-card-clients-dataset

Downloading default-of-credit-card-clients-dataset.zip to /content
  0% 0.00/0.98M [00:00<?, ?B/s]
100% 0.98M/0.98M [00:00<00:00, 70.4MB/s]


In [0]:
import pandas as pd
file = '/content/default-of-credit-card-clients-dataset.zip'
data = pd.read_csv(file, compression = 'zip')

In [0]:
data.head()

Unnamed: 0,ID,LIMIT_BAL,SEX,EDUCATION,MARRIAGE,AGE,PAY_0,PAY_2,PAY_3,PAY_4,PAY_5,PAY_6,BILL_AMT1,BILL_AMT2,BILL_AMT3,BILL_AMT4,BILL_AMT5,BILL_AMT6,PAY_AMT1,PAY_AMT2,PAY_AMT3,PAY_AMT4,PAY_AMT5,PAY_AMT6,default.payment.next.month
0,1,20000.0,2,2,1,24,2,2,-1,-1,-2,-2,3913.0,3102.0,689.0,0.0,0.0,0.0,0.0,689.0,0.0,0.0,0.0,0.0,1
1,2,120000.0,2,2,2,26,-1,2,0,0,0,2,2682.0,1725.0,2682.0,3272.0,3455.0,3261.0,0.0,1000.0,1000.0,1000.0,0.0,2000.0,1
2,3,90000.0,2,2,2,34,0,0,0,0,0,0,29239.0,14027.0,13559.0,14331.0,14948.0,15549.0,1518.0,1500.0,1000.0,1000.0,1000.0,5000.0,0
3,4,50000.0,2,2,1,37,0,0,0,0,0,0,46990.0,48233.0,49291.0,28314.0,28959.0,29547.0,2000.0,2019.0,1200.0,1100.0,1069.0,1000.0,0
4,5,50000.0,1,2,1,57,-1,0,-1,0,0,0,8617.0,5670.0,35835.0,20940.0,19146.0,19131.0,2000.0,36681.0,10000.0,9000.0,689.0,679.0,0


The input layer contains 3 features: 
1. Education
2. Marital status, and
3. Age

which are available as borrower_features. 

The hidden layer contains 2 nodes and the output layer contains a single node.

In [0]:
borrower_features = data[['EDUCATION','MARRIAGE','AGE']]

In [0]:
borrower_features.head()

Unnamed: 0,EDUCATION,MARRIAGE,AGE
0,2,1,24
1,2,2,26
2,2,2,34
3,2,1,37
4,2,1,57


For each layer, we take the previous layer as an input, initialize a set of weights, compute the product of the inputs and weights, and then apply an activation function. 

In [0]:
from tensorflow import Variable, matmul, ones, float32, constant
import tensorflow as tf
import numpy as np

In [0]:
# Define input layer
inputs = tf.constant(borrower_features, tf.float32)

In [0]:
# Define dense layer 1
dense1 = tf.keras.layers.Dense(16, activation='relu')(inputs)

In [0]:
# Define dense layer 2
dense2 = tf.keras.layers.Dense(8, activation='sigmoid')(dense1)

In [0]:
# Define output layer
outputs = tf.keras.layers.Dense(4, activation='softmax')(dense2)

In [0]:
print(outputs)

tf.Tensor(
[[0.41216153 0.10426651 0.20044154 0.28313044]
 [0.41147256 0.10402402 0.20055237 0.283951  ]
 [0.40902454 0.10331793 0.20200165 0.28565592]
 ...
 [0.4082709  0.10311431 0.20256901 0.28604576]
 [0.40600854 0.10250545 0.20443723 0.28704882]
 [0.40598437 0.10249762 0.20443855 0.28707942]], shape=(30000, 4), dtype=float32)


**Binary Classification Problems**

Here, we make use of credit card data. 

The target variable, default, indicates whether a credit card holder defaults on her payment in the following period. Since there are only two options--default or not--this is a binary classification problem. 

While the dataset has many features, we will focus on just three: the size of the three latest credit card bills. 

We will compute predictions from an untrained network, outputs, and compare those the target variable, default.

In [0]:
bill_amounts = data[['BILL_AMT4',	'BILL_AMT5',	'BILL_AMT6']]

In [0]:
bill_amounts.head()

Unnamed: 0,BILL_AMT4,BILL_AMT5,BILL_AMT6
0,0.0,0.0,0.0
1,3272.0,3455.0,3261.0
2,14331.0,14948.0,15549.0
3,28314.0,28959.0,29547.0
4,20940.0,19146.0,19131.0


In [0]:
default = data['default.payment.next.month']

default = constant(default, float32)

In [0]:
default.head()

0    1
1    1
2    0
3    0
4    0
Name: default.payment.next.month, dtype: int64

In [0]:
# Construct input layer from features
inputs = constant(bill_amounts, float32)

In [0]:
# Define first dense layer
dense1 = tf.keras.layers.Dense(3, activation='relu')(inputs)

In [0]:
# Define second dense layer
dense2 = tf.keras.layers.Dense(2, activation='relu')(dense1)

In [0]:
# Define output layer
outputs = tf.keras.layers.Dense(1, activation='sigmoid')(dense2)

In [0]:
print(outputs.numpy)

<bound method _EagerTensorBase.numpy of <tf.Tensor: shape=(30000, 1), dtype=float32, numpy=
array([[0.5],
       [1. ],
       [1. ],
       ...,
       [1. ],
       [1. ],
       [1. ]], dtype=float32)>>


In [0]:
# Print error for first five examples
error = default[:5] - outputs.numpy()[:5]
print(error)

tf.Tensor(
[[ 0.5  0.5 -0.5 -0.5 -0.5]
 [ 0.   0.  -1.  -1.  -1. ]
 [ 0.   0.  -1.  -1.  -1. ]
 [ 0.   0.  -1.  -1.  -1. ]
 [ 0.   0.  -1.  -1.  -1. ]], shape=(5, 5), dtype=float32)


**Multiclass classification problems**

A multiclass problem has targets that can take on three or more values. 

In the credit card dataset, the education variable can take on 6 different values, each corresponding to a different level of education. 

We will use that as our target in this exercise and will also expand the feature set from 3 to 10 columns.

In [0]:
borrower_features = data[['EDUCATION',	'MARRIAGE',	'BILL_AMT1',	'BILL_AMT2',	'BILL_AMT3',	'BILL_AMT4',	'BILL_AMT5',	'BILL_AMT6']]

In [0]:
borrower_features.head()

Unnamed: 0,EDUCATION,MARRIAGE,BILL_AMT1,BILL_AMT2,BILL_AMT3,BILL_AMT4,BILL_AMT5,BILL_AMT6
0,2,1,3913.0,3102.0,689.0,0.0,0.0,0.0
1,2,2,2682.0,1725.0,2682.0,3272.0,3455.0,3261.0
2,2,2,29239.0,14027.0,13559.0,14331.0,14948.0,15549.0
3,2,1,46990.0,48233.0,49291.0,28314.0,28959.0,29547.0
4,2,1,8617.0,5670.0,35835.0,20940.0,19146.0,19131.0


In [0]:
borrower_features = constant(borrower_features, float32)

In [0]:
# Construct input layer from borrower features
inputs = constant(borrower_features)

# Define first dense layer
dense1 = keras.layers.Dense(10, activation='sigmoid')(inputs)

# Define second dense layer
dense2 = keras.layers.Dense(8, activation='relu')(dense1)

# Define output layer
outputs = keras.layers.Dense(6, activation='softmax')(dense2)

# Print first five predictions
print(outputs.numpy()[:5])

[[0.16265075 0.19698463 0.11350538 0.2522056  0.15524304 0.11941069]
 [0.13544767 0.1998328  0.1369536  0.23823678 0.17253537 0.11699376]
 [0.15057723 0.1615248  0.16702412 0.17602593 0.18551677 0.15933105]
 [0.13380052 0.16518627 0.14076625 0.24564546 0.19083086 0.12377059]
 [0.13544767 0.1998328  0.1369536  0.23823678 0.17253537 0.11699376]]
