<a href="https://colab.research.google.com/github/raj-vijay/dl/blob/master/07_Dense_layers_in_Neural_Networks_and_TensorFlow.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Neural Networks**

The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided by the inventor of one of the first neurocomputers, Dr. Robert Hecht-Nielsen. 

He defines a neural network as:


"...a computing system made up of a number of simple, highly interconnected processing elements, which process information by their dynamic state response to external inputs.

In "Neural Network Primer: Part I" by Maureen Caudill, AI Expert, Feb. 1989

ANNs are processing devices (algorithms or actual hardware) that are loosely modeled after the neuronal structure of the mamalian cerebral cortex but on much smaller scales. 

A large ANN might have hundreds or thousands of processor units, whereas a mamalian brain has billions of neurons with a corresponding increase in magnitude of their overall interaction and emergent behavior. Although ANN researchers are generally not concerned with whether their networks accurately resemble biological systems, some have. 

For example, researchers have accurately simulated the function of the retina and modeled the eye rather well.

Source: http://pages.cs.wisc.edu/~bolo/shipyard/neural/local.html

![alt text](https://www.tensorflow.org/images/custom_estimators/full_network.png)

Neural neworks are typically organized in layers. Layers are made up of a number of interconnected 'nodes' which contain an 'activation function'. Patterns are presented to the network via the 'input layer', which communicates to one or more 'hidden layers' where the actual processing is done via a system of weighted 'connections'.

**Dense Layer**

A dense layer applies weights to all nodes from the previous layer.

A dense layer is just a regular layer of neurons in a neural network. Each neuron recieves input from all the neurons in the previous layer, thus densely connected. The layer has a weight matrix W, a bias vector b, and the activations of previous layer a. 

![alt text](https://raw.githubusercontent.com/raj-vijay/dl/master/images/Dense%20Layer.png)

**A simple dense layer**

In [0]:
import tensorflow as tf
from tensorflow import float32

In [0]:
# Define inputs (features)
inputs = tf.constant([[1, 35]], float32)

In [0]:
# Define weights
weights = tf.Variable([[-0.05], [-0.01]], float32)

In [0]:
# Define the bias
bias = tf.Variable([0.5], float32)

In [0]:
# Multiply inputs (features) by the weights
product = tf.matmul(inputs, weights)

In [0]:
# Define dense layer
dense = tf.keras.activations.sigmoid(product+bias)

**Defining a complete model**

Tensorflow has high level api's provided within keras, which skips the need to peform linear algebra.

In [0]:
import tensorflow as tf

In [0]:
# Define input (features) layer
inputs = tf.constant(inputs, tf.float32)

In [0]:
# Define first dense layer
dense1 = tf.keras.layers.Dense(10, activation='sigmoid')(inputs)

In [0]:
# Define second dense layer
dense2 = tf.keras.layers.Dense(5, activation='sigmoid')(dense1)

In [0]:
# Define output (predictions) layer
outputs = tf.keras.layers.Dense(1, activation='sigmoid')(dense2)

![alt text](https://raw.githubusercontent.com/raj-vijay/dl/master/images/TensorFlow%20Approach.png)

**Default of Credit Card Clients Dataset**

This research aimed at the case of customers default payments in Taiwan and compares the predictive accuracy of probability of default among six data mining methods. 

From the perspective of risk management, the result of predictive accuracy of the estimated probability of default will be more valuable than the binary result of classification - credible or not credible clients. 

Because the real probability of default is unknown, this study presented the novel Sorting Smoothing Method to estimate the real probability of default. With the real probability of default as the response variable (Y), and the predictive probability of default as the independent variable (X), the simple linear regression result (Y = A + BX) shows that the forecasting model produced by artificial neural network has the highest coefficient of determination; its regression intercept (A) is close to zero, and regression coefficient (B) to one. 

Therefore, among the six data mining techniques, artificial neural network is the only one that can accurately estimate the real probability of default.



Installing Kaggle Package to access the diabetes dataset from Kaggle.

In [0]:
!pip install kaggle



Make .kaggle directory under root to import the Kaggle Authentication JSON.

In [0]:
!mkdir ~/.kaggle

Change file path to root/.kaggle/kaggle.json

In [0]:
!cp /content/kaggle.json ~/.kaggle/kaggle.json

Chmod 600 (chmod a+rwx,u-x,g-rwx,o-rwx) sets permissions so that, (U)ser / owner can read, can write and can't execute. (G)roup can't read, can't write and can't execute. (O)thers can't read, can't write and can't execute.

In [0]:
!chmod 600 /root/.kaggle/kaggle.json

Download housing dataset from Kaggle!

In [0]:
!kaggle datasets download -d uciml/default-of-credit-card-clients-dataset

Downloading default-of-credit-card-clients-dataset.zip to /content
  0% 0.00/0.98M [00:00<?, ?B/s]
100% 0.98M/0.98M [00:00<00:00, 63.5MB/s]


In [0]:
import pandas as pd
file = '/content/default-of-credit-card-clients-dataset.zip'
data = pd.read_csv(file, compression = 'zip')

In [0]:
data.head()

Unnamed: 0,ID,LIMIT_BAL,SEX,EDUCATION,MARRIAGE,AGE,PAY_0,PAY_2,PAY_3,PAY_4,PAY_5,PAY_6,BILL_AMT1,BILL_AMT2,BILL_AMT3,BILL_AMT4,BILL_AMT5,BILL_AMT6,PAY_AMT1,PAY_AMT2,PAY_AMT3,PAY_AMT4,PAY_AMT5,PAY_AMT6,default.payment.next.month
0,1,20000.0,2,2,1,24,2,2,-1,-1,-2,-2,3913.0,3102.0,689.0,0.0,0.0,0.0,0.0,689.0,0.0,0.0,0.0,0.0,1
1,2,120000.0,2,2,2,26,-1,2,0,0,0,2,2682.0,1725.0,2682.0,3272.0,3455.0,3261.0,0.0,1000.0,1000.0,1000.0,0.0,2000.0,1
2,3,90000.0,2,2,2,34,0,0,0,0,0,0,29239.0,14027.0,13559.0,14331.0,14948.0,15549.0,1518.0,1500.0,1000.0,1000.0,1000.0,5000.0,0
3,4,50000.0,2,2,1,37,0,0,0,0,0,0,46990.0,48233.0,49291.0,28314.0,28959.0,29547.0,2000.0,2019.0,1200.0,1100.0,1069.0,1000.0,0
4,5,50000.0,1,2,1,57,-1,0,-1,0,0,0,8617.0,5670.0,35835.0,20940.0,19146.0,19131.0,2000.0,36681.0,10000.0,9000.0,689.0,679.0,0


**The linear algebra of dense layers**

There are two ways to define a dense layer in tensorflow. The first involves the use of low-level, linear algebraic operations. The second makes use of high-level keras operations. 

Here, we use the first method to construct the network shown in the image below.

![alt text](https://raw.githubusercontent.com/raj-vijay/dl/master/images/Simple%20Neural%20Network.png)

The input layer contains 3 features: 
1. Education
2. Marital status, and
3. Age

which are available as borrower_features. 

The hidden layer contains 2 nodes and the output layer contains a single node.

In [0]:
borrower_features = data[['EDUCATION','MARRIAGE','AGE']]

In [0]:
borrower_features.head()

Unnamed: 0,EDUCATION,MARRIAGE,AGE
0,2,1,24
1,2,2,26
2,2,2,34
3,2,1,37
4,2,1,57


For each layer, we take the previous layer as an input, initialize a set of weights, compute the product of the inputs and weights, and then apply an activation function. 

In [0]:
from tensorflow import Variable, matmul, ones, float32
from tensorflow import keras
import numpy as np

In [0]:
borrower_features = np.array(borrower_features, np.float32)

In [0]:
# Initialize bias1
bias1 = Variable(1.0, float32)

# Initialize weights1 as 3x2 variable of ones
weights1 = Variable(ones((3, 2)), float32)

# Perform matrix multiplication of borrower_features and weights1
product1 = matmul(borrower_features, weights1)

# Apply sigmoid activation function to product1 + bias1
dense1 = keras.activations.sigmoid(product1 + bias1)

# Print shape of dense1
print("\n dense1's output shape: {}".format(dense1.shape))


 dense1's output shape: (30000, 2)


In [0]:
bias1 = Variable(1.0)
weights1 = Variable(ones((3, 2)))
product1 = matmul(borrower_features, weights1)
dense1 = keras.activations.sigmoid(product1 + bias1)

# Initialize bias2 and weights2
bias2 = Variable(1.0)
weights2 = Variable(ones((2, 1)))

# Perform matrix multiplication of dense1 and weights2
product2 = matmul(dense1, weights2)

# Apply activation to product2 + bias2 and print the prediction
prediction = keras.activations.sigmoid(product2 + bias2)
print('\n prediction: {}'.format(prediction.numpy()[0,0]))
print('\n actual: 1')


 prediction: 0.9525741338729858

 actual: 1


**Low-level approach with multiple examples**

In [0]:
weights1 = tf.constant([-0.6, 0.6, 0.8, -0.3, -0.09, -0.08], shape=[3, 2], dtype=float32)

In [0]:
print(weights1)

tf.Tensor(
[[-0.6   0.6 ]
 [ 0.8  -0.3 ]
 [-0.09 -0.08]], shape=(3, 2), dtype=float32)


In [0]:
bias1 = tf.constant([0.1], dtype=float32)

In [0]:
print(bias1)

tf.Tensor([0.1], shape=(1,), dtype=float32)


In [0]:
# Compute the product of borrower_features and weights1
products1 = matmul(borrower_features, weights1)

# Apply a sigmoid activation function to products1 + bias1
dense1 = keras.activations.sigmoid(products1+bias1)

# Print the shapes of borrower_features, weights1, bias1, and dense1
print('\n shape of borrower_features: ', borrower_features.shape)
print('\n shape of weights1: ', weights1.shape)
print('\n shape of bias1: ', bias1.shape)
print('\n shape of dense1: ', dense1.shape)


 shape of borrower_features:  (30000, 3)

 shape of weights1:  (3, 2)

 shape of bias1:  (1,)

 shape of dense1:  (30000, 2)


In [0]:
# Define the first dense layer
dense1 = keras.layers.Dense(7, activation='sigmoid')(borrower_features)

# Define a dense layer with 3 output nodes
dense2 = keras.layers.Dense(3, activation='sigmoid')(dense1)

# Define a dense layer with 1 output node
predictions = keras.layers.Dense(1, activation='sigmoid')(dense2)

# Print the shapes of dense1, dense2, and predictions
print('\n shape of dense1: ', dense1.shape)
print('\n shape of dense2: ', dense2.shape)
print('\n shape of predictions: ', predictions.shape)


 shape of dense1:  (30000, 7)

 shape of dense2:  (30000, 3)

 shape of predictions:  (30000, 1)
