# Python Library: TensorFlow 2.0

**Run this from tf env, with tensorflow 2.0+**

In [1]:
import tensorflow as tf

In [2]:
tf.__version__

'2.0.0'

In [3]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

## 0. Introduction

Tensorflow is an open-source library for graph-based numerical computation.     
Has both low and high level APIs.       
Can be used to perform addition, multiplication and differentiation.      
Can be used to train ML models.       

Important changes in TensorFlow 2.0:        
Eager exucution is now enabled y default, which allows users to write simpler and more intuitive code.       
Modeling building is now centered around the Keras and Estimators high-level APIs.         



In [4]:
# 0D Tensor
d0 = tf.ones((1,))
d0

<tf.Tensor: id=5, shape=(1,), dtype=float32, numpy=array([1.], dtype=float32)>

In [5]:
# 1D Tensor
d1 = tf.ones((2,))
d1

<tf.Tensor: id=8, shape=(2,), dtype=float32, numpy=array([1., 1.], dtype=float32)>

In [6]:
# 2D Tensor
d2 = tf.ones((2,2))
d2

<tf.Tensor: id=11, shape=(2, 2), dtype=float32, numpy=
array([[1., 1.],
       [1., 1.]], dtype=float32)>

In [7]:
# 3D Tensor
d3 = tf.ones((2,2,2))
d3

<tf.Tensor: id=14, shape=(2, 2, 2), dtype=float32, numpy=
array([[[1., 1.],
        [1., 1.]],

       [[1., 1.],
        [1., 1.]]], dtype=float32)>

In [8]:
# Print the 3D tensor
print(d3.numpy())

[[[1. 1.]
  [1. 1.]]

 [[1. 1.]
  [1. 1.]]]


### 0.1 Defining constants in TensorFlow

A constant is the simplest category of tensor.      
A constant does not change and cannot be trained.      

It can, however, have any dimension.      


In [9]:
from tensorflow import constant

# Define a 2x3 constant tensor of 3s.
a = constant(3, shape=[2,3])

# define a 2x2 tensor, which is constructed from the 1D tensor: 1,2,3,4
b = constant([1,2,3,4],shape=[2,2])

In [10]:
print(a.numpy())
print(b.numpy())

[[3 3 3]
 [3 3 3]]
[[1 2]
 [3 4]]


In [12]:
# some convenient way of defining constant
input_tensor = constant([1,2,3,4],shape=[2,2])

a = tf.constant([1,2,3])

b = tf.zeros([2,2])

c = tf.zeros_like(input_tensor)

d = tf.ones([2,2])

e = tf. ones_like(input_tensor)

f = tf.fill([3,3],7)

print(a.numpy())
print(b.numpy())
print(c.numpy())
print(d.numpy())
print(e.numpy())
print(f.numpy())

[1 2 3]
[[0. 0.]
 [0. 0.]]
[[0 0]
 [0 0]]
[[1. 1.]
 [1. 1.]]
[[1 1]
 [1 1]]
[[7 7 7]
 [7 7 7]
 [7 7 7]]


### 0.2 Defining and initialising variables

Unlike a constant, a variable's value can change during computation. The value of a variable is shared, persistent and modifiable.         
Its data type and shape are fixed.         



In [13]:
# Define a variable
a0 = tf.Variable([1,2,3,4,5,6], dtype=tf.float32)
a1 = tf.Variable([1,2,3,4,5,6], dtype=tf.int16)

In [14]:
# Define a constant
b = tf.constant(2,tf.float32)

In [15]:
# compute their product
c0 = tf.multiply(a0,b)
c1 = a0*b

In [16]:
print(c0.numpy())
print(c1.numpy())

[ 2.  4.  6.  8. 10. 12.]
[ 2.  4.  6.  8. 10. 12.]


Defining data as constants:        

After you have imported constant, you will use it to transform a numpy array, credit_numpy, into a tensorflow constant, credit_constant. This array contains feature columns from a dataset on credit card holders and is previewed in the image below. We will return to this dataset in later chapters.             

Note that tensorflow version 2.0 allows you to use data as either a numpy array or a tensorflow constant object. Using a constant will ensure that any operations performed with that object are done in tensorflow.

In [4]:
credit_numpy = np.array([[ 2.0000e+00,  2.4000e+01,  1.0000e+00,  3.9130e+03],
       [ 2.0000e+00,  2.6000e+01,  2.0000e+00,  2.6820e+03],
       [ 2.0000e+00,  3.4000e+01,  2.0000e+00,  2.9239e+04],
       [ 2.0000e+00,  3.7000e+01,  2.0000e+00,  3.5650e+03],
       [ 3.0000e+00,  4.1000e+01,  1.0000e+00, -1.6450e+03],
       [ 2.0000e+00,  4.6000e+01,  1.0000e+00,  4.7929e+04]])

In [5]:
# Import constant from TensorFlow
from tensorflow import constant

# Convert the credit_numpy array into a tensorflow constant
credit_constant = constant(credit_numpy)

# Print constant datatype
print('The datatype is:', credit_constant.dtype)

# Print constant shape
print('The shape is:', credit_constant.shape)

The datatype is: <dtype: 'float64'>
The shape is: (6, 4)


## 1. Basic Operation of Tensorflow

Tensorflow has a model of computation that revolves around the use of graphs. A Tensorflow graph contains edges and nodes, where the edges are tensors and the modes are operations.      

The add() operation performs element-wise **addition** with two tensors.       
Element-wise addition requires both tensors to have the same shape.            
The add() operator is overloaded, which means that we can also perform addition using the plus symbol.      


In [4]:
# addition operator
from tensorflow import constant, add

# Define 0-dimensional tensors
A0 = constant([1])
B0 = constant([2])

# Define 1-dimensional tensors
A1 = constant([1,2])
B1 = constant([3,4])

# Define 2-dimensional tensors
A2 = constant([[1,2],[3,4]])
B2 = constant([[5,6],[7,8]])

# Perform tensor addition with add()
C0 = add(A0,B0) #scaler addition
C1 = add(A1,B1) #vector addition
C2 = add(A2,B2) #matrix addition

print(C0.numpy())
print(C1.numpy())
print(C2.numpy())

[3]
[4 6]
[[ 6  8]
 [10 12]]


**Element-wise multiplication** performed using multiply() operation.        
The tensors involved must have the same shape.     
**Matrix multiplication** performed wutg matmul() operator.         
The matmul(A,B) operation multiplies A by B.       
Number of columns of A must equal to the number of rows of B.      


In [9]:
from tensorflow import ones, matmul, multiply

# Define tensors
A0 = ones(1)
A31 = ones([3,1])
A34 = ones([3,4])
A43 = ones([4,3])

ex1 = multiply(A0,A0)
ex2 = multiply(A31,A31)
ex3 = multiply(A34,A34)

print(ex1.numpy())
print(ex2.numpy())
print(ex3.numpy())

[1.]
[[1.]
 [1.]
 [1.]]
[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]


Summing over tensor dimensions:          

The reduce_sum() operator sums over the dimensions of a tensor.      
reduce_sum(A) sums over all dimensions of A.      
reduce_sum(A,i) sums over dimension i.       
In each case we reduce the size of the tensor by summing over one of its dimensions.

In [12]:
from tensorflow import ones, reduce_sum

# Define a 2x3x4 tensors of ones
A = ones([2,3,4])

# Sum over all dimension
B = reduce_sum(A)

# Sum over dimension 0,1,2
B0 = reduce_sum(A,0) # 3x4 matrix of 2s
B1 = reduce_sum(A,1) # 2x4 matrix of 3s
B2 = reduce_sum(A,2) # 2x3 matrix of 4s

print(A.numpy())
print(B.numpy())
print(B0.numpy())
print(B1.numpy())
print(B2.numpy())

[[[1. 1. 1. 1.]
  [1. 1. 1. 1.]
  [1. 1. 1. 1.]]

 [[1. 1. 1. 1.]
  [1. 1. 1. 1.]
  [1. 1. 1. 1.]]]
24.0
[[2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]]
[[3. 3. 3. 3.]
 [3. 3. 3. 3.]]
[[4. 4. 4.]
 [4. 4. 4.]]


In [14]:
from tensorflow import ones_like

# Define tensors A1 and A23 as constants
A1 = constant([1, 2, 3, 4])
A23 = constant([[1, 2, 3], [1, 6, 4]])

# Define B1 and B23 to have the correct shape
B1 = ones_like(A1)
B23 = ones_like(A23)

# Perform element-wise multiplication
C1 = multiply(A1,B1)
C23 = multiply(A23,B23)

# Print the tensors C1 and C23
print('C1: {}'.format(C1.numpy()))
print('C23: {}'.format(C23.numpy()))

C1: [1 2 3 4]
C23: [[1 2 3]
 [1 6 4]]


In [15]:
# Define features, params, and bill as constants
features = constant([[2, 24], [2, 26], [2, 57], [1, 37]])
params = constant([[1000], [150]])
bill = constant([[3913], [2682], [8617], [64400]])

# Compute billpred using features and params
billpred = matmul(features,params)

# Compute and print the error
error = bill - billpred
print(error.numpy())

[[-1687]
 [-3218]
 [-1933]
 [57850]]


In [16]:
wealth = constant([[11, 50], [7, 2], [4, 60], [3, 0],[25,10]])

wealth_all = reduce_sum(wealth)
# Sum over dimension 0,1
wealth0 = reduce_sum(wealth,0) # 3x4 matrix of 2s
wealth1 = reduce_sum(wealth,1) # 2x4 matrix of 3s


print(wealth.numpy())
print(wealth_all.numpy())
print(wealth0.numpy())
print(wealth1.numpy())


[[11 50]
 [ 7  2]
 [ 4 60]
 [ 3  0]
 [25 10]]
172
[ 50 122]
[61  9 64  3 35]


## 2. Advanced operations

gradient():         
Computes the slope of a function at a point.       

reshape():        
Reshape a tensor (e.g. 10x10 to 100x1)         

random():       
Populates tensor with entries drawn.        


**Finding the optimum (gradient())**        

In many ML problems, we will want to find the optimum of a funciton, e.g. minimise the loss function or maximise the objective function.        

Minimum: lowest value of a loss function.          
Maximum: highest value of objective function.           

We can do this using the gradient() operation.      
We start this process by passing points to the gradient operation until we find one where the gradient is 0, that is the **optimum**.        
Minimum: change in gradient > 0.       
Maximum: change in gradient < 0.          




In [17]:
# Define x
x = tf.Variable(-1.0)

# Define y to be x^2 wihtin instance of GradientTape
# watch method to an instance of gradient tape and then pass the variable x,
# this will allow us to compute the rate of change of y wrt x.
with tf.GradientTape() as tape:
    tape.watch(x)
    y = tf.multiply(x,x)
    
# Evaluate the gradient of y at x = -1, using the tape instance of gradienttape
g = tape.gradient(y,x)
print(g.numpy())

# slope is -2 at x = -1

-2.0


**Images as tensor (reshape())**        

A grayscale image has a natural respresentation as matrix with values between 0 and 255. Some algorithm require us to reshape matrices into vectors before using them as inputs. 

In [19]:
# generate grayscale image
gray = tf.random.uniform([2,2],maxval=255, dtype="int32")

# reshape grayscale image
gray = tf.reshape(gray, [2*2,1])

print(gray.numpy())
# generate a color image
color = tf.random.uniform([2,2,3],maxval=255, dtype="int32")

# reshape color image
color = tf.reshape(color,[2*2,3])
print(color.numpy())

[[221]
 [232]
 [206]
 [111]]
[[105 109  38]
 [145 170 187]
 [ 94 225  87]
 [ 50 136 170]]


You are given a loss function, y=x2, which you want to minimize. You can do this by computing the slope using the GradientTape() operation at different values of x. If the slope is positive, you can decrease the loss by lowering x. If it is negative, you can decrease it by increasing x. This is how gradient descent works.

In [21]:
def compute_gradient(x0):
  	# Define x as a variable with an initial value of x0
	x = tf.Variable(x0)
	with tf.GradientTape() as tape:
		tape.watch(x)
        # Define y using the multiply operation
		y = tf.multiply(x,x)
    # Return the gradient of y with respect to x
	return tape.gradient(y, x).numpy()

# Compute and print gradients at x = -1, 1, and 0
print(compute_gradient(-1.0))
print(compute_gradient(1.0))
print(compute_gradient(0.0))

-2.0
2.0
0.0


## 3. Linear Model

### 3.1 Input data

When we train a ML model, we will want to import data from an external source. Numeric data wil need to be assigned a type, and text and image data will need to be converted into a usable format.       

External datasets can be imported using TensorFlow. It is useful for complex data pipeline. We can also use pandas to import data. And then we can convert data into numpy array, which we can use without further modification in TensorFlow.     
Dataset will contain columns with different data types. 

In [27]:
housing = pd.read_csv("../Machine_Learning_basics/data/kc_house_data.csv")
housing.head()

Unnamed: 0,id,date,price,bedrooms,bathrooms,sqft_living,sqft_lot,floors,waterfront,view,...,grade,sqft_above,sqft_basement,yr_built,yr_renovated,zipcode,lat,long,sqft_living15,sqft_lot15
0,7129300520,20141013T000000,221900.0,3,1.0,1180,5650,1.0,0,0,...,7,1180,0,1955,0,98178,47.5112,-122.257,1340,5650
1,6414100192,20141209T000000,538000.0,3,2.25,2570,7242,2.0,0,0,...,7,2170,400,1951,1991,98125,47.721,-122.319,1690,7639
2,5631500400,20150225T000000,180000.0,2,1.0,770,10000,1.0,0,0,...,6,770,0,1933,0,98028,47.7379,-122.233,2720,8062
3,2487200875,20141209T000000,604000.0,4,3.0,1960,5000,1.0,0,0,...,7,1050,910,1965,0,98136,47.5208,-122.393,1360,5000
4,1954400510,20150218T000000,510000.0,3,2.0,1680,8080,1.0,0,0,...,8,1680,0,1987,0,98074,47.6168,-122.045,1800,7503


In [28]:
housing.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 21613 entries, 0 to 21612
Data columns (total 21 columns):
id               21613 non-null int64
date             21613 non-null object
price            21613 non-null float64
bedrooms         21613 non-null int64
bathrooms        21613 non-null float64
sqft_living      21613 non-null int64
sqft_lot         21613 non-null int64
floors           21613 non-null float64
waterfront       21613 non-null int64
view             21613 non-null int64
condition        21613 non-null int64
grade            21613 non-null int64
sqft_above       21613 non-null int64
sqft_basement    21613 non-null int64
yr_built         21613 non-null int64
yr_renovated     21613 non-null int64
zipcode          21613 non-null int64
lat              21613 non-null float64
long             21613 non-null float64
sqft_living15    21613 non-null int64
sqft_lot15       21613 non-null int64
dtypes: float64(5), int64(15), object(1)
memory usage: 3.5+ MB


In [26]:
#convert to numpy array
housing_arr = np.array(housing)

In [None]:
# setting data type with np array

#convert price olumns to float32
price = np.array(housing["price"],np.float32)

#convert waterfrount column to Boolean
# waterfront is a np array after conversion
waterfront = np.array(housing["waterfront"], np.bool)


In [None]:
# setting data type using case in TensorFlow

#convert price olumns to float32
price = tf.cast(housing["price"],tf.float32)

#convert waterfrount column to Boolean
# waterfront is tf.tensor type after conversion
waterfront = tf.cast(housing["waterfront"],tf.bool)



In [29]:
# Import numpy and tensorflow with their standard aliases
import tensorflow as tf
import numpy as np

# Use a numpy array to define price as a 32-bit float
price = np.array(housing['price'], np.float32)

# Define waterfront as a Boolean using cast
waterfront = tf.cast(housing['waterfront'], tf.bool)

# Print price and waterfront
print(price)
print(waterfront)

[221900. 538000. 180000. ... 402101. 400000. 325000.]
tf.Tensor([False False False ... False False False], shape=(21613,), dtype=bool)


### 3.2 Loss functions

We need the loss funtion to train models because they tell us how well our model explains the data.     
Measure of model fit: w/o this feedback, it is unclear how to adjust model parameters during the training process.     
High loss value indicates that the model fit is poor: we typically want to minimise the loss function. (in some case we also might want to maximise a function instead). We can always place a minus sign before the function we want to maximize and instead minimise it.      

TensorFlow has operations for common loss functions (for linear model):       
1) Mean squared error (MSE)           
2) Mean absolute error (MAE)        
3) Huber error           

Loss functions are accessible from tf.keras.losses():       
tf.keras.losses.mse()         
tf.keras.losses.mae()      
tf.keras.losses.Huber()       

MSE:       
Strongly penalise outliers      
High sensitivity near minimum       

MAE:    
Scales linearly with size of error      
Low sensitity near minimum       

Huber:    
Similar to MSE near minimum      
Similar to MAE away form minimum        

For greater sensitivity near the minimum, use MSE or Huber.      
To minimise the impact of outliers, use MAE or Huber.          

In many cases, the training process will require us to supply a function that accepts our model's variables and ata and returns a loss.

In [None]:
# Define a loss function

# compute the MSE loss.
# need two tensors to compute it: the actual values or target
# predicted values or predictions

loss = tf.keras.losses.mse(targets, predictions)

In [None]:
# define a linear regression model
def linear_regression(intercept, slope = slope, features = features):
    return intercept + features*slope

# define a loss function ot compute the MSE
def loss_function(intercept, slope, targets = targets, features = features):
    # compute the predictions for a linear model
    predictions = linear_regression(intercept, slope)
    
    # return the loss
    return tf.keras.losses.mse(targets, predictions)

In [None]:
# compute the loss for test data inputs
loss_function(intercept, slope, test_targets, test_features)

In [None]:
# Initialize a variable named scalar
scalar = tf.Variable(1.0, dtype=tf.float32)

# Define the model
def model(scalar, features = features):
  	return scalar * features

# Define a loss function
def loss_function(scalar, features = features, targets = targets):
	# Compute the predicted values
	predictions = model(scalar, features)
    
	# Return the mean absolute error loss
	return tf.keras.losses.mae(targets, predictions)

# Evaluate the loss function and print the loss
print(loss_function(scalar).numpy())

### 3.3 Linear Regression

A linear regression model assumes a linear relationship (univariate regression):       
price = intercept + size * slope + error (to predict house price with size of house)      



In [39]:
# Linear regression in TensorFlow

# Define the targets and features
price = np.array(housing["price"],np.float32)
size = np.array(housing["sqft_living"],np.float32)

# Define the intercept and slope, initialisation
intercept = tf.Variable(0.1, dtype=np.float32)
slope = tf.Variable(0.1, dtype=np.float32)

# define a model: linear regression
def linear_regression(intercept, slope, features = size):
    return intercept + features*slope

# compute the predicted and loss
def loss_function(intercept, slope, targets = price, features = size):
    predictions = linear_regression(intercept, slope)
    return tf.keras.losses.mse(targets, predictions)

# Define an optimisation operation
# learning rate = 0.5
opt = tf.keras.optimizers.Adam(0.5)

# minimise the loss function and price the loss
for j in range(1000):
    opt.minimize(lambda: loss_function(intercept,slope), var_list=[intercept,slope])
    if j % 100 == 0:
	    print(loss_function(intercept, slope).numpy())
    
# print the trained parameter
print(intercept.numpy(), slope.numpy())

424840500000.0
305881000000.0
219757840000.0
160266650000.0
121212770000.0
97023190000.0
82984410000.0
75397860000.0
71600530000.0
69847540000.0
246.97383 253.7275


### 3.3 Batch training

We can use batch training to handle large datasets.        

If the dataset is much larger and we cannot fit the entire dataset in memory, we can instead divide it into batchs and train on those batches sequentially.        

A single pass over all of the batch is called an epoch and the process itself is called batch training.      
Batch trianing also allow us to update model weights and optimizer parameters after each batch, rather than at the end of the epoch.       

pd.read_csv() allows us to load data in batches.      
Avoid loading entire dataset.     
We can do this by using the chunksize parameter to provide batch size.         

Full sample:       
1) One update per epoch         
2) Accepts datasets w/o modification       
3) Limited by memory      

Batch Training:       
1) Multiple updates per epoch       
2) Requires division of dataset      
3) No limit on dataset size          



In [None]:
# load data in batches

for batch in pd.read_csv("../Machine_Learning_basics/data/kc_house_data.csv", chunksize=100):
    #extract price column
    price = np.array(batch["price"],np.float32)
    
    #extract size column
    size = np.array(batch["size"],np.float32)

In [41]:
# train a linear model in batches, minimumal example

# Define trainable variable
intercept = tf.Variable(0.1, dtype=np.float32)
slope = tf.Variable(0.1,dtype=np.float32)

# Define model
def linear_regression(intercept, slope, features):
    return intercept + features*slope

# Compute predicted values and return loss function
def loss_function(intercept, slope,targets,features):
    predictions = linear_regression(intercept,slope,features)
    return tf.keras.losses.mse(targets,predictions)

# Define optimizer
opt = tf.keras.optimizers.Adam()

# load data in batches
for batch in pd.read_csv("../Machine_Learning_basics/data/kc_house_data.csv", chunksize=100):
    #extract the target and feature columns
    price_batch = np.array(batch["price"],np.float32)
    size_batch = np.array(batch["sqft_lot"],np.float32)
    
    #minimize the loss function
    opt.minimize(lambda: loss_function(intercept, slope, price_batch, size_batch),var_list=[intercept,slope])
    
print(intercept.numpy(),slope.numpy())

0.31781912 0.29831016


## 4. Neural Networks



### 4.1 Dense layer

Input layer consist of features. The output layer contains our prediction. Each hidden layer takes inputs form the previous layer, applies numerical weights to them, sum them together, and then applies an activation function.     

Dense layer: applies weights to all nodes from the previous layer.      


In [43]:
# Define inputs (features)
inputs = tf.constant([[1.0,35.0]]) # 2 input (feature) value (e.g. maritial status (1/0) and age)

# Define weights
weights = tf.Variable([[-0.5],[-0.01]])

# Define bias
# play a similar role to intercept in the linear regression model
bias = tf.Variable([0.5])

# A simple dense layer
# multiply inputs (features) by the weights
product = tf.matmul(inputs, weights)

# Define dense layer
# add the "product" to bias and apply non-linear transformation (sigmoid function, activation function)
dense = tf.keras.activations.sigmoid(product+bias)


In [None]:
# A complete model 
# using high level API Keras

# Define input (features) layer
inputs = tf.constant(data, tf.float32)

# Define first dense layer
# 10 outgoing nodes
# by default, a bias will be included
# inputs as arg to the first dens layer
dense1 = tf.keras.layers.Dense(10,activation="sigmoid")(inputs)

# Define second dense layer
dense2 = tf.keras.layers.Dense(5,activation="sigmoid")(dense1)

# Define output (predictions)
outputs = tf.keras.layers.Dense(1,activation="sigmoid")(dense2)

High-level approach:         
Relies on complex operations in high-level APIs, such as Keras and Estimators, reducing the amount of code needed. The weights and the mathmatical operations will typically be hidden by the layer constructor.       

Low-level approach:         
Linear algebra       


The input layer contains 3 features -- education, marital status, and age -- which are available as borrower_features. The hidden layer contains 2 nodes and the output layer contains a single node.          

For each layer, you will take the previous layer as an input, initialize a set of weights, compute the product of the inputs and weights, and then apply an activation function

In [55]:
borrower_features = np.array([[2.,  1., 24.]],dtype=np.float32)

# Initialize bias1
bias1 = tf.Variable(1.0)

# Initialize weights1 as 3x2 variable of ones
weights1 = tf.Variable(ones((3, 2)))

# Perform matrix multiplication of borrower_features and weights1
product1 = tf.matmul(borrower_features,weights1)

# Apply sigmoid activation function to product1 + bias1
dense1 = tf.keras.activations.sigmoid(product1 + bias1)

# Print shape of dense1
print("\n dense1's output shape: {}".format(dense1.shape))

# Initialize bias2 and weights2
bias2 = tf.Variable(1.0)
weights2 = tf.Variable(ones((2, 1)))

# Perform matrix multiplication of dense1 and weights2
product2 = tf.matmul(dense1,weights2)

# Apply activation to product2 + bias2 and print the prediction
prediction = tf.keras.activations.sigmoid(product2 + bias2)
print('\n prediction: {}'.format(prediction.numpy()[0,0]))
print('\n actual: 1')


 dense1's output shape: (1, 2)

 prediction: 0.9525741338729858

 actual: 1


In [58]:
weights1 = np.array(
       [[-0.6 ,  0.6 ],
       [ 0.8 , -0.3 ],
       [-0.09, -0.08]],dtype=np.float32)

borrower_features = np.array(
       [[ 3.,  3., 23.],
       [ 2.,  1., 24.],
       [ 1.,  1., 49.],
       [ 1.,  1., 49.],
       [ 2.,  1., 29.]],dtype=np.float32)

bias1 = np.array([0.1], dtype=np.float32)
# Compute the product of borrower_features and weights1
products1 = tf.matmul(borrower_features,weights1)

# Apply a sigmoid activation function to products1 + bias1
dense1 = tf.keras.activations.sigmoid(products1+bias1)

# Print the shapes of borrower_features, weights1, bias1, and dense1
print('\n shape of borrower_features: ', borrower_features.shape)
print('\n shape of weights1: ', weights1.shape)
print('\n shape of bias1: ', bias1.shape)
print('\n shape of dense1: ', dense1.shape)



 shape of borrower_features:  (5, 3)

 shape of weights1:  (3, 2)

 shape of bias1:  (1,)

 shape of dense1:  (5, 2)


We've now seen how to define dense layers in tensorflow using linear algebra. In this exercise, we'll skip the linear algebra and let keras work out the details. This will allow us to construct the network below, which has 2 hidden layers and 10 features, using less code than we needed for the network with 1 hidden layer and 3 features.       

To construct this network, we'll need to define three dense layers, each of which takes the previous layer as an input, multiplies it by weights, and applies an activation function. Note that input data has been defined and is available as a 100x10 tensor

In [59]:
borrower_features = np.array([[6.96469188e-01, 2.86139339e-01, 2.26851448e-01, 5.51314771e-01,
        7.19468951e-01, 4.23106462e-01, 9.80764210e-01, 6.84829712e-01,
        4.80931908e-01, 3.92117530e-01],
       [3.43178004e-01, 7.29049683e-01, 4.38572258e-01, 5.96778952e-02,
        3.98044258e-01, 7.37995386e-01, 1.82491735e-01, 1.75451756e-01,
        5.31551361e-01, 5.31827569e-01],
       [6.34400964e-01, 8.49431813e-01, 7.24455297e-01, 6.11023486e-01,
        7.22443402e-01, 3.22958916e-01, 3.61788660e-01, 2.28263229e-01,
        2.93714046e-01, 6.30976140e-01],
       [9.21049416e-02, 4.33701187e-01, 4.30862755e-01, 4.93685097e-01,
        4.25830305e-01, 3.12261224e-01, 4.26351309e-01, 8.93389165e-01,
        9.44160044e-01, 5.01836658e-01],
       [6.23952925e-01, 1.15618393e-01, 3.17285478e-01, 4.14826214e-01,
        8.66309166e-01, 2.50455379e-01, 4.83034253e-01, 9.85559762e-01,
        5.19485116e-01, 6.12894535e-01],
       [1.20628662e-01, 8.26340795e-01, 6.03060126e-01, 5.45068026e-01,
        3.42763841e-01, 3.04120779e-01, 4.17022198e-01, 6.81300759e-01,
        8.75456870e-01, 5.10422349e-01],
       [6.69313788e-01, 5.85936546e-01, 6.24903500e-01, 6.74689054e-01,
        8.42342436e-01, 8.31949860e-02, 7.63682842e-01, 2.43666381e-01,
        1.94222957e-01, 5.72456956e-01],
       [9.57125202e-02, 8.85326803e-01, 6.27248943e-01, 7.23416328e-01,
        1.61292069e-02, 5.94431877e-01, 5.56785166e-01, 1.58959642e-01,
        1.53070509e-01, 6.95529521e-01],
       [3.18766415e-01, 6.91970289e-01, 5.54383278e-01, 3.88950586e-01,
        9.25132513e-01, 8.41669977e-01, 3.57397556e-01, 4.35914621e-02,
        3.04768085e-01, 3.98185670e-01],
       [7.04958856e-01, 9.95358467e-01, 3.55914861e-01, 7.62547791e-01,
        5.93176901e-01, 6.91701770e-01, 1.51127458e-01, 3.98876280e-01,
        2.40855902e-01, 3.43456000e-01],
       [5.13128161e-01, 6.66624546e-01, 1.05908483e-01, 1.30894944e-01,
        3.21980596e-01, 6.61564350e-01, 8.46506238e-01, 5.53257346e-01,
        8.54452491e-01, 3.84837806e-01],
       [3.16787899e-01, 3.54264677e-01, 1.71081826e-01, 8.29112649e-01,
        3.38670850e-01, 5.52370071e-01, 5.78551471e-01, 5.21533072e-01,
        2.68806447e-03, 9.88345444e-01],
       [9.05341566e-01, 2.07635865e-01, 2.92489409e-01, 5.20010173e-01,
        9.01911378e-01, 9.83630896e-01, 2.57542074e-01, 5.64359069e-01,
        8.06968689e-01, 3.94370049e-01],
       [7.31073022e-01, 1.61069021e-01, 6.00698590e-01, 8.65864456e-01,
        9.83521581e-01, 7.93657899e-02, 4.28347290e-01, 2.04542860e-01,
        4.50636476e-01, 5.47763586e-01],
       [9.33267102e-02, 2.96860784e-01, 9.27584231e-01, 5.69003761e-01,
        4.57412004e-01, 7.53525972e-01, 7.41862178e-01, 4.85790335e-02,
        7.08697379e-01, 8.39243352e-01],
       [1.65937886e-01, 7.80997932e-01, 2.86536604e-01, 3.06469738e-01,
        6.65261447e-01, 1.11392170e-01, 6.64872468e-01, 8.87856781e-01,
        6.96311295e-01, 4.40327883e-01],
       [4.38214391e-01, 7.65096068e-01, 5.65641999e-01, 8.49041641e-02,
        5.82671106e-01, 8.14843714e-01, 3.37066382e-01, 9.27576602e-01,
        7.50716984e-01, 5.74063838e-01],
       [7.51644015e-01, 7.91489631e-02, 8.59389067e-01, 8.21504116e-01,
        9.09871638e-01, 1.28631204e-01, 8.17800835e-02, 1.38415575e-01,
        3.99378717e-01, 4.24306870e-01],
       [5.62218368e-01, 1.22243546e-01, 2.01399505e-01, 8.11644375e-01,
        4.67987567e-01, 8.07938218e-01, 7.42637832e-03, 5.51592708e-01,
        9.31932151e-01, 5.82175434e-01],
       [2.06095725e-01, 7.17757583e-01, 3.78985852e-01, 6.68383956e-01,
        2.93197222e-02, 6.35900378e-01, 3.21979336e-02, 7.44780660e-01,
        4.72912997e-01, 1.21754356e-01],
       [5.42635918e-01, 6.67744428e-02, 6.53364897e-01, 9.96086299e-01,
        7.69397318e-01, 5.73774099e-01, 1.02635257e-01, 6.99834049e-01,
        6.61167860e-01, 4.90971319e-02],
       [7.92299330e-01, 5.18716574e-01, 4.25867707e-01, 7.88187146e-01,
        4.11569238e-01, 4.81026262e-01, 1.81628838e-01, 3.21318895e-01,
        8.45533013e-01, 1.86903745e-01],
       [4.17291075e-01, 9.89034534e-01, 2.36599818e-01, 9.16832328e-01,
        9.18397486e-01, 9.12963450e-02, 4.63652730e-01, 5.02216339e-01,
        3.13668936e-01, 4.73395362e-02],
       [2.41685644e-01, 9.55296382e-02, 2.38249913e-01, 8.07791114e-01,
        8.94978285e-01, 4.32228930e-02, 3.01946849e-01, 9.80582178e-01,
        5.39504826e-01, 6.26309335e-01],
       [5.54540846e-03, 4.84909445e-01, 9.88328516e-01, 3.75185519e-01,
        9.70381573e-02, 4.61908758e-01, 9.63004470e-01, 3.41830611e-01,
        7.98922718e-01, 7.98846304e-01],
       [2.08248302e-01, 4.43367690e-01, 7.15601265e-01, 4.10519779e-01,
        1.91006958e-01, 9.67494309e-01, 6.50750339e-01, 8.65459859e-01,
        2.52423584e-02, 2.66905814e-01],
       [5.02071083e-01, 6.74486384e-02, 9.93033290e-01, 2.36462399e-01,
        3.74292195e-01, 2.14011908e-01, 1.05445869e-01, 2.32479781e-01,
        3.00610125e-01, 6.34442270e-01],
       [2.81234771e-01, 3.62276763e-01, 5.94284385e-03, 3.65719140e-01,
        5.33885956e-01, 1.62015840e-01, 5.97433090e-01, 2.93152481e-01,
        6.32050514e-01, 2.61966046e-02],
       [8.87593448e-01, 1.61186308e-02, 1.26958027e-01, 7.77162433e-01,
        4.58952338e-02, 7.10998714e-01, 9.71046150e-01, 8.71682942e-01,
        7.10161626e-01, 9.58509743e-01],
       [4.29813325e-01, 8.72878909e-01, 3.55957657e-01, 9.29763675e-01,
        1.48777649e-01, 9.40029025e-01, 8.32716227e-01, 8.46054852e-01,
        1.23923011e-01, 5.96486926e-01],
       [1.63924806e-02, 7.21184373e-01, 7.73751410e-03, 8.48222747e-02,
        2.25498408e-01, 8.75124514e-01, 3.63576323e-01, 5.39959908e-01,
        5.68103194e-01, 2.25463361e-01],
       [5.72146773e-01, 6.60951793e-01, 2.98245400e-01, 4.18626845e-01,
        4.53088939e-01, 9.32350636e-01, 5.87493777e-01, 9.48252380e-01,
        5.56034744e-01, 5.00561416e-01],
       [3.53221106e-03, 4.80889052e-01, 9.27455008e-01, 1.98365688e-01,
        5.20911328e-02, 4.06778902e-01, 3.72396469e-01, 8.57153058e-01,
        2.66111158e-02, 9.20149207e-01],
       [6.80903018e-01, 9.04226005e-01, 6.07529044e-01, 8.11953306e-01,
        3.35543871e-01, 3.49566221e-01, 3.89874220e-01, 7.54797101e-01,
        3.69291186e-01, 2.42219806e-01],
       [9.37668383e-01, 9.08011079e-01, 3.48797321e-01, 6.34638071e-01,
        2.73842216e-01, 2.06115127e-01, 3.36339533e-01, 3.27099890e-01,
        8.82276118e-01, 8.22303832e-01],
       [7.09623218e-01, 9.59345222e-01, 4.22543347e-01, 2.45033041e-01,
        1.17398441e-01, 3.01053345e-01, 1.45263731e-01, 9.21861008e-02,
        6.02932215e-01, 3.64187449e-01],
       [5.64570367e-01, 1.91335723e-01, 6.76905870e-01, 2.15505451e-01,
        2.78023601e-01, 7.41760433e-01, 5.59737921e-01, 3.34836423e-01,
        5.42988777e-01, 6.93984687e-01],
       [9.12132144e-01, 5.80713212e-01, 2.32686386e-01, 7.46697605e-01,
        7.77769029e-01, 2.00401321e-01, 8.20574224e-01, 4.64934856e-01,
        7.79766679e-01, 2.37478226e-01],
       [3.32580268e-01, 9.53697145e-01, 6.57815099e-01, 7.72877812e-01,
        6.88374341e-01, 2.04304114e-01, 4.70688760e-01, 8.08963895e-01,
        6.75035119e-01, 6.02788571e-03],
       [8.74077454e-02, 3.46794724e-01, 9.44365561e-01, 4.91190493e-01,
        2.70176262e-01, 3.60423714e-01, 2.10652635e-01, 4.21200067e-01,
        2.18035445e-01, 8.45752478e-01],
       [4.56270605e-01, 2.79802024e-01, 9.32891667e-01, 3.14351350e-01,
        9.09714639e-01, 4.34180908e-02, 7.07115054e-01, 4.83889043e-01,
        4.44221050e-01, 3.63233462e-02],
       [4.06831913e-02, 3.32753628e-01, 9.47119534e-01, 6.17659986e-01,
        3.68874848e-01, 6.11977041e-01, 2.06131533e-01, 1.65066436e-01,
        3.61817271e-01, 8.63353372e-01],
       [5.09401739e-01, 2.96901524e-01, 9.50251639e-01, 8.15966070e-01,
        3.22973937e-01, 9.72098231e-01, 9.87351120e-01, 4.08660144e-01,
        6.55923128e-01, 4.05653208e-01],
       [2.57348120e-01, 8.26526731e-02, 2.63610333e-01, 2.71479845e-01,
        3.98639083e-01, 1.84886038e-01, 9.53818381e-01, 1.02879882e-01,
        6.25208557e-01, 4.41697389e-01],
       [4.23518062e-01, 3.71991783e-01, 8.68314683e-01, 2.80476987e-01,
        2.05761567e-02, 9.18097019e-01, 8.64480257e-01, 2.76901782e-01,
        5.23487568e-01, 1.09088197e-01],
       [9.34270695e-02, 8.37466121e-01, 4.10265714e-01, 6.61716521e-01,
        9.43200588e-01, 2.45130599e-01, 1.31598311e-02, 2.41484065e-02,
        7.09385693e-01, 9.24551904e-01],
       [4.67330277e-01, 3.75109136e-01, 5.42860448e-01, 8.58916819e-01,
        6.52153850e-01, 2.32979894e-01, 7.74580181e-01, 1.34613499e-01,
        1.65559977e-01, 6.12682283e-01],
       [2.38783404e-01, 7.04778552e-01, 3.49518538e-01, 2.77423948e-01,
        9.98918414e-01, 4.06161249e-02, 6.45822525e-01, 3.86995859e-02,
        7.60210276e-01, 2.30089962e-01],
       [8.98318663e-02, 6.48449719e-01, 7.32601225e-01, 6.78095341e-01,
        5.19009456e-02, 2.94306934e-01, 4.51088339e-01, 2.87103295e-01,
        8.10513437e-01, 1.31115109e-01],
       [6.12179339e-01, 9.88214970e-01, 9.02556539e-01, 2.22157061e-01,
        8.18876142e-05, 9.80597317e-01, 8.82712960e-01, 9.19472456e-01,
        4.15503561e-01, 7.44615436e-01],
       [2.12831497e-01, 3.92304063e-01, 8.51548076e-01, 1.27612218e-01,
        8.93865347e-01, 4.96507972e-01, 4.26095665e-01, 3.05646390e-01,
        9.16848779e-01, 5.17623484e-01],
       [8.04026365e-01, 8.57651770e-01, 9.22382355e-01, 3.03380728e-01,
        3.39810848e-01, 5.95073879e-01, 4.41324145e-01, 9.32842553e-01,
        3.97564054e-01, 4.77778047e-01],
       [6.17186069e-01, 4.04739499e-01, 9.92478430e-01, 9.88512859e-02,
        2.20603317e-01, 3.22655141e-01, 1.47722840e-01, 2.84219235e-01,
        7.79245317e-01, 5.22891998e-01],
       [3.39536369e-02, 9.82622564e-01, 6.16006494e-01, 5.89394793e-02,
        6.61168754e-01, 3.78369361e-01, 1.35673299e-01, 5.63664615e-01,
        7.27079928e-01, 6.71126604e-01],
       [2.47513160e-01, 5.24866223e-01, 5.37663460e-01, 7.16803372e-01,
        3.59867334e-01, 7.97732592e-01, 6.27921820e-01, 3.83316055e-02,
        5.46479046e-01, 8.61912072e-01],
       [5.67574143e-01, 1.75828263e-01, 5.10376394e-01, 7.56945848e-01,
        1.10105194e-01, 8.17099094e-01, 1.67481646e-01, 5.34076512e-01,
        3.85743469e-01, 2.48623773e-01],
       [6.47432506e-01, 3.73921096e-02, 7.60045826e-01, 5.26940644e-01,
        8.75771224e-01, 5.20718336e-01, 3.50331701e-02, 1.43600971e-01,
        7.95604587e-01, 4.91976053e-01],
       [4.41879272e-01, 3.18434775e-01, 2.84549206e-01, 9.65886295e-01,
        4.32969332e-01, 8.84003043e-01, 6.48163140e-01, 8.58427644e-01,
        8.52449536e-01, 9.56312001e-01],
       [6.97942257e-01, 8.05396914e-01, 7.33127892e-01, 6.05226815e-01,
        7.17354119e-01, 7.15750396e-01, 4.09077927e-02, 5.16110837e-01,
        7.92651355e-01, 2.42962182e-01],
       [4.65147972e-01, 4.34985697e-01, 4.02787179e-01, 1.21839531e-01,
        5.25711536e-01, 4.46248353e-01, 6.63392782e-01, 5.49413085e-01,
        2.75429301e-02, 3.19179893e-02],
       [7.01359808e-01, 7.07581103e-01, 9.59939122e-01, 8.76704693e-01,
        4.68059659e-01, 6.25906527e-01, 4.57181722e-01, 2.22946241e-01,
        3.76677006e-01, 1.03884235e-01],
       [6.66527092e-01, 1.92030147e-01, 4.75467801e-01, 9.67436612e-01,
        3.16689312e-02, 1.51729956e-01, 2.98579186e-01, 9.41806972e-01,
        9.08841789e-01, 1.62000835e-01],
       [9.81117785e-01, 7.50747502e-01, 5.39977074e-01, 9.31702912e-01,
        8.80607128e-01, 3.91316503e-01, 6.56343222e-01, 6.47385120e-01,
        3.26968193e-01, 1.79390177e-01],
       [4.66809869e-01, 2.63281047e-01, 3.55065137e-01, 9.54143941e-01,
        4.61137861e-01, 6.84891462e-01, 3.36229891e-01, 9.95861053e-01,
        6.58767581e-01, 1.96009472e-01],
       [9.81839970e-02, 9.43180561e-01, 9.44777846e-01, 6.21328354e-01,
        1.69914998e-02, 2.25534886e-01, 8.01276803e-01, 8.75459850e-01,
        4.53989804e-01, 3.65520626e-01],
       [2.74224997e-01, 1.16970517e-01, 1.15744539e-01, 9.52602684e-01,
        8.08626115e-01, 1.64779365e-01, 2.07050055e-01, 6.55551553e-01,
        7.64664233e-01, 8.10314834e-01],
       [1.63337693e-01, 9.84128296e-01, 2.27802068e-01, 5.89415431e-01,
        5.87615728e-01, 9.67361867e-01, 6.57667458e-01, 5.84904253e-01,
        5.18772602e-01, 7.64657557e-01],
       [1.06055260e-01, 2.09190114e-03, 9.52488840e-01, 4.98657674e-01,
        3.28335375e-01, 3.68053257e-01, 8.03843319e-01, 3.82370204e-01,
        7.70169199e-01, 4.40461993e-01],
       [8.44077468e-01, 7.62040615e-02, 4.81128335e-01, 4.66849715e-01,
        2.64327973e-01, 9.43614721e-01, 9.05028462e-01, 4.43596303e-01,
        9.71596092e-02, 2.06783146e-01],
       [2.71491826e-01, 4.84219760e-01, 3.38377118e-01, 7.74136066e-01,
        4.76026595e-01, 8.70370507e-01, 9.95781779e-01, 2.19835952e-01,
        6.11671388e-01, 8.47502291e-01],
       [9.45236623e-01, 2.90086418e-01, 7.27042735e-01, 1.50161488e-02,
        8.79142463e-01, 6.39385507e-02, 7.33395398e-01, 9.94610369e-01,
        5.01189768e-01, 2.09333986e-01],
       [5.94643593e-01, 6.24149978e-01, 6.68072760e-01, 1.72611743e-01,
        8.98712695e-01, 6.20991349e-01, 4.35687043e-02, 6.84041083e-01,
        1.96084052e-01, 2.73407809e-02],
       [5.50953269e-01, 8.13313663e-01, 8.59941125e-01, 1.03520922e-01,
        6.63042784e-01, 7.10075200e-01, 2.94516981e-01, 9.71364021e-01,
        2.78687477e-01, 6.99821860e-02],
       [5.19280374e-01, 6.94314897e-01, 2.44659781e-01, 3.38582188e-01,
        5.63627958e-01, 8.86678159e-01, 7.47325897e-01, 2.09591955e-01,
        2.51777083e-01, 5.23880661e-01],
       [7.68958688e-01, 6.18761778e-01, 5.01324296e-01, 5.97125351e-01,
        7.56060004e-01, 5.37079811e-01, 8.97752762e-01, 9.47067499e-01,
        9.15354490e-01, 7.54518330e-01],
       [2.46321008e-01, 3.85271460e-01, 2.79999942e-01, 6.57660246e-01,
        3.24221611e-01, 7.54391611e-01, 1.13509081e-01, 7.75364757e-01,
        5.85901976e-01, 8.35388660e-01],
       [4.30875659e-01, 6.24964476e-01, 5.54412127e-01, 9.75671291e-01,
        7.55474389e-01, 5.44813275e-01, 1.74032092e-01, 9.04114246e-01,
        2.05837786e-01, 6.50043249e-01],
       [9.36471879e-01, 2.23579630e-01, 2.25923538e-01, 8.51818919e-01,
        8.27655017e-01, 3.51703346e-01, 2.65096277e-01, 1.27388477e-01,
        9.87936080e-01, 8.35343122e-01],
       [8.99391592e-01, 5.13679326e-01, 1.14384830e-01, 5.25803380e-02,
        3.30582112e-01, 9.20330405e-01, 9.47581828e-01, 8.41163874e-01,
        1.58679143e-01, 4.19923156e-01],
       [2.46242926e-01, 2.05349773e-01, 6.84825838e-01, 4.86111671e-01,
        3.24909657e-01, 1.00214459e-01, 5.44763386e-01, 3.47025156e-01,
        3.91095817e-01, 3.10508728e-01],
       [3.87195200e-01, 5.55859566e-01, 1.41438060e-02, 8.47647011e-01,
        9.21919882e-01, 5.50529718e-01, 2.68021107e-01, 9.90239024e-01,
        3.83194029e-01, 6.93655372e-01],
       [6.89952552e-01, 4.34309065e-01, 1.99158162e-01, 9.66579378e-01,
        6.36908561e-02, 4.85149384e-01, 2.20730707e-01, 2.93974131e-01,
        8.28527331e-01, 3.67265552e-01],
       [8.33482668e-02, 1.96309000e-01, 8.60373437e-01, 9.77028847e-01,
        2.67982155e-01, 6.75408959e-01, 8.11989978e-02, 7.23465621e-01,
        4.16436613e-01, 9.18159902e-01],
       [3.11536163e-01, 9.41466987e-01, 5.03247440e-01, 3.48892927e-01,
        6.47019625e-01, 2.49746203e-01, 2.29763597e-01, 1.96346447e-01,
        9.59899545e-01, 4.92913723e-01],
       [7.51614988e-01, 4.73991871e-01, 5.87540150e-01, 5.84138989e-01,
        9.79886293e-01, 6.68433130e-01, 2.39769474e-01, 1.51976589e-02,
        2.18682140e-01, 4.55519646e-01],
       [3.93420339e-01, 8.12326252e-01, 7.85556734e-01, 8.90959650e-02,
        9.52010751e-01, 5.27456701e-01, 5.96403956e-01, 4.05056775e-01,
        6.49500966e-01, 8.71326327e-01],
       [6.73935950e-01, 9.70098555e-01, 7.01122224e-01, 8.21720719e-01,
        4.50395830e-02, 6.72698498e-01, 6.54752672e-01, 1.01746053e-01,
        8.42387497e-01, 6.14172399e-01],
       [9.83280912e-02, 5.94467103e-01, 4.78415847e-01, 2.33293563e-01,
        1.97560899e-02, 3.65567267e-01, 6.19851053e-01, 3.29279125e-01,
        3.07254642e-01, 7.51121223e-01],
       [7.58624673e-01, 7.18765855e-01, 1.01181954e-01, 5.16165972e-01,
        5.57798684e-01, 7.44804502e-01, 9.03177738e-01, 3.69038880e-01,
        4.28663462e-01, 7.32767463e-01],
       [6.62636399e-01, 5.57869911e-01, 3.50139618e-01, 1.95352346e-01,
        1.83807373e-01, 8.15832913e-02, 8.12008530e-02, 8.45798194e-01,
        3.83672744e-01, 6.07396215e-02],
       [8.96425664e-01, 2.23270476e-01, 2.68124431e-01, 1.94497839e-01,
        9.67501044e-01, 1.12540089e-01, 7.22163260e-01, 9.32088733e-01,
        6.68001294e-01, 8.58726621e-01],
       [2.42447108e-01, 6.73927963e-01, 7.00871348e-01, 4.58332509e-01,
        8.70545626e-01, 6.94386125e-01, 8.94877791e-01, 7.53204346e-01,
        5.20290434e-01, 4.98688221e-01],
       [4.53727633e-01, 2.16468628e-02, 5.35141408e-01, 4.22973245e-01,
        1.57533601e-01, 1.19069695e-01, 4.49351877e-01, 3.99130546e-02,
        9.86579895e-01, 3.78120929e-01],
       [3.82109195e-01, 5.11263013e-02, 4.26672339e-01, 1.57454368e-02,
        3.00936326e-02, 3.39099228e-01, 8.20968926e-01, 4.58821088e-01,
        1.48405796e-02, 1.63220033e-01],
       [7.39922702e-01, 7.38293707e-01, 7.54522920e-01, 3.51669371e-01,
        3.52276951e-01, 8.02075684e-01, 3.98137897e-01, 7.27191031e-01,
        5.81122994e-01, 3.64341676e-01],
       [8.00065175e-02, 1.16125375e-01, 8.89558733e-01, 4.52340513e-01,
        9.94004548e-01, 3.63896936e-01, 2.49954298e-01, 3.50539327e-01,
        3.43086094e-01, 6.37356758e-01],
       [1.27375638e-02, 7.63268650e-01, 4.16414618e-01, 4.32239205e-01,
        4.81115013e-01, 4.49212462e-01, 4.97470886e-01, 3.45904320e-01,
        4.53346133e-01, 4.04651344e-01],
       [5.18242717e-01, 6.23269081e-01, 2.41040602e-01, 5.08437157e-01,
        5.94621897e-01, 1.69483144e-02, 5.20493746e-01, 2.39293247e-01,
        4.04538542e-01, 8.26530159e-01],
       [3.26235592e-01, 4.83216912e-01, 2.47411542e-02, 3.08750868e-01,
        6.39721096e-01, 3.15161765e-01, 2.05797508e-01, 2.90655673e-01,
        9.54378307e-01, 8.68018195e-02],
       [4.63357776e-01, 5.83869033e-02, 5.38658261e-01, 1.46035731e-01,
        6.34084821e-01, 2.64397472e-01, 6.90915406e-01, 3.47146064e-01,
        4.16848855e-03, 2.94894695e-01]],dtype=np.float32)

# Define the first dense layer
dense1 = tf.keras.layers.Dense(7, activation='sigmoid')(borrower_features)

# Define a dense layer with 3 output nodes
dense2 = tf.keras.layers.Dense(3,activation="sigmoid")(dense1)

# Define a dense layer with 1 output node
predictions = tf.keras.layers.Dense(1,activation="sigmoid")(dense2)

# Print the shapes of dense1, dense2, and predictions
print('\n shape of dense1: ', dense1.shape)
print('\n shape of dense2: ', dense2.shape)
print('\n shape of predictions: ', predictions.shape)


 shape of dense1:  (100, 7)

 shape of dense2:  (100, 3)

 shape of predictions:  (100, 1)


### 4.2 Activation Functions

A typical hidden layer consists of 2 operations:       
1) Linear: performs matrix multiplication, which is a linear operation.         
2) Nonlinear: activation function.         



In [62]:
# a simple example where we assume that the weight on age is 1 and the weight on the bill amount is 2

# Define example borrower features
young, old = 0.3, 0.6
low_bill, high_bill = 0.1, 0.5

# apply matrix multiplication step for all feature combination
young_high = 1.0*young + 2.0*high_bill
young_low = 1.0*young + 2.0*low_bill
old_high = 1.0*old + 2.0*high_bill
old_low = 1.0*old + 2.0*low_bill

# Difference in default predictions for young
print(young_high - young_low)

# Difference in default predictions for old
print(old_high - old_low)

0.8
0.8


In [63]:
# with activation function
print(tf.keras.activations.sigmoid(young_high).numpy()-tf.keras.activations.sigmoid(young_low).numpy())

print(tf.keras.activations.sigmoid(old_high).numpy()-tf.keras.activations.sigmoid(old_low).numpy())

0.16337568
0.14204389


Most common activation function: sigmoid, ReLu and softmax.        

The sigmoid activation function is used primarily in the output layer of binary classificaiton problem.     
Low-level: tf.keras.activations.sigmoid()         
High-level: sigmoid       

The ReLu activation function (Rectified Linear Unit) is used in all layers other htan the output layer. This activation simply takes the maximum of the value passed to it and 0 if x is negative.          
Low-level: tf.keras.activations.relu()         
High-level: relu         

The softmax activation function is used in the output layer in classification problems with more than two classes. The outputs from a softmax activation function can be interpreted as predicted class probabilities in multiclass classification problems.      
Low-level: tf.keras.activations.softmax()       
High-level: softmax          



In [64]:
# Define input layer
inputs = tf.constant(borrower_features, dtype = tf.float32)

# Define dense layer 1
dense1 = tf.keras.layers.Dense(16, activation="relu")(inputs)

# Define dense 2
dense2 = tf.keras.layers.Dense(8, activation="sigmoid")(dense1)

# Define output layer
outputs = tf.keras.layers.Dense(4, activation="softmax")(dense2)

In [65]:
print(outputs)

tf.Tensor(
[[0.1105312  0.19927394 0.20831117 0.48188367]
 [0.11558039 0.19875197 0.20812516 0.47754243]
 [0.11165091 0.19399762 0.20439456 0.48995692]
 [0.11063999 0.19841023 0.20914055 0.48180923]
 [0.11378781 0.20506497 0.20466019 0.476487  ]
 [0.11229768 0.19330044 0.20363392 0.490768  ]
 [0.11060317 0.19343947 0.19988917 0.4960683 ]
 [0.10830142 0.1926553  0.20854698 0.49049637]
 [0.11424928 0.19637963 0.20755497 0.48181608]
 [0.1126796  0.2006236  0.21552572 0.47117105]
 [0.11160199 0.19368425 0.20495002 0.48976383]
 [0.1067858  0.20228057 0.21580915 0.47512448]
 [0.11331404 0.20366342 0.2087831  0.47423944]
 [0.10725141 0.20448974 0.21608381 0.472175  ]
 [0.10671645 0.18948689 0.20531829 0.4984784 ]
 [0.11295969 0.19189045 0.20054823 0.49460164]
 [0.1190635  0.18990554 0.19749677 0.4935342 ]
 [0.10769681 0.20565833 0.21392554 0.4727194 ]
 [0.11045446 0.20933622 0.21675651 0.46345285]
 [0.11290117 0.20186439 0.21643701 0.46879745]
 [0.10897365 0.20334868 0.21717392 0.47050372]
 [

In this exercise, you will again make use of credit card data. The target variable, default, indicates whether a credit card holder defaults on her payment in the following period. Since there are only two options--default or not--this is a binary classification problem. While the dataset has many features, you will focus on just three: the size of the three latest credit card bills. Finally, you will compute predictions from your untrained network, outputs, and compare those the target variable, default.

In [67]:
cc_df = pd.read_csv("../Machine_Learning_basics/data/uci_credit_card.csv")
cc_df.head()

Unnamed: 0,ID,LIMIT_BAL,SEX,EDUCATION,MARRIAGE,AGE,PAY_0,PAY_2,PAY_3,PAY_4,...,BILL_AMT4,BILL_AMT5,BILL_AMT6,PAY_AMT1,PAY_AMT2,PAY_AMT3,PAY_AMT4,PAY_AMT5,PAY_AMT6,default.payment.next.month
0,1,20000.0,2,2,1,24,2,2,-1,-1,...,0.0,0.0,0.0,0.0,689.0,0.0,0.0,0.0,0.0,1
1,2,120000.0,2,2,2,26,-1,2,0,0,...,3272.0,3455.0,3261.0,0.0,1000.0,1000.0,1000.0,0.0,2000.0,1
2,3,90000.0,2,2,2,34,0,0,0,0,...,14331.0,14948.0,15549.0,1518.0,1500.0,1000.0,1000.0,1000.0,5000.0,0
3,4,50000.0,2,2,1,37,0,0,0,0,...,28314.0,28959.0,29547.0,2000.0,2019.0,1200.0,1100.0,1069.0,1000.0,0
4,5,50000.0,1,2,1,57,-1,0,-1,0,...,20940.0,19146.0,19131.0,2000.0,36681.0,10000.0,9000.0,689.0,679.0,0


In [70]:
bill_amounts = np.array(cc_df[["BILL_AMT1","BILL_AMT2","BILL_AMT3"]],np.float32)

In [78]:
default = np.array(cc_df[["default.payment.next.month"]],np.int64)

In [80]:
# Construct input layer from features
inputs = tf.constant(bill_amounts)

# Define first dense layer
dense1 = tf.keras.layers.Dense(3, activation='relu')(inputs)

# Define second dense layer
dense2 = tf.keras.layers.Dense(2,activation="relu")(dense1)

# Define output layer
outputs = tf.keras.layers.Dense(1,activation="sigmoid")(dense2)

# Print error for first five examples
error = default[:5] - outputs.numpy()[:5]
print(error)

[[ 0.        ]
 [ 0.18238378]
 [-1.        ]
 [-1.        ]
 [-1.        ]]


Multiclass classification problems       
In this exercise, we expand beyond binary classification to cover multiclass problems. A multiclass problem has targets that can take on three or more values. In the credit card dataset, the education variable can take on 6 different values, each corresponding to a different level of education. We will use that as our target in this exercise and will also expand the feature set from 3 to 10 columns.              

As in the previous problem, you will define an input layer, dense layers, and an output layer. You will also print the untrained model's predictions, which are probabilities assigned to the classes. The tensor of features has been loaded and is available as borrower_features

In [81]:
borrower_features = np.array(cc_df[["BILL_AMT1","BILL_AMT2","BILL_AMT3","BILL_AMT4","BILL_AMT5","BILL_AMT6","PAY_AMT1","PAY_AMT2","PAY_AMT3","PAY_AMT4","PAY_AMT5"]],np.float32)

In [85]:
# Construct input layer from borrower features
inputs = tf.constant(borrower_features)

# Define first dense layer
dense1 = tf.keras.layers.Dense(10, activation='sigmoid')(inputs)

# Define second dense layer
dense2 = tf.keras.layers.Dense(8,activation="relu")(dense1)

# Define output layer
outputs = tf.keras.layers.Dense(6,activation="softmax")(dense2)

# Print first five predictions
print(outputs.numpy()[:5])

[[0.16175532 0.21448301 0.1641436  0.15772898 0.13908899 0.16280012]
 [0.17216966 0.17142151 0.15821598 0.17170802 0.1622757  0.16420914]
 [0.18177326 0.1902991  0.15739845 0.1547137  0.14012657 0.17568892]
 [0.20669936 0.20017597 0.11092839 0.20265287 0.1336803  0.14586306]
 [0.17216966 0.17142151 0.15821598 0.17170802 0.1622757  0.16420914]]


### 4.3 Optimisers

In a nerual network, we pick a point, measure the loss, and then try to move to a lower loss.        

Stochastic gradient descent (SGD) optimiser       
tf.keras.optimizers.SGD()         
learning_rate (0.5, 0.001)       

RMS propagation optimiser        
Applies different learning rates to each feature, useful for high dimensional porblem.        
tf.keras.optimisers.RMSprop()            
learning_rate       
momentum (which allows the optimizer to break through local minima.)         
decay (low value = prevent momenum from accumulating over long period during training)       

Adam (adaptive moment) optimiser        
tf.keras.optimisers.adam()           
learning_rate       
beta1 (similar to decay function, low beta->decay faster)       


In [None]:
# define model function
def model(bias, weights, features=borrower_features):
    product = tf.matmul(features,weights)
    return tf.keras.activations.sigmoid(product+bias)

def loss_function(bias, weights, targets=default, features = borrower_features):
    predictions = model(bias, weights)
    return tf.keras.losses.binary_crossentropy(targets, predictions) # for binary classificaiton prob, loss func

#minimise the loss function with RMS propagation
opt = tf.keras.optimizers.RMSprop(learning_rate=0.01,momentum=0.9)
opt.minimize(lambda: loss_function(bias,weights),var_list=[bias,weights])

### 4.4 Training a network in TensorFlow

We often need to initialise hundreds or thousands of variables. (mainly for the weights)    
We use random or algorithmic generation of initial values for gradient descent.       

We can draw them from a probability distribution, suc as the normal or uniform distribution. There are also specialised options, such as the Glorot initialiser, which are designed for ML algorithm.        



In [None]:
# low level approach to initialise a 500x500 variable

# Define 500x500 random nomral variable
weights = tf.Variable(tf.random.normal([500,500]))

# Define 500x500 truncated random normal variable
# discards very large and very small draws
weights = tf.Variable(tf.random.truncated_normal([500,500]))



In [None]:
# high level approach by initialising a dense layer using the defaulkeras options, 
# currently the Glorot uniform initialiser.

# Define a dense layer with the default initialiser
dense = tf.keras.layers.Dense(32,activation="relu")

# Define a dense layer with the zeros initialiser
dense = tf.keras.layers.Dense(32,activation="relu",kernel_initialiser="zeros")


For neual network, a solution for overfitting problem is to use dropout, an operation that will randomly drop the weights connected to certain nodes in a layer during the training process. This will force the network to develop more robust rules for classification, since it cannot rely on any particular nodes being passed to an activation function. This will tend to improve out-of-sample performance.      



In [None]:
# dropout working

# Define input data
inputs = np.array(borrower_features, np.float32)

# Define dense layer 1
dense1 = tf.keras.layers.Dense(32,activation="relu")(inputs)

# define dense layer 2
dense2 = tf.keras.layers.Dense(16, activation="relu")(dense1)

# apply dropout operation (apply a droupout layer)
# dropout 25% of the nodes randomly
dropout1 = tf.keras.layers.Dropout(0.25)(dense2)

# Define output layer
outputs = tf.layers.Dense(1, activation="sigmoid")(dropout1)

