<a href="https://colab.research.google.com/github/raj-vijay/dl/blob/master/08_Loss_Functions_in_TensorFlow.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Loss Function**

Fundamental tensorflow operation
- Used to train a model
- Measure of model fit

Higher value -> worse fit
- Maximize Loss Function

**Common loss functions in TensorFlow**

TensorFlow has operations for common loss functions
- Mean squared error (MSE)
- Mean absolute error (MAE)
- Huber error

Loss functions are accessible from tf.keras.losses()
- tf.keras.losses.mse()
- tf.keras.losses.mae()
- tf.keras.losses.Huber()

MSE
- Strongly penalizes outliers
- High (gradient) sensitivity near minimum

MAE
- Scales linearly with size of error
- Low sensitivity near minimum

Huber
- Similar to MSE near minimum
- Similar to MAE away from minimum

![alt text](https://raw.githubusercontent.com/raj-vijay/dl/master/images/Loss%20Functions.png)

**Defining Loss Functions**

In [0]:
# Import TensorFlow under standard alias
import tensorflow as tf

In [0]:
# Define a linear regression model
def linear_regression(intercept, slope, features):
  return intercept + features*slope

In [0]:
# Define a loss function to compute the MSE
def loss_function(intercept, slope, targets, features):
  # Compute the predictions for a linear model
  predictions = linear_regression(intercept, slope)
  # Return the loss
  return tf.keras.losses.mse(targets, predictions)

In [0]:
# Compute the loss for test data inputs
# loss_function(intercept, slope, test_targets, test_features)

**Loss functions in TensorFlow**

**King County Housing Dataset**

Online property companies offer valuations of houses using machine learning techniques. The aim of this report is to predict the house sales in King County, Washington State, USA using Multiple Linear Regression (MLR). The dataset consisted of historic data of houses sold between May 2014 to May 2015. We will predict the sales of houses in King County with an accuracy of at least 75-80% and understand which factors are responsible for higher property value - $650K and above.”


The dataset consists of house prices from King County an area in the US State of Washington, this data also covers Seattle. The dataset was obtained from Kaggle*. This data was published/released under CC0*: Public Domain. Unfortunately, the user has not indicated the source of the data. Please find the citation and database description in the Glossary and Bibliography. 

The dataset consisted of 21 variables and 21613 observations.

Installing Kaggle Package to access the diabetes dataset from Kaggle.

In [0]:
!pip install kaggle



Make .kaggle directory under root to import the Kaggle Authentication JSON.

In [0]:
!mkdir ~/.kaggle

Change file path to root/.kaggle/kaggle.json

In [0]:
!cp /content/kaggle.json ~/.kaggle/kaggle.json

Chmod 600 (chmod a+rwx,u-x,g-rwx,o-rwx) sets permissions so that, (U)ser / owner can read, can write and can't execute. (G)roup can't read, can't write and can't execute. (O)thers can't read, can't write and can't execute.

In [0]:
!chmod 600 /root/.kaggle/kaggle.json

Download housing dataset from Kaggle!

In [0]:
!kaggle datasets download -d shivachandel/kc-house-data

Downloading kc-house-data.zip to /content
  0% 0.00/770k [00:00<?, ?B/s]
100% 770k/770k [00:00<00:00, 51.2MB/s]


**Load data using pandas**

In [0]:
# Import pandas under the alias pd
import pandas as pd
import numpy as np

# Assign the path to a string variable named data_path
data_path = '/content/kc-house-data.zip'

# Load the dataset as a dataframe named housing
housing = pd.read_csv(data_path, compression='zip')


In [0]:
housing.head()

Unnamed: 0,id,date,price,bedrooms,bathrooms,sqft_living,sqft_lot,floors,waterfront,view,condition,grade,sqft_above,sqft_basement,yr_built,yr_renovated,zipcode,lat,long,sqft_living15,sqft_lot15
0,7129300520,20141013T000000,221900.0,3,1.0,1180,5650,1.0,0,0,3,7,1180.0,0,1955,0,98178,47.5112,-122.257,1340,5650
1,6414100192,20141209T000000,538000.0,3,2.25,2570,7242,2.0,0,0,3,7,2170.0,400,1951,1991,98125,47.721,-122.319,1690,7639
2,5631500400,20150225T000000,180000.0,2,1.0,770,10000,1.0,0,0,3,6,770.0,0,1933,0,98028,47.7379,-122.233,2720,8062
3,2487200875,20141209T000000,604000.0,4,3.0,1960,5000,1.0,0,0,5,7,1050.0,910,1965,0,98136,47.5208,-122.393,1360,5000
4,1954400510,20150218T000000,510000.0,3,2.0,1680,8080,1.0,0,0,3,8,1680.0,0,1987,0,98074,47.6168,-122.045,1800,7503


In [0]:
price = housing['price']

In [0]:
# Print the price column of housing
print(housing['price'])

0        221900.0
1        538000.0
2        180000.0
3        604000.0
4        510000.0
           ...   
21608    360000.0
21609    400000.0
21610    402101.0
21611    400000.0
21612    325000.0
Name: price, Length: 21613, dtype: float64


In [0]:
# Print the price column of housing
predictions = housing['price']

**Loss functions in TensorFlow**

In [0]:
# Import the keras module from tensorflow
from tensorflow import keras

# Compute the mean squared error (mse)
loss = keras.losses.mse(price, predictions)

# Print the mean squared error (mse)
print(loss.numpy())

0.0


In [0]:
# Compute the mean absolute error (mae)
loss = keras.losses.mae(price, predictions)

# Print the mean absolute error (mae)
print(loss.numpy())

0.0


**Modifying the Loss Function**

Here, we compute the loss within another function called loss_function(), which first generates predicted values from the data and variables. 

The purpose of this is to construct a function of the trainable model variables that returns the loss. 

It can then repeatedly evaluate this function for different variable values until the minimum is found. 

In practice, this function is passed to an optimizer in tensorflow.

In [0]:
import tensorflow as tf
from tensorflow import Variable, float32
from tensorflow import keras

In [0]:
features = Variable([1, 2, 3, 4, 5], dtype=tf.float32)
targets = Variable([2., 4., 6., 8., 10.], dtype=tf.float32)

In [0]:
# Initialize a variable named scalar
scalar = Variable(1.0, float32)

# Define the model
def model(scalar, features = features):
  	return scalar * features

# Define a loss function
def loss_function(scalar, features = features, targets = targets):
	# Compute the predicted values
	predictions = model(scalar, features)
    
	# Return the mean absolute error loss
	return keras.losses.mae(targets, predictions)


In [0]:
# Evaluate the loss function and print the loss
print(loss_function(scalar).numpy())

3.0
