# Tensorflow Developer Certificate Preparation
___
## Introduction to Tensorflow in Python - DataCamp - ML-Scientist-Career-Track - by Isaiah Hull
___
## Chapter 2.3- Linear Regression

### 1. Linear regression
Now that you understand how to construct loss functions, you're well-equipped to start training models. We'll do that for the first time in this video with a linear regression model.

### 2. What is a linear regression?
So what is a linear regression model? We can answer this with a simple illustration. Let's say we want to examine the relationship between house size and price in the King County housing dataset. We might start by plotting the size in square feet against the price in dollars. Note that we've actually plotted the relationship after taking the natural logarithm of each variable, which is useful when we suspect that the relationship is proportional. That is, we might expect an x% increase in size to be associated with a y% increase in price.  
![2.3.1](./figures/2.3.1.PNG)

A linear regression model assumes that the relationship between these variables can be captured by a line. That is, two parameters--the line's slope and intercept--fully characterize the relationship between size and price.  
![2.3.2](./figures/2.3.2.PNG)

### 3. The linear regression model
In our case, we've assumed that the relationship is linear after taking natural logarithms. 
- Training the model will involve recovering the slope of the line and the intercept, where the line intersects the vertical axis. 
- Once we have trained the intercept and slope, we can take a house's size and predict its price. 
- The difference between the predicted price and actual price is the error, which can be used to construct a loss function. 
- The example we've shown is for a univariate regression, which has only one feature, size. 
- A multiple regression has multiple features, such as size and location.  
![2.3.3](./figures/2.3.3.PNG)

### 4. Linear regression in TensorFlow
Let's look at some code to see how this can be implemented. 
- We will first define our target variable, price, and feature, size. 
- We also initialize the intercept and slope as trainable variables.

In [21]:
# import numpy and pandas
import numpy as np
import pandas as pd
import tensorflow as tf

# Load data from csv
housing = pd.read_csv('./datasets/kc_house_data.csv')

# Define the targets and features
price = np.array(housing['price'], np.float32)
size = np.array(housing['sqft_living'], np.float32)

# Define the intercept and slope
intercept = tf.Variable(0.1, np.float32)
slope = tf.Variable(0.1, np.float32)

- After that, we **define the model**, which we'll use to make predictions by multiplying size and slope and then adding the intercept. 
- Again, remember that we can do this using the addition and multiplication symbols, since these are overloaded operators and intercept and slope are tensorflow operations. 

In [22]:
# Define a linear regression model
def linear_regression(intercept, slope = slope, features = size):
    return intercept + features*slope

- Our **next step** is to define a **loss function**. 
- This function will take the model's parameters and the data as an input. 
- It will first use the model to compute the predicted values. 
- We then set the function to return the mean squared error loss. 
- We, of course, could have selected a different loss.

In [27]:
# define a loss function to compute the MSE
def loss_function(intercept, slope, targets = price, features = size):
    # compute the predictions for a linear model
    predictions = linear_regression(intercept, slope)
    
    # Return the loss
    return tf.keras.losses.mse(targets, predictions)

- With the loss function defined, the next step is to **define an optimization operation**. 
- We'll do this using the adam optimizer. 
- For now, you can ignore the choice of optimization algorithm. 
- We will discuss the selection of optimizers in greater detail later. 
- For our purposes, it is sufficient to understand that executing this operation will change the slope and intercept in a direction that will lower the value of the loss. 

In [28]:
# Define an optimization operation
opt = tf.keras.optimizers.Adam()

- We will next **perform minimization on the loss function using the optimizer**. 
- Notice that we've passed the loss function as a lambda function to the minimize operation. 
- We also supplied a variable list, which contains intercept and slope, the two variables we defined earlier. 
- We will execute our optimization step 1000 times. 
- Printing the loss, we'll see that it tends to decline, moving closer to the minimum value with each step. 
- Finally, we print the intercept and the slope. This is our linear model, which enables us to predict the value of a house given its size.

In [29]:
# Minimize the loss function and print the loss
for j in range(1000):
    opt.minimize(lambda: loss_function(intercept, slope),\
                            var_list = [intercept, slope])
    print(loss_function(intercept, slope))

tf.Tensor(423483800000.0, shape=(), dtype=float32)
tf.Tensor(423481150000.0, shape=(), dtype=float32)
tf.Tensor(423478460000.0, shape=(), dtype=float32)
tf.Tensor(423475700000.0, shape=(), dtype=float32)
tf.Tensor(423473000000.0, shape=(), dtype=float32)
tf.Tensor(423470300000.0, shape=(), dtype=float32)
tf.Tensor(423467600000.0, shape=(), dtype=float32)
tf.Tensor(423464900000.0, shape=(), dtype=float32)
tf.Tensor(423462140000.0, shape=(), dtype=float32)
tf.Tensor(423459420000.0, shape=(), dtype=float32)
tf.Tensor(423456740000.0, shape=(), dtype=float32)
tf.Tensor(423454000000.0, shape=(), dtype=float32)
tf.Tensor(423451260000.0, shape=(), dtype=float32)
tf.Tensor(423448580000.0, shape=(), dtype=float32)
tf.Tensor(423445920000.0, shape=(), dtype=float32)
tf.Tensor(423443170000.0, shape=(), dtype=float32)
tf.Tensor(423440480000.0, shape=(), dtype=float32)
tf.Tensor(423437700000.0, shape=(), dtype=float32)
tf.Tensor(423435040000.0, shape=(), dtype=float32)
tf.Tensor(423432360000.0, shape

In [30]:
# print the trained parameters
print(intercept.numpy(), slope.numpy())

2.0983524 2.098373


___
### Multiple linear regression (Practice Question)
In most cases, performing a univariate linear regression will not yield a model that is useful for making accurate predictions. In this exercise, you will perform a multiple regression, which uses more than one feature.

- You will use ``price_log`` as your target and ``size_log`` and ``bedrooms`` as your features. Each of these tensors has been defined and is available. 
- You will also switch from using the the mean squared error loss to the mean absolute error loss: ``keras.losses.mae()``. 
- Finally, the predicted values are computed as follows: ``params[0] + feature1*params[1] + feature2*params[2]``. 
- Note that we've defined a vector of parameters, ``params``, as a variable, rather than using three variables. Here, ``params[0]`` is the ``intercept`` and ``params[1]`` and ``params[2]`` are the ``slopes``.

In [20]:
# Define the linear regression model
def linear_regression(params, feature1 = size_log, feature2 = bedrooms):
    return params[0] + feature1*params[1] + feature2*params[2]

# Define the loss function
def loss_function(params, targets = price_log, feature1 = size_log, feature2 = bedrooms):
    # Set the predicted values
    predictions = linear_regression(params, feature1, feature2)
  
    # Use the mean absolute error loss
    return keras.losses.mae(targets, predictions)

# Define the optimize operation
opt = keras.optimizers.Adam()

# Perform minimization and print trainable variables
for j in range(10):
    opt.minimize(lambda: loss_function(params), var_list=[params])
    print_results(params)

NameError: name 'size_log' is not defined