In [None]:
import numpy as np
import matplotlib.pyplot as plt

# Problem Statement

In the lecture, we will use the example of housing price prediction.  
This lab will use a simple data set with only two data points - a house with 1000 square feet(sqft) sold for \\$300,000 and a house with 2000 square feet sold for \\$500,000. These two points will constitute our *data or training set*. In this lab, the units of size are 1000 sqft and the units of price are 1000s of dollars.

| Size (1000 sqft)     | Price (1000s of dollars) |
| -------------------| ------------------------ |
| 1.0               | 300                      |
| 2.0               | 500                      |

You would like to fit a linear regression model (shown above as the blue straight line) through these two points, so you can then predict price for other houses - say, a house with 1200 sqft.


# Initializing Data

In [None]:
#Arrays for storing data
x_train = np.array([1.0, 2.0])
y_train = np.array([300, 500]);
print(x_train)
print(y_train)

In [None]:
#Number of Training Examples
m = len(x_train)
print(m)

# Visualizing Training Samples

In [None]:
#Experiment for different training examples
i = 0;
x_i = x_train[i];
y_i = y_train[i];
print(f"Training example {i+1}: ({x_i}, {y_i})")

# Plotting Data

In [None]:
# Plot the data points
plt.scatter(x_train, y_train, marker='x', c='r')
# Set the title
plt.title("Housing Prices")
# Set the y-axis label
plt.ylabel('Price (in 1000s of dollars)')
# Set the x-axis label
plt.xlabel('Size (1000 sqft)')
plt.show()

# Model function

The model function for linear regression is represented as 

$$ f_{w,b}(x^{(i)}) = wx^{(i)} + b \tag{1}$$

Different values of $w$ and $b$ give you different straight lines on the plot. 

Let's try to get a better intuition for this through the code blocks below. Let's start with $w = 100$ and $b = 100$. 

In [None]:
w = 100;
b = 100;
print(w);
print(b);

In [None]:
#Function to compute our ouput
def compute_output(w, b, x):
    m = len(x);
    f_wb = np.zeros(m);
    for i in range(m):
        f_wb[i] = w*x[i] + b;

    return f_wb

In [None]:
tem_f_wb = compute_output(w, b, x_train)
plt.plot(x_train, tem_f_wb, c='b',label='Our Prediction')

# Plot the data points
plt.scatter(x_train, y_train, marker='x', c='r',label='Actual Values')

# Set the title
plt.title("Housing Prices")
# Set the y-axis label
plt.ylabel('Price (in 1000s of dollars)')
# Set the x-axis label
plt.xlabel('Size (1000 sqft)')

plt.legend();
plt.show();

As we can see, setting $w = 100$ and $b = 100$ does *not* result in a line that fits our data. 

Thus, we need to experiment with different values of $w$ and $b$ to find values that fits our data?