# 1 - Model Representation

## Notation

Here is a summary of some of the notation you will encounter.

| General Notation                   | Description                                                               | Python (if applicable) |
|:-----------------------------------|:--------------------------------------------------------------------------|:-----------------------:|
| $a$                                | scalar, non-bold                                                          |                        |
| $\mathbf{a}$                       | vector, bold                                                              |                        |
| **Regression**                     |                                                                           |                        |
| $\mathbf{x}$                       | Training example feature values (e.g., size in 1000 sqft)                 | `x_train`              |
| $\mathbf{y}$                       | Training example targets (e.g., price in 1000s of dollars)                | `y_train`              |
| $x^{(i)}$, $y^{(i)}$               | $i_{th}$ training example                                                 | `x_i`, `y_i`           |
| $m$                                | Number of training examples                                               | `m`                    |
| $w$                                | parameter: weight                                                         | `w`                    |
| $b$                                | parameter: bias                                                           | `b`                    |
| $f_{w,b}(x^{(i)})$                 | Model evaluation at $x^{(i)}$: $f_{w,b}(x^{(i)}) = wx^{(i)}+b$            | `f_wb`                 |


## Tools
- NumPy
- Matplotlib

In [None]:
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('seaborn-v0_8-bright')


## Training Data
We will use simple data set with only two data points - a house with 1000 square feet(sqft) sold for \\$300,000 and a house with 2000 square feet sold for \\$500,000. These two points will constitute our *data or training set*. The units of size are 1000 sqft and the units of price are 1000s of dollars.

| Size (1000 sqft)     | Price (1000s of dollars) |
| -------------------| ------------------------ |
| 1.0               | 300                      |
| 2.0               | 500                      |

You would like to fit a linear regression model (shown above as the blue straight line) through these two points, so you can then predict price for other houses - say, a house with 1200 sqft.

In [None]:
x_train = np.array([1.0, 2.0])
y_train = np.array([300, 500])
print(x_train)
print(y_train)

## Number of training examples `m`
use Numpy `.shape` parameter
`.shape[0]` is the number of examples

In [None]:
m = x_train.shape[0]
print(f'Number of training examples is: {m}')

### Plotting the data

You can plot these two points using the `scatter()` function in the `matplotlib` library, as shown in the cell below. 
- The function arguments `marker` and `c` show the points as red crosses (the default is blue dots).

You can use other functions in the `matplotlib` library to set the title and labels to display

In [None]:
# Plot the data points
plt.scatter(x_train, y_train, marker="o", c="r")

# Set title
plt.title("Housing prices")

# Set the y-axis label
plt.ylabel("Price (1000s of dollars)")

# Set the x-axis label
plt.xlabel("Size (in 1000s of sqft)")
plt.show()

## Model function
The model function for linear regression (which is a function that maps from `x` to `y`) is represented as 

$$ f_{w,b}(x^{(i)}) = wx^{(i)} + b \tag{1}$$

The formula above is how you can represent straight lines - different values of $w$ and $b$ give you different straight lines on the plot. 

In [None]:
w = 200 
b = 100

def compute_model_output(x, w, b):
    """
    Computes the prediction of linear model
    Args:
        x (ndarray (m,)): Data, m examples
        w,b (scalar)    : model parameters
    Returns:
        f_wb (ndarray (m,)): model prediction
    """
    m = x.shape[0]
    f_wb = np.zeros(m)
    for i in range(m):
        f_wb[i] = w * x[i] + b
    return f_wb


In [None]:
tmp_f_wb = compute_model_output(x_train, w, b)

# Plot our model prediction
plt.plot(x_train, tmp_f_wb, c='b', label='Model Prediction')

# Plot the data points
plt.scatter(x_train, y_train, marker='x', c='r', label='Actual Values')

# Set title
plt.title("Housing Prices")

# Set y-axis label
plt.ylabel("Price (in 1000s of dollars)")

# Set x-axis label
plt.xlabel("Size (in 1000s of sqft)")
plt.legend()
plt.show()

### Prediction
Now that we have a model, we can use it to make our original prediction. Let's predict the price of a house with 1200 sqft. Since the units of $x$ are in 1000's of sqft, $x$ is 1.2.

In [None]:
x_i = 1.2
cost_1200sqft = w * x_i + b

print(f"The cost of {x_i * 1000} sqft home is ${cost_1200sqft *1000:.2f}")