# Real Estate estimator

In [1]:
import numpy as np

## (1) Define the matrix $\boldsymbol X$ of `features`:

❓ Create a $(4,3)$ `numpy.ndarray` storing the values of the 3 features (surface, bedrooms, floors) for the 4 observations. 

In [2]:
X = np.array([
    [620, 1, 1],
    [3280, 4, 2],
    [1900, 2, 2],
    [1320, 3, 3]
])
X

array([[ 620,    1,    1],
       [3280,    4,    2],
       [1900,    2,    2],
       [1320,    3,    3]])

In [3]:
print(f"Shape: {X.shape}")
print(f"Size: {X.size}")
print(f"Dimensions: {X.ndim}")

Shape: (4, 3)
Size: 12
Dimensions: 2


In [4]:
# Define x0 as a (4,1) vector filled with 1 with the fastest NumPy method
x0 = np.ones((4,1))
x0

array([[1.],
       [1.],
       [1.],
       [1.]])

In [5]:
# Use `numpy.hstack` to create the (4,4) matrix X by concatenating x0 to your previous (4,3) matrix
X = np.hstack((x0, X))
X

array([[1.00e+00, 6.20e+02, 1.00e+00, 1.00e+00],
       [1.00e+00, 3.28e+03, 4.00e+00, 2.00e+00],
       [1.00e+00, 1.90e+03, 2.00e+00, 2.00e+00],
       [1.00e+00, 1.32e+03, 3.00e+00, 3.00e+00]])

## (2) Define the vector $\boldsymbol y$ of `Prices`

In [6]:
y = np.array([
    [244],
    [671],
    [504],
    [510]
])
y

array([[244],
       [671],
       [504],
       [510]])

## (3) Find the solution of the system

⏰Now, it's time to find the vector of coefficients $\boldsymbol \theta = \begin{bmatrix}
    \theta_0 \\
    \theta_1 \\
    \theta_2 \\
    \theta_3
\end{bmatrix}$ !

👍 The solution of the equation is:
 
$$ \large \boldsymbol X \cdot \boldsymbol \theta = \boldsymbol y 
\large \iff \boldsymbol X^{-1} \cdot \boldsymbol X \boldsymbol \cdot \theta = \boldsymbol X^{-1} \cdot \boldsymbol y 
\large \iff \boldsymbol \theta = \boldsymbol X^{-1} \cdot \boldsymbol y
$$

where $\large \boldsymbol X^{-1}$ is the inverse of $\large \boldsymbol X$.

In [7]:
# Compute the inverse of the matrix X with the right NumPy method
X_inv = np.linalg.inv(X)
X_inv

array([[ 1.64516129e+00, -1.03509515e-16, -2.90322581e-01,
        -3.54838710e-01],
       [-5.37634409e-04,  1.66950831e-19,  1.07526882e-03,
        -5.37634409e-04],
       [ 3.70967742e-01,  5.00000000e-01, -1.24193548e+00,
         3.70967742e-01],
       [-6.82795699e-01, -5.00000000e-01,  8.65591398e-01,
         3.17204301e-01]])

👉 You can check that the inversion worked by testing the following equality:

$$\boldsymbol X^{-1} \cdot\boldsymbol X = \boldsymbol I_4$$
where $\boldsymbol I_4$ is the $ 4 \times 4 $ identity matrix $ \begin{bmatrix}
    1 & 0 & 0 & 0 \\
    0 & 1 & 0 & 0 \\
    0 & 0 & 1 & 0 \\
    0 & 0 & 0 & 1
\end{bmatrix}$

In [8]:
# Define I4 using the right NumPy method
I4 = np.eye(4)
I4

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

In [9]:
x_comp = X_inv @ X
x_comp

array([[ 1.00000000e+00,  1.70530257e-13,  0.00000000e+00,
         2.22044605e-16],
       [ 0.00000000e+00,  1.00000000e+00,  4.33680869e-19,
         0.00000000e+00],
       [ 5.55111512e-17,  1.13686838e-13,  1.00000000e+00,
         0.00000000e+00],
       [-1.11022302e-16, -1.70530257e-13,  1.11022302e-16,
         1.00000000e+00]])

In [10]:
# result = np.dot(X_inv, X)
result = X_inv @ X
print(result)
print(np.allclose(result,I4))

[[ 1.00000000e+00  1.70530257e-13  0.00000000e+00  2.22044605e-16]
 [ 0.00000000e+00  1.00000000e+00  4.33680869e-19  0.00000000e+00]
 [ 5.55111512e-17  1.13686838e-13  1.00000000e+00  0.00000000e+00]
 [-1.11022302e-16 -1.70530257e-13  1.11022302e-16  1.00000000e+00]]
True


🎉 You are finally able to compute `theta` using the following formula: $ \large \boldsymbol \theta = \boldsymbol X^{-1}\cdot \boldsymbol y $:

In [11]:
thetha = X_inv @ y
thetha

array([[ 74.12903226],
       [  0.13655914],
       [-10.72580645],
       [ 95.93010753]])

## (4) Estimation of a new flat price

You finally solved the system finding $\boldsymbol \theta$ and are now able to estimate the `Price` (in thousands of $) of a 5th flat given these characteristics:

- `Surface`: 3000 $ft^2$
- `Bedrooms`: 5 
- `Floors`: 1

with the following formula:

$$y_{flat5} = \boldsymbol x_{flat5} \cdot \boldsymbol \theta$$

In [12]:
# Define x5

x5 = np.array([1, 3000, 5, 1])

# Compute y5
# You should find a Price of 526,000 $

y5 = x5 @ thetha
y5 = int(y5[0]) * 1000
y5

526000

## (5) Reality-check

In [13]:
# Create the new matrix of features X of shape (5,4)
X_new = np.vstack((X, x5))
X_new.shape

(5, 4)

In [14]:
X_new

array([[1.00e+00, 6.20e+02, 1.00e+00, 1.00e+00],
       [1.00e+00, 3.28e+03, 4.00e+00, 2.00e+00],
       [1.00e+00, 1.90e+03, 2.00e+00, 2.00e+00],
       [1.00e+00, 1.32e+03, 3.00e+00, 3.00e+00],
       [1.00e+00, 3.00e+03, 5.00e+00, 1.00e+00]])

In [15]:
# Create new y of shape (5,1)
y_new = np.array([[244],[671],[504],[510],[700]])
y_new

array([[244],
       [671],
       [504],
       [510],
       [700]])

In [16]:
theta_new, residuals, rank, s = np.linalg.lstsq(X_new, y_new, rcond=None)

In [17]:
theta_new

array([[1.11020221e+02],
       [8.92124222e-02],
       [5.31235728e+01],
       [4.15795102e+01]])