# House Prices

A multiple linear regression model to predict the house prices.

## Import Libraries

In [1]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

## Create a Sample Dataset

- Generate sample data
- Assign the sample data to a variable as DataFrame, `houses`

We created a sample dataset with fetures like numbers of bedrooms, size of the house, and the age of the house, along with the target variable, which is the house price.

In [2]:
sample_data = {
    'bedrooms': [3, 4, 2, 3, 5],
    'size_sqft': [1500, 2000, 1200, 1800, 2500],
    'age_years': [10, 15, 5, 8, 20],
    'price': [400000, 500000, 300000, 350000, 600000]
}

houses = pd.DataFrame(sample_data)

## Define Features and Target Variable

We define `X` as the feature matrix and `y` as the target variable.

In [3]:
X = houses[['bedrooms', 'size_sqft', 'age_years']]
y = houses['price']

## Split the Data

- Split the data into training and testing set

The dataset is split into training and testing sets to evaluate the performance of the model.

In [4]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

## Build and Train the Model

- Create a linear regression model
- Train the model

We fit the `LinearRegression` model to the tra

In [5]:
model = LinearRegression()
model.fit(X_train, y_train)

## Make Predictions

Utilize the trained model to forecast values for the data points in the test set.

In [6]:
y_pred = model.predict(X_test)

## Evaluate the Model Performance

- Calculate the Mean Squared Error
- Calculate the R-squared 

Assess the model's accuracy using metrics such as:
- Mean Squared Error where it measures the average squared differnce between predicted and actual values
- R-squared, it indicates the proportion of variance in the dependent variable explained by the independent variables.

In [7]:
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

## Preview the Evaluation

In [8]:
print(f'Mean Squared Error: {mse}')
print(f'R-squared: {r2}')

Mean Squared Error: 19504597.784994457
R-squared: nan


## Preview the Coefficients

Examine values of he model's coefficients and intercepts using the `print()` function.
- To understand the impact of each feature on the predicted outcome.
- Identify important predictors.

In [11]:
print('Coefficients:', model.coef_)
print('Intercept:', model.intercept_)

Coefficients: [ 2602.50624448   -22.082005   21687.69925066]
Intercept: 208438.49625331434
