## **California Housing**
- Regression in machine learning is a type of supervised learning where the goal is to predict a continuous target variable based on one or more input features. Below, I'll teach you about regression using three different datasets in Python, each focusing on different regression problems and evaluation metrics.

In [None]:
# import libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

In [None]:
from sklearn import datasets

In [None]:
dir(datasets)

In [None]:
data = datasets.fetch_california_housing()

In [None]:
data

In [None]:
california_housing = pd.DataFrame(data["data"], columns = data["feature_names"])

In [None]:
california_housing

In [None]:
california_housing["target"] = data["target"]

In [None]:
california_housing.shape

In [None]:
california_housing.info()

In [None]:
print(california_housing.to_string())

In [None]:
m = california_housing.drop(["target"], axis = 1)

In [None]:
m.head()

In [None]:
m.shape

In [None]:
n = california_housing["target"]

In [None]:
n.head()

In [None]:
n.shape

In [None]:
# splitting data into training and testing
m_train, m_test, n_train, n_test = train_test_split(m, n, test_size = 0.25, random_state = 42)

In [None]:
m_train.shape

In [None]:
m_test.shape

In [None]:
n_train.shape

In [None]:
n_test.shape

## **Model Training**

In [None]:
# Linear Regression
# Initialize model
from sklearn.linear_model import LinearRegression
lr_model = LinearRegression()

In [None]:
# Fit the Model
lr_model = lr_model.fit(m_train, n_train)

In [None]:
# Make Predictions
pred = lr_model.predict(m_test)

#### **1. Mean Absolute Error (MAE)**
- Mean Absolute Error is the average of the difference between the ground truth and the predicted values. 
- Strengths: Intuitive and easy to understand.
- Weaknesses: Treats all errors equally, regardless of magnitude

#### **2. Mean Squared Error (MSE)**
- It essentially finds the average of the squared difference between the target value and the value predicted by the regression model.
- Strengths: Highlights large errors.
- Weaknesses: Sensitive to outliers.

#### **3. R-squared (RÂ²)**
- Strengths: Indicates model fit.
- Weaknesses: Cannot detect overfitting. Negative values suggest poor performance.

In [None]:
print(n_test.to_string())

In [None]:
pred

In [None]:
# Evaluate Model Performance
mae = mean_absolute_error(n_test, pred)

mse = mean_squared_error(n_test, pred)

r2 = r2_score(n_test, pred)

In [None]:
print(f"Mean Absolute Error (mae): {mae}")
print(f"Mean Square Error (mse): {mse}")
print(f"R-Squared (r2): {r2}")