# Linear Regression
You should build a machine learning pipeline using a linear regression model. In particular, you should do the following:
- Load the `housing` dataset using [Pandas](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html). You can find this dataset in the datasets folder.
- Split the dataset into training and test sets using [Scikit-Learn](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html). 
- Conduct data exploration, data preprocessing, and feature engineering if necessary. 
- Train and test a linear regression model using [Scikit-Learn](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html).
- Check the documentation to identify the most important hyperparameters, attributes, and methods of the model. Use them in practice.

## Business Problem Statement
lkjjdkfjd
kdjkfd
- hdfhdkl
- knfkkjdj


## Importing Required Libraries

In [None]:
import pandas as pd
import numpy as np
import sklearn.model_selection
import matplotlib.pyplot as plt
import seaborn as sns
import sklearn.compose
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import OneHotEncoder
import sklearn.svm
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error
from sklearn.metrics import r2_score 
# This is a test

## Data Collection

In [None]:

df = pd.read_csv("https://raw.githubusercontent.com/m-mahdavi/teaching/main/datasets/housing.csv")
df.head(10)

In [None]:
df.shape

In [None]:
df_train, df_test = sklearn.model_selection.train_test_split(df,test_size=.25)

print("df size:",df.shape)
print("df train size:",df_train.shape)
print("df test size:",df_test.shape)

## Data Exploration

In [None]:
df_train.dtypes

In [None]:
sns.histplot(data=df_train, x="bedrooms")

In [None]:
# for column in df_train:
#     new = df_train[column].value_counts()
#     print(new)


df_train['bedrooms'].value_counts()


In [None]:
for column in df_train:
    new = df_train[column].unique()
    print(f"Column Name : {column}")
    print(new)

## Data Pre-Processing

In [None]:
x_train = df_train.drop(['price'], axis=1)
y_train = df_train['price']
x_test = df_test.drop(['price'], axis=1)
y_test = df_test['price']

print(f"X Train size: {x_train.shape}")
print(f"Y Train size: {y_train.shape}")
print(f"X Test size: {x_test.shape}")
print(f"Y Train size: {y_test.shape}")

print(x_test)

## Feature Engineering

In [None]:
numerical_attributes = x_train.select_dtypes(include=["int64","float64"])
numerical_attributes
    



In [None]:

scaler = StandardScaler()
scaler.fit(x_train)
scaler.transform(x_train)
scaler.transform(x_test)

print(f"X Train size: {x_train.shape}")
print(f"X Test size: {x_test.shape}")





In [None]:
x_train.head(10)

## Model Training

In [None]:
model = sklearn.svm.SVC()
model.fit(x_train, y_train)


In [None]:
support_vectors = model.support_vectors_
gram_matrix = np.dot(support_vectors, support_vectors.T)


## Model Assessment

In [None]:
y_predicted = model.predict(x_test)
MSE = mean_squared_error(y_test, y_predicted)
MAE = mean_absolute_error(y_test, y_predicted)
R2 = r2_score(y_test, y_predicted)

print(f" MSE : {MSE}")
print(f" MAE : {MAE}")
print(f" R2  : {R2}")

## Conclusion