# Linear Regression  :  Predicting House Prices

#### Problem Statement 

You are provided with a dataset containing information about houses, including features like the size of the house (in a house based on square feet) and the number these features using lin linear of bedrooms. The goal is to predict the price of regression.

| Size (sqft) | Bedrooms | Price ($) |
|-------------|----------|-----------|
| 2104        | 3        | 399900    |
| 1600        | 3        | 329900    |
| 2400        | 3        | 369000    |
| 1416        | 2        | 232000    |
| 3000        | 4        | 539900    |





```python
data = {
    'Size (sqft)': [2104, 1600, 2400, 1416, 3000],
    'Bedrooms': [3, 3, 3, 2, 4],
    'Price ($)': [399900, 329900, 369000, 232000, 539900]
}
```


#### Create DataFrame
```python
import pandas as pd

df = pd.DataFrame(data)
df.head()
```

#### Split the data into features (X) and target (y)
```python
X = df[['Size (sqft)', 'Bedrooms']]
y = df['Price ($)']
```

#### Split the data into training and test sets
```python
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
```

#### Plot Size vs. Price
```python
import matplotlib.pyplot as plt

plt.scatter(df['Size (sqft)'], df['Price ($)'], color='blue', label='Size')
plt.xlabel('Size (sqft)')
plt.ylabel('Price ($)')
plt.title('Size vs Price')
plt.show()
```

#### Plot Bedrooms vs. Price
```python
import matplotlib.pyplot as plt

plt.scatter(df['Bedrooms'], df['Price ($)'], color='green', label='Bedrooms')

plt.xlabel('Bedrooms')
plt.ylabel('Price ($)')
plt.title('Bedrooms vs Price')
plt.show()
```

#### Define & Train a linear regression model
```python
from sklearn.linear_model import LinearRegression

model = LinearRegression()
model.fit(X_train, y_train)
```

#### The intercept and coefficients
```python
theta0 = model.intercept_
print(f"Intercept (theta0) : {theta0}")

theta_1, theta_2 = model.coef_
print(f"Coefficient for size (theta_1) : {theta_1}")
print(f"Coefficient for bedrooms (theta_2) : {theta_2}")
```

#### Predict the price from sample data
```python
sample_data = [[2500, 3]]
prediction = model.predict(sample_data)
print(f"Prediction for sample data {sample_data}: {prediction[0]:.2f}")
```

#### Make predictions
```python
y_pred = model.predict(X)
df['Predicted Price ($)'] = y_pred.round(2)
print(df)
```

#### Compare actual and predicted values
```python
comparison = pd.DataFrame({
    'Actual Price in Y ': y,
    'Predicted Price from model ': y_pred.round(2),
    'Difference': (y - y_pred).round(2)
})
print(comparison)
```