# **Problem Statement**  
## **4. Predict house prices using linear regression and deploy as Flask app.**

### Problem Statement

House Price Prediction Using Linear Regression and Flask Deployment

The objective is to predict house prices based on features like area, number of bedrooms, and age of the house using Linear Regression, and deploy the trained model as a Flask REST API for real-time predictions.

### Constraints & Example Inputs/Outputs

### Constraints
- Regression problem
- Small, structured tabular dataset
- Model must be interpretable
- API response time < 1 second
- No deep learning

### Input Features
```python
| Feature  | Description              |
| -------- | ------------------------ |
| area     | Square feet              |
| bedrooms | Number of bedrooms       |
| age      | Age of the house (years) |
```

### Example Input (API Request):
```python
{
  "area": 1200,
  "bedrooms": 3,
  "age": 10
}
```

### Expected Output:
```python
{
  "predicted_price": 750000
}
```

### Solution Approach

### Step 1: Understand the Business Logic
House price increases with:
- arger area
- More bedrooms
- Newer construction

### Step 2: Why Linear Regression?
- Simple
- Explainable
- Widely accepted in pricing problems
- Fast inference (ideal for APIs)

### Step 3: End-to-End Flow
1. Create / load housing dataset
2. Train linear regression model
3. Save trained model
4. Build Flask API
5. Send test requests
6. Validate predictions

### Solution Code

In [1]:
# Approach 1: Brute Force (Manual Price Estimation)
def manual_price_estimator(area, bedrooms, age):
    base_price = 500000
    price = base_price
    price += area * 100
    price += bedrooms * 50000
    price -= age * 10000
    return price

manual_price_estimator(1200, 3, 10)


670000

### Limitations
- Hardcoded logic
- No learning from data
- Not scalable


### Alternative Solution

In [2]:
# Approach 2: Optimized (Linear Regression – ML)

# Step1: Create Dataset

import pandas as pd

data = pd.DataFrame({
    "area": [800, 1000, 1200, 1500, 1800],
    "bedrooms": [2, 2, 3, 4, 4],
    "age": [20, 15, 10, 5, 2],
    "price": [400000, 500000, 650000, 800000, 950000]
})

X = data[["area", "bedrooms", "age"]]
y = data["price"]


In [3]:
# Step2: Train Model

from sklearn.linear_model import LinearRegression

model = LinearRegression()
model.fit(X, y)

model.coef_, model.intercept_


(array([  399.71346705, 20988.53868195, -5945.55873926]),
 np.float64(155229.22636103106))

In [4]:
# Step3: Save Model

import joblib

joblib.dump(model, "house_price_model.pkl")


['house_price_model.pkl']

### Alternative Approaches

**Brute Force**
- Rule-based pricing
- Not adaptive

**ML Alternatives**
- Ridge / Lasso Regression
- Decision Trees
- Random Forest (higher accuracy, less interpretability)

**✅ Why Linear Regression First?**
- Easy to explain to business
- Fast deployment
- Strong baseline

### Test Case

In [5]:
# Test Case 1: tNormal House
test_input = pd.DataFrame({
    "area": [1200],
    "bedrooms": [3],
    "age": [10]
})

model.predict(test_input)


array([638395.41547278])

In [6]:
# Test Case 2: Large New House
test_input = pd.DataFrame({
    "area": [2000],
    "bedrooms": [4],
    "age": [1]
})


### Flask Deployment 

In [7]:
from flask import Flask, request, jsonify
import joblib
import pandas as pd

app = Flask(__name__)
model = joblib.load("house_price_model.pkl")

@app.route("/predict", methods=["POST"])
def predict():
    data = request.get_json()
    
    df = pd.DataFrame([{
        "area": data["area"],
        "bedrooms": data["bedrooms"],
        "age": data["age"]
    }])
    
    prediction = model.predict(df)[0]
    return jsonify({"predicted_price": float(prediction)})

if __name__ == "__main__":
    app.run(debug=True)


(np.float64(47.17559214062425), np.float64(52.82440785937575))


### API Testing 

In [None]:
# Using curl

curl -X POST http://127.0.0.1:5000/predict \
-H "Content-Type: application/json" \
-d '{"area":1200,"bedrooms":3,"age":10}'


### Expected Response 

In [None]:
{
  "predicted_price": 700000.0
}


### Business Explanation

**Real-World Use**
- Real estate platforms
- Loan eligibility systems
- Property valuation tools

**Business Impact**
- Faster pricing decisions
- Reduced manual valuation cost
- Consistent predictions

### Complexity Analysis

**Training**
- Time: O(n × features)
- Space: O(features)

**Prediction (API)**
- Time: O(features)
- Space: O(1)

#### Thank You!!