# Task 3: House Recommendation System

This notebook implements a customer recommendation system that suggests houses based on:
- Price range
- Area requirements
- Number of bedrooms and bathrooms
- Other preferences (garage, quality, etc.)

## Table of Contents
1. Load Data and Initialize System
2. Basic Filtering Examples
3. Comprehensive Recommendations
4. Similar House Finder
5. Best Value Recommendations
6. Custom Customer Scenarios

In [None]:
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

# Set visualization style
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)

# Import custom modules
import sys
sys.path.append('../src')
from recommendation_system import HouseRecommendationSystem, create_customer_profile

print("Libraries imported successfully!")

## 1. Load Data and Initialize System

In [None]:
# Load the training data
df = pd.read_csv('../data/train.csv')
print(f"Loaded {len(df)} houses")

# Initialize recommendation system
recommender = HouseRecommendationSystem()
recommender.load_data(df, target_col='SalePrice')

# Display sample data
print("\nSample houses:")
df[['Id', 'SalePrice', 'GrLivArea', 'BedroomAbvGr', 'FullBath', 
    'GarageCars', 'YearBuilt', 'Neighborhood']].head(10)

## 2. Basic Filtering Examples

### Example 1: Filter by Price Range

In [None]:
# Find houses between $150,000 and $200,000
price_filtered = recommender.filter_by_price(min_price=150000, max_price=200000)

print(f"\nFound {len(price_filtered)} houses in price range $150,000 - $200,000")
print("\nSummary:")
price_filtered[['Id', 'SalePrice', 'GrLivArea', 'BedroomAbvGr', 
                'FullBath', 'Neighborhood']].head(10)

### Example 2: Filter by Living Area

In [None]:
# Reset recommender data
recommender.load_data(df, target_col='SalePrice')

# Find houses with 1500-2000 sq ft living area
area_filtered = recommender.filter_by_area(min_area=1500, max_area=2000)

print(f"\nFound {len(area_filtered)} houses with 1500-2000 sq ft")
area_filtered[['Id', 'SalePrice', 'GrLivArea', 'BedroomAbvGr', 
               'FullBath']].head(10)

### Example 3: Filter by Multiple Features

In [None]:
# Reset recommender data
recommender.load_data(df, target_col='SalePrice')

# Define feature filters
filters = {
    'BedroomAbvGr': {'min': 3, 'max': 4},
    'FullBath': {'min': 2},
    'GarageCars': {'min': 2}
}

feature_filtered = recommender.filter_by_features(filters)

print(f"\nFound {len(feature_filtered)} houses matching criteria:")
print("- 3-4 bedrooms")
print("- At least 2 full bathrooms")
print("- At least 2-car garage")
print("\nSample results:")
feature_filtered[['Id', 'SalePrice', 'GrLivArea', 'BedroomAbvGr', 
                  'FullBath', 'GarageCars']].head(10)

## 3. Comprehensive Recommendations

### Scenario 1: Young Family Looking for First Home

In [None]:
# Reset recommender
recommender.load_data(df, target_col='SalePrice')

# Young family preferences
young_family_prefs = {
    'max_price': 175000,
    'min_area': 1200,
    'max_area': 1800,
    'bedrooms': {'min': 3},
    'bathrooms': {'min': 2},
    'garage': {'min': 1}
}

recommendations = recommender.get_recommendations(young_family_prefs, n_recommendations=10)

print("\n" + "=" * 70)
print("RECOMMENDATIONS FOR: Young Family (First Home)")
print("=" * 70)
print("\nPreferences:")
print("- Budget: Under $175,000")
print("- Living Area: 1,200 - 1,800 sq ft")
print("- Bedrooms: 3+")
print("- Bathrooms: 2+")
print("- Garage: 1+ car")
print("\nTop Recommendations:")

if not recommendations.empty:
    summary = recommender.summarize_recommendations(recommendations)
    display(summary)
else:
    print("No houses match all criteria. Try relaxing some requirements.")

### Scenario 2: Established Family with Higher Budget

In [None]:
# Reset recommender
recommender.load_data(df, target_col='SalePrice')

# Established family preferences
established_family_prefs = {
    'min_price': 200000,
    'max_price': 350000,
    'min_area': 2000,
    'bedrooms': {'min': 4},
    'bathrooms': {'min': 2},
    'garage': {'min': 2}
}

recommendations = recommender.get_recommendations(established_family_prefs, n_recommendations=10)

print("\n" + "=" * 70)
print("RECOMMENDATIONS FOR: Established Family (Larger Home)")
print("=" * 70)
print("\nPreferences:")
print("- Budget: $200,000 - $350,000")
print("- Living Area: 2,000+ sq ft")
print("- Bedrooms: 4+")
print("- Bathrooms: 2+")
print("- Garage: 2+ car")
print("\nTop Recommendations:")

if not recommendations.empty:
    summary = recommender.summarize_recommendations(recommendations)
    display(summary)
else:
    print("No houses match all criteria. Try relaxing some requirements.")

### Scenario 3: Retiree Downsizing

In [None]:
# Reset recommender
recommender.load_data(df, target_col='SalePrice')

# Retiree preferences
retiree_prefs = {
    'max_price': 150000,
    'min_area': 800,
    'max_area': 1500,
    'bedrooms': {'min': 2, 'max': 3},
    'bathrooms': {'min': 1}
}

recommendations = recommender.get_recommendations(retiree_prefs, n_recommendations=10)

print("\n" + "=" * 70)
print("RECOMMENDATIONS FOR: Retiree (Downsizing)")
print("=" * 70)
print("\nPreferences:")
print("- Budget: Under $150,000")
print("- Living Area: 800 - 1,500 sq ft")
print("- Bedrooms: 2-3")
print("- Bathrooms: 1+")
print("\nTop Recommendations:")

if not recommendations.empty:
    summary = recommender.summarize_recommendations(recommendations)
    display(summary)
else:
    print("No houses match all criteria. Try relaxing some requirements.")

## 4. Similar House Finder

In [None]:
# Reset recommender
recommender.load_data(df, target_col='SalePrice')

# Select a reference house (e.g., index 100)
reference_house = df.iloc[100]

print("Reference House:")
print("=" * 50)
print(f"Price: ${reference_house['SalePrice']:,.0f}")
print(f"Living Area: {reference_house['GrLivArea']:,.0f} sq ft")
print(f"Bedrooms: {reference_house['BedroomAbvGr']}")
print(f"Bathrooms: {reference_house['FullBath']}")
print(f"Year Built: {reference_house['YearBuilt']}")
print(f"Neighborhood: {reference_house['Neighborhood']}")

# Find similar houses
similar_houses = recommender.find_similar_houses(reference_house, n_similar=5)

print("\n\nTop 5 Similar Houses:")
print("=" * 50)
similar_houses[['Id', 'SalePrice', 'GrLivArea', 'BedroomAbvGr', 
                'FullBath', 'YearBuilt', 'Neighborhood', 'Similarity_Score']]

## 5. Best Value Recommendations

### Best Value by Price per Square Foot

In [None]:
# Reset recommender
recommender.load_data(df, target_col='SalePrice')

# Find best value by price per sq ft
best_value_psf = recommender.get_best_value_houses(n_houses=10, 
                                                    value_metric='price_per_sqft')

print("\nTop 10 Best Value Houses (Lowest Price per Sq Ft):")
print("=" * 70)
best_value_psf[['Id', 'SalePrice', 'GrLivArea', 'PricePerSqFt', 
                'BedroomAbvGr', 'FullBath', 'OverallQual']].head(10)

### Best Value by Quality-to-Price Ratio

In [None]:
# Reset recommender
recommender.load_data(df, target_col='SalePrice')

# Find best value by quality/price ratio
best_value_quality = recommender.get_best_value_houses(n_houses=10, 
                                                       value_metric='quality_price_ratio')

print("\nTop 10 Best Value Houses (Highest Quality per Dollar):")
print("=" * 70)
best_value_quality[['Id', 'SalePrice', 'OverallQual', 'QualityPriceRatio',
                    'GrLivArea', 'BedroomAbvGr', 'YearBuilt']].head(10)

## 6. Visualize Recommendations

In [None]:
# Reset recommender for visualization
recommender.load_data(df, target_col='SalePrice')

# Get recommendations for a specific scenario
prefs = {
    'max_price': 200000,
    'min_area': 1500,
    'bedrooms': {'min': 3},
    'bathrooms': {'min': 2}
}

recs = recommender.get_recommendations(prefs, n_recommendations=20)

if not recs.empty:
    # Price vs Area scatter
    fig, axes = plt.subplots(1, 2, figsize=(15, 6))
    
    # Plot 1: Recommended houses
    axes[0].scatter(df['GrLivArea'], df['SalePrice'], alpha=0.3, 
                   label='All Houses', s=30)
    axes[0].scatter(recs['GrLivArea'], recs['SalePrice'], alpha=0.8, 
                   color='red', label='Recommended', s=80, edgecolors='black')
    axes[0].set_xlabel('Living Area (sq ft)')
    axes[0].set_ylabel('Sale Price ($)')
    axes[0].set_title('Recommended Houses - Price vs Area')
    axes[0].legend()
    axes[0].grid(True, alpha=0.3)
    
    # Plot 2: Price distribution
    axes[1].hist(df['SalePrice'], bins=50, alpha=0.5, label='All Houses')
    axes[1].hist(recs['SalePrice'], bins=20, alpha=0.8, 
                color='red', label='Recommended')
    axes[1].set_xlabel('Sale Price ($)')
    axes[1].set_ylabel('Count')
    axes[1].set_title('Price Distribution Comparison')
    axes[1].legend()
    axes[1].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
else:
    print("No recommendations to visualize")

## Summary and Recommendations

### Key Features of the Recommendation System:

1. **Price-Based Filtering**: Find houses within a specific budget
2. **Area-Based Filtering**: Search by living area requirements
3. **Multi-Feature Filtering**: Combine multiple criteria (bedrooms, bathrooms, garage, etc.)
4. **Similar House Finder**: Find houses similar to a reference property using k-NN
5. **Best Value Analysis**: Identify best deals by price/sqft or quality/price ratios

### How to Use This System:

1. **Define Customer Preferences**: Identify budget, size, and feature requirements
2. **Apply Filters**: Use the recommendation system to narrow down options
3. **Evaluate Results**: Review suggested houses and their features
4. **Find Similar Options**: Use similarity search to expand choices
5. **Assess Value**: Compare value metrics to find the best deals

### Customization Options:

- Adjust price ranges for different budgets
- Modify area requirements based on family size
- Add quality constraints for higher-end homes
- Include neighborhood preferences
- Consider age of home (YearBuilt)
- Factor in specific amenities (pool, fireplace, etc.)