# Feature Importance
This notebook identifies and visualizes the most important features in predicting house prices.


In [None]:
# Import necessary libraries
import pandas as pd
import numpy as np
import pickle
import streamlit as st
import matplotlib.pyplot as plt
import seaborn as sns


## Load Optimized Model
We load the trained model from the previous step to analyze feature importance.



In [None]:
# Load the trained model
with open("../models/optimized_model.pkl", "rb") as f:
    model = pickle.load(f)

print("Optimized model loaded successfully.")


## Compute Feature Importance
We extract feature importance scores from the trained model.


In [None]:
# Load processed dataset to get feature names
data = pd.read_csv("../data/processed_train.csv")

# Extract feature importance from model
feature_importances = model.feature_importances_
feature_names = data.drop(columns=["SalePrice"]).columns  # Remove target column

# Create DataFrame
importance_df = pd.DataFrame({
    "Feature": feature_names,
    "Importance": feature_importances
})

# Sort by importance
importance_df = importance_df.sort_values(by="Importance", ascending=False)

# Display top features
print("Top Important Features:")
print(importance_df.head(10))



## Visualize Feature Importance
We create a bar plot to visualize the most important features.



In [None]:
# Plot feature importance
plt.figure(figsize=(12, 6))
sns.barplot(x=importance_df["Importance"][:10], y=importance_df["Feature"][:10], palette="viridis")
plt.xlabel("Feature Importance Score")
plt.ylabel("Feature")
plt.title("Top 10 Most Important Features in House Price Prediction")
plt.show()


## Summary
- Loaded the trained model.
- Extracted feature importance.
- Identified the top predictive features.
- Visualized the most important features.
