## Description of Numeric Features

**SalePrice:** The property's sale price in dollars. This is the target variable that this project aims to predict.

* **MSSubClass**: The building class
* **LotFrontage**: Linear feet of street connected to property
* **LotArea**: Lot size in square feet
* **Street**: Type of road access
* **OverallQual**: Overall material and finish quality
* **OverallCond**: Overall condition rating
* **YearBuilt**: Original construction date
* **YearRemodAdd**: Remodel date
* **MasVnrArea**: Masonry veneer area in square feet
* **BsmtFinSF1**: Type 1 finished square feet
* **BsmtFinSF2**: Type 2 finished square feet
* **BsmtUnfSF**: Unfinished square feet of basement area
* **TotalBsmtSF**: Total square feet of basement area
* **1stFlrSF**: First Floor square feet
* **2ndFlrSF**: Second floor square feet
* **LowQualFinSF**: Low quality finished square feet (all floors)
* **GrLivArea**: Above grade (ground) living area square feet
* **BsmtFullBath**: Basement full bathrooms
* **BsmtHalfBath**: Basement half bathrooms
* **FullBath**: Full bathrooms above grade
* **HalfBath**: Half baths above grade
* **Bedroom**: Number of bedrooms above basement level
* **Kitchen**: Number of kitchens
* **TotRmsAbvGrd**: Total rooms above grade (does not include bathrooms)
* **Fireplaces**: Number of fireplaces
* **GarageYrBlt**: Year garage was built
* **GarageFinish**: Interior finish of the garage
* **GarageCars**: Size of garage in car capacity
* **GarageArea**: Size of garage in square feet
* **WoodDeckSF**: Wood deck area in square feet
* **OpenPorchSF**: Open porch area in square feet
* **EnclosedPorch**: Enclosed porch area in square feet
* **3SsnPorch**: Three season porch area in square feet
* **ScreenPorch**: Screen porch area in square feet
* **PoolArea**: Pool area in square feet
* **MiscVal**: Value of miscellaneous feature
* **MoSold**: Month Sold
* **YrSold**: Year Sold

**TEST Dataset Features with Null Values**
* MSZoning =  4
* LotFrontage = 227
* Alley = 1352
* Utilities  =  2
* Exterior1st = 1
* Exterior2nd = 1
* MasVnrType = 894
* MasVnrArea = 15
* BsmtQual = 44
* BsmtCond = 45
* BsmtExposure = 44
* BsmtFinType1 = 42
* BsmtFinSF1 = 1
* BsmtFinType2 = 42
* BsmtFinSF2 =  1
* BsmtUnfSF = 1
* TotalBsmtSF =  1
* BsmtFullBath =  2
* BsmtHalfBath =  2
* Electrical = 1
* KitchenQual = 1
* Functional = 2
* FireplaceQu = 730
* GarageType = 76
* GarageYrBlt = 78
* GarageFinish = 78
* GarageCars = 1
* GarageArea = 1
* GarageQual = 78
* GarageCond = 78
* PoolQC = 1456
* Fence = 1169
* MiscFeature = 1408
* SaleType = 1

**TRAINING Features with Null Values**

* LotFrontage = 259
* Alley = 1369
* MasVnrType = 872
* MasVnrArea = 8
* BsmtQual = 37
* BsmtCond = 37
* BsmtExposure = 38
* BsmtFinType1 = 37
* BsmtFinType2 = 38
* Electrical = 1
* FireplaceQu = 690
* GarageType = 81
* GarageYrBlt = 81
* GarageFinish = 81
* PoolQC = 1453
* Fence = 1179
* MiscFeature = 1406

**COMBINED Dataset Features with Null Values**
* MSZoning = 4
* LotFrontage = 486
* Alley = 1352
* Utilities = 2
* Exterior1st =  1
* Exterior2nd  = 1
* MasVnrType = 1766
* MasVnrArea  = 23
* BsmtQual = 81
* BsmtCond  = 82
* BsmtExposure = 82
* BsmtFinType1 = 79
* BsmtFinSF1  =  1
* BsmtFinType2 =  80
* BsmtFinSF2  =  1
* BsmtUnfSF  =  1
* TotalBsmtSF  = 1
* BsmtFullBath = 2
* BsmtHalfBath = 2
* KitchenQual  =  1
* Functional =  2
* FireplaceQu = 1420
* GarageType  = 157
* GarageYrBlt = 159
* GarageFinish = 159
* GarageCars =  1
* GarageArea = 1
* GarageQual  = 159
* GarageCond = 159
* PoolQC  = 2909
* Fence  =  2348
* MiscFeature =  2814
* SaleType =  1

In [None]:
#Get overall count of null/NaN values in the TEST dataset
test_null_count = test_df.isnull().sum()
print("Null Values by Column\n\n", test_null_count)

In [None]:
#Get overall count of null/NaN values in the dataset
null_count = df_train.isnull().sum()
print("Null Values by Column\n\n", null_count)

In [None]:
#heat map to show high correlations of numeric features to SalePrice

corr= num_df

plt.figure(figsize=(20, 17))
sns.heatmap(corr[(corr >= 0.5)|(corr <= -0.4)], annot=True, cmap='magma', fmt='.2f')
plt.title("Combined Housing Numeric Features Correlation Heatmap\n", fontsize=18, 
          weight='bold', style='italic')
plt.show()

In [None]:
# Calculate the correlation matrix
correlation_matrix = num_df.corr()

# Create a mask for correlations less than 0.5
mask = abs(correlation_matrix) < 0.5

# Set up the matplotlib figure
plt.figure(figsize=(20, 17))

# Draw the heatmap with the mask
sns.heatmap(correlation_matrix, annot=True, cmap='magma', fmt='.2f', mask=mask, 
            cbar_kws={"shrink": .8})  # cbar_kws controls the color bar size
plt.title("Combined Housing Numeric Features Correlation Heatmap\n", fontsize=18, 
          weight='bold', style='italic')
plt.show()