<a href="https://colab.research.google.com/github/ValW007/Group-Proj---Rental-Scam-Check/blob/main/Monthly_Rental_Prediction_(test).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import pandas as pd

In [None]:
from sklearn.model_selection import train_test_split

In [29]:
from sklearn.preprocessing import LabelEncoder, OneHotEncoder

In [None]:
from sklearn.linear_model import LinearRegression

In [35]:
from sklearn.metrics import mean_squared_error

In [51]:
from sklearn.metrics import mean_absolute_percentage_error, r2_score

In [99]:
from sklearn.compose import ColumnTransformer

In [100]:
df = pd.read_csv("/content/drive/MyDrive/Module 3 - AI in Finance, Credit Risks, Risks Mgmt/Rental_Data.csv")

In [101]:
label_encoder = LabelEncoder()
df['REGION'] = label_encoder.fit_transform(df['REGION'])

In [102]:
X = df[['REGION', 'No of Bedroom', "Property Type"]]
y = df['Monthly Rent ($)']

In [103]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


In [104]:
# 4. Create a ColumnTransformer for preprocessing:
categorical_features = ["Property Type"]  # List of categorical columns
preprocessor = ColumnTransformer(transformers=[
        ("num", "passthrough", ["REGION", "No of Bedroom"]),  # Numerical features
        ("cat", OneHotEncoder(sparse_output=False, handle_unknown='ignore'), categorical_features),  # Categorical features
    ])

In [105]:
# 5. Apply the preprocessor to training and testing data:
X_train = preprocessor.fit_transform(X_train)
X_test = preprocessor.transform(X_test)


In [106]:
model = LinearRegression()

In [107]:
model.fit(X_train, y_train)

In [108]:
y_pred = model.predict(X_test)

In [109]:
mse = mean_squared_error(y_test, y_pred)
print(f'Mean Squared Error: {mse}')


Mean Squared Error: 2153027.700793001


In [110]:
predictions_df = pd.DataFrame({'Actual': y_test, 'Predicted': y_pred})
print(predictions_df.head())

      Actual    Predicted
3113    5700  5996.655482
1068    4000  4719.536208
2046    4300  6194.372532
787     4000  4495.710045
798     3900  4297.992995


In [111]:
mape = mean_absolute_percentage_error(y_test, y_pred)
print(f'Mean Absolute Percentage Error (MAPE): {mape:.2f}%')


Mean Absolute Percentage Error (MAPE): 0.22%


In [112]:
r_squared = r2_score(y_test, y_pred)
print(f'R-squared (R²): {r_squared:.2f}')

R-squared (R²): 0.31


In [113]:
# Print the coefficients and intercept of the model
print(f'Intercept: {model.intercept_}')
print(f'Coefficients: {model.coef_}')

Intercept: 1628.836995622884
Coefficients: [  55.95654073 1251.0101607   -56.69048446  -84.3360807   141.02656517]


#Explanation:
Intercept
Intercept: 1628.836995622884
This is the baseline value of the Monthly Rent ($) when all the features (REGION, No of Bedroom, Property Type) are zero. In practical terms, it represents the starting point of the rent prediction before considering the influence of the features.
Coefficients
The coefficients represent the change in the target variable (Monthly Rent) for a one-unit change in the corresponding feature, holding all other features constant. Here’s how each coefficient is linked to the features:

REGION (55.95654073)
For each unit increase in the encoded value of the REGION, the Monthly Rent increases by approximately $55.96, assuming all other factors remain constant.
No of Bedroom (1251.0101607)
For each additional bedroom, the Monthly Rent increases by approximately $1251.01, assuming all other factors remain constant.
Property Type (OneHotEncoded)
The Property Type feature was one-hot encoded, resulting in multiple binary columns. The coefficients for these columns represent the impact of each property type compared to the baseline (which is typically the first category that was not explicitly encoded).
Property Type_Condo (-56.69048446)
If the property is a Condo, the Monthly Rent decreases by approximately $56.69 compared to the baseline property type (which might be HDB or another type, depending on the encoding order).
Property Type_HDB (-84.3360807)
If the property is an HDB, the Monthly Rent decreases by approximately $84.34 compared to the baseline property type.
Property Type_Landed (141.02656517)
If the property is a Landed property, the Monthly Rent increases by approximately $141.03 compared to the baseline property type.
Summary
Intercept: The starting point of the rent prediction.
REGION: Each unit increase in the encoded REGION value increases the rent by $55.96.
No of Bedroom: Each additional bedroom increases the rent by $1251.01.
Property Type: The impact on rent varies depending on the property type, with specific adjustments for Condo, HDB, and Landed properties.

In [114]:
import joblib
#SAVE THE TRAINED MODEL
joblib.dump(model, 'rental_price_model.pkl')
# Save the preprocessor
joblib.dump(preprocessor, 'preprocessor.pkl')

['preprocessor.pkl']