# 房屋租金预测

房屋租金取决于很多因素，例如：
1.BHK:卧室、大厅和厨房的数量
2.房产面积
3.房子的地板
4.区域类型
5.区域位置
6.城市
7.房屋的装修状况

要构建房屋租金预测系统，我们需要基于影响房屋租金因素的数据。我在 Kaggle 上找到了一个数据集，它包含了我们需要的所有特征。

使用 Python 预测房屋租金

In [24]:
# 导入必要的Pyhon库和数据
import pandas as pd
import numpy as np 
import matplotlib.pyplot as plt
import plotly.express as px
import plotly.graph_objects as go

In [25]:
data = pd.read_csv("House_Rent_Dataset.csv")
print(data.head())

    Posted On  BHK   Rent  Size            Floor    Area Type  \
0  2022-05-18    2  10000  1100  Ground out of 2   Super Area   
1  2022-05-13    2  20000   800       1 out of 3   Super Area   
2  2022-05-16    2  17000  1000       1 out of 3   Super Area   
3  2022-07-04    2  10000   800       1 out of 2   Super Area   
4  2022-05-09    2   7500   850       1 out of 2  Carpet Area   

              Area Locality     City Furnishing Status  Tenant Preferred  \
0                    Bandel  Kolkata       Unfurnished  Bachelors/Family   
1  Phool Bagan, Kankurgachi  Kolkata    Semi-Furnished  Bachelors/Family   
2   Salt Lake City Sector 2  Kolkata    Semi-Furnished  Bachelors/Family   
3               Dumdum Park  Kolkata       Unfurnished  Bachelors/Family   
4             South Dum Dum  Kolkata       Unfurnished         Bachelors   

   Bathroom Point of Contact  
0         2    Contact Owner  
1         1    Contact Owner  
2         1    Contact Owner  
3         1    Contact Owner

看是否有缺失值

In [26]:
print(data.isnull().sum())

Posted On            0
BHK                  0
Rent                 0
Size                 0
Floor                0
Area Type            0
Area Locality        0
City                 0
Furnishing Status    0
Tenant Preferred     0
Bathroom             0
Point of Contact     0
dtype: int64


看看数据的描述统计数据

In [27]:
print(data.describe())

               BHK          Rent         Size     Bathroom
count  4746.000000  4.746000e+03  4746.000000  4746.000000
mean      2.083860  3.499345e+04   967.490729     1.965866
std       0.832256  7.810641e+04   634.202328     0.884532
min       1.000000  1.200000e+03    10.000000     1.000000
25%       2.000000  1.000000e+04   550.000000     1.000000
50%       2.000000  1.600000e+04   850.000000     2.000000
75%       3.000000  3.300000e+04  1200.000000     2.000000
max       6.000000  3.500000e+06  8000.000000    10.000000


让我们来看看房屋的平均租金、中位数、最高租金和最低租金：

In [28]:
print(f"平均租金：{data.Rent.mean()}")
print(f"租金中位数：{data.Rent.median()}")
print(f"最高租金：{data.Rent.max()}")
print(f"最低租金：{data.Rent.min()}")



平均租金：34993.45132743363
租金中位数：16000.0
最高租金：3500000
最低租金：1200


现在我们来看看各个城市按照卧室、厅堂、厨房数量划分的房屋租金情况

In [29]:
# BHK:卧室、大厅和厨房的数量
figure  = px.bar(data,x=data["City"],
                 y=data["Rent"],
                 color = data["BHK"],
                 title="根据BHK计算不同城市的租金")
figure.show()


现在我们来看看不同城市按区域类型划分的房屋租金情况：

In [30]:
figure = px.bar(data, x=data["City"], 
                y = data["Rent"], 
                color = data["Area Type"],
            title="按区域类型划分不同的城市租金")
figure.show()

下面我们来看看各个城市按照房屋装修状况的租金情况：

In [31]:
figure = px.bar(data, x=data["City"], 
                y = data["Rent"], 
                color = data["Furnishing Status"],
            title="根据装修情况不同城市租金")
figure.show()

现在我们来看看不同城市按房屋面积的租金情况：

In [32]:

figure = px.bar(data, x=data["City"], 
                y = data["Rent"], 
                color = data["Size"],
            title="在不同城市根据房屋面积的租金情况")
figure.show()

现在我们根据数据集来看看不同城市可供出租的房屋数量：

In [33]:
cities = data["City"].value_counts()
print(cities)
label = cities.index
print(label)
counts = cities.values
print(counts)
colors = ['gold','lightgreen']
fig = go.Figure(data=[go.Pie(labels=label,values=counts,hole=0.5)])
fig.update_layout(title_text="可供出租的房屋数量")
fig.update_traces(hoverinfo='label+percent',textinfo='value',textfont_size=30,
                  marker=dict(colors=colors,line=dict(color='black',width=3)))

fig.show()


City
Mumbai       972
Chennai      891
Bangalore    886
Hyderabad    868
Delhi        605
Kolkata      524
Name: count, dtype: int64
Index(['Mumbai', 'Chennai', 'Bangalore', 'Hyderabad', 'Delhi', 'Kolkata'], dtype='object', name='City')
[972 891 886 868 605 524]


现在让我们来看看可供不同类型租户使用的房屋数量：

In [34]:
# 租客偏好
tenant = data["Tenant Preferred"].value_counts()
print(tenant)
label = tenant.index
counts = tenant.values
colors = ['gold','lightgreen']
print(label)
fig = go.Figure(data=[go.Pie(labels=label, values=counts, hole=0.5)])
fig.update_layout(title_text='印度租户的偏好')
fig.update_traces(hoverinfo='label+percent', textinfo='value', textfont_size=30,
                  marker=dict(colors=colors, line=dict(color='black', width=3)))
fig.show()

Tenant Preferred
Bachelors/Family    3444
Bachelors            830
Family               472
Name: count, dtype: int64
Index(['Bachelors/Family', 'Bachelors', 'Family'], dtype='object', name='Tenant Preferred')


# 房屋租金预测模型

现在将把所有分类特征转换为训练房屋租金预测模型所需的数值特征：

In [35]:
data["Area Type"] = data["Area Type"].map({"Super Area":1,"Carpet Area":2,"Built":3})

data["City"] = data["City"].map({"Mumbai": 4000, "Chennai": 6000, 
                                 "Bangalore": 5600, "Hyderabad": 5000, 
                                 "Delhi": 1100, "Kolkata": 7000})
data["Furnishing Status"] = data["Furnishing Status"].map({"Unfurnished": 0, 
                                                           "Semi-Furnished": 1, 
                                                           "Furnished": 2})
data["Tenant Preferred"] = data["Tenant Preferred"].map({"Bachelors/Family": 2, 
                                                         "Bachelors": 1, 
                                                         "Family": 3})
print(data.head())

    Posted On  BHK   Rent  Size            Floor  Area Type  \
0  2022-05-18    2  10000  1100  Ground out of 2        1.0   
1  2022-05-13    2  20000   800       1 out of 3        1.0   
2  2022-05-16    2  17000  1000       1 out of 3        1.0   
3  2022-07-04    2  10000   800       1 out of 2        1.0   
4  2022-05-09    2   7500   850       1 out of 2        2.0   

              Area Locality  City  Furnishing Status  Tenant Preferred  \
0                    Bandel  7000                  0                 2   
1  Phool Bagan, Kankurgachi  7000                  1                 2   
2   Salt Lake City Sector 2  7000                  1                 2   
3               Dumdum Park  7000                  0                 2   
4             South Dum Dum  7000                  0                 1   

   Bathroom Point of Contact  
0         2    Contact Owner  
1         1    Contact Owner  
2         1    Contact Owner  
3         1    Contact Owner  
4         1    Contac

将数据划分为训练集和测试集

In [36]:
# 划分数据
from sklearn.model_selection import train_test_split
x = np.array(data[["BHK", "Size", "Area Type", "City", 
                   "Furnishing Status", "Tenant Preferred", 
                   "Bathroom"]])
y = np.array(data[["Rent"]])

xtrain, xtest, ytrain, ytest = train_test_split(x, y, 
                                                test_size=0.10, 
                                                random_state=42)

## 现在让我们使用LSTM神经网络模型训练房屋租金预测模型：

In [39]:
from keras.models import Sequential
from keras.layers import Dense, LSTM
model = Sequential()
model.add(LSTM(128, return_sequences=True, 
               input_shape= (xtrain.shape[1], 1)))
model.add(LSTM(64, return_sequences=False))
model.add(Dense(25))
model.add(Dense(1))
model.summary()

In [41]:
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(xtrain, ytrain, batch_size=2, epochs=10)

Epoch 1/10
[1m2136/2136[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 4ms/step - loss: nan
Epoch 2/10
[1m2136/2136[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 4ms/step - loss: nan
Epoch 3/10
[1m2136/2136[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 3ms/step - loss: nan
Epoch 4/10
[1m2136/2136[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 3ms/step - loss: nan
Epoch 5/10
[1m2136/2136[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 3ms/step - loss: nan
Epoch 6/10
[1m2136/2136[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 3ms/step - loss: nan
Epoch 7/10
[1m2136/2136[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 4ms/step - loss: nan
Epoch 8/10
[1m2136/2136[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 4ms/step - loss: nan
Epoch 9/10
[1m2136/2136[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 4ms/step - loss: nan
Epoch 10/10
[1m2136/2136[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 4ms/step - loss: na

<keras.src.callbacks.history.History at 0x19e162b7200>

In [None]:
print("Enter House Details to Predict Rent")
a = int(input("Number of BHK: "))
b = int(input("Size of the House: "))
c = int(input("Area Type (Super Area = 1, Carpet Area = 2, Built Area = 3): "))
d = int(input("Pin Code of the City: "))
e = int(input("Furnishing Status of the House (Unfurnished = 0, Semi-Furnished = 1, Furnished = 2): "))
f = int(input("Tenant Type (Bachelors = 1, Bachelors/Family = 2, Only Family = 3): "))
g = int(input("Number of bathrooms: "))
features = np.array([[a, b, c, d, e, f, g]])
print("Predicted House Price = ", model.predict(features))

Enter House Details to Predict Rent
