# Rent Price Prediction Case Study

  'Kaizen Property Solutions' is an innovative firm focused on providing data-driven insights for rental property management and investment. The company aims to enhance the rental decision-making process for landlords, tenants, and real estate investors through advanced analytical solutions.
  
Business Challenge:
    
  - In the dynamic rental market of Pune and Mumbai, accurately predicting rental prices is crucial for maximizing occupancy rates and optimizing rental income. However, property managers and landlords often face challenges in estimating rental prices due to various influencing factors, including property characteristics, location, and local market trends specific to these metropolitan areas.





Objective:
  - To tackle this challenge, 'Kaizen Property Solutions' is conducting a machine learning case study aimed at developing a predictive model for estimating rental prices based on a comprehensive dataset that includes the following features: house_type, house_size, location, city (specifically Pune and Mumbai), latitude, longitude, price, currency, numBathrooms, numBalconies, isNegotiable, priceSqFt, verificationDate, description, SecurityDeposit, and Status.


In [None]:
import pandas as pd
import numpy as np
pune=pd.read_csv('/content/Indian_housing_Pune_data.csv')
mumbai=pd.read_csv('/content/Indian_housing_Mumbai_data.csv')

In [None]:
pune

Unnamed: 0,house_type,house_size,location,city,latitude,longitude,price,currency,numBathrooms,numBalconies,isNegotiable,priceSqFt,verificationDate,description,SecurityDeposit,Status
0,2 BHK Apartment,906 sq ft,Lohegaon,Pune,18.605820,73.912407,12000,INR,2.0,,,,Posted 3 years ago,A spacious 2 bhk multistorey apartment is avai...,No Deposit,Unfurnished
1,1 BHK Apartment,650 sq ft,Anand Nagar,Pune,18.474377,73.820549,11000,INR,1.0,,,,Posted 2 years ago,It has a built-up area of 650 sqft and is avai...,No Deposit,Semi-Furnished
2,1 RK Studio Apartment,350 sq ft,Wagholi,Pune,18.580336,73.980507,4500,INR,1.0,,,,Posted 2 years ago,This spacious 1 rk independent house is availa...,No Deposit,Unfurnished
3,3 BHK Apartment,"1,500 sq ft",Sangamvadi,Pune,18.541786,73.882454,35000,INR,2.0,,,,Posted 3 years ago,"Furnishings include 1 tv, 1 refrigerator, 1 so...",No Deposit,Furnished
4,5 BHK Independent House,"5,000 sq ft",Wadgaon Sheri,Pune,18.560164,73.924927,110000,INR,5.0,,,,Posted 3 years ago,Itâs a 5 bhk independent house situated in W...,No Deposit,Unfurnished
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3905,1 BHK Apartment,655 sq ft,Wakad,Pune,18.603699,73.761238,12500,INR,1.0,,,,Posted 3 years ago,Well designed 1 bhk multistorey apartment is a...,36000,Unfurnished
3906,2 BHK Apartment,920 sq ft,Pimple Saudagar,Pune,18.594870,73.798187,16000,INR,2.0,,,,Posted 3 years ago,Itâs a 2 bhk multistorey apartment situated ...,50000,Unfurnished
3907,1 BHK Apartment,650 sq ft,Pimple Saudagar,Pune,18.595701,73.797890,16000,INR,1.0,1.0,,,Posted 3 years ago,A spacious 1 bhk multistorey apartment is avai...,35000,Furnished
3908,2 BHK Apartment,"1,200 sq ft",Pimple Saudagar,Pune,18.592476,73.798538,20000,INR,2.0,,,,Posted 3 years ago,It has a built-up area of 1200 sqft and is ava...,60000,Furnished


In [None]:
#Data cleaning
#house_type ->replace Floor to House
#house_size->remove , and sqft convert to numerical data
#location->Checking for spelling mistake in location
#City-> no problem
#Drop -> latitude longitude
#Pirce-> no problem
# currency-> drop
#numBathrooms-> convert to int
# numnumBalconies	isNegotiable--> Yes,no
#Drop sqftprice
#Drop/use verification
#Discription - needs more research
#Deposit--> convert to numerical, replace nodeposit with zero.
#status--->No problem

In [None]:
pune.shape

(3910, 16)

In [None]:
mumbai.shape

(5000, 16)

In [None]:
finaldf=pd.concat([pune,mumbai]) # merge

In [None]:
import re
def clean_House_size(text):
  x=re.sub(',','',text)
  y=re.sub('sq ft','',x)
  return y.strip()

In [None]:
finaldf['house_size'].apply(clean_House_size).astype("int")

Unnamed: 0,house_size
0,906
1,650
2,350
3,1500
4,5000
...,...
4995,800
4996,1200
4997,1040
4998,1930


In [None]:
finaldf['house_size']=finaldf['house_size'].apply(clean_House_size).astype("int")

In [None]:
text1=finaldf['description'].iloc[2]

In [None]:
text1

'This spacious 1 rk independent house is available for rental and is located in the heart of Wagholi. It has an area of 350 sqft with a carpet area of 300 sqft . The property is available at a monthly rental of Rs. 4,500 . This residential property is ready-to-move-in. It is made in way to provide a comfortable living for the residents. Contact us for more details. '

In [None]:
re.findall("[\d]+",text1)[2]

'300'

In [None]:
finaldf1=finaldf.drop(columns=['latitude','longitude','currency','priceSqFt','verificationDate','description'])

In [None]:
def clean_Security_Deposit(text):
  return re.sub(',','',text)

In [None]:
finaldf1['SecurityDeposit']=finaldf1['SecurityDeposit'].apply(clean_Security_Deposit)

In [None]:
finaldf1['SecurityDeposit']=np.where(finaldf1['SecurityDeposit']=="No Deposit",0,finaldf1['SecurityDeposit'])

In [None]:
finaldf1['SecurityDeposit']=finaldf1['SecurityDeposit'].astype('int')

In [None]:
finaldf[finaldf['isNegotiable']=='Negotiable']['description'].iloc[811]

'AVAILABLE 2 BHK FLAT AT ANDHERI WEST \nVERY BIG AND LUXURY FLATS FOR FAMILY AND BACHELORS AND PETS MOST WELCOME.\nWE HAVE MULTIPLE FLAT OPTIONS AVAILABLE AT ANDHERI WEST. FOR MORE DETAILS OR MORE FLAT OPTIONS PLEASE CONTACT US.\nPRISM PROPERTY'

In [None]:
finaldf1['numBathrooms'].isna().sum()

31

In [None]:
finaldf1['numBathrooms'].fillna(1,inplace=True)

In [None]:
finaldf1['numBathrooms']=finaldf1['numBathrooms'].astype('int')

In [None]:
finaldf1['numBalconies']=finaldf1['numBalconies'].astype('object')


In [None]:
finaldf1.groupby('numBalconies')['price'].median()

Unnamed: 0_level_0,price
numBalconies,Unnamed: 1_level_1
1.0,17000.0
2.0,18000.0
3.0,29500.0
4.0,44000.0
5.0,60000.0
6.0,888000.0


In [None]:
finaldf1['numBalconies'].isna().sum()

6356

In [None]:
text=finaldf1['house_type'].head(1)[0]

In [None]:
text

'2 BHK Apartment '

In [None]:
text=text.strip()

In [None]:
import re
re.sub(' ',"_",text)


'2_BHK_Apartment'

In [None]:
def house_type_cleaning(text):
  text=text.strip()
  return re.sub(' ',"_",text)

In [None]:
finaldf1['house_type']=finaldf1['house_type'].apply(house_type_cleaning)

In [None]:
import numpy as np

finaldf1['numBalconies'] = np.where(finaldf1['house_type'] == '2_BHK_Apartment',
                                    finaldf1['numBalconies'].fillna(1),
                                    finaldf1['numBalconies'].fillna(2))


In [None]:
finaldf1[finaldf1['house_type']=='2 BHK Apartment']

Unnamed: 0,house_type,house_size,location,city,price,numBathrooms,numBalconies,isNegotiable,SecurityDeposit,Status


In [None]:
finaldf['description']

Unnamed: 0,description
0,A spacious 2 bhk multistorey apartment is avai...
1,It has a built-up area of 650 sqft and is avai...
2,This spacious 1 rk independent house is availa...
3,"Furnishings include 1 tv, 1 refrigerator, 1 so..."
4,Itâs a 5 bhk independent house situated in W...
...,...
4995,Gundecha asta complex. Andheri East. sakinaka ...
4996,It has an area of 1200 sqft . The property is ...
4997,It is located on the 7th floor(out of a total ...
4998,"It is a furnished property. It has 1 wardrobe,..."


In [None]:
finaldf1['isNegotiable']=np.where(finaldf1['isNegotiable']=='Negotiable','Yes','No')

In [None]:
finaldf1.isna().sum()

Unnamed: 0,0
house_type,0
house_size,0
location,0
city,0
price,0
numBathrooms,0
numBalconies,0
isNegotiable,0
SecurityDeposit,0
Status,0


In [None]:
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import OneHotEncoder,OrdinalEncoder
from sklearn.metrics import r2_score,mean_absolute_error
from sklearn.linear_model import LinearRegression
from sklearn.neighbors import KNeighborsRegressor
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor,GradientBoostingRegressor,AdaBoostRegressor
from sklearn.svm import SVR

In [None]:
x=finaldf1.drop(columns=['price'])
y=finaldf1['price']

In [None]:
# train test split
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test = train_test_split(x,y,test_size=0.20,random_state=2)

In [None]:
X_train['city'].value_counts()

Unnamed: 0_level_0,count
city,Unnamed: 1_level_1
Mumbai,4013
Pune,3107
Hisar,8


In [None]:
X_train

Unnamed: 0,house_type,house_size,location,city,numBathrooms,numBalconies,isNegotiable,SecurityDeposit,Status
251,3_BHK_Apartment,1310,Kalyan West,Mumbai,3,2.0,No,0,Furnished
482,2_BHK_Apartment,950,Bandra West,Mumbai,2,1.0,No,0,Furnished
1449,1_BHK_Apartment,956,Wadala,Mumbai,1,2.0,No,0,Semi-Furnished
2316,1_BHK_Apartment,700,Ulwe,Mumbai,1,2.0,No,0,Furnished
623,2_BHK_Apartment,780,Lohegaon,Pune,2,1.0,No,0,Semi-Furnished
...,...,...,...,...,...,...,...,...,...
1099,2_BHK_Apartment,990,Wagholi,Pune,2,2.0,No,25000,Furnished
2514,2_BHK_Apartment,1100,Aundh,Pune,2,1.0,Yes,84000,Semi-Furnished
2727,2_BHK_Apartment,1000,Ambernath West,Mumbai,2,1.0,No,0,Semi-Furnished
2575,1_BHK_Apartment,633,Dhanori,Pune,1,2.0,Yes,50000,Unfurnished


In [None]:
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import OneHotEncoder, OrdinalEncoder
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score, mean_absolute_error

# Step 1: Column Transformer with encoding
step1 = ColumnTransformer(transformers=[
    ('tnf1', OneHotEncoder(sparse_output=False, categories='auto', handle_unknown='ignore'), [2, 3, 6]),
    ('tnf2', OrdinalEncoder(categories=[['1_RK_Studio_Apartment', '1_BHK_Apartment', '1_BHK_Independent_House',
                                         '1_BHK_Independent_Floor', '1_BHK_Villa', '2_BHK_Apartment',
                                         '2_BHK_Independent_House', '2_BHK_Independent_Floor', '2_BHK_Villa',
                                         '3_BHK_Apartment', '3_BHK_Independent_House', '3_BHK_Independent_Floor',
                                         '3_BHK_Villa', '4_BHK_Apartment', '4_BHK_Independent_House',
                                         '4_BHK_Independent_Floor', '4_BHK_Villa', '5_BHK_Apartment',
                                         '5_BHK_Independent_House', '5_BHK_Villa', '6_BHK_Apartment',
                                         '6_BHK_Independent_House', '6_BHK_Villa', '6_BHK_penthouse']],
                         handle_unknown='use_encoded_value', unknown_value=-1), [0]),
    ('tnf3', OrdinalEncoder(categories=[['Unfurnished', 'Semi-Furnished', 'Furnished']],
                            handle_unknown='use_encoded_value', unknown_value=-1), [8]),
], remainder='passthrough')

# Step 2: Use Multiple Linear Regression instead of Gradient Boosting
step2 = LinearRegression()
# Create pipeline
pipe = Pipeline([
    ('step1', step1),
    ('step2', step2)
])

# Fit the pipeline to the training data
pipe.fit(X_train, y_train)
# Make predictions on the test data
y_pred = pipe.predict(X_test)
# Evaluate the model
print('R2 score:', r2_score(y_test, y_pred))
print('MAE:', mean_absolute_error(y_test, y_pred))


R2 score: 0.6614131312692343
MAE: 17102.42474754782


In [None]:
from sklearn.tree import DecisionTreeRegressor
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import OneHotEncoder, OrdinalEncoder
from sklearn.metrics import r2_score, mean_absolute_error

# Step 1: Column Transformer with encoding
step1 = ColumnTransformer(transformers=[
    ('tnf1', OneHotEncoder(sparse_output=False, categories='auto', handle_unknown='ignore'), [2, 3, 6]),
    ('tnf2', OrdinalEncoder(categories=[['1_RK_Studio_Apartment', '1_BHK_Apartment', '1_BHK_Independent_House',
                                         '1_BHK_Independent_Floor', '1_BHK_Villa', '2_BHK_Apartment',
                                         '2_BHK_Independent_House', '2_BHK_Independent_Floor', '2_BHK_Villa',
                                         '3_BHK_Apartment', '3_BHK_Independent_House', '3_BHK_Independent_Floor',
                                         '3_BHK_Villa', '4_BHK_Apartment', '4_BHK_Independent_House',
                                         '4_BHK_Independent_Floor', '4_BHK_Villa', '5_BHK_Apartment',
                                         '5_BHK_Independent_House', '5_BHK_Villa', '6_BHK_Apartment',
                                         '6_BHK_Independent_House', '6_BHK_Villa', '6_BHK_penthouse']],
                         handle_unknown='use_encoded_value', unknown_value=-1), [0]),
    ('tnf3', OrdinalEncoder(categories=[['Unfurnished', 'Semi-Furnished', 'Furnished']],
                            handle_unknown='use_encoded_value', unknown_value=-1), [8]),
], remainder='passthrough')

# Step 2: Use Decision Tree Regressor
step2 = DecisionTreeRegressor()

# Create pipeline
pipe = Pipeline([
    ('step1', step1),
    ('step2', step2)
])

# Fit the pipeline to the training data
pipe.fit(X_train, y_train)

# Make predictions on the test data
y_pred = pipe.predict(X_test)

# Evaluate the model
print('R2 score:', r2_score(y_test, y_pred))
print('MAE:', mean_absolute_error(y_test, y_pred))


R2 score: 0.7869902683681171
MAE: 9683.418262798956


In [None]:
# Step 1: Column Transformer with encoding
step1 = ColumnTransformer(transformers=[
    ('tnf1', OneHotEncoder(sparse_output=False, categories='auto', handle_unknown='ignore'), [2, 3, 6]),
    ('tnf2', OrdinalEncoder(categories=[['1_RK_Studio_Apartment', '1_BHK_Apartment', '1_BHK_Independent_House',
                                         '1_BHK_Independent_Floor', '1_BHK_Villa', '2_BHK_Apartment',
                                         '2_BHK_Independent_House', '2_BHK_Independent_Floor', '2_BHK_Villa',
                                         '3_BHK_Apartment', '3_BHK_Independent_House', '3_BHK_Independent_Floor',
                                         '3_BHK_Villa', '4_BHK_Apartment', '4_BHK_Independent_House',
                                         '4_BHK_Independent_Floor', '4_BHK_Villa', '5_BHK_Apartment',
                                         '5_BHK_Independent_House', '5_BHK_Villa', '6_BHK_Apartment',
                                         '6_BHK_Independent_House', '6_BHK_Villa', '6_BHK_penthouse']],
                         handle_unknown='use_encoded_value', unknown_value=-1), [0]),
    ('tnf3', OrdinalEncoder(categories=[['Unfurnished', 'Semi-Furnished', 'Furnished']],
                            handle_unknown='use_encoded_value', unknown_value=-1), [8]),
], remainder='passthrough')
# Step 2: Use Random Forest Regressor
step2 = RandomForestRegressor(n_estimators=100)
# Create pipeline
pipe = Pipeline([
    ('step1', step1),
    ('step2', step2)
])
# Fit the pipeline to the training data
pipe.fit(X_train, y_train)

# Make predictions on the test data
y_pred = pipe.predict(X_test)

# Evaluate the model
print('R2 score:', r2_score(y_test, y_pred))
print('MAE:', mean_absolute_error(y_test, y_pred))


R2 score: 0.8260226367050573
MAE: 9240.007035678038


In [None]:
from sklearn.svm import SVR
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import OneHotEncoder, OrdinalEncoder, StandardScaler
from sklearn.metrics import r2_score, mean_absolute_error
# Step 1: Column Transformer with encoding
step1 = ColumnTransformer(transformers=[
    ('tnf1', OneHotEncoder(sparse_output=False, categories='auto', handle_unknown='ignore'), [2, 3, 6]),
    ('tnf2', OrdinalEncoder(categories=[['1_RK_Studio_Apartment', '1_BHK_Apartment', '1_BHK_Independent_House',
                                         '1_BHK_Independent_Floor', '1_BHK_Villa', '2_BHK_Apartment',
                                         '2_BHK_Independent_House', '2_BHK_Independent_Floor', '2_BHK_Villa',
                                         '3_BHK_Apartment', '3_BHK_Independent_House', '3_BHK_Independent_Floor',
                                         '3_BHK_Villa', '4_BHK_Apartment', '4_BHK_Independent_House',
                                         '4_BHK_Independent_Floor', '4_BHK_Villa', '5_BHK_Apartment',
                                         '5_BHK_Independent_House', '5_BHK_Villa', '6_BHK_Apartment',
                                         '6_BHK_Independent_House', '6_BHK_Villa', '6_BHK_penthouse']],
                         handle_unknown='use_encoded_value', unknown_value=-1), [0]),
    ('tnf3', OrdinalEncoder(categories=[['Unfurnished', 'Semi-Furnished', 'Furnished']],
                            handle_unknown='use_encoded_value', unknown_value=-1), [8]),
], remainder='passthrough')

# Step 2: Use StandardScaler to scale features before applying SVR
step2_scaler = StandardScaler()

# Step 3: Use SVR
step3 = SVR(kernel='rbf')  # You can choose different kernels, like 'linear', 'poly', etc.

# Create pipeline
pipe = Pipeline([
    ('step1', step1),
    ('step2_scaler', step2_scaler),  # Apply scaling after encoding
    ('step3', step3)
])

# Fit the pipeline to the training data
pipe.fit(X_train, y_train)

# Make predictions on the test data
y_pred = pipe.predict(X_test)

# Evaluate the model
print('R2 score:', r2_score(y_test, y_pred))
print('MAE:', mean_absolute_error(y_test, y_pred))


R2 score: -0.07539762368515457
MAE: 27528.59094449541


In [None]:


# Step 1: Column Transformer with encoding
step1 = ColumnTransformer(transformers=[
    ('tnf1', OneHotEncoder(sparse_output=False, categories='auto', handle_unknown='ignore'), [2, 3, 6]),
    ('tnf2', OrdinalEncoder(categories=[['1_RK_Studio_Apartment', '1_BHK_Apartment', '1_BHK_Independent_House',
                                         '1_BHK_Independent_Floor', '1_BHK_Villa', '2_BHK_Apartment',
                                         '2_BHK_Independent_House', '2_BHK_Independent_Floor', '2_BHK_Villa',
                                         '3_BHK_Apartment', '3_BHK_Independent_House', '3_BHK_Independent_Floor',
                                         '3_BHK_Villa', '4_BHK_Apartment', '4_BHK_Independent_House',
                                         '4_BHK_Independent_Floor', '4_BHK_Villa', '5_BHK_Apartment',
                                         '5_BHK_Independent_House', '5_BHK_Villa', '6_BHK_Apartment',
                                         '6_BHK_Independent_House', '6_BHK_Villa', '6_BHK_penthouse']],
                         handle_unknown='use_encoded_value', unknown_value=-1), [0]),
    ('tnf3', OrdinalEncoder(categories=[['Unfurnished', 'Semi-Furnished', 'Furnished']],
                            handle_unknown='use_encoded_value', unknown_value=-1), [8]),
], remainder='passthrough')

# Step 2: Use AdaBoost Regressor with DecisionTreeRegressor as base estimator
step2 = AdaBoostRegressor(base_estimator=DecisionTreeRegressor(), n_estimators=100)

# Create pipeline
pipe = Pipeline([
    ('step1', step1),
    ('step2', step2)
])
# Fit the pipeline to the training data
pipe.fit(X_train, y_train)
# Make predictions on the test data
y_pred = pipe.predict(X_test)
# Evaluate the model
print('R2 score:', r2_score(y_test, y_pred))
print('MAE:', mean_absolute_error(y_test, y_pred))



R2 score: 0.792221715454946
MAE: 10365.135330877352


In [None]:
step1=ColumnTransformer(transformers=[('tnf1',OneHotEncoder(sparse=False,categories='auto',handle_unknown = 'ignore'),[2,3,6]),
('tnf2',OrdinalEncoder(categories = [['1_RK_Studio_Apartment', '1_BHK_Apartment', '1_BHK_Independent_House', '1_BHK_Independent_Floor', '1_BHK_Villa', '2_BHK_Apartment', '2_BHK_Independent_House', '2_BHK_Independent_Floor', '2_BHK_Villa', '3_BHK_Apartment', '3_BHK_Independent_House', '3_BHK_Independent_Floor', '3_BHK_Villa', '4_BHK_Apartment', '4_BHK_Independent_House', '4_BHK_Independent_Floor', '4_BHK_Villa', '5_BHK_Apartment', '5_BHK_Independent_House', '5_BHK_Villa', '6_BHK_Apartment', '6_BHK_Independent_House', '6_BHK_Villa', '6_BHK_penthouse']
],handle_unknown='use_encoded_value', unknown_value=-1),[0]),
('tnf3',OrdinalEncoder(categories=[['Unfurnished', 'Semi-Furnished', 'Furnished']
],handle_unknown='use_encoded_value', unknown_value=-1),[8]),
],remainder='passthrough')
step2 = GradientBoostingRegressor(n_estimators=500)
pipe = Pipeline([
('step1',step1),
('step2',step2)
])
pipe.fit(X_train,y_train)
y_pred = pipe.predict(X_test)
print('R2 score',r2_score(y_test,y_pred))
print('MAE',mean_absolute_error(y_test,y_pred))



R2 score 0.8339732144535235
MAE 10506.16642911997


In [None]:
import pickle
pickle.dump(finaldf1,open('/content/dataf.pkl','wb'))
pickle.dump(pipe,open('/content/pipe1.pkl','wb'))

In [None]:
!pip install pandas==1.5.3

Collecting pandas==1.5.3
  Downloading pandas-1.5.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (11 kB)
Downloading pandas-1.5.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.1/12.1 MB[0m [31m81.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pandas
  Attempting uninstall: pandas
    Found existing installation: pandas 2.1.4
    Uninstalling pandas-2.1.4:
      Successfully uninstalled pandas-2.1.4
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
cudf-cu12 24.4.1 requires pandas<2.2.2dev0,>=2.0, but you have pandas 1.5.3 which is incompatible.
google-colab 1.0.0 requires pandas==2.1.4, but you have pandas 1.5.3 which is incompatible.
mizani 0.11.4 requires pandas>=2.1.0, but you have pandas 1.5.3 which is incompatible.
plotn

In [None]:
!pip install gradio

Collecting gradio
  Using cached gradio-4.44.0-py3-none-any.whl.metadata (15 kB)
Collecting aiofiles<24.0,>=22.0 (from gradio)
  Using cached aiofiles-23.2.1-py3-none-any.whl.metadata (9.7 kB)
Collecting fastapi<1.0 (from gradio)
  Using cached fastapi-0.115.0-py3-none-any.whl.metadata (27 kB)
Collecting ffmpy (from gradio)
  Using cached ffmpy-0.4.0-py3-none-any.whl.metadata (2.9 kB)
Collecting gradio-client==1.3.0 (from gradio)
  Using cached gradio_client-1.3.0-py3-none-any.whl.metadata (7.1 kB)
Collecting httpx>=0.24.1 (from gradio)
  Using cached httpx-0.27.2-py3-none-any.whl.metadata (7.1 kB)
Collecting orjson~=3.0 (from gradio)
  Using cached orjson-3.10.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (50 kB)
Collecting python-multipart>=0.0.9 (from gradio)
  Using cached python_multipart-0.0.9-py3-none-any.whl.metadata (2.5 kB)
Collecting uvicorn>=0.14.0 (from gradio)
  Using cached uvicorn-0.30.6-py3-none-any.whl.metadata (6.6 kB)
Collecting starlette<0.3

In [None]:
import gradio as gr
import pickle
import numpy as np
import math
# Load your model and data
pipe = pickle.load(open('/content/pipe1.pkl', 'rb'))
dataf = pickle.load(open('/content/dataf.pkl', 'rb'))
def predict_rent_price(house_type, house_size, location, city, numBathrooms,
       numBalconies, isNegotiable, SecurityDeposit, Status):
    query = np.array([house_type, house_size, location, city, numBathrooms,
       numBalconies, isNegotiable, SecurityDeposit, Status])
    query = query.reshape(1, 9)
    prediction = pipe.predict(query)[0]
    return round(prediction)
# Define the inputs and outputs for the Gradio interface
inputs = [
    gr.Dropdown(choices=dataf['house_type'].unique().tolist(), label="House Type"),
    gr.Number(label="House Size in Sqft"),
    gr.Dropdown(choices=dataf['location'].unique().tolist(), label="Location"),
    gr.Dropdown(choices=dataf['city'].unique().tolist(), label="City"),
    gr.Number(label="Number of Bathrooms"),
    gr.Number(label="Number of Balconies"),

    gr.Dropdown(choices=dataf['isNegotiable'].unique().tolist(), label="IsNegotiable"),
    gr.Number(label="SecurityDeposit"),
    gr.Dropdown(choices=dataf['Status'].unique().tolist(), label="Status"),]
outputs = gr.Textbox(label="Predicted Price")
# Create and launch the Gradio interface
gr.Interface(fn=predict_rent_price, inputs=inputs, outputs=outputs, title="Home Rent Prediction ").launch()


Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://4e957e4e4e2d67325a.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)




In [None]:
finaldf1.groupby('isNegotiable')['price'].mean().plot(kind='bar')

NameError: name 'finaldf1' is not defined

In [None]:
#negotiable, price  t- test

In [None]:
finaldf1