## 🏠 House Price Prediction – ML Model Training Notebook
This notebook trains a machine learning model to predict apartment prices in Bishkek, based on features like area, number of rooms, floor, and renovation type. The trained model is then saved for use in a Flask web app.

#### 📥 Step 1: Import Required Libraries

In [1]:
import pandas as pd
import pickle
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

#### 🧹 Step 2: Load and Prepare the Dataset

In [2]:
# Load dataset
df = pd.read_csv('Houses.csv', index_col=0)

In [3]:
df.head()

Unnamed: 0,Number of rooms,Area (in m2),Floor,Address,Price in USD,Price in KGS
0,2,65,1,"Бишкек, Чокана Валиханова 2/9-2/12",48500,3964026
1,3,116,10,"Бишкек, Военторг, Исанова/Токтогула",112000,9154040
2,1,54,10,"Бишкек, Академия Наук, Шевченко 94-98/Уметалиева",65000,5312613
3,2,41,2,"Бишкек, 9 м-н, Суеркулова 4б",49000,4004893
4,3,60,1,"Бишкек, Азия Молл, Балтагулова 1а/Айтматова",50000,4086625


In [4]:
# Drop columns that are not needed
df = df.drop(columns=['Price in KGS', 'Address'])

#### ❗ Step 3: Clean Incorrect Entries
There is one row with "более" in the "Area (in m2)" column, which is not numeric and must be removed.

In [5]:
df = df[df['Area (in m2)'] != 'более']

#### 🎯 Step 4: Define Features and Target

In [6]:
# Split features and target
X = df.copy()
y = X.pop('Price in USD')

In [7]:
X.head()

Unnamed: 0,Number of rooms,Area (in m2),Floor
0,2,65,1
1,3,116,10
2,1,54,10
3,2,41,2
4,3,60,1


#### 🔀 Step 5: Train/Test Split

In [8]:
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.25, random_state=0
)

#### 🌲 Step 6: Train the Model (Random Forest)

In [9]:
# Initialize model
model = RandomForestClassifier(
    random_state=0, criterion='gini', n_estimators=100
)

# Train
model.fit(X_train, y_train)

# Evaluate
print("Train accuracy:", model.score(X_train, y_train))

Train accuracy: 0.8556149732620321


#### 💾 Step 7: Save the Trained Model

In [10]:
# Save model to file
with open('model.pkl', 'wb') as f:
    pickle.dump(model, f)