# Train Machine Learning model on historical data and export it
Let's say you have a house to sell, which price should you put to it?

In [1]:
your_house = {
    'BEDROOMS': 3,
    'BATHROOMS': 2,
    'GARAGE': 2,
    'FLOOR_AREA': 200,
    'BUILD_YEAR': 2000
}

In [2]:
import pandas as pd

name = 'Wall Street'
df_house = pd.DataFrame(your_house, index=[name])
df_house

Unnamed: 0,BEDROOMS,BATHROOMS,GARAGE,FLOOR_AREA,BUILD_YEAR
Wall Street,3,2,2,200,2000


## Historical data

Having a dataset with many houses and their sold prices, you can use Machine Learning to predict the optimal price for your house.

In [3]:
import pandas as pd

df_base = pd.read_csv('../../../data/house-price.csv', index_col=0)
df_base

Unnamed: 0_level_0,PRICE,BEDROOMS,BATHROOMS,GARAGE,FLOOR_AREA,BUILD_YEAR
ADDRESS,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1 Datchet Turn,270000,3,2,2.0,109,2011.0
1 McKenzie Corner,470000,4,2,2.0,279,2005.0
...,...,...,...,...,...,...
93 Centennial Avenue,350000,4,2,2.0,177,2005.0
98 Centennial Avenue,441000,4,2,2.0,195,2004.0


## Feature selection

In [4]:
y = df_base['PRICE']
X = df_base.drop(columns='PRICE')

## Train Machine Learning model

In [5]:
from sklearn.tree import DecisionTreeRegressor
model = DecisionTreeRegressor()
model.fit(X, y)

In [6]:
model.score(X, y)

0.9746449876358452

## Calculate prediction

In [7]:
df_house

Unnamed: 0,BEDROOMS,BATHROOMS,GARAGE,FLOOR_AREA,BUILD_YEAR
Wall Street,3,2,2,200,2000


In [8]:
model.predict(df_house)

array([373000.])

## Export model

In [9]:
import pickle

with open('../artifacts/model.pkl', 'wb') as f:
    pickle.dump(model, f)