<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Import-Data" data-toc-modified-id="Import-Data-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Import Data</a></span><ul class="toc-item"><li><span><a href="#Load-Data" data-toc-modified-id="Load-Data-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Load Data</a></span></li><li><span><a href="#Extract-Features-and-Targets" data-toc-modified-id="Extract-Features-and-Targets-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Extract Features and Targets</a></span></li><li><span><a href="#Create-Validation-Set" data-toc-modified-id="Create-Validation-Set-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>Create Validation Set</a></span></li><li><span><a href="#Explore-Data" data-toc-modified-id="Explore-Data-1.4"><span class="toc-item-num">1.4&nbsp;&nbsp;</span>Explore Data</a></span></li></ul></li><li><span><a href="#Model-Creation" data-toc-modified-id="Model-Creation-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Model Creation</a></span></li><li><span><a href="#Train-Model" data-toc-modified-id="Train-Model-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Train Model</a></span></li><li><span><a href="#Evaluate-Model" data-toc-modified-id="Evaluate-Model-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Evaluate Model</a></span></li></ul></div>

# Import Packages

In [1]:
import pandas as pd
from sklearn.metrics import mean_absolute_error
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor

## Import Data

### Load Data

In [2]:
# Define data locations
data_dir        = '../Data/house-prices-advanced-regression-techniques/'
train_file_name = 'train.csv'
test_file_name  = 'test.csv'

# Load training and testing data
train_data = pd.read_csv( data_dir + train_file_name )
test_data  = pd.read_csv( data_dir + test_file_name )

### Extract Features and Targets

In [9]:
features = ['LotArea', 'YearBuilt', '1stFlrSF', '2ndFlrSF', 'FullBath', 'BedroomAbvGr', 'TotRmsAbvGrd']
#targets  = ['SalePrice']

X = train_data[features]
y = train_data.SalePrice

X_test = test_data[features]

### Create Validation Set

In [10]:
train_X, val_X, train_y, val_y = train_test_split(X, y, train_size=0.8, test_size=0.2, random_state=0)

### Explore Data

To save space from extraneous output, uncomment command of interst when desired.

In [11]:
#train_X.describe()
#train_X.head()
#train_y.describe()
#train_y.head()

## Model Creation

Here as in the instructions we define multiple models instead of the single model from the last course.

In [12]:
# Define the models
model_1 = RandomForestRegressor(n_estimators=50, random_state=0)
model_2 = RandomForestRegressor(n_estimators=100, random_state=0)
model_3 = RandomForestRegressor(n_estimators=100, criterion='mae', random_state=0)
model_4 = RandomForestRegressor(n_estimators=200, min_samples_split=20, random_state=0)
model_5 = RandomForestRegressor(n_estimators=100, max_depth=7, random_state=0)

models = [model_1, model_2, model_3, model_4, model_5]

## Train Model

In [13]:
for model in models:
    model.fit( train_X, train_y )

## Evaluate Model

In [26]:
i = 0
for model in models:
    i+=1
    val_preds = model.predict( val_X )
    val_mae   = mean_absolute_error( val_preds, val_y )
    print( f'Model {i} validation MAE: {val_mae:.0f}' )

Model 1 validation MAE: 24015
Model 2 validation MAE: 23741
Model 3 validation MAE: 23529
Model 4 validation MAE: 23997
Model 5 validation MAE: 23707
