# 房價預測 - 特徵工程
本次的課程將學習如何實作迴歸分析模型，目的是利用房子的相關資訊，來預測該房價；藉由此項專案將學會如何使用python裡的套件pandas和numpy來操作資料、並利用matplotlib、seaborn視覺化資料，以及用keras來搭建深度學習的模型。

### 環境提醒及備註
在執行本範例前請先確認Jupyter筆記本設置是否正確，首先點選主選單的「修改」─「筆記本設置」─「運行類別」，選擇「Python3」，同時將「硬件加速器」下拉式選單由「None」改成「GPU」，再按「保存」。

### 課程架構
在房價預測的專案中，將帶著學員建構一個深度學習的模型，並進行房價預測，主要包括以下四個步驟：

>1.   如何進行資料前處理(Processing)

>2.   如何實作探索式數據分析(Exploratory Data Analysis)

>3.   如何導入特徵工程(Feature Engineering)

>4.   如何選擇模型並評估其學習狀況(Model&Inference) 

---

**3.1 載入所需套件**

---

In [1]:
# 3-1
# 首先載入所需套件，一般會利用import (package_name) as (xxx) 來簡化套件名稱，使得之後呼叫它們時更方便

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns # 基於matplotlib提供更多高階視覺化的套件
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.models import load_model

import warnings
plt.style.use('ggplot')
warnings.filterwarnings('ignore')
%matplotlib inline

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


In [2]:
# 3-2
# 可以用pandas裡面的函式來讀取csv檔，使用方法為pd.read_csv('檔案名稱')

train = pd.read_csv('train/train-v3.csv')
test = pd.read_csv('test/test-v3.csv')
valid = pd.read_csv('vaild/valid-v3.csv')

---

**3.2 處理outlier**

---

In [3]:
# 3-3

print ("Shape Of The Before Ouliers: ",train.shape)
train1 = train[np.abs(train['bathrooms']-train['bathrooms'].mean())<=(3*train['bathrooms'].std())] 
print ("Shape Of The After Ouliers: ",train1.shape)

Shape Of The Before Ouliers:  (12967, 23)
Shape Of The After Ouliers:  (12863, 23)


In [4]:
# 3-4

print ("Shape Of The Before Ouliers: ",train.shape)
train2 = train1[np.abs(train['sqft_living']-train['sqft_living'].mean())<=(3*train['sqft_living'].std())] 
print ("Shape Of The After Ouliers: ",train2.shape)

Shape Of The Before Ouliers:  (12967, 23)
Shape Of The After Ouliers:  (12758, 23)


In [5]:
# 3-5

print ("Shape Of The Before Ouliers: ",train.shape)
train3 = train2[np.abs(train['sqft_basement']-train['sqft_basement'].mean())<=(3*train['sqft_basement'].std())] 
print ("Shape Of The After Ouliers: ",train3.shape)

Shape Of The Before Ouliers:  (12967, 23)
Shape Of The After Ouliers:  (12670, 23)


---

**3.3 輸出訓練資料集**

---

In [6]:
# 3-6
train3.to_excel('train/train_new.xls')

-----