##### K近邻算法模型的特点是不需要训练参数。前面学过的K近邻分类任务中，算法只是关注K个最近的样本的标签来做决策，在K近邻回归中也是相同的思想
##### 借助K个最近的样本来辅助做决策。这里的决策方式有两种：
##### 1.对K个近邻值计算算数平均数
##### 2.考虑距离的差异计算加权平均

###### 使用两种不同的K近邻回归算法预测波士顿房价

### 1.导入数据

In [1]:
from sklearn.datasets import load_boston
df = load_boston()

### 2.划分训练集和测试集

In [2]:
from sklearn.cross_validation import train_test_split
x_train,x_test,y_train,y_test = train_test_split(df.data,df.target,test_size=0.25,random_state=123)



### 3.对数据进行标准化处理

In [3]:
from sklearn.preprocessing import StandardScaler
ss_X = StandardScaler()
ss_Y = StandardScaler()

x_train = ss_X.fit_transform(x_train)
x_test  = ss_X.transform(x_test)

y_train = ss_Y.fit_transform(y_train.reshape(-1,1))
y_test  = ss_Y.transform(y_test.reshape(-1,1))

### 4.使用K近邻回归器对房价数据进行预测

In [5]:
from sklearn.neighbors import KNeighborsRegressor
#使用平均值
knr_uniform = KNeighborsRegressor(weights='uniform',n_neighbors=5)
knr_uniform.fit(x_train,y_train)
knr_uniform_predict = knr_uniform.predict(x_test)

#使用距离加权平均
knr_distance = KNeighborsRegressor(weights='distance',n_neighbors=5)
knr_distance.fit(x_train,y_train)
knr_distance_predict = knr_distance.predict(x_test)

### 5.性能评估

In [6]:
from sklearn.metrics import r2_score,mean_absolute_error,mean_squared_error
print('平均回归方式的KNN的\nR-squared:%s,MAE:%s,MSE:%s'%(knr_uniform.score(x_test,y_test),mean_absolute_error(knr_uniform_predict,y_test),
                                               mean_squared_error(knr_uniform_predict,y_test)))

print('加权平均回归方式的KNN的\nR-squared:%s,MAE:%s,MSE:%s'%(knr_distance.score(x_test,y_test),
                                                   mean_absolute_error(knr_distance_predict,y_test),
                                               mean_squared_error(knr_distance_predict,y_test)))

平均回归方式的KNN的
R-squared:0.708147737002,MAE:0.315640563308,MSE:0.267189558961
加权平均回归方式的KNN的
R-squared:0.764442026598,MAE:0.293790676113,MSE:0.215652365948


### 6.特点分析

##### 从上面的结果可以看到，采用加权平均的方式预测的结果是比较好的。KNN分类与KNN回归均属于无参数模型，都没有参数训练过程