## K- 近邻算法
### 基本原理：
The first machine-learning algorithm we’ll look at is k-Nearest Neighbors (k NN ). It
works like this: we have an existing set of example data, our training set. We have
labels for all of this data—we know what class each piece of the data should fall into.
When we’re given a new piece of data without a label, we compare that new piece of
data to the existing data, every piece of existing data. We then take the most similar
pieces of data (the nearest neighbors) and look at their labels. We look at the top k
most similar pieces of data from our known dataset; this is where the k comes from. (k
is an integer and it’s usually less than 20.) Lastly, we take a majority vote from the k
most similar pieces of data, and the majority is the new class we assign to the data we
were asked to classify.

In [18]:
import numpy as np
import operator

In [89]:
def classify0(inX, dataSet, labels, k):
    dataSetSize =dataSet.shape[0]
    # 计算距离
    diffMat = np.tile(inX, (dataSetSize, 1))- dataSet
    sqDiffMat = diffMat**2
    sqDistance = sqDiffMat.sum(axis=1)
    Distance = sqDistance**0.5
    sortedDistance = Distance.argsort()
    
    classCount = {}
    for i in range(k):
        voteIlabel = labels[sortedDistance[i]]
        classCount[voteIlabel] = classCount.get(voteIlabel, 0) + 1
        
    sortedClassCount = sorted(classCount.items(), key = operator.itemgetter(1), reverse = True)
   
    '''
    print('diffMat=%s' %diffMat)
    print('-'*50)
    print('sqDiffMat=%s' %sqDiffMat)
    print('-'*20)
    print('sqDistance=%s' %sqDistance)
    print('-'*20)
    print('Distance=%s' %Distance)
    print('-'*20)
    print('sortedDistance=%s' %sortedDistance)
    print('-'*20)
    print('sortedClassCount=%s' %sortedClassCount)
    print('-'*20)
    '''   
    return sortedClassCount[0][0]

In [90]:
group = np.array([[1.0, 1.1], [1.0, 1.0], [0, 0], [0, 0.1]])
labels = ['A', 'A', 'B', 'B']

In [91]:
group

array([[ 1. ,  1.1],
       [ 1. ,  1. ],
       [ 0. ,  0. ],
       [ 0. ,  0.1]])

In [92]:
classify0([0,0], group, labels, 3)

'B'

#### numpy broadcasting
![numpy broadcasting](http://www.scipy-lectures.org/_images/numpy_broadcasting.png)

## ?np.tile<http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.tile.html>
 