# Real-Time Location System Case Study

## Team Names:
David Samuel, Brian West, Kumar Raja

This notebook will explore a Real Time Location System developed to accurately predict device locations indoors, using wifi signal strength.  Our team will examine, format, and clean the data; and then perform a K-Nearest-Neighbors analysis using the signal strength and location data to predict the need for using data from two conflicting wireless access point addresses.

## References

K-Nearest-Neighbor Analysis of Received Signal Strength Distance Estimation Across Environments
Aaron Ault, Xuan Zhong, Edward J. Coyle The Center for Wireless Systems and Applications Purdue University, West Lafayette, Indiana 47907 Email: {ault,zhongx,coyle}@ecn.purdue.edu

https://s3-us-west-2.amazonaws.com/smu-mds/prod/Quantifying+the+World/Course+Materials/WiNMee_Ault.pdf


LOCATION ESTIMATION IN WIRELESS NETWORKS: A BAYESIAN APPROACH
David Madigan1,2, Wen-Hua Ju2, P. Krishnan2, A. S. Krishnakumar2 and Ivan Zorych1
1Rutgers University and 2Avaya Labs Research

https://s3-us-west-2.amazonaws.com/smu-mds/prod/Quantifying+the+World/Course+Materials/A16n210.pdf


Weighted Least Squares Techniques for Improved Received Signal Strength Based Localization
Paula Tarr ́ıo ⋆, Ana M. Bernardos and Jose ́ R. Casar
Data Processing and Simulation Group, Universidad Polite ́cnica de Madrid, ETSI. Telecomunicacio ́n, Avda. Complutense 30, 28040 Madrid, Spain; E-Mails: abernardos@grpss.ssr.upm.es (A.M.B.); jramon@grpss.ssr.upm.es (J.R.C.)

https://s3-us-west-2.amazonaws.com/smu-mds/prod/Quantifying+the+World/Course+Materials/sensors-11-08569.pdf


<a id="top"></a>
## Table of Contents
________________________________________________________________________________________________________


# Skip Ahead to Pickled DataFrame

* <a href="#pickled_data">Skip to formatted Dataset</a>

In [None]:
import pandas as pd

series = pd.read_table("http://rdatasciencecases.org/Data/offline.final.trace.txt", squeeze=True, 
                       engine='python', skiprows=2, skipfooter=1)
series.tail()

In [None]:
df = pd.DataFrame([x.split(';') for x in series.values])
df

## Name Columns

In [None]:
var_names = [x.split("=")[0] for x in df.iloc[1][:4].values]

var_names

In [None]:
mac_names = ['MAC_'+str(x) for x in range(len(df.iloc[1][4:].values))]

mac_names

In [None]:
col_names = var_names + mac_names

col_names

In [None]:
cols = dict(list(zip(df.columns.values, col_names)))

cols

In [None]:
df.rename(columns=cols, inplace=True)
df.head()

## Set values

### Time

In [None]:
df['t'] = df.t.str.replace("t=", "")

df.tail()

## Position

In [None]:
df['pos'] = df.pos.str.replace("pos=", "")

df.tail()

In [None]:
d = {}
for i, x in enumerate(df.pos.values):
    if x != None:
        temp = x.split(",")
        d[i] = [float(y) for y in temp]
    else:
        d[i] = None

pos = pd.Series(d)

df['pos'] = pos

df.pos

In [None]:
df.head()

## Degree

In [None]:
df['degree'] = df.degree.str.replace("degree=", "")

df.tail()

In [None]:
df['degree'] = pd.to_numeric(df.degree, errors='coerce', downcast='float')

df.head()

## id

In [None]:
df['id'] = df.id.str.replace("id=", "")

df.head()

# MAC Addresses

In [None]:
l = []
for col in df.iloc[:,4:].columns:
    d = {}
    colString = '{}'.format(col)
    for i, x in enumerate(df[colString].values):
        if x != None:
            temp = x.split("=")
            macId = temp[0]
            coords = [int(y) for y in temp[1].split(",")]
            d[i] = {macId: [coords[:2], coords[-1]]}
            #import pdb; pdb.set_trace()
        else:
            d[i] = None
    
    df[colString] = pd.Series(d)
    
df.head()

# Create new formatted file as a checkpoint

<a id='pickled_data'></a>
<a href="#top">Back to Top</a>

In [None]:
df.to_pickle("mac.pkl")

In [1]:
import pandas as pd

df = pd.read_pickle("mac.pkl")
df.head()

Unnamed: 0,t,id,pos,degree,MAC_0,MAC_1,MAC_2,MAC_3,MAC_4,MAC_5,...,MAC_11,MAC_12,MAC_13,MAC_14,MAC_15,MAC_16,MAC_17,MAC_18,MAC_19,MAC_20
0,1139643118358,00:02:2D:21:0F:33,"[0.0, 0.0, 0.0]",0.0,"{'00:14:bf:b1:97:8a': [[-38, 2437000000], 3]}","{'00:14:bf:b1:97:90': [[-56, 2427000000], 3]}","{'00:0f:a3:39:e1:c0': [[-53, 2462000000], 3]}","{'00:14:bf:b1:97:8d': [[-65, 2442000000], 3]}","{'00:14:bf:b1:97:81': [[-65, 2422000000], 3]}","{'00:14:bf:3b:c7:c6': [[-66, 2432000000], 3]}",...,,,,,,,,,,
1,1139643118744,00:02:2D:21:0F:33,"[0.0, 0.0, 0.0]",0.0,"{'00:14:bf:b1:97:8a': [[-38, 2437000000], 3]}","{'00:0f:a3:39:e1:c0': [[-54, 2462000000], 3]}","{'00:14:bf:b1:97:90': [[-56, 2427000000], 3]}","{'00:14:bf:3b:c7:c6': [[-67, 2432000000], 3]}","{'00:14:bf:b1:97:81': [[-66, 2422000000], 3]}","{'00:14:bf:b1:97:8d': [[-70, 2442000000], 3]}",...,,,,,,,,,,
2,1139643119002,00:02:2D:21:0F:33,"[0.0, 0.0, 0.0]",0.0,"{'00:14:bf:b1:97:8a': [[-38, 2437000000], 3]}","{'00:0f:a3:39:e1:c0': [[-54, 2462000000], 3]}","{'00:14:bf:b1:97:90': [[-57, 2427000000], 3]}","{'00:14:bf:b1:97:81': [[-66, 2422000000], 3]}","{'00:14:bf:3b:c7:c6': [[-69, 2432000000], 3]}","{'00:14:bf:b1:97:8d': [[-70, 2442000000], 3]}",...,,,,,,,,,,
3,1139643119263,00:02:2D:21:0F:33,"[0.0, 0.0, 0.0]",0.0,"{'00:14:bf:b1:97:8a': [[-38, 2437000000], 3]}","{'00:14:bf:b1:97:90': [[-52, 2427000000], 3]}","{'00:0f:a3:39:e1:c0': [[-54, 2462000000], 3]}","{'00:14:bf:b1:97:81': [[-64, 2422000000], 3]}","{'00:14:bf:3b:c7:c6': [[-68, 2432000000], 3]}","{'00:14:bf:b1:97:8d': [[-74, 2442000000], 3]}",...,,,,,,,,,,
4,1139643119538,00:02:2D:21:0F:33,"[0.0, 0.0, 0.0]",0.0,"{'00:14:bf:b1:97:8a': [[-46, 2437000000], 3]}","{'00:0f:a3:39:e1:c0': [[-55, 2462000000], 3]}","{'00:14:bf:b1:97:90': [[-57, 2427000000], 3]}","{'00:14:bf:3b:c7:c6': [[-67, 2432000000], 3]}","{'00:0f:a3:39:dd:cd': [[-66, 2412000000], 3]}","{'00:0f:a3:39:e0:4b': [[-80, 2462000000], 3]}",...,,,,,,,,,,


# Clean Data

In [2]:
df.count()

t         151388
id        146080
pos       146080
degree    146080
MAC_0     146074
MAC_1     146041
MAC_2     146030
MAC_3     145965
MAC_4     145308
MAC_5     141435
MAC_6     127802
MAC_7      97147
MAC_8      54627
MAC_9      21489
MAC_10      6926
MAC_11      1860
MAC_12       566
MAC_13       218
MAC_14        85
MAC_15        36
MAC_16        13
MAC_17         3
MAC_18         1
MAC_19         1
MAC_20         1
dtype: int64

In [3]:
new = df.fillna(method='pad')

new.isnull().all()

t         False
id        False
pos       False
degree    False
MAC_0     False
MAC_1     False
MAC_2     False
MAC_3     False
MAC_4     False
MAC_5     False
MAC_6     False
MAC_7     False
MAC_8     False
MAC_9     False
MAC_10    False
MAC_11    False
MAC_12    False
MAC_13    False
MAC_14    False
MAC_15    False
MAC_16    False
MAC_17    False
MAC_18    False
MAC_19    False
MAC_20    False
dtype: bool

In [4]:
new.isnull().any()

t         False
id        False
pos       False
degree    False
MAC_0     False
MAC_1     False
MAC_2     False
MAC_3     False
MAC_4     False
MAC_5     False
MAC_6     False
MAC_7     False
MAC_8     False
MAC_9     False
MAC_10    False
MAC_11     True
MAC_12     True
MAC_13     True
MAC_14     True
MAC_15     True
MAC_16     True
MAC_17     True
MAC_18     True
MAC_19     True
MAC_20     True
dtype: bool

In [5]:
df10 = new.iloc[:,:15]
df10.isnull().any()

t         False
id        False
pos       False
degree    False
MAC_0     False
MAC_1     False
MAC_2     False
MAC_3     False
MAC_4     False
MAC_5     False
MAC_6     False
MAC_7     False
MAC_8     False
MAC_9     False
MAC_10    False
dtype: bool

## data imputed to MAC_10, and the rest was dropped due to sparseness throughout the set

# Alternate dataset with dropped nulls below

# Shape of id, pos, degree are equal

In [6]:
df5 = df[df.MAC_5.notnull()==True]
df5.MAC_5.isnull().any()

False

## Use 6 access points "MAC_5" and ommit all rows with nulls for "MAC_5"

In [7]:
df5.count()

t         141435
id        141435
pos       141435
degree    141435
MAC_0     141435
MAC_1     141435
MAC_2     141435
MAC_3     141435
MAC_4     141435
MAC_5     141435
MAC_6     127802
MAC_7      97147
MAC_8      54627
MAC_9      21489
MAC_10      6926
MAC_11      1860
MAC_12       566
MAC_13       218
MAC_14        85
MAC_15        36
MAC_16        13
MAC_17         3
MAC_18         1
MAC_19         1
MAC_20         1
dtype: int64

## Set new data frame as columns to MAC_9, ommitting 11-20

In [8]:
df5 = df5.iloc[:,:15]

df5.head()

Unnamed: 0,t,id,pos,degree,MAC_0,MAC_1,MAC_2,MAC_3,MAC_4,MAC_5,MAC_6,MAC_7,MAC_8,MAC_9,MAC_10
0,1139643118358,00:02:2D:21:0F:33,"[0.0, 0.0, 0.0]",0.0,"{'00:14:bf:b1:97:8a': [[-38, 2437000000], 3]}","{'00:14:bf:b1:97:90': [[-56, 2427000000], 3]}","{'00:0f:a3:39:e1:c0': [[-53, 2462000000], 3]}","{'00:14:bf:b1:97:8d': [[-65, 2442000000], 3]}","{'00:14:bf:b1:97:81': [[-65, 2422000000], 3]}","{'00:14:bf:3b:c7:c6': [[-66, 2432000000], 3]}","{'00:0f:a3:39:dd:cd': [[-75, 2412000000], 3]}","{'00:0f:a3:39:e0:4b': [[-78, 2462000000], 3]}","{'00:0f:a3:39:e2:10': [[-87, 2437000000], 3]}","{'02:64:fb:68:52:e6': [[-88, 2447000000], 1]}","{'02:00:42:55:31:00': [[-84, 2457000000], 1]}"
1,1139643118744,00:02:2D:21:0F:33,"[0.0, 0.0, 0.0]",0.0,"{'00:14:bf:b1:97:8a': [[-38, 2437000000], 3]}","{'00:0f:a3:39:e1:c0': [[-54, 2462000000], 3]}","{'00:14:bf:b1:97:90': [[-56, 2427000000], 3]}","{'00:14:bf:3b:c7:c6': [[-67, 2432000000], 3]}","{'00:14:bf:b1:97:81': [[-66, 2422000000], 3]}","{'00:14:bf:b1:97:8d': [[-70, 2442000000], 3]}","{'00:0f:a3:39:e0:4b': [[-79, 2462000000], 3]}","{'00:0f:a3:39:dd:cd': [[-73, 2412000000], 3]}","{'00:0f:a3:39:e2:10': [[-83, 2437000000], 3]}","{'02:00:42:55:31:00': [[-85, 2457000000], 1]}",
2,1139643119002,00:02:2D:21:0F:33,"[0.0, 0.0, 0.0]",0.0,"{'00:14:bf:b1:97:8a': [[-38, 2437000000], 3]}","{'00:0f:a3:39:e1:c0': [[-54, 2462000000], 3]}","{'00:14:bf:b1:97:90': [[-57, 2427000000], 3]}","{'00:14:bf:b1:97:81': [[-66, 2422000000], 3]}","{'00:14:bf:3b:c7:c6': [[-69, 2432000000], 3]}","{'00:14:bf:b1:97:8d': [[-70, 2442000000], 3]}","{'00:0f:a3:39:e0:4b': [[-78, 2462000000], 3]}","{'00:0f:a3:39:e2:10': [[-83, 2437000000], 3]}","{'00:0f:a3:39:dd:cd': [[-65, 2412000000], 3]}","{'02:64:fb:68:52:e6': [[-90, 2447000000], 1]}",
3,1139643119263,00:02:2D:21:0F:33,"[0.0, 0.0, 0.0]",0.0,"{'00:14:bf:b1:97:8a': [[-38, 2437000000], 3]}","{'00:14:bf:b1:97:90': [[-52, 2427000000], 3]}","{'00:0f:a3:39:e1:c0': [[-54, 2462000000], 3]}","{'00:14:bf:b1:97:81': [[-64, 2422000000], 3]}","{'00:14:bf:3b:c7:c6': [[-68, 2432000000], 3]}","{'00:14:bf:b1:97:8d': [[-74, 2442000000], 3]}","{'00:0f:a3:39:dd:cd': [[-78, 2412000000], 3]}","{'00:0f:a3:39:e0:4b': [[-78, 2462000000], 3]}","{'00:0f:a3:39:e2:10': [[-83, 2437000000], 3]}","{'02:00:42:55:31:00': [[-84, 2457000000], 1]}","{'02:64:fb:68:52:e6': [[-87, 2447000000], 1]}"
4,1139643119538,00:02:2D:21:0F:33,"[0.0, 0.0, 0.0]",0.0,"{'00:14:bf:b1:97:8a': [[-46, 2437000000], 3]}","{'00:0f:a3:39:e1:c0': [[-55, 2462000000], 3]}","{'00:14:bf:b1:97:90': [[-57, 2427000000], 3]}","{'00:14:bf:3b:c7:c6': [[-67, 2432000000], 3]}","{'00:0f:a3:39:dd:cd': [[-66, 2412000000], 3]}","{'00:0f:a3:39:e0:4b': [[-80, 2462000000], 3]}","{'00:0f:a3:39:e2:10': [[-83, 2437000000], 3]}","{'00:14:bf:b1:97:81': [[-66, 2422000000], 3]}","{'02:00:42:55:31:00': [[-87, 2457000000], 1]}",,


In [9]:
df5.fillna(method='pad')

df5.isnull().any()

t         False
id        False
pos       False
degree    False
MAC_0     False
MAC_1     False
MAC_2     False
MAC_3     False
MAC_4     False
MAC_5     False
MAC_6      True
MAC_7      True
MAC_8      True
MAC_9      True
MAC_10     True
dtype: bool

# Focus on not null data

In [10]:
df5_10 = df5.iloc[:,:10]

In [11]:
df5_10.isnull().any()

t         False
id        False
pos       False
degree    False
MAC_0     False
MAC_1     False
MAC_2     False
MAC_3     False
MAC_4     False
MAC_5     False
dtype: bool

# Now try running KNN Unsupervised

In [12]:
df5_10.head()

Unnamed: 0,t,id,pos,degree,MAC_0,MAC_1,MAC_2,MAC_3,MAC_4,MAC_5
0,1139643118358,00:02:2D:21:0F:33,"[0.0, 0.0, 0.0]",0.0,"{'00:14:bf:b1:97:8a': [[-38, 2437000000], 3]}","{'00:14:bf:b1:97:90': [[-56, 2427000000], 3]}","{'00:0f:a3:39:e1:c0': [[-53, 2462000000], 3]}","{'00:14:bf:b1:97:8d': [[-65, 2442000000], 3]}","{'00:14:bf:b1:97:81': [[-65, 2422000000], 3]}","{'00:14:bf:3b:c7:c6': [[-66, 2432000000], 3]}"
1,1139643118744,00:02:2D:21:0F:33,"[0.0, 0.0, 0.0]",0.0,"{'00:14:bf:b1:97:8a': [[-38, 2437000000], 3]}","{'00:0f:a3:39:e1:c0': [[-54, 2462000000], 3]}","{'00:14:bf:b1:97:90': [[-56, 2427000000], 3]}","{'00:14:bf:3b:c7:c6': [[-67, 2432000000], 3]}","{'00:14:bf:b1:97:81': [[-66, 2422000000], 3]}","{'00:14:bf:b1:97:8d': [[-70, 2442000000], 3]}"
2,1139643119002,00:02:2D:21:0F:33,"[0.0, 0.0, 0.0]",0.0,"{'00:14:bf:b1:97:8a': [[-38, 2437000000], 3]}","{'00:0f:a3:39:e1:c0': [[-54, 2462000000], 3]}","{'00:14:bf:b1:97:90': [[-57, 2427000000], 3]}","{'00:14:bf:b1:97:81': [[-66, 2422000000], 3]}","{'00:14:bf:3b:c7:c6': [[-69, 2432000000], 3]}","{'00:14:bf:b1:97:8d': [[-70, 2442000000], 3]}"
3,1139643119263,00:02:2D:21:0F:33,"[0.0, 0.0, 0.0]",0.0,"{'00:14:bf:b1:97:8a': [[-38, 2437000000], 3]}","{'00:14:bf:b1:97:90': [[-52, 2427000000], 3]}","{'00:0f:a3:39:e1:c0': [[-54, 2462000000], 3]}","{'00:14:bf:b1:97:81': [[-64, 2422000000], 3]}","{'00:14:bf:3b:c7:c6': [[-68, 2432000000], 3]}","{'00:14:bf:b1:97:8d': [[-74, 2442000000], 3]}"
4,1139643119538,00:02:2D:21:0F:33,"[0.0, 0.0, 0.0]",0.0,"{'00:14:bf:b1:97:8a': [[-46, 2437000000], 3]}","{'00:0f:a3:39:e1:c0': [[-55, 2462000000], 3]}","{'00:14:bf:b1:97:90': [[-57, 2427000000], 3]}","{'00:14:bf:3b:c7:c6': [[-67, 2432000000], 3]}","{'00:0f:a3:39:dd:cd': [[-66, 2412000000], 3]}","{'00:0f:a3:39:e0:4b': [[-80, 2462000000], 3]}"


# Create labeled Training Dataset 

In [13]:
train = df5_10.iloc[:,4:]
train.head()

Unnamed: 0,MAC_0,MAC_1,MAC_2,MAC_3,MAC_4,MAC_5
0,"{'00:14:bf:b1:97:8a': [[-38, 2437000000], 3]}","{'00:14:bf:b1:97:90': [[-56, 2427000000], 3]}","{'00:0f:a3:39:e1:c0': [[-53, 2462000000], 3]}","{'00:14:bf:b1:97:8d': [[-65, 2442000000], 3]}","{'00:14:bf:b1:97:81': [[-65, 2422000000], 3]}","{'00:14:bf:3b:c7:c6': [[-66, 2432000000], 3]}"
1,"{'00:14:bf:b1:97:8a': [[-38, 2437000000], 3]}","{'00:0f:a3:39:e1:c0': [[-54, 2462000000], 3]}","{'00:14:bf:b1:97:90': [[-56, 2427000000], 3]}","{'00:14:bf:3b:c7:c6': [[-67, 2432000000], 3]}","{'00:14:bf:b1:97:81': [[-66, 2422000000], 3]}","{'00:14:bf:b1:97:8d': [[-70, 2442000000], 3]}"
2,"{'00:14:bf:b1:97:8a': [[-38, 2437000000], 3]}","{'00:0f:a3:39:e1:c0': [[-54, 2462000000], 3]}","{'00:14:bf:b1:97:90': [[-57, 2427000000], 3]}","{'00:14:bf:b1:97:81': [[-66, 2422000000], 3]}","{'00:14:bf:3b:c7:c6': [[-69, 2432000000], 3]}","{'00:14:bf:b1:97:8d': [[-70, 2442000000], 3]}"
3,"{'00:14:bf:b1:97:8a': [[-38, 2437000000], 3]}","{'00:14:bf:b1:97:90': [[-52, 2427000000], 3]}","{'00:0f:a3:39:e1:c0': [[-54, 2462000000], 3]}","{'00:14:bf:b1:97:81': [[-64, 2422000000], 3]}","{'00:14:bf:3b:c7:c6': [[-68, 2432000000], 3]}","{'00:14:bf:b1:97:8d': [[-74, 2442000000], 3]}"
4,"{'00:14:bf:b1:97:8a': [[-46, 2437000000], 3]}","{'00:0f:a3:39:e1:c0': [[-55, 2462000000], 3]}","{'00:14:bf:b1:97:90': [[-57, 2427000000], 3]}","{'00:14:bf:3b:c7:c6': [[-67, 2432000000], 3]}","{'00:0f:a3:39:dd:cd': [[-66, 2412000000], 3]}","{'00:0f:a3:39:e0:4b': [[-80, 2462000000], 3]}"


# Create Position Dataset

In [14]:
dataset = df5_10.iloc[:,:4]
dataset.head()

Unnamed: 0,t,id,pos,degree
0,1139643118358,00:02:2D:21:0F:33,"[0.0, 0.0, 0.0]",0.0
1,1139643118744,00:02:2D:21:0F:33,"[0.0, 0.0, 0.0]",0.0
2,1139643119002,00:02:2D:21:0F:33,"[0.0, 0.0, 0.0]",0.0
3,1139643119263,00:02:2D:21:0F:33,"[0.0, 0.0, 0.0]",0.0
4,1139643119538,00:02:2D:21:0F:33,"[0.0, 0.0, 0.0]",0.0


# Create Labels for corresponding Signal Data Matrix

In [15]:
labels = {}
data = {}

strength = {}
freq = {}
mode = {}

for i, x in enumerate(train.values):
    
    labels[i] = [list(y.keys()) for y in x][0][0]
    data[i] = [list(y.values()) for y in x][0][0][0][0], [list(y.values()) for y in x][0][0][0][1], [list(y.values()) for y in x][0][0][-1]

labels = pd.Series(labels)
data = pd.Series(data)

In [16]:
labels.shape, data.shape

((141435,), (141435,))

In [17]:
data.head(), labels.head()

(0    (-38, 2437000000, 3)
 1    (-38, 2437000000, 3)
 2    (-38, 2437000000, 3)
 3    (-38, 2437000000, 3)
 4    (-46, 2437000000, 3)
 dtype: object, 0    00:14:bf:b1:97:8a
 1    00:14:bf:b1:97:8a
 2    00:14:bf:b1:97:8a
 3    00:14:bf:b1:97:8a
 4    00:14:bf:b1:97:8a
 dtype: object)

## We now have two 1-dimensional arrays, of type string, and type integer triple for the MAC address, and corresonding signal strength (Ordinal), Frequency (Numeric), and Mode (Binary) 

In [18]:
dataset.shape, data.shape

((141435, 4), (141435,))

## make position data 1-dimensional

In [22]:
position = {}

for i, x in enumerate(dataset.values):
    #import pdb; pdb.set_trace()
    position[i] = [y for y in x][2:]

position = pd.Series(position)
position.head()

0    [[0.0, 0.0, 0.0], 0.0]
1    [[0.0, 0.0, 0.0], 0.0]
2    [[0.0, 0.0, 0.0], 0.0]
3    [[0.0, 0.0, 0.0], 0.0]
4    [[0.0, 0.0, 0.0], 0.0]
dtype: object

In [39]:
position.shape, data.shape

((141435,), (141435,))

## Join Datasets

In [88]:
new = {}

for i, x in enumerate(position.values):
    new[i] = (position[0][0][0], position[0][0][1], position[0][0][2]), position[0][1], data[0], labels[0]
    
new = pd.DataFrame(new).T
new.head()

Unnamed: 0,0,1,2,3
0,"(0.0, 0.0, 0.0)",0,"(-38, 2437000000, 3)",00:14:bf:b1:97:8a
1,"(0.0, 0.0, 0.0)",0,"(-38, 2437000000, 3)",00:14:bf:b1:97:8a
2,"(0.0, 0.0, 0.0)",0,"(-38, 2437000000, 3)",00:14:bf:b1:97:8a
3,"(0.0, 0.0, 0.0)",0,"(-38, 2437000000, 3)",00:14:bf:b1:97:8a
4,"(0.0, 0.0, 0.0)",0,"(-38, 2437000000, 3)",00:14:bf:b1:97:8a
5,"(0.0, 0.0, 0.0)",0,"(-38, 2437000000, 3)",00:14:bf:b1:97:8a
6,"(0.0, 0.0, 0.0)",0,"(-38, 2437000000, 3)",00:14:bf:b1:97:8a
7,"(0.0, 0.0, 0.0)",0,"(-38, 2437000000, 3)",00:14:bf:b1:97:8a
8,"(0.0, 0.0, 0.0)",0,"(-38, 2437000000, 3)",00:14:bf:b1:97:8a
9,"(0.0, 0.0, 0.0)",0,"(-38, 2437000000, 3)",00:14:bf:b1:97:8a


In [89]:
train = new.iloc[:,:-1]
target

Unnamed: 0,0,1,2
0,"(0.0, 0.0, 0.0)",0,"(-38, 2437000000, 3)"
1,"(0.0, 0.0, 0.0)",0,"(-38, 2437000000, 3)"
2,"(0.0, 0.0, 0.0)",0,"(-38, 2437000000, 3)"
3,"(0.0, 0.0, 0.0)",0,"(-38, 2437000000, 3)"
4,"(0.0, 0.0, 0.0)",0,"(-38, 2437000000, 3)"
5,"(0.0, 0.0, 0.0)",0,"(-38, 2437000000, 3)"
6,"(0.0, 0.0, 0.0)",0,"(-38, 2437000000, 3)"
7,"(0.0, 0.0, 0.0)",0,"(-38, 2437000000, 3)"
8,"(0.0, 0.0, 0.0)",0,"(-38, 2437000000, 3)"
9,"(0.0, 0.0, 0.0)",0,"(-38, 2437000000, 3)"


## Set train and Target

In [103]:
train = new.iloc[:,:3].values
target = new.iloc[:,3:].values.ravel()

(array([[(0.0, 0.0, 0.0), 0.0, (-38, 2437000000, 3)],
        [(0.0, 0.0, 0.0), 0.0, (-38, 2437000000, 3)],
        [(0.0, 0.0, 0.0), 0.0, (-38, 2437000000, 3)],
        ..., 
        [(0.0, 0.0, 0.0), 0.0, (-38, 2437000000, 3)],
        [(0.0, 0.0, 0.0), 0.0, (-38, 2437000000, 3)],
        [(0.0, 0.0, 0.0), 0.0, (-38, 2437000000, 3)]], dtype=object),
 array(['00:14:bf:b1:97:8a', '00:14:bf:b1:97:8a', '00:14:bf:b1:97:8a', ...,
        '00:14:bf:b1:97:8a', '00:14:bf:b1:97:8a', '00:14:bf:b1:97:8a'], dtype=object))

In [106]:
train.shape, target.shape

((141435, 3), (141435,))

# Begin SKLearn's KNN Classification Approach

In [109]:
from sklearn.neighbors import KNeighborsClassifier

X, y = train.tolist(), target.tolist()

neigh = KNeighborsClassifier(n_neighbors=5, algorithm='ball_tree', p=2, n_jobs=-1)
neigh.fit(X,y)

ValueError: setting an array element with a sequence.

# the two addresses in question are 00:0f:a3:39:e1:c0 and 00:0f:a3:39:dd:cd

In [91]:
print("There are", len(labels.unique()), "unique addresses:\n", labels.unique())

There are 7 unique addresses:
 ['00:14:bf:b1:97:8a' '00:14:bf:b1:97:90' '00:0f:a3:39:e1:c0'
 '00:14:bf:b1:97:81' '00:14:bf:3b:c7:c6' '00:14:bf:b1:97:8d'
 '00:0f:a3:39:dd:cd']


## So now we predict position with mixed data from these addresses.