# Indoor localization

An indoor positioning system (IPS) is a system to locate objects or people inside a building using radio waves, magnetic fields, acoustic signals, or other sensory information collected by mobile devices. There are several commercial systems on the market, but there is no standard for an IPS system.

IPSes use different technologies, including distance measurement to nearby anchor nodes (nodes with known positions, e.g., WiFi access points), magnetic positioning, dead reckoning. They either actively locate mobile devices and tags or provide ambient location or environmental context for devices to get sensed.

According to the [report](https://www.marketsandmarkets.com/Market-Reports/indoor-positioning-navigation-ipin-market-989.html), the global indoor location market size is expected to grow from USD 7.11 Billion in 2017 to USD 40.99 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 42.0% during the forecast period. Hassle-free navigation, improved decision-making, and increased adoption of connected devices are boosting the growth of the indoor location market across the globe.

In this problem, you are going to use signals from seven different wi-fi access points to define in which room the user is located.

In [1]:
import pandas
import numpy as np
from xgboost import XGBClassifier
from sklearn.grid_search import GridSearchCV
from sklearn.metrics import accuracy_score
import warnings
warnings.filterwarnings('ignore')



Loading the data and breaking it into training and cross-validation sets.

In [2]:
train_set = pandas.read_csv('train_set.csv')
cv_set = pandas.read_csv('cv_set.csv')

train_data = train_set[['wifi'+str(i) for i in range(1, len(train_set.columns) - 1)]]
train_labels = train_set['room']
cv_data = cv_set[['wifi'+str(i) for i in range(1, len(cv_set.columns) - 1)]]
cv_labels = cv_set['room']

In [3]:
print(train_data[:10])
print(train_labels[:10])

   wifi1  wifi2  wifi3  wifi4  wifi5  wifi6  wifi7
0    -68    -57    -61    -65    -71    -85    -85
1    -63    -60    -60    -67    -76    -85    -84
2    -61    -60    -68    -62    -77    -90    -80
3    -65    -61    -65    -67    -69    -87    -84
4    -61    -63    -58    -66    -74    -87    -82
5    -62    -60    -66    -68    -80    -86    -91
6    -65    -59    -61    -67    -72    -86    -81
7    -63    -57    -61    -65    -73    -84    -84
8    -66    -60    -65    -62    -70    -85    -83
9    -67    -60    -59    -61    -71    -86    -91
0    1
1    1
2    1
3    1
4    1
5    1
6    1
7    1
8    1
9    1
Name: room, dtype: int64


In [4]:
print(cv_data[:10])
print(cv_labels[:10])

   wifi1  wifi2  wifi3  wifi4  wifi5  wifi6  wifi7
0    -64    -56    -61    -66    -71    -82    -81
1    -63    -65    -60    -63    -77    -81    -87
2    -64    -55    -63    -66    -76    -88    -83
3    -65    -60    -59    -63    -76    -86    -82
4    -67    -61    -62    -67    -77    -83    -91
5    -61    -59    -65    -63    -74    -89    -87
6    -63    -56    -63    -65    -72    -82    -89
7    -66    -59    -64    -68    -68    -97    -83
8    -67    -57    -64    -71    -75    -89    -87
9    -63    -57    -59    -67    -71    -82    -93
0    1
1    1
2    1
3    1
4    1
5    1
6    1
7    1
8    1
9    1
Name: room, dtype: int64


### Training XGBoost regressor

In [5]:
clf = XGBClassifier(objective='multi:softmax', num_classes=4)
model = clf.fit(train_data, train_labels)
cv_predicted_lables = model.predict(cv_data)

accuracy = accuracy_score(cv_labels, cv_predicted_lables)
print(model)
print("Accuracy: %.2f%%" % (accuracy * 100.0))

XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
       colsample_bytree=1, gamma=0, learning_rate=0.1, max_delta_step=0,
       max_depth=3, min_child_weight=1, missing=None, n_estimators=100,
       n_jobs=1, nthread=None, num_classes=4, objective='multi:softprob',
       random_state=0, reg_alpha=0, reg_lambda=1, scale_pos_weight=1,
       seed=None, silent=True, subsample=1)
Accuracy: 98.24%


### Tuning hyperparameters

Here I perform grid search in order to tune hyperparameters

In [6]:
params = {'max_depth': range(3, 8), 'min_child_weight': range(3, 8)}

optimized_GBM = GridSearchCV(XGBClassifier(objective='multi:softmax', num_classes=4,  learning_rate=0.1), params, scoring = 'accuracy') 


In [7]:
optimized_GBM.fit(train_data, train_labels)
optimized_GBM.grid_scores_, optimized_GBM.best_params_, optimized_GBM.best_score_

([mean: 0.97442, std: 0.00686, params: {'max_depth': 3, 'min_child_weight': 3},
  mean: 0.97567, std: 0.00608, params: {'max_depth': 3, 'min_child_weight': 4},
  mean: 0.97505, std: 0.00686, params: {'max_depth': 3, 'min_child_weight': 5},
  mean: 0.97505, std: 0.00686, params: {'max_depth': 3, 'min_child_weight': 6},
  mean: 0.97567, std: 0.00455, params: {'max_depth': 3, 'min_child_weight': 7},
  mean: 0.97567, std: 0.00547, params: {'max_depth': 4, 'min_child_weight': 3},
  mean: 0.97629, std: 0.00633, params: {'max_depth': 4, 'min_child_weight': 4},
  mean: 0.97629, std: 0.00488, params: {'max_depth': 4, 'min_child_weight': 5},
  mean: 0.97505, std: 0.00614, params: {'max_depth': 4, 'min_child_weight': 6},
  mean: 0.97629, std: 0.00633, params: {'max_depth': 4, 'min_child_weight': 7},
  mean: 0.97567, std: 0.00663, params: {'max_depth': 5, 'min_child_weight': 3},
  mean: 0.97629, std: 0.00614, params: {'max_depth': 5, 'min_child_weight': 4},
  mean: 0.97692, std: 0.00438, params: {

In [8]:
params = {'gamma':[i/10.0 for i in range(0,5)]}
optimized_GBM = GridSearchCV(XGBClassifier(objective='multi:softmax', num_classes=4, max_depth=6, min_child_weight=5), 
                             params, 
                             scoring = 'accuracy')

In [9]:
optimized_GBM.fit(train_data, train_labels)
optimized_GBM.grid_scores_, optimized_GBM.best_params_, optimized_GBM.best_score_

([mean: 0.97817, std: 0.00488, params: {'gamma': 0.0},
  mean: 0.97629, std: 0.00488, params: {'gamma': 0.1},
  mean: 0.97692, std: 0.00614, params: {'gamma': 0.2},
  mean: 0.97754, std: 0.00697, params: {'gamma': 0.3},
  mean: 0.97692, std: 0.00719, params: {'gamma': 0.4}],
 {'gamma': 0.0},
 0.9781659388646288)

In [10]:
params = {'learning_rate': [0.1, 0.01, 0.001], 'subsample': [0.7,0.8,0.9, 1.0]}
optimized_GBM = GridSearchCV(XGBClassifier(objective='multi:softmax', num_classes=4, max_depth=6, min_child_weight=5, gamma=0), 
                             params, 
                             scoring = 'accuracy')

In [11]:
optimized_GBM.fit(train_data, train_labels)
optimized_GBM.grid_scores_, optimized_GBM.best_params_, optimized_GBM.best_score_

([mean: 0.97567, std: 0.00401, params: {'subsample': 0.7, 'learning_rate': 0.1},
  mean: 0.97629, std: 0.00381, params: {'subsample': 0.8, 'learning_rate': 0.1},
  mean: 0.97817, std: 0.00633, params: {'subsample': 0.9, 'learning_rate': 0.1},
  mean: 0.97817, std: 0.00488, params: {'subsample': 1.0, 'learning_rate': 0.1},
  mean: 0.96132, std: 0.01733, params: {'subsample': 0.7, 'learning_rate': 0.01},
  mean: 0.96257, std: 0.01758, params: {'subsample': 0.8, 'learning_rate': 0.01},
  mean: 0.96319, std: 0.01629, params: {'subsample': 0.9, 'learning_rate': 0.01},
  mean: 0.96195, std: 0.01615, params: {'subsample': 1.0, 'learning_rate': 0.01},
  mean: 0.96132, std: 0.01881, params: {'subsample': 0.7, 'learning_rate': 0.001},
  mean: 0.96070, std: 0.01946, params: {'subsample': 0.8, 'learning_rate': 0.001},
  mean: 0.96007, std: 0.01966, params: {'subsample': 0.9, 'learning_rate': 0.001},
  mean: 0.95758, std: 0.01930, params: {'subsample': 1.0, 'learning_rate': 0.001}],
 {'learning_rat

In [12]:
params = {'reg_alpha':[0, 1e-5, 1e-2, 0.1, 0.01, 1, 100]}
optimized_GBM = GridSearchCV(XGBClassifier(objective='multi:softmax', num_classes=4, max_depth=6, min_child_weight=5, gamma=0, learning_rate=0.1, subsample=0.9), 
                             params, 
                             scoring = 'accuracy')

In [13]:
optimized_GBM.fit(train_data, train_labels)
optimized_GBM.grid_scores_, optimized_GBM.best_params_, optimized_GBM.best_score_

([mean: 0.97817, std: 0.00633, params: {'reg_alpha': 0},
  mean: 0.97817, std: 0.00633, params: {'reg_alpha': 1e-05},
  mean: 0.97692, std: 0.00614, params: {'reg_alpha': 0.01},
  mean: 0.97879, std: 0.00576, params: {'reg_alpha': 0.1},
  mean: 0.97692, std: 0.00614, params: {'reg_alpha': 0.01},
  mean: 0.97567, std: 0.00547, params: {'reg_alpha': 1},
  mean: 0.95633, std: 0.01502, params: {'reg_alpha': 100}],
 {'reg_alpha': 0.1},
 0.9787897691827823)

Build final classifier using obtained hyperparameters

In [14]:
final_classifier = XGBClassifier(objective='multi:softmax', num_classes=4,learning_rate=0.1, subsample=0.7, max_depth=5, min_child_weight=5,gamma=0, req_alpha=0.1)


In [15]:
final_classifier

XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
       colsample_bytree=1, gamma=0, learning_rate=0.1, max_delta_step=0,
       max_depth=5, min_child_weight=5, missing=None, n_estimators=100,
       n_jobs=1, nthread=None, num_classes=4, objective='multi:softmax',
       random_state=0, reg_alpha=0, reg_lambda=1, req_alpha=0.1,
       scale_pos_weight=1, seed=None, silent=True, subsample=0.7)

In [16]:
model = final_classifier.fit(train_data, train_labels)


In [17]:
cv_predicted_lables = model.predict(cv_data)
accuracy = accuracy_score(cv_labels, cv_predicted_lables)
print("Final Accuracy: %.2f%%" % (accuracy * 100.0))

Final Accuracy: 98.99%
