# Indoor localization

An indoor positioning system (IPS) is a system to locate objects or people inside a building using radio waves, magnetic fields, acoustic signals, or other sensory information collected by mobile devices. There are several commercial systems on the market, but there is no standard for an IPS system.

IPSes use different technologies, including distance measurement to nearby anchor nodes (nodes with known positions, e.g., WiFi access points), magnetic positioning, dead reckoning. They either actively locate mobile devices and tags or provide ambient location or environmental context for devices to get sensed.

According to the [report](https://www.marketsandmarkets.com/Market-Reports/indoor-positioning-navigation-ipin-market-989.html), the global indoor location market size is expected to grow from USD 7.11 Billion in 2017 to USD 40.99 Billion by 2022, at a Compound Annual Growth Rate (CAGR) of 42.0% during the forecast period. Hassle-free navigation, improved decision-making, and increased adoption of connected devices are boosting the growth of the indoor location market across the globe.

In this problem, you are going to use signals from seven different wi-fi access points to define in which room the user is located.

In [1]:
import pandas
import numpy as np
import xgboost
from sklearn.metrics import accuracy_score
from sklearn.model_selection import GridSearchCV

Loading the data and breaking it into training and cross-validation sets.

In [2]:
train_set = pandas.read_csv('train_set.csv')
cv_set = pandas.read_csv('cv_set.csv')

train_data = train_set[['wifi'+str(i) for i in range(1, len(train_set.columns) - 1)]]
train_labels = train_set['room']
cv_data = cv_set[['wifi'+str(i) for i in range(1, len(cv_set.columns) - 1)]]
cv_labels = cv_set['room']

In [5]:
print(train_labels)

0       1
1       1
2       1
3       1
4       1
       ..
1598    4
1599    4
1600    4
1601    4
1602    4
Name: room, Length: 1603, dtype: int64


In [6]:
print(cv_data[:10])
print(cv_labels[:10])

   wifi1  wifi2  wifi3  wifi4  wifi5  wifi6  wifi7
0    -64    -56    -61    -66    -71    -82    -81
1    -63    -65    -60    -63    -77    -81    -87
2    -64    -55    -63    -66    -76    -88    -83
3    -65    -60    -59    -63    -76    -86    -82
4    -67    -61    -62    -67    -77    -83    -91
5    -61    -59    -65    -63    -74    -89    -87
6    -63    -56    -63    -65    -72    -82    -89
7    -66    -59    -64    -68    -68    -97    -83
8    -67    -57    -64    -71    -75    -89    -87
9    -63    -57    -59    -67    -71    -82    -93
0    1
1    1
2    1
3    1
4    1
5    1
6    1
7    1
8    1
9    1
Name: room, dtype: int64


### Training XGBoost regressor

In [3]:
model = xgboost.XGBClassifier(seed=44)
model.fit(train_data, train_labels)
preds = model.predict(cv_data)
predictions = [round(value) for value in preds]
accuracy_score(cv_labels, predictions)

0.9848866498740554

### Tuning hyperparameters

In [3]:
params = {
    'gamma':[i/10.0 for i in range(0,5)],
    'max_depth':range(3,10,2),
    'min_child_weight':range(1,6,2),
    'reg_alpha':[0, 0.001, 0.005, 0.01, 0.05],
    'subsample':[i/10.0 for i in range(6,10)],
    'colsample_bytree':[i/10.0 for i in range(6,10)]
}
gsearch = GridSearchCV(estimator=xgboost.XGBClassifier(seed=44), n_jobs=-1, param_grid=params, scoring='neg_log_loss', cv=4)
gsearch.fit(train_data, train_labels)

GridSearchCV(cv=4, error_score=nan,
             estimator=XGBClassifier(base_score=None, booster=None,
                                     colsample_bylevel=None,
                                     colsample_bynode=None,
                                     colsample_bytree=None, gamma=None,
                                     gpu_id=None, importance_type='gain',
                                     interaction_constraints=None,
                                     learning_rate=None, max_delta_step=None,
                                     max_depth=None, min_child_weight=None,
                                     missing=nan, monotone_constraints=None,
                                     n_estim...
                                     validate_parameters=False,
                                     verbosity=None),
             iid='deprecated', n_jobs=-1,
             param_grid={'colsample_bytree': [0.6, 0.7, 0.8, 0.9],
                         'gamma': [0.0, 0.1, 0.2, 0.3, 0

In [4]:
gsearch.best_params_

{'colsample_bytree': 0.6,
 'gamma': 0.4,
 'max_depth': 9,
 'min_child_weight': 1,
 'reg_alpha': 0,
 'subsample': 0.9}

In [7]:
updated_model = xgboost.XGBClassifier(
    gamma=0.4, max_depth=9, min_child_weight=1,
    reg_aplha=0, subsample=0.9, colsample_bytree=0.6, seed=44)
updated_model.fit(train_data, train_labels)
preds = updated_model.predict(cv_data)
predictions = [round(value) for value in preds]
accuracy_score(cv_labels, predictions)

0.9874055415617129