# Can your predicted positions really stand?

* Predicted positions is not always able to stand
* This notebook show a simple way to analyze positions and how to correct them

## Reference and acknowledgement
* Notebooks
 * https://www.kaggle.com/devinanzelmo/wifi-features
* Datasets
 * https://www.kaggle.com/hiro5299834/indoor-navigation-and-location-wifi-features

In [None]:
# load libraries
import os
import cv2
import json
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [None]:
floor_map = {"B2":-2, "B1":-1, "F1":0, "F2":1, "F3":2, "F4":3, "F5":4, "F6":5, "F7":6, "F8":7, "F9":8,
             "1F":0, "2F":1, "3F":2, "4F":3, "5F":4, "6F":5, "7F":6, "8F":7, "9F":8}

In [None]:
# First, load submission.csv
# change submission_csv path into yours
submission_csv = "../input/indoor-location-navigation-sample-submission/sample_submission.csv"
submission_df = pd.read_csv(submission_csv)

In [None]:
# list buildings
site_df = submission_df["site_path_timestamp"].str.split("_", expand=True)
site_df[0].unique()

In [None]:
# sample
site = "5da1383b4db8ce0c98bc11ab"
floor = "F3"

In [None]:
# extract a given building from submission.csv
submission_df_ext = submission_df[site_df[0] == site]
submission_df_ext = submission_df_ext[submission_df_ext["floor"] == floor_map[floor]]

# load train positions
train_csv = "../input/indoor-navigation-and-location-wifi-features/%s_train.csv" % (site)
train_df = pd.read_csv(train_csv)
train_df_ext = train_df[train_df["f"] == floor_map[floor]]

# load building infomation
floor_image = "../input/indoor-location-navigation/metadata/%s/%s/floor_image.png" % (site, floor)
json_path = "../input/indoor-location-navigation/metadata/%s/%s/floor_info.json" % (site, floor)
with open(json_path, "r") as f:
    train_floor_info = json.load(f)

# load image
img_bgr = cv2.imread(floor_image)
img_rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)
img_height, img_width, _ = img_bgr.shape

# caliculate positions
submission_x = submission_df_ext["x"].values * img_width / train_floor_info["map_info"]["width"]
submission_y = img_height - submission_df_ext["y"].values * img_height / train_floor_info["map_info"]["height"]
train_x = train_df_ext["x"].values * img_width / train_floor_info["map_info"]["width"]
train_y = img_height - train_df_ext["y"].values * img_height / train_floor_info["map_info"]["height"]

In [None]:
# plot positions
fig = plt.figure(figsize=(15, 15))
plt.imshow(img_rgb, alpha=1)
plt.scatter(train_x, train_y, marker="o", color="blue", label="train")
plt.scatter(submission_x, submission_y, marker="o", color="red", label="predicted")
plt.legend(fontsize=16)
plt.show()

## Discussion

* Every position in train data is located black areas where people seem to be able to stand
* However some predicted positions are located light blue areas where people seem not to be able to stand
* We have to correct forbidden positions in some way
* In this notebook, I simply correct them into nearest permitted position in a black area

In [None]:
# find contours
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_RGB2GRAY)
img_th = cv2.threshold(img_gray, 1, 255, cv2.THRESH_BINARY)[1]
contours, hierarchy = cv2.findContours(img_th, cv2.RETR_LIST, cv2.CHAIN_APPROX_NONE)
contour_arr = np.vstack(contours)
coords = np.round(np.vstack([submission_x, submission_y]).T).astype(np.int32)

# correct forbidden positions into nearest permitted positions
coords_list = [contour_arr[np.argmin(np.linalg.norm(contour_arr - coord, axis=2))] if img_gray[coord[1], coord[0]] != 0 else coord for coord in coords]
new_coords = np.vstack(coords_list)
new_x, new_y = np.vsplit(new_coords.T, 2)

In [None]:
# plot positions
fig = plt.figure(figsize=(15, 15))
plt.imshow(img_rgb, alpha=1)
plt.scatter(train_x, train_y, marker="o", color="blue", label="train")
plt.scatter(submission_x, submission_y, marker="o", color="red", label="predicted")
plt.scatter(new_x, new_y, marker="o", color="orange", label="corrected")
plt.legend(fontsize=16)
plt.show()