# Preparing for map matching

* Map matching is the problem of how to match predicted geographic coordinates to a model of the real map (See a figure below from wikipedia).
* To match your predicted coordinates to real map, we have to make a model from real map information.
* In this notebook, I present an example to create real map model.

![map matching](https://upload.wikimedia.org/wikipedia/commons/thumb/8/8c/Map_Matching_Example_with_GraphHopper.png/483px-Map_Matching_Example_with_GraphHopper.png)


## References and Acknowledgements

* Papers
 * https://www.mdpi.com/1424-8220/17/6/1272
* Notebooks
 * https://www.kaggle.com/ihelon/indoor-location-exploratory-data-analysis
 * https://www.kaggle.com/wineplanetary/can-your-predicted-positions-really-stand
* Datasets
 * https://www.kaggle.com/hiro5299834/indoor-navigation-and-location-wifi-features
* Discussions
 * https://www.kaggle.com/c/indoor-location-navigation/discussion/217874

In [None]:
import os
import cv2
import math
import json
import glob
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [None]:
base_dir = "../input/indoor-location-navigation/metadata"
floor_map = {"B2":-2, "B1":-1, "F1":0, "F2":1, "F3":2, "F4":3, "F5":4, "F6":5, "F7":6, "F8":7, "F9":8,
             "1F":0, "2F":1, "3F":2, "4F":3, "5F":4, "6F":5, "7F":6, "8F":7, "9F":8}

def floor2strs(floor):
    return [key for key, val in floor_map.items() if val == floor]

In [None]:
# sample
site = "5d27097f03f801723c320d97"

## First, let's see real floor maps!

In [None]:
%matplotlib inline
floor_image_list = sorted(glob.glob(os.path.join(base_dir, site, "*", "*.png")))
w = math.ceil(len(floor_image_list) / 2)
h = math.ceil(len(floor_image_list) / w)
fig = plt.figure(figsize=(16, 10))
for i, floor_image in enumerate(floor_image_list):
    floor = os.path.basename(os.path.dirname(floor_image))
    plt.subplot(h, w, i + 1)
    img = cv2.imread(floor_image, cv2.IMREAD_UNCHANGED) # need alpha
    plt.imshow(img)
    plt.axis("off")
    plt.title(floor)
plt.tight_layout()
plt.show()

## Check distribution of position in train data

* To correspond coordinates in train data to real maps, we have to convert coordinates into pixels

In [None]:
def coord2pix(x, y, img_width, img_height, train_floor_info):
    pixx = x * img_width / train_floor_info["map_info"]["width"]
    pixy = img_height - y * img_height / train_floor_info["map_info"]["height"]
    return pixx, pixy

In [None]:
floor_image_list = sorted(glob.glob(os.path.join(base_dir, site, "*", "*.png")))
w = math.ceil(len(floor_image_list) / 2)
h = math.ceil(len(floor_image_list) / w)

train_csv = "../input/indoor-navigation-and-location-wifi-features/%s_train.csv" % (site)
train_df = pd.read_csv(train_csv)

fig = plt.figure(figsize=(20, 15))
for i, floor_image in enumerate(floor_image_list):
    floor = os.path.basename(os.path.dirname(floor_image))
    plt.subplot(h, w, i + 1)
    train_df_ext = train_df[train_df["f"] == floor_map[floor]]
    
    json_path = "../input/indoor-location-navigation/metadata/%s/%s/floor_info.json" % (site, floor)
    with open(json_path, "r") as f:
        train_floor_info = json.load(f)
    
    img = cv2.imread(floor_image, cv2.IMREAD_UNCHANGED) # need alpha
    img_height, img_width, _ = img.shape
    pixx, pixy = coord2pix(train_df_ext["x"].values, train_df_ext["y"].values, img_width, img_height, train_floor_info)
    plt.imshow(img)
    plt.scatter(pixx, pixy, marker="o", color="blue", label="train")
    plt.axis("off")
    plt.title(floor)
plt.tight_layout()
plt.show()

### Discussion

* Blue positions are waypoins recorded in train data.
* It seems that we have to predict most of positions from white areas.
* To predict positions from white areas, we have to extract white areas from map information.

## Extract white area from map information

In [None]:
def extract_permitted_area_from_map(site, floor):
    floor_image_path = os.path.join(base_dir, site, floor, "floor_image.png")
    img = cv2.imread(floor_image_path, cv2.IMREAD_UNCHANGED)
    height, width, channel = img.shape
    _, thimg_soft = cv2.threshold(img[:,:,3], 1, 1, cv2.THRESH_BINARY)
    _, thimg_hard = cv2.threshold(img[:,:,3], 254, 1, cv2.THRESH_BINARY_INV)
    thimg_soft[0, :] = 0
    thimg_soft[height - 1, :] = 0
    thimg_soft[:, 0] = 0
    thimg_soft[:, width - 1] = 0
    mask_img = np.zeros_like(thimg_soft).astype(np.uint8)
    contours, hierarchy = cv2.findContours(thimg_soft, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
    contours = [contour for contour in contours if cv2.contourArea(contour) > 1000]
    cv2.fillPoly(mask_img, contours, 1)
    permitted_img = np.minimum(mask_img, thimg_hard)
    return img, cv2.blur(permitted_img, (5, 5))

* The second returned value of this function is a 2-bit map of permitted area which is shown as red regions in figure below.
* Let's plot and check !
* If you map your predicted coordinate to red areas, you have to convert your coordinates into pixels.

In [None]:
floor_image_list = sorted(glob.glob(os.path.join(base_dir, site, "*", "*.png")))
w = math.ceil(len(floor_image_list) / 2)
h = math.ceil(len(floor_image_list) / w)
floor_list = [os.path.basename(os.path.dirname(floor_image)) for floor_image in floor_image_list]

fig = plt.figure(figsize=(20, 15))
for i, floor in enumerate(floor_list):
    plt.subplot(h, w, i + 1)   
    floor_image, permitted_mask = extract_permitted_area_from_map(site, floor)
    permitted_area = np.zeros_like(floor_image)
    permitted_area[:,:,0][permitted_mask == 1] = 255
    permitted_area[:,:,3][permitted_mask == 1] = 255
    plt.imshow(floor_image)
    plt.imshow(permitted_area)
    plt.axis("off")
    plt.title(floor)
plt.tight_layout()
plt.show()

Thank you for reading :)