## <div class="alert alert-block alert-warning">There are some HTML visualizations. It takes some seconds, so please wait!⌛</div> 

# Get understand dataset by run processing scripts

## Contents

1. [Introduction](#1)
1. [Preparete libraries](#2)
1. [Setting](#3)
1. [Run the tool](#4)
1. [Check output data](#5)
1. [How script extracts data from trace path file?](#6)

<a id="1"></a> <br>
# <div class="alert alert-block alert-info">Introduction</div>

## About this notebook

For processing and visualization of datas from android smartphone, [indoor-location-competition-20](https://github.com/location-competition/indoor-location-competition-20) repository is introduced in [Data page](https://www.kaggle.com/c/indoor-location-navigation/data). 

This repository contains sample data and code for Indoor Location Competition 2.0.

I have made this sample code work on kaggle notebook, so I'll share the result. And I also aim to get better understanding of the given data through setting up and running the tool.

### <u>Note</u>

indoor-location-competition-20 is under MIT License. Copyright (c) 2017-2020 XYZ10, Inc. https://dangwu.com/

To check detail, please check [here](https://github.com/location-competition/indoor-location-competition-20/blob/master/LICENSE).

## About Indoor Location Competition 2.0

[Indoor Location Competition 2.0](https://location20.xyz10.com/) is a continuation of Microsoft Indoor Location Competition.

Webinar Video of Indoor Location Competition 2.0 is here.

In [None]:
from IPython.display import YouTubeVideo
YouTubeVideo('xt3OzMC-XMU')

## About sample processing scripts

Indoor traces data processing scripts witch have been published in [indoor-location-competition-20](https://github.com/location-competition/indoor-location-competition-20) repository. These scripts has following five function.

- Ground truth location visualization

- Sample step detection and visualization

- Geo-magnetic field intensity visualization

- WiFi RSSI ([Received signal strength indication](https://en.wikipedia.org/wiki/Received_signal_strength_indication)) heatmap generation

- iBeacon RSSI heatmap generation

- WiFi SSID counts visualization

If you want to see sample visualizations, please jump [here](#5) !

-------------------------------

<a id="2"></a> <br>
# <div class="alert alert-block alert-info">Preparete libraries</div>

Load libraries and clone tool.

In [None]:
import glob
import json
import os
from pathlib import Path
import sys


from IPython.display import IFrame
import numpy as np
import pandas as pd

In [None]:
!git clone -b develop https://github.com/tasotasoso/indoor-location-competition-20.git

This is what the repository looks like.

In [None]:
!ls indoor-location-competition-20

By running main.py, we can process the data. However, since the data path is hardcoded and main.py asks for standard input, I will redefine only main.py in this jupyter notebook.

Add path for indoor-location-competition-20 and import modules Other than the main.py.

In [None]:
sys.path.append('indoor-location-competition-20')

In [None]:
from compute_f import split_ts_seq, compute_step_positions
from io_f import read_data_file
from visualize_f import visualize_trajectory, visualize_heatmap, save_figure_to_html

I copied the functions in main.py. If you are interested, please open the cell or check the [repository](https://github.com/location-competition/indoor-location-competition-20).

In [None]:
def calibrate_magnetic_wifi_ibeacon_to_position(path_file_list):
    mwi_datas = {}
    for path_filename in path_file_list:
        print(f'Processing {path_filename}...')

        path_datas = read_data_file(path_filename)
        acce_datas = path_datas.acce
        magn_datas = path_datas.magn
        ahrs_datas = path_datas.ahrs
        wifi_datas = path_datas.wifi
        ibeacon_datas = path_datas.ibeacon
        posi_datas = path_datas.waypoint

        step_positions = compute_step_positions(acce_datas, ahrs_datas, posi_datas)
        # visualize_trajectory(posi_datas[:, 1:3], floor_plan_filename, width_meter, height_meter, title='Ground Truth', show=True)
        # visualize_trajectory(step_positions[:, 1:3], floor_plan_filename, width_meter, height_meter, title='Step Position', show=True)

        if wifi_datas.size != 0:
            sep_tss = np.unique(wifi_datas[:, 0].astype(float))
            wifi_datas_list = split_ts_seq(wifi_datas, sep_tss)
            for wifi_ds in wifi_datas_list:
                diff = np.abs(step_positions[:, 0] - float(wifi_ds[0, 0]))
                index = np.argmin(diff)
                target_xy_key = tuple(step_positions[index, 1:3])
                if target_xy_key in mwi_datas:
                    mwi_datas[target_xy_key]['wifi'] = np.append(mwi_datas[target_xy_key]['wifi'], wifi_ds, axis=0)
                else:
                    mwi_datas[target_xy_key] = {
                        'magnetic': np.zeros((0, 4)),
                        'wifi': wifi_ds,
                        'ibeacon': np.zeros((0, 3))
                    }

        if ibeacon_datas.size != 0:
            sep_tss = np.unique(ibeacon_datas[:, 0].astype(float))
            ibeacon_datas_list = split_ts_seq(ibeacon_datas, sep_tss)
            for ibeacon_ds in ibeacon_datas_list:
                diff = np.abs(step_positions[:, 0] - float(ibeacon_ds[0, 0]))
                index = np.argmin(diff)
                target_xy_key = tuple(step_positions[index, 1:3])
                if target_xy_key in mwi_datas:
                    mwi_datas[target_xy_key]['ibeacon'] = np.append(mwi_datas[target_xy_key]['ibeacon'], ibeacon_ds, axis=0)
                else:
                    mwi_datas[target_xy_key] = {
                        'magnetic': np.zeros((0, 4)),
                        'wifi': np.zeros((0, 5)),
                        'ibeacon': ibeacon_ds
                    }

        sep_tss = np.unique(magn_datas[:, 0].astype(float))
        magn_datas_list = split_ts_seq(magn_datas, sep_tss)
        for magn_ds in magn_datas_list:
            diff = np.abs(step_positions[:, 0] - float(magn_ds[0, 0]))
            index = np.argmin(diff)
            target_xy_key = tuple(step_positions[index, 1:3])
            if target_xy_key in mwi_datas:
                mwi_datas[target_xy_key]['magnetic'] = np.append(mwi_datas[target_xy_key]['magnetic'], magn_ds, axis=0)
            else:
                mwi_datas[target_xy_key] = {
                    'magnetic': magn_ds,
                    'wifi': np.zeros((0, 5)),
                    'ibeacon': np.zeros((0, 3))
                }

    return mwi_datas


def extract_magnetic_strength(mwi_datas):
    magnetic_strength = {}
    for position_key in mwi_datas:
        # print(f'Position: {position_key}')

        magnetic_data = mwi_datas[position_key]['magnetic']
        magnetic_s = np.mean(np.sqrt(np.sum(magnetic_data[:, 1:4] ** 2, axis=1)))
        magnetic_strength[position_key] = magnetic_s

    return magnetic_strength


def extract_wifi_rssi(mwi_datas):
    wifi_rssi = {}
    for position_key in mwi_datas:
        # print(f'Position: {position_key}')

        wifi_data = mwi_datas[position_key]['wifi']
        for wifi_d in wifi_data:
            bssid = wifi_d[2]
            rssi = int(wifi_d[3])

            if bssid in wifi_rssi:
                position_rssi = wifi_rssi[bssid]
                if position_key in position_rssi:
                    old_rssi = position_rssi[position_key][0]
                    old_count = position_rssi[position_key][1]
                    position_rssi[position_key][0] = (old_rssi * old_count + rssi) / (old_count + 1)
                    position_rssi[position_key][1] = old_count + 1
                else:
                    position_rssi[position_key] = np.array([rssi, 1])
            else:
                position_rssi = {}
                position_rssi[position_key] = np.array([rssi, 1])

            wifi_rssi[bssid] = position_rssi

    return wifi_rssi


def extract_ibeacon_rssi(mwi_datas):
    ibeacon_rssi = {}
    for position_key in mwi_datas:
        # print(f'Position: {position_key}')

        ibeacon_data = mwi_datas[position_key]['ibeacon']
        for ibeacon_d in ibeacon_data:
            ummid = ibeacon_d[1]
            rssi = int(ibeacon_d[2])

            if ummid in ibeacon_rssi:
                position_rssi = ibeacon_rssi[ummid]
                if position_key in position_rssi:
                    old_rssi = position_rssi[position_key][0]
                    old_count = position_rssi[position_key][1]
                    position_rssi[position_key][0] = (old_rssi * old_count + rssi) / (old_count + 1)
                    position_rssi[position_key][1] = old_count + 1
                else:
                    position_rssi[position_key] = np.array([rssi, 1])
            else:
                position_rssi = {}
                position_rssi[position_key] = np.array([rssi, 1])

            ibeacon_rssi[ummid] = position_rssi

    return ibeacon_rssi


def extract_wifi_count(mwi_datas):
    wifi_counts = {}
    for position_key in mwi_datas:
        # print(f'Position: {position_key}')

        wifi_data = mwi_datas[position_key]['wifi']
        count = np.unique(wifi_data[:, 2]).shape[0]
        wifi_counts[position_key] = count

    return wifi_counts

<a id="3"></a> <br>
# <div class="alert alert-block alert-warning">Setting</div> 

To run the sample code, set up the data path and other settings. Also, let's check the directory structure at the same time.

## <u>Note</u>

Cells where you can change the analysis target by changing the values, I'll mark with orange sentence <span style="color: orange; ">like this</span>!!!

First, in order to process the data, we have to specify four values.

- Directory for metadata

- state (id under train and test directories)

- floor (id under state directory)

- Directory for data (train or test)

Since the  values are decisive in original main.py, I define them as a variable.

### <span style="color: orange; ">Set site & floor you want to analyze↓↓↓</span>

In [None]:
#User setting
data_root = "../input/indoor-location-navigation/train"
metadata_root = "../input/indoor-location-navigation/metadata"
state = "5a0546857ecc773753327266"
floor = "B1"

In [None]:
def list_files(startpath):
    """Show directory structure recursive like tree command.
    refered from https://stackoverflow.com/questions/9727673/list-directory-tree-structure-in-python
    """
    for root, dirs, files in os.walk(startpath):
        level = root.replace(startpath, '').count(os.sep)
        indent = ' ' * 4 * (level)
        print(f'{indent}{os.path.basename(root)}/')
        subindent = ' ' * 4 * (level + 1)
        for f in files:
            print(f'{subindent}{f}')

Let's check the directory structure so that we can freely set the path we did earlier.

Check it from the indoor trace data. The root of the data is train or test. Take a look under the train directory. Under this directory is like "site/floor/data". Let's try listing under "5cd56c0ce2acfd2d33b6ab27" site.

In [None]:
#list_files is utility function defined in previous hidden cell.
list_files("../input/indoor-location-navigation/train/5cd56c0ce2acfd2d33b6ab27")

The dataset for this competition consists of dense indoor signatures of WiFi, geomagnetic field, iBeacons etc., as well as ground truth (waypoint) collected from hundreds of buildings in Chinese cities. The data found in path trace files (*.txt) corresponds to an indoor path between position p_1 and p_2 walked by a site-surveyor.

In other words, the data listed above is the path trace data collected from a certain 5cd56c0ce2acfd2d33b6ab27 site in China.

During the walk, an Android smartphone is held flat in front of the surveyors body, and a sensor data recording app is running on the device to collect IMU (accelerometer, gyroscope) and geomagnetic field (magnetometer) readings, as well as WiFi and Bluetooth iBeacon scanning results.

Let's check a path trace data file.

In [None]:
!head ../input/indoor-location-navigation/train/5cd56c0ce2acfd2d33b6ab27/B1/5d09a625bd54340008acddb9.txt -n 15

The lines with '#' at the beginning are metadata. It contains the location where the measurement was taken, the Android smartphone model and OS version, etc.

Rows without '#' are the measured data. The first column is the Unix Time in millisecond. The second column is the type of data.　For details on the second row, [Android Developers Document](https://developer.android.com/guide/topics/sensors/sensors_overview) is referable. In particular, TYPE_WAYPOINT is our target and is not included in the test data. ~~There is only one TYPE_WAYPOINT listed per .txt file.~~ Some trace path files seem to containt some TYPE_WAYPOINT line of defferent timestamp.

The third and subsequent columns are the measured values. See [here](https://github.com/tasotasoso/indoor-location-competition-20#sample-data) for details.

The same goes for metadata.

As metadata, there are following three data for each floor.
- floor_image.png
- floor_info.json
- geojson_map.json

In [None]:
list_files(metadata_root + "/5cd56c0ce2acfd2d33b6ab27")

In [None]:
#The above settings should allow the data to be loaded.
floor_data_dir = '/'.join([metadata_root, state, floor])
path_data_dir = '/'.join([data_root, state, floor])
floor_plan_filename = '/'.join([floor_data_dir, 'floor_image.png'])
floor_info_filename = '/'.join([floor_data_dir, 'floor_info.json'])

Then, I'll set the processing result will be output as follows.

In [None]:
#Output setting
save_dir = '/'.join(['./output', state, floor])
path_image_save_dir = '/'.join([save_dir, 'path_images'])
step_position_image_save_dir = save_dir
magn_image_save_dir = save_dir
wifi_image_save_dir = '/'.join([save_dir, 'wifi_images'])
ibeacon_image_save_dir = '/'.join([save_dir, 'ibeacon_images'])
wifi_count_image_save_dir = save_dir

In [None]:
#If the directories do not exist, they should be created automatically.
Path(path_image_save_dir).mkdir(parents=True, exist_ok=True)
Path(magn_image_save_dir).mkdir(parents=True, exist_ok=True)
Path(wifi_image_save_dir).mkdir(parents=True, exist_ok=True)
Path(ibeacon_image_save_dir).mkdir(parents=True, exist_ok=True)

<a id="4"></a> <br>
# <div class="alert alert-block alert-warning">Run the tool</div> 

OK, we're ready to go. Let's proceed with the process corresponding to main.py. The code is somewhat long, so I'll let some cells hide . Please display and check them if necessary.

## Create ground truth location, step detection and geo-magnetic field intensity visualization

In [None]:
with open(floor_info_filename) as f:
    floor_info = json.load(f)
width_meter = floor_info["map_info"]["width"]
height_meter = floor_info["map_info"]["height"]
path_filenames = list(Path(path_data_dir).resolve().glob("*.txt"))

# 1. visualize ground truth positions
print('Visualizing ground truth positions...')
for path_filename in path_filenames:
    print(f'Processing file: {path_filename}...')
    path_data = read_data_file(path_filename)
    path_id = path_filename.name.split(".")[0]
    fig = visualize_trajectory(path_data.waypoint[:, 1:3], floor_plan_filename, width_meter, height_meter, title=path_id, show=False)
    html_filename = f'{path_image_save_dir}/{path_id}.html'
    html_filename = str(Path(html_filename).resolve())
    save_figure_to_html(fig, html_filename)
    
# 2. visualize step position, magnetic, wifi, ibeacon
print('Visualizing more information...')
mwi_datas = calibrate_magnetic_wifi_ibeacon_to_position(path_filenames)
step_positions = np.array(list(mwi_datas.keys()))
fig = visualize_trajectory(step_positions, floor_plan_filename, width_meter, height_meter, mode='markers', title='Step Position', show=True)
html_filename = f'{step_position_image_save_dir}/step_position.html'
html_filename = str(Path(html_filename).resolve())
save_figure_to_html(fig, html_filename)
magnetic_strength = extract_magnetic_strength(mwi_datas)
heat_positions = np.array(list(magnetic_strength.keys()))
heat_values = np.array(list(magnetic_strength.values()))
fig = visualize_heatmap(heat_positions, heat_values, floor_plan_filename, width_meter, height_meter, colorbar_title='mu tesla', title='Magnetic Strength', show=True)
html_filename = f'{magn_image_save_dir}/magnetic_strength.html'
html_filename = str(Path(html_filename).resolve())
save_figure_to_html(fig, html_filename)
wifi_rssi = extract_wifi_rssi(mwi_datas)

## Create Wifi RSSI heatmap

In [None]:
print(f'This floor has {len(wifi_rssi.keys())} wifi aps')
ten_wifi_bssids = list(wifi_rssi.keys())[0:10]
print('Example 10 wifi ap bssids:\n')
for bssid in ten_wifi_bssids:
    print(bssid)

### <span style="color: orange; ">We can choose wifi_bssids↓↓↓</span>

In [None]:
target_wifi = "db01605eac3f33540038bd9722aba25774871d43"

In [None]:
heat_positions = np.array(list(wifi_rssi[target_wifi].keys()))
heat_values = np.array(list(wifi_rssi[target_wifi].values()))[:, 0]
fig = visualize_heatmap(heat_positions, heat_values, floor_plan_filename, width_meter, height_meter, colorbar_title='dBm', title=f'Wifi: {target_wifi} RSSI', show=False)
html_filename = f'{wifi_image_save_dir}/{target_wifi.replace(":", "-")}.html'
html_filename = str(Path(html_filename).resolve())
save_figure_to_html(fig, html_filename)
ibeacon_rssi = extract_ibeacon_rssi(mwi_datas)

## Create iBeacon RSSI heatmap and Wifi Count heatmap

In [None]:
print(f'This floor has {len(ibeacon_rssi.keys())} ibeacons')
ten_ibeacon_ummids = list(ibeacon_rssi.keys())[0:10]
print('Example 10 ibeacon UUID_MajorID_MinorIDs:\n')
for ummid in ten_ibeacon_ummids:
    print(ummid)

### <span style="color: orange; ">We can choose ibeacon_ummid↓↓↓</span>

In [None]:
target_ibeacon = "89cb11b04122cef23388b0da06bd426c1f48a9b5_cfc84f0752adc96b489f71195d91a946c5f6d3e8_8159618423dfa22f1ca0b62543e2f18eef630ce8"

In [None]:
heat_positions = np.array(list(ibeacon_rssi[target_ibeacon].keys()))
heat_values = np.array(list(ibeacon_rssi[target_ibeacon].values()))[:, 0]
fig = visualize_heatmap(heat_positions, heat_values, floor_plan_filename, width_meter, height_meter, colorbar_title='dBm', title=f'iBeacon: {target_ibeacon} RSSI', show=False)
html_filename = f'{ibeacon_image_save_dir}/{target_ibeacon}.html'
html_filename = str(Path(html_filename).resolve())
save_figure_to_html(fig, html_filename)
wifi_counts = extract_wifi_count(mwi_datas)
heat_positions = np.array(list(wifi_counts.keys()))
heat_values = np.array(list(wifi_counts.values()))

In [None]:
# filter out positions that no wifi detected
mask = heat_values != 0
heat_positions = heat_positions[mask]
heat_values = heat_values[mask]
fig = visualize_heatmap(heat_positions, heat_values, floor_plan_filename, width_meter, height_meter, colorbar_title='number', title=f'Wifi Count', show=False)
html_filename = f'{wifi_count_image_save_dir}/wifi_count.html'
html_filename = str(Path(html_filename).resolve())
save_figure_to_html(fig, html_filename)

<a id="5"></a> <br>
# <div class="alert alert-block alert-success">Check output data</div>

We got some processed output data.

- path_images: Ground truth location visualization

- step_position.html: Sample step detection and visualization

- magnetic_strength.html: Geo-magnetic field intensity visualization

- wifi_images: WiFi RSSI heatmap generation

- ibeacon_images: iBeacon RSSI heatmap generation

- wifi_count.html: WiFi SSID counts visualization

In [None]:
!ls ./output/5a0546857ecc773753327266/B1/

Let's see each outputs.

In [None]:
!ls ./output/5a0546857ecc773753327266/B1/path_images	    

In [None]:
IFrame(src='./output/5a0546857ecc773753327266/B1/path_images/5e158ee11506f2000638fd0f.html', width=950, height=950)

In [None]:
IFrame(src='./output/5a0546857ecc773753327266/B1/step_position.html', width=950, height=950)

In [None]:
IFrame(src='./output/5a0546857ecc773753327266/B1/magnetic_strength.html', width=950, height=950)

In [None]:
!ls ./output/5a0546857ecc773753327266/B1/wifi_images

In [None]:
IFrame(src='./output/5a0546857ecc773753327266/B1/wifi_images/db01605eac3f33540038bd9722aba25774871d43.html', width=950, height=950)

In [None]:
!ls ./output/5a0546857ecc773753327266/B1/ibeacon_images

In [None]:
IFrame(src='./output/5a0546857ecc773753327266/B1/ibeacon_images/89cb11b04122cef23388b0da06bd426c1f48a9b5_cfc84f0752adc96b489f71195d91a946c5f6d3e8_8159618423dfa22f1ca0b62543e2f18eef630ce8.html', width=950, height=950)

In [None]:
IFrame(src='./output/5a0546857ecc773753327266/B1/wifi_count.html', width=950, height=950)

<a id="6"></a> <br>
# <div class="alert alert-block alert-success">How script extracts data from trace path file?</div>

It very helpful to preprocess data, so I will describe it here. This script uses the read_data_file function of the io_f.py module to parse trace path files.

In [None]:
from io_f import read_data_file

This module can read given path trace file, and return dataclass instance which has following menbers. By accessing these members, we can get the data of each path trace file in numpy array format.

In [None]:
"""
@dataclass
class ReadData:
    acce: np.ndarray
    acce_uncali: np.ndarray
    gyro: np.ndarray
    gyro_uncali: np.ndarray
    magn: np.ndarray
    magn_uncali: np.ndarray
    ahrs: np.ndarray
    wifi: np.ndarray
    ibeacon: np.ndarray
    waypoint: np.ndarray
"""

Let's analize one path trace file.

In [None]:
sample_path_trace_file_path = "../input/indoor-location-navigation/train/5a0546857ecc773753327266/B1/5e15730aa280850006f3d005.txt"
data = read_data_file(sample_path_trace_file_path)
data

We can access extracted data like this.

In [None]:
data.acce

Script reads files and parse them with \t, like following.

In [None]:
#From https://github.com/tasotasoso/indoor-location-competition-20/blob/master/io_f.py

"""
        line_data = line_data.split('\t')

        if line_data[1] == 'TYPE_ACCELEROMETER':
            acce.append([int(line_data[0]), float(line_data[2]), float(line_data[3]), float(line_data[4])])
            continue

        if line_data[1] == 'TYPE_ACCELEROMETER_UNCALIBRATED':
            acce_uncali.append([int(line_data[0]), float(line_data[2]), float(line_data[3]), float(line_data[4])])
            continue

        if line_data[1] == 'TYPE_GYROSCOPE':
            gyro.append([int(line_data[0]), float(line_data[2]), float(line_data[3]), float(line_data[4])])
            continue

        if line_data[1] == 'TYPE_GYROSCOPE_UNCALIBRATED':
            gyro_uncali.append([int(line_data[0]), float(line_data[2]), float(line_data[3]), float(line_data[4])])
            continue

        if line_data[1] == 'TYPE_MAGNETIC_FIELD':
            magn.append([int(line_data[0]), float(line_data[2]), float(line_data[3]), float(line_data[4])])
            continue

        if line_data[1] == 'TYPE_MAGNETIC_FIELD_UNCALIBRATED':
            magn_uncali.append([int(line_data[0]), float(line_data[2]), float(line_data[3]), float(line_data[4])])
            continue

        if line_data[1] == 'TYPE_ROTATION_VECTOR':
            ahrs.append([int(line_data[0]), float(line_data[2]), float(line_data[3]), float(line_data[4])])
            continue

        if line_data[1] == 'TYPE_WIFI':
            sys_ts = line_data[0]
            ssid = line_data[2]
            bssid = line_data[3]
            rssi = line_data[4]
            lastseen_ts = line_data[6]
            wifi_data = [sys_ts, ssid, bssid, rssi, lastseen_ts]
            wifi.append(wifi_data)
            continue

        if line_data[1] == 'TYPE_BEACON':
            ts = line_data[0]
            uuid = line_data[2]
            major = line_data[3]
            minor = line_data[4]
            rssi = line_data[6]
            ibeacon_data = [ts, '_'.join([uuid, major, minor]), rssi]
            ibeacon.append(ibeacon_data)
            continue

        if line_data[1] == 'TYPE_WAYPOINT':
            waypoint.append([int(line_data[0]), float(line_data[2]), float(line_data[3])])
"""

For example, we can extract all data types and turn them into a pandas data frame like following way.

In [None]:
train_acce = pd.DataFrame()
train_acce_uncali = pd.DataFrame()
train_gyro = pd.DataFrame()
train_gyro_uncali = pd.DataFrame()
train_magn = pd.DataFrame()
train_magn_uncali = pd.DataFrame()
train_ahrs = pd.DataFrame()
train_wifi = pd.DataFrame()
train_ibeacon = pd.DataFrame()
train_waypoint = pd.DataFrame()

dfs_train = {
    "acce": train_acce,
    "acce_uncali": train_acce_uncali,
    "gyro": train_gyro,
    "gyro_uncali": train_gyro_uncali,
    "magn": train_magn,
    "magn_uncali": train_magn_uncali,
    "ahrs": train_ahrs,
    "wifi": train_wifi,
    "ibeacon": train_ibeacon,
    "waypoint": train_waypoint 
}

In [None]:
def concat_data(dfs, file_path):
    """Extract data from trace path file and concat them to given pandas dataframe.
    """
    data = read_data_file(file_path)
    file_path_parsed =  file_path.split("/")
    floor = file_path_parsed[-2]
    trace_file = file_path_parsed[-1].split(".")[0]    

    if data.acce.size > 0:        
        df_tmp = pd.DataFrame(data.acce, columns=("time_stamp", "x", "y", "z"))
        df_tmp["floor"] = floor
        df_tmp["trace_file"] = trace_file
        dfs["acce"] = pd.concat([dfs["acce"], df_tmp])
        
    if data.acce_uncali.size > 0:                              
        df_tmp = pd.DataFrame(data.acce_uncali, columns=("time_stamp", "x", "y", "z"))
        df_tmp["floor"] = floor
        df_tmp["trace_file"] = trace_file
        dfs["acce_uncali"] = pd.concat([dfs["acce_uncali"], df_tmp])
    
    if data.gyro.size > 0:                              
        df_tmp =  pd.DataFrame(data.gyro, columns=("time_stamp", "x", "y", "z"))
        df_tmp["floor"] = floor
        df_tmp["trace_file"] = trace_file
        dfs["gyro"] = pd.concat([dfs["gyro"], df_tmp])
    
    if data.gyro_uncali.size > 0:                              
        df_tmp = pd.DataFrame(data.gyro_uncali, columns=("time_stamp", "x", "y", "z"))
        df_tmp["floor"] = floor
        df_tmp["trace_file"] = trace_file
        dfs["gyro_uncali"] = pd.concat([dfs["gyro_uncali"], df_tmp])
    
    if data.magn.size > 0:                              
        df_tmp = pd.DataFrame(data.magn, columns=("time_stamp", "x", "y", "z"))
        df_tmp["floor"] = floor
        df_tmp["trace_file"] = trace_file
        dfs["magn"] = pd.concat([dfs["magn"], df_tmp])

    if data.magn_uncali.size > 0:    
        df_tmp = pd.DataFrame(data.magn_uncali, columns=("time_stamp", "x", "y", "z"))
        df_tmp["floor"] = floor
        df_tmp["trace_file"] = trace_file
        dfs["magn_uncali"] = pd.concat([dfs["magn_uncali"], df_tmp])

    if data.ahrs.size > 0:    
        df_tmp = pd.DataFrame(data.ahrs, columns=("time_stamp", "x", "y", "z"))
        df_tmp["floor"] = floor
        df_tmp["trace_file"] = trace_file
        dfs["ahrs"] = pd.concat([dfs["ahrs"], df_tmp])
    
    if data.wifi.size > 0:    
        df_tmp = pd.DataFrame(data.wifi, columns=("time_stamp", "ssid", "bssid", "rssi","lastseen_ts"))
        df_tmp["floor"] = floor
        df_tmp["trace_file"] = trace_file
        dfs["wifi"] = pd.concat([dfs["wifi"], df_tmp])
 
    if data.ibeacon.size > 0:    
        df_tmp = pd.DataFrame(data.ibeacon, columns=("time_stamp", "uuid_major_minor", "rssi"))
        df_tmp["floor"] = floor
        df_tmp["trace_file"] = trace_file
        dfs["ibeacon"] = pd.concat([dfs["ibeacon"], df_tmp])

    if data.waypoint.size > 0:    
        df_tmp = pd.DataFrame(data.waypoint , columns=("time_stamp", "x", "y"))
        df_tmp["floor"] = floor
        df_tmp["trace_file"] = trace_file
        dfs["waypoint"] = pd.concat([dfs["waypoint"], df_tmp])

In [None]:
concat_data(dfs_train, sample_path_trace_file_path)

In [None]:
dfs_train["acce"]