# 1. Problem Description & Motivation

Most robots are composed of 3 main parts:
- The Controller ‐ also known as the "brain" which is run by a computer program.
- Mechanical parts ‐ motors, pistons, grippers, wheels, and gears that make the robot move, grab, turn, and lift.
- Sensors ‐ to tell the robot about its surroundings.

For a robot to learn and carry out a task correctly, it needs to rely and integrate all 3 components effectively.
In our problem, we help robots recognize the floor surface they’re standing on using data collected from Inertial Measurement Units (IMU sensors).
*IMU sensor data* is collected while driving a small mobile robot over *different floor surfaces* on the university premises. The task is to **predict which one of the nine floor types (carpet, tiles, concrete) the robot is on using sensor data such as acceleration and velocity**.

## What is IMU?
IMU stands for Inertial Measurement Unit, which is an electronic device that measures and reports a body's specific force, angular rate, and sometimes the magnetic field surroundings the body, using a combination of [accelerometers](https://en.wikipedia.org/wiki/Accelerometer) and [gyroscopes](https://en.wikipedia.org/wiki/Gyroscope),  sometimes also [magnetometers](https://en.wikipedia.org/wiki/Magnetometers). With these 3 incorporated sensors, the IMU measures at least 3 different types of quantities:
1. 3D Orientation
2. Linear Acceleration
3. Angular Acceleration

# 2. Data Acquisition

The dataset that we will be using comes a [Kaggle Compeition](https://www.kaggle.com/c/career-con-2019/data).

# 3. Data Exploration

In [7]:
#Import necessary libraries
import pandas as pd
import numpy as np

#Import all data and labels
X_train = pd.read_csv('./data/X_train.csv')
y_train = pd.read_csv('./data/y_train.csv')
X_test = pd.read_csv('./data/X_test.csv')

print('X_train has {} rows, {} cols.'.format(X_train.shape[0],X_train.shape[1]))
print('y_train has {} rows, {} cols.'.format(y_train.shape[0],y_train.shape[1]))
print('X_test has {} rows, {} cols.'.format(X_test.shape[0],X_test.shape[1]))

X_train has 487680 rows, 13 cols.
y_train has 3810 rows, 3 cols.
X_test has 488448 rows, 13 cols.


It seems that number of rows for X_train is different from y_train. Let us explore the data further.

In [9]:
print(X_train.head())
print(y_train.head())
https://www.kaggle.com/hiralmshah/robot-sensor-eda-feature-engg-and-prediction

  row_id  series_id  measurement_number  orientation_X  orientation_Y  \
0    0_0          0                   0       -0.75853       -0.63435   
1    0_1          0                   1       -0.75853       -0.63434   
2    0_2          0                   2       -0.75853       -0.63435   
3    0_3          0                   3       -0.75852       -0.63436   
4    0_4          0                   4       -0.75852       -0.63435   

   orientation_Z  orientation_W  angular_velocity_X  angular_velocity_Y  \
0       -0.10488       -0.10597            0.107650            0.017561   
1       -0.10490       -0.10600            0.067851            0.029939   
2       -0.10492       -0.10597            0.007275            0.028934   
3       -0.10495       -0.10597           -0.013053            0.019448   
4       -0.10495       -0.10596            0.005135            0.007652   

   angular_velocity_Z  linear_acceleration_X  linear_acceleration_Y  \
0            0.000767               -0.

# 4. Data Preprocessing