    This notebook takes in a log file, cleans it, adds data we want for our model, and outputs it to a log file
    Justin Wasserman - 

## Import and Verify Datalog

In [126]:
import pandas as pd
import numpy as np
import math

In [127]:
datalog_DIR = '../../data/'

In [128]:
datalogFile = datalog_DIR + '02-11-2019_13-08-38.csv'
#Use error_bad_lines to fill in blanks as NA
#The WallIds should be only to have NaN

df = pd.read_csv(datalogFile, sep=',')
df.head()

Unnamed: 0,Time,ID,X,Y,Yaw,ResetID,checkCorrectness,NumberOfWalls,WallId(s)
0,0 1000000,0,0.0,0.0,0.0,1,1,0,
1,0 2000000,0,0.0,0.0,0.0,1,1,0,
2,0 3000000,0,0.0,0.0,0.0,1,1,0,
3,0 4000000,0,0.0,0.0,0.0,1,1,0,
4,0 5000000,0,0.0,0.0,0.0,1,1,0,


In [129]:
#Drop last row in df, sometimes datalog will be stopped while writing numbers to log
#which will cause NaNs to be inserted.So it is just best to drop the last row.
df.drop(df.tail(1).index,inplace=True) # drop last row

In [130]:
#Verify that only WallId(s) has NaN in it
NaNs = df.isnull().any() #Checks which columns have an NA in it
if(NaNs.where(NaNs == True).sum() != 1.0 and NaNs['WallId(s)'] != True): #Should only be 1 NA and it should be WallId(s)
    print("[cleans_minimal] More than one column has a NaN in it")


## Time

The time column is in the form of second(s) space millisecond with six 0's after the milliseconds (up to 999 milliseconds are contained in the time). So "0 1000000" is 1 millisecond while "0 10000000" is 10 milliseconds. However, times that are just seconds, and have 0 milliseconds only have one 0, so "1 0" is one second and not "1 000000".

In [131]:
for i in df.index:
    (second, millisecond) = df['Time'][i].split(' ')
    second = float(second)
    if(millisecond != '0'):
        millisecond = float(millisecond[:-6]) / 1000.0
    else:
        millisecond = float(millisecond)
    df.at[i, 'Time'] = second + millisecond
df.head()

Unnamed: 0,Time,ID,X,Y,Yaw,ResetID,checkCorrectness,NumberOfWalls,WallId(s)
0,0.001,0,0.0,0.0,0.0,1,1,0,
1,0.002,0,0.0,0.0,0.0,1,1,0,
2,0.003,0,0.0,0.0,0.0,1,1,0,
3,0.004,0,0.0,0.0,0.0,1,1,0,
4,0.005,0,0.0,0.0,0.0,1,1,0,


## Check Correctness

The gazebo simulator verifies that the ball is in a hub, and the hubs/weaselballs are within the environment. CheckCorrectness is the variable that gets printed to the datalog to verify that the simulator is running correctly for a given timestep. So, any rows with a checkCorrectness = 0 should be removed.

In [132]:
#If the checkCorrectness code doesn't become generalized, then it may be appropiate to skips this cell
df = df[df.checkCorrectness != 0]
df.head()

Unnamed: 0,Time,ID,X,Y,Yaw,ResetID,checkCorrectness,NumberOfWalls,WallId(s)
0,0.001,0,0.0,0.0,0.0,1,1,0,
1,0.002,0,0.0,0.0,0.0,1,1,0,
2,0.003,0,0.0,0.0,0.0,1,1,0,
3,0.004,0,0.0,0.0,0.0,1,1,0,
4,0.005,0,0.0,0.0,0.0,1,1,0,


## WallId(s) / NumberOfWalls

Since the Gazebo simulator will have the models shoot out after a collision, I will add a huerisitc where if a wall was touched in the last n ms and there are no collisions currently then we will consider the row to collide with the wall.

In [133]:
#Make all wallId(s) strings
for i in df.index:
    if df['NumberOfWalls'][i] > 0:
        if(type(df['WallId(s)'][i]) == float):
            df.at[i, 'WallId(s)'] = str(int(df['WallId(s)'][i]))

In [134]:
#verify
for i in df.index:
    if df['NumberOfWalls'][i] > 0:
        if type(df['WallId(s)'][i]) != str:
             print("Error, non-string detected!")

In [135]:
n = 5 #milliseconds since last collision

In [136]:
rowsSinceLastWall = 0
lastWall = None
lastNumberOfWalls = None
for i in df.index:
    rowNumberOfWalls = df['NumberOfWalls'][i]
    rowWallIds = df['WallId(s)'][i]
    if rowNumberOfWalls > 0:
        rowsSinceLastWall = 0
        lastWall = rowWallIds
        lastNumberOfWalls = rowNumberOfWalls
    elif rowsSinceLastWall < n and lastWall != None:
        df.at[i, 'NumberOfWalls'] = lastNumberOfWalls
        df.at[i, 'WallId(s)'] = lastWall
    rowsSinceLastWall += 1

In [137]:
total = 0
for i in df.index:
    total += df['NumberOfWalls'][i]
total

2312

## Enclosure Data

Here I will import the enclosure data

In [138]:
enclosureFile = datalog_DIR + 'boundaryDescription.txt'
enclosure_df = pd.read_csv(enclosureFile, sep=',')
enclosure_df.head()


Unnamed: 0,name,X,Y,Z,Roll,Pitch,Yaw,sizeX,sizeY,sizeZ
0,rail01,0.56355,0.0,0.03175,0,0,0.0,0.01905,1.12713,0.0889
1,rail02,0.0,0.56356,0.03175,0,0,1.57,0.01905,1.1525,0.0889
2,rail03,-0.56355,0.0,0.03175,0,0,3.14,0.01905,1.12713,0.0889
3,rail04,0.0,-0.56356,0.03175,0,0,-1.57319,0.01905,1.1525,0.0889


Next I will change the name of the railXX to become the ID to match the df.

In [139]:
for i in enclosure_df.index:
    enclosure_df.at[i, 'name'] = int(enclosure_df.at[i, 'name'].replace("rail",""))
enclosure_df.head()

Unnamed: 0,name,X,Y,Z,Roll,Pitch,Yaw,sizeX,sizeY,sizeZ
0,1,0.56355,0.0,0.03175,0,0,0.0,0.01905,1.12713,0.0889
1,2,0.0,0.56356,0.03175,0,0,1.57,0.01905,1.1525,0.0889
2,3,-0.56355,0.0,0.03175,0,0,3.14,0.01905,1.12713,0.0889
3,4,0.0,-0.56356,0.03175,0,0,-1.57319,0.01905,1.1525,0.0889


Now I will get a vector to represent each corner, this can be used to perform a cross product on the trajectory of 
the robot going into/out of a corn to find the angle that the robot enters/leaves

In [140]:
#get vector
from numpy import ones,vstack
from numpy.linalg import lstsq
for i in enclosure_df.index:
    x1 = enclosure_df.at[i, 'X'] - (enclosure_df.at[i,'sizeX'] / 2.0) * np.cos(enclosure_df.at[i,'Yaw'])
    y1 = enclosure_df.at[i, 'Y'] - (enclosure_df.at[i,'sizeX'] / 2.0) * np.sin(enclosure_df.at[i,'Yaw'])
    x2 = x1 - (enclosure_df.at[i,'sizeY']*np.sin(enclosure_df.at[i,'Yaw']))
    y2 = y1 + (enclosure_df.at[i,'sizeY']*np.cos(enclosure_df.at[i,'Yaw']))
    
    v = (x2-x1, y2-y1)
    
    enclosure_df.at[i,'vector_x'] = v[0]
    enclosure_df.at[i,'vector_y'] = v[1]
enclosure_df


Unnamed: 0,name,X,Y,Z,Roll,Pitch,Yaw,sizeX,sizeY,sizeZ,vector_x,vector_y
0,1,0.56355,0.0,0.03175,0,0,0.0,0.01905,1.12713,0.0889,0.0,1.12713
1,2,0.0,0.56356,0.03175,0,0,1.57,0.01905,1.1525,0.0889,-1.1525,0.000918
2,3,-0.56355,0.0,0.03175,0,0,3.14,0.01905,1.12713,0.0889,-0.001795,-1.127129
3,4,0.0,-0.56356,0.03175,0,0,-1.57319,0.01905,1.1525,0.0889,1.152497,-0.002759


## Bounce angle

To get the bounce angle, 2 lines are needed. The first one is the line from the wall which is found above. The second line comes from creating a line from the point where the wall is hit with the points from the previous k time steps.

In [141]:
MAX_K = 5
df.head()

Unnamed: 0,Time,ID,X,Y,Yaw,ResetID,checkCorrectness,NumberOfWalls,WallId(s)
0,0.001,0,0.0,0.0,0.0,1,1,0,
1,0.002,0,0.0,0.0,0.0,1,1,0,
2,0.003,0,0.0,0.0,0.0,1,1,0,
3,0.004,0,0.0,0.0,0.0,1,1,0,
4,0.005,0,0.0,0.0,0.0,1,1,0,


In [149]:
#Get incoming bounce angles
for i in df.index:
    #I choose the weird if statement here because sometimes if we have a 2 walls, it can hit a 1 wall first.
    #I want the person analyzing the data to decide if that is useful
    if df.at[i, 'NumberOfWalls'] > 0 and df.at[i-1, 'NumberOfWalls'] < df.at[i, 'NumberOfWalls']:
        print(i)
        angles = []
        wallIds = (df.at[i, 'WallId(s)']).split('&')
        for wall in wallIds:
            wallV = (enclosure_df.at[int(wall)-1,'vector_x'], enclosure_df.at[int(wall)-1,'vector_y'])
            weaselPoints = []
            for local_max_k in range(1,MAX_K+1):
                localWeaselPoints = []
                for k in range(local_max_k+1):
                    localWeaselPoints.append((df.at[i-k, 'X'],df.at[i-k, 'Y'])) #Add a -1 because 0 based iteration over k
                weaselPoints.append(localWeaselPoints)
            local_angles = []
            for points in weaselPoints:
                #Find line of best fit for k
                x,y = zip(*points)
                line = (np.polyfit(x, y, 1)) 
                print(x,y)
                weaselV = (1, line[0]) #line.c[0] is the slope of the line
                try:
                    angle = math.acos((np.dot(weaselV,wallV)) / (np.linalg.norm(weaselV) * np.linalg.norm(wallV)))
                except Exception as e:
                    print(e)
                    angle = -999999999
                local_angles.append(angle)
            angles.append(local_angles)
        #Write angle data to df.
        #If it is just 1 wall, then just write the angle
        #If it is more than 1 wall, write angles with an & in between them
        for j in range(df.at[i, 'NumberOfWalls']):
            for k in range(1,MAX_K+1):
                df.at[i, 'in_angle'+str(k)] = str(angles[j][k-1])
                if(j != df.at[i, 'NumberOfWalls']-1):
                    df.at[i, 'in_angle'+str(k)] = df.at[i, 'in_angle'+str(k)] + "&"
                

    


1880
(0.273929, 0.273383) (0.493964, 0.49298800000000004)
1.78754578754554 1.787545787545775
(0.273929, 0.273383, 0.273339) (0.493964, 0.49298800000000004, 0.49300299999999997)
1.6959386069930542 1.787545787545775
(0.273929, 0.273383, 0.273339, 0.273117) (0.493964, 0.49298800000000004, 0.49300299999999997, 0.49250200000000005)
1.7700905616385814 1.787545787545775
(0.273929, 0.273383, 0.273339, 0.273117, 0.272893) (0.493964, 0.49298800000000004, 0.49300299999999997, 0.49250200000000005, 0.49199799999999994)
1.8761956425766781 1.787545787545775
(0.273929, 0.273383, 0.273339, 0.273117, 0.272893, 0.272667) (0.493964, 0.49298800000000004, 0.49300299999999997, 0.49250200000000005, 0.49199799999999994, 0.491494)
1.961606307335914 1.787545787545775
1927
(0.28015999999999996, 0.280113) (0.492824, 0.492831)
-0.14893617021275052 -0.14893617021362035
(0.28015999999999996, 0.280113, 0.280065) (0.492824, 0.492831, 0.49283699999999997)
-0.13680011818636548 -0.14893617021362035
(0.28015999999999996, 0



 (0.494752, 0.494749, 0.494729, 0.49472700000000003, 0.494704, 0.49468)
0.2045301387442599 0.047619047619509164
11368
(-0.48331599999999997, -0.483417) (0.49498500000000006, 0.49495500000000003)
0.297029702971257 0.29702970297054193
(-0.48331599999999997, -0.483417, -0.483493) (0.49498500000000006, 0.49495500000000003, 0.49495600000000006)
0.17095083076192352 0.29702970297054193
(-0.48331599999999997, -0.483417, -0.483493, -0.48358999999999996) (0.49498500000000006, 0.49495500000000003, 0.49495600000000006, 0.49493400000000004)
0.1720751916897721 0.29702970297054193
(-0.48331599999999997, -0.483417, -0.483493, -0.48358999999999996, -0.48368199999999995) (0.49498500000000006, 0.49495500000000003, 0.49495600000000006, 0.49493400000000004, 0.494914)
0.18156435479331787 0.29702970297054193
(-0.48331599999999997, -0.483417, -0.483493, -0.48358999999999996, -0.48368199999999995, -0.483757) (0.49498500000000006, 0.49495500000000003, 0.49495600000000006, 0.49493400000000004, 0.494914, 0.494903

21656
(0.257768, 0.25759299999999996) (0.49586, 0.495864)
-0.022857142856314256 -0.02285714285716098
(0.257768, 0.25759299999999996, 0.257419) (0.49586, 0.495864, 0.495867)
-0.020059988396207734 -0.02285714285716098
(0.257768, 0.25759299999999996, 0.257419, 0.257248) (0.49586, 0.495864, 0.495867, 0.495871)
-0.020759335382019562 -0.02285714285716098
(0.257768, 0.25759299999999996, 0.257419, 0.257248, 0.25707399999999997) (0.49586, 0.495864, 0.495867, 0.495871, 0.49586899999999995)
-0.01442864125554948 -0.02285714285716098
(0.257768, 0.25759299999999996, 0.257419, 0.257248, 0.25707399999999997, 0.256898) (0.49586, 0.495864, 0.495867, 0.495871, 0.49586899999999995, 0.49586899999999995)
-0.01052073767061601 -0.02285714285716098
23765
(-0.49405, -0.49468100000000004) (0.325864, 0.325787)
0.12202852614898861 0.12202852614895066
(-0.49405, -0.49468100000000004, -0.49413900000000005) (0.325864, 0.325787, 0.325406)
-0.16739825618778195 0.12202852614895066
(-0.49405, -0.49468100000000004, -0.494



In [150]:
df.iloc[1880]

Time                             1.881
ID                                   0
X                             0.273929
Y                             0.493964
Yaw                          0.0203294
ResetID                              1
checkCorrectness                     1
NumberOfWalls                        1
WallId(s)                            2
in_angle1            2.080051422681121
in_angle2           2.1027699831751225
in_angle3            2.084243216471946
in_angle4           2.0596926786680867
in_angle5           2.0414459547908397
out_angle1                         NaN
out_angle2                         NaN
out_angle3                         NaN
out_angle4                         NaN
out_angle5                         NaN
Name: 1880, dtype: object

In [143]:
#Get outgoing bounce angles
for i in df.index:
    #I choose the weird if statement here because sometimes if we have a 2 walls, it will hit a 1 wall first.
    #I want the person analyzing the data to decide if that is useful
    if df.at[i, 'NumberOfWalls'] > 0 and df.at[i+1, 'NumberOfWalls'] < df.at[i, 'NumberOfWalls']:
        angles = []
        wallIds = (df.at[i, 'WallId(s)']).split('&')
        for wall in wallIds:
            wallV = (enclosure_df.at[int(wall)-1,'vector_x'], enclosure_df.at[int(wall)-1,'vector_y'])
            weaselPoints = []
            for local_max_k in range(1,MAX_K+1):
                localWeaselPoints = [(df.at[i, 'X'],df.at[i, 'Y'])]
                for k in range(local_max_k+1):
                    localWeaselPoints.append((df.at[i-k, 'X'],df.at[i-k, 'Y'])) #Add a -1 because 0 based iteration over k
                weaselPoints.append(localWeaselPoints)
            local_angles = []
            for points in weaselPoints:
                #Find line of best fit for k
                x,y = zip(*points)
                line = np.poly1d(np.polyfit(x, y, 1)) 
                weaselV = (1, line.c[0]) #line.c[0] is the slope of the line
                try:
                    angle = math.acos((np.dot(wallV,weaselV)) / (np.linalg.norm(weaselV) * np.linalg.norm(wallV)))
                except Exception as e:
                    print(e)
                    angle = -999999999
                local_angles.append(angle)
            angles.append(local_angles)
        #Write angle data to df.
        #If it is just 1 wall, then just write the angle
        #If it is more than 1 wall, write angles with an & in between them
        for j in range(df.at[i, 'NumberOfWalls']):
            for k in range(1,MAX_K+1):
                df.at[i, 'out_angle'+str(k)] = str(angles[j][k-1])
                if(j != df.at[i, 'NumberOfWalls']-1):
                    df.at[i, 'out_angle'+str(k)] = df.at[i, 'out_angle'+str(k)] + "&"
                

    




## Output CSV

In [144]:
df.head()

Unnamed: 0,Time,ID,X,Y,Yaw,ResetID,checkCorrectness,NumberOfWalls,WallId(s),in_angle1,in_angle2,in_angle3,in_angle4,in_angle5,out_angle1,out_angle2,out_angle3,out_angle4,out_angle5
0,0.001,0,0.0,0.0,0.0,1,1,0,,,,,,,,,,,
1,0.002,0,0.0,0.0,0.0,1,1,0,,,,,,,,,,,
2,0.003,0,0.0,0.0,0.0,1,1,0,,,,,,,,,,,
3,0.004,0,0.0,0.0,0.0,1,1,0,,,,,,,,,,,
4,0.005,0,0.0,0.0,0.0,1,1,0,,,,,,,,,,,


In [145]:
df.to_csv(datalog_DIR + "results.csv")

## Debug

In [146]:
#Find rows with more than 2 walls
for i in df.index:
    if df.at[i, 'NumberOfWalls'] > 0 and df.at[i-1, 'NumberOfWalls'] < df.at[i, 'NumberOfWalls']:
        print(df.loc[i])

Time                             1.881
ID                                   0
X                             0.273929
Y                             0.493964
Yaw                          0.0203294
ResetID                              1
checkCorrectness                     1
NumberOfWalls                        1
WallId(s)                            2
in_angle1            2.080051422681121
in_angle2           2.1027699831751225
in_angle3            2.084243216471946
in_angle4           2.0596926786680867
in_angle5           2.0414459547908397
out_angle1                         NaN
out_angle2                         NaN
out_angle3                         NaN
out_angle4                         NaN
out_angle5                         NaN
Name: 1880, dtype: object
Time                             1.928
ID                                   0
X                              0.28016
Y                             0.492824
Yaw                          0.0538737
ResetID                              1

Time                            25.476
ID                                   0
X                            -0.495408
Y                             0.317056
Yaw                           -1.73805
ResetID                              1
checkCorrectness                     1
NumberOfWalls                        1
WallId(s)                            3
in_angle1           2.6352119123719664
in_angle2            2.417751284278263
in_angle3            2.378534306198969
in_angle4           2.2199323760787837
in_angle5           2.1373984888147204
out_angle1                         NaN
out_angle2                         NaN
out_angle3                         NaN
out_angle4                         NaN
out_angle5                         NaN
Name: 25475, dtype: object
Time                            25.548
ID                                   0
X                            -0.496336
Y                             0.308714
Yaw                           -1.82748
ResetID                              