# RF Solution

The introduction of this file is courtesy of the team at Perceptive Automata

### Imports

In [1]:
import ast
import pandas as pd
import numpy as np
from pandas import DataFrame
from sklearn.ensemble.forest import RandomForestClassifier

In [2]:
# Older version of sklearn
# from sklearn.cross_validation import train_test_split

# Newer versions of sklearn
from sklearn.model_selection import train_test_split

# Pedestrian Crossing Prediction

This problem uses the opensource JAAD dataset for this problem, which you can read more about here: http://data.nvision2.eecs.yorku.ca/JAAD_dataset/. The data is available in this library as "pedestrians_df.csv".

The dataset consists of tracked people in some videos from a car's dashcam.  Each of these people have been carefully annotated with a bunch of different attributes, such as whether or not they are stopped or moving fast or moving slow.  For this problem we will try to predict whether or not the pedestrian will cross the street in the next frame based on all previous data we have about the pedestrian.  You will use the bounding boxes of the pedestrians along with the other actions that they take to try to predict this for a test set.

## Dataframe

Each row of the dataframe that we construct for you consists of some meta data about the video id and the ped id so that you can match them up with the JAAD videos, and then an ordered list of frames where that pedestrian appears.  

* frame_numbers - These should be continuous and there should be no gaps in these lists.  The other fields all align with the frame number field.
* bounding_boxes - This field is a series of boxes that aligns with the frame_numbers field.  Each box is constructed of [box x, box y, box width, box height], where x and y represent the upper left hand corner of the box
* moving_slow, stopped, handwave, look, clear path, moving fast, looking, standing, slow down, nod, speed up - The annotated attributes you will use to train the model, each is a list that aligns with the frame_numbers field of whether or not the attribute is true for that frame number
* crossing - Whether or not the pedestrian is crossing for this corresponding frame number 
* cross_overall - This is the field that you will try to predict, it is whether or the person crossed at any point in the sequence

In [4]:
pedestrians_df = pd.read_csv('data/pedestrian_df.csv')
for col_name in ['bounding_boxes', 'frame_numbers', 'moving slow', 'stopped', 'handwave', 'look', 'clear path', 'crossing', 'moving fast', 'looking', 'standing', 'slow down', 'nod', 'speed up']:
    pedestrians_df[col_name] = pedestrians_df[col_name].apply(ast.literal_eval)
pedestrians_df.head()

Unnamed: 0,video_id,ped_ind,frame_numbers,bounding_boxes,moving slow,stopped,handwave,look,clear path,crossing,moving fast,looking,standing,slow down,nod,speed up,cross_overall
0,video_0071,1,"[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,...","[[1209, 598, 51, 191], [1214, 598, 52, 192], [...","[False, False, False, False, False, False, Fal...","[True, True, True, True, True, True, True, Tru...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...",False
1,video_0071,2,"[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,...","[[1249, 621, 51, 127], [1254, 620, 51, 129], [...","[True, True, True, True, True, True, True, Tru...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[True, True, True, True, True, True, True, Tru...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...",True
2,video_0204,1,"[3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, ...","[[1135, 673, 28, 97], [1139, 672, 29, 92], [11...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[True, True, True, True, True, True, True, Tru...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...",True
3,video_0204,3,"[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,...","[[906, 670, 35, 65], [906, 672, 32, 65], [907,...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[True, True, True, True, True, True, True, Tru...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...",True
4,video_0204,2,"[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,...","[[1152, 657, 42, 114], [1158, 657, 42, 117], [...","[False, False, False, False, False, False, Fal...","[True, True, True, True, True, True, True, Tru...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[True, True, True, True, True, True, True, Tru...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...","[False, False, False, False, False, False, Fal...",False


In [5]:
count = 0
for i, row in pedestrians_df.iterrows():
    count += len(row['frame_numbers'])
    
print "Number of Pedestrian-Frames: %d" % count

SyntaxError: Missing parentheses in call to 'print'. Did you mean print("Number of Pedestrian-Frames: %d" % count)? (<ipython-input-5-7b2286be032e>, line 5)

In [6]:
# Let's take a more in-depth look at that first row:
print(pedestrians_df.iloc[2])
print(pedestrians_df.iloc[2]['crossing'])
print(pedestrians_df.iloc[2]['speed up'])

video_id                                                 video_0204
ped_ind                                                           1
frame_numbers     [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, ...
bounding_boxes    [[1135, 673, 28, 97], [1139, 672, 29, 92], [11...
moving slow       [False, False, False, False, False, False, Fal...
stopped           [False, False, False, False, False, False, Fal...
handwave          [False, False, False, False, False, False, Fal...
look              [False, False, False, False, False, False, Fal...
clear path        [False, False, False, False, False, False, Fal...
crossing          [True, True, True, True, True, True, True, Tru...
moving fast       [False, False, False, False, False, False, Fal...
looking           [False, False, False, False, False, False, Fal...
standing          [False, False, False, False, False, False, Fal...
slow down         [False, False, False, False, False, False, Fal...
nod               [False, False, False, False, F

# More info

Your task is predict, for each pedestrian, whether or not they will be crossing the road at each frame.  For example, for row 0 of the above dataframe, the pedestrian appears in frames 0-329.  For each of those frames, you need to predict whether or not they will be crossing or not crossing in the next frame.  So, for frame number 5, you can use whatever data you want from frames 0-4 to predict whether or not they will be crossing in frame 5.  And for frame 329, you can use whatever data you want from frames 0-328 to predict whether or not they will be crossing.

You can skip the first few frames for each pedestrian if your solution requires a certain number of frames to be initialized.

You will need to:
- unravel the existing per-pedestrian dataframe to build your new per-pedestrian-frame dataframe.  
- extract features
- split the data into train and validation sets (70%-30% split is probably about the right size)
- build a baseline that simply predicts the previous frames' "crossing" value for the next frame
- make some models
- test your final model on your validation set
- write up your analysis

# Build Dataset Functions

In [7]:
def CheckAnyOccurence(current_window,index):
    #Check if the event happen any time in the window
    if 1 in current_window[index]:
        return 1
        
    else:
        return 0

def CheckTwoOccurences(current_window,index1,index2):
    #Check if two events happen at the same time
    window_size = len(current_window[0])
    data1 = current_window[index1]
    data2 = current_window[index2]
    for i in range(window_size):
        if (data1[i] == 1 and data2[i] == 1):
            return 1
            break
    return 0

def CheckLast5Steps(current_window,index):
    #Check if element occured in last 5 frame
    if 1 in current_window[index][-5:]:
        return 1
    else:
        return 0
def CheckLastStep(current_window,index):
    #Check if element occured in last frame
    if 1 in current_window[index][-1:]:
        return 1
    else:
        return 0

def ClassifyWindow(current_window,attribute_dict,new_data):
    
    window_data = np.zeros(len(new_data)).astype(int)
    window_data[0] = CheckAnyOccurence(current_window,attribute_dict['looking'])
    window_data[1] = CheckAnyOccurence(current_window,attribute_dict['handwave'])
    window_data[2] = CheckAnyOccurence(current_window,attribute_dict['nod'])
    window_data[3] = CheckAnyOccurence(current_window,attribute_dict['standing'])
    window_data[4] = CheckAnyOccurence(current_window,attribute_dict['speed up'])
    window_data[5] = CheckAnyOccurence(current_window,attribute_dict['slow down'])
    window_data[6] = CheckAnyOccurence(current_window,attribute_dict['stopped'])
    window_data[19] = CheckAnyOccurence(current_window,attribute_dict['look'])
    
    window_data[7] = CheckTwoOccurences(current_window,attribute_dict['stopped'],attribute_dict['nod'])
    window_data[8] = CheckTwoOccurences(current_window,attribute_dict['looking'],attribute_dict['nod'])
    window_data[9] = CheckTwoOccurences(current_window,attribute_dict['looking'],attribute_dict['handwave'])
    window_data[10] = CheckTwoOccurences(current_window,attribute_dict['nod'],attribute_dict['handwave'])
    
    window_data[11] = CheckLast5Steps(current_window,attribute_dict['handwave'])
    window_data[12] = CheckLast5Steps(current_window,attribute_dict['nod'])
    window_data[13] = CheckLast5Steps(current_window,attribute_dict['standing'])
    window_data[14] = CheckLast5Steps(current_window,attribute_dict['speed up'])
    window_data[15] = CheckLast5Steps(current_window,attribute_dict['slow down'])
    window_data[16] = CheckLast5Steps(current_window,attribute_dict['stopped'])
    window_data[17] = CheckLast5Steps(current_window,attribute_dict['clear path'])
    window_data[18] = CheckLast5Steps(current_window,attribute_dict['look'])
    
    
    window_data[20] = CheckLastStep(current_window,attribute_dict['moving slow'])
    window_data[21] = CheckLastStep(current_window,attribute_dict['stopped'])
    window_data[22] = CheckLastStep(current_window,attribute_dict['handwave'])
    window_data[23] = CheckLastStep(current_window,attribute_dict['look'])
    window_data[24] = CheckLastStep(current_window,attribute_dict['clear path'])
    window_data[25] = CheckLastStep(current_window,attribute_dict['moving fast'])
    window_data[26] = CheckLastStep(current_window,attribute_dict['looking'])
    window_data[27] = CheckLastStep(current_window,attribute_dict['standing'])
    window_data[28] = CheckLastStep(current_window,attribute_dict['slow down'])
    window_data[29] = CheckLastStep(current_window,attribute_dict['nod'])
    window_data[30] = CheckLastStep(current_window,attribute_dict['speed up'])

    return window_data

def GenerateData(moving_window_size, pedestrians_df):
    # Attribute of the pedestrian dataframe
    attribute_list = ['moving slow','stopped','handwave','look',
                      'clear path','moving fast','looking',
                      'standing','slow down','nod','speed up','crossing']
    
    # Dictionary assigning the attributes to their index number in the dataframe
    attribute_dict = {'moving slow':0,'stopped':1,'handwave':2,'look':3,
                      'clear path':4,'moving fast':5,'looking':6,
                      'standing':7,'slow down':8,'nod':9,'speed up':10,'crossing':11}
    
    # List stating the elements used in the new dataset
    # The length of this list is used, otherwise it is just used to keep track of the conditions used
    new_data = ['looking','waved','nodded','stood','sped up','slowed down','stopped','stop and nod','look and nod',
                'look and wave','wave and nod','looked last 5','waved 5','nodded 5','stood 5', 'sped up 5', 
                'slowed down 5','stopped 5','clear path 5','look 5','look all','moving slow','stopped','handwave',
                'look','clear path','moving fast','looking','standing','slow down','nod','speed up']

    pedestrian_data = []
    result_data = []
    for ped_n in range(len(pedestrians_df)):
        
        pedestrian = pedestrians_df.iloc[ped_n]
        n_attributes = len(attribute_list)
        n_frames = len(pedestrian['frame_numbers'])
        
        #Generate empty list of lists to store all data for pedestrian in 
        general_attribute_list = [[] for _ in range(n_attributes)]
        
        #Insert the classified attributes from the dataframe into the general attribute list
        for j in range(n_attributes):
            attribute = attribute_list[j]
            general_attribute_list[j] = pedestrians_df[attribute][ped_n]
        general_attribute_list = np.multiply(general_attribute_list,1)
        
        # Go through the windows that can be generated by the general attribute list
        # Total windows that can be generated is length of general list - the window size or 1
        #Also add the crossing attribute data to the result data list so it can be transformed into test data
        if n_frames <= moving_window_size:
            current_window = np.copy(general_attribute_list)
            
            pedestrian_data.append(ClassifyWindow(current_window,attribute_dict,new_data))
            result_data.append(CheckAnyOccurence(current_window,attribute_dict['crossing']))
        else:
            n_windows = n_frames - moving_window_size
            for i in range(n_windows):
                max_window = i+moving_window_size+1
                current_window = general_attribute_list[:,i:max_window]
                pedestrian_data.append(ClassifyWindow(current_window,attribute_dict,new_data))
                result_data.append(CheckAnyOccurence(current_window,attribute_dict['crossing']))
                
    out_data= np.asarray(np.matrix(pedestrian_data))
    out_result_data = np.asarray(np.matrix(result_data).T)
    return(out_data,out_result_data)



# Baseline

In [8]:
(Xdata,Ydata) = GenerateData(10,pedestrians_df)


X_train, X_test, y_train, y_test = train_test_split(Xdata, Ydata, test_size=0.3)

# Reduce y data to (:,) shape

y_train = np.ravel(y_train)
y_test = np.ravel(y_test)

baseline_prediction = []

from sklearn.metrics import mean_squared_error
from sklearn.metrics import accuracy_score
from sklearn.metrics import confusion_matrix


for i in range(1,len(X_train)):
    #Here I made the baseline data by assigning the previous frame's crossing value
    #I do not make the prediction with the first frame for every pedestrian
    baseline_prediction.append(y_train[i-1])
baseline_score = mean_squared_error(y_train[1:],baseline_prediction)
print('Baseline MSE: %.3f' % baseline_score)

Baseline MSE: 0.472


# Best Random Forest Classifier

In [11]:
def random_forest_classifier(features, target):
    """
    To train the random forest classifier with features and target data
    :param features:
    :param target:
    :return: trained random forest classifier
    """
    clf = RandomForestClassifier(n_estimators=500,min_samples_split=100,
                                 min_samples_leaf=50,n_jobs=4)
    clf.fit(features, target)
    return clf

train_accuracy = []
test_accuracy = []

#Run for X times
for i in range(20):
    print("Running iteration "+str(i))
    trained_model = random_forest_classifier(X_train,y_train)

    predictions = trained_model.predict(X_test)
    train_accuracy.append(accuracy_score(y_train, trained_model.predict(X_train)))
    test_accuracy.append(accuracy_score(y_test, predictions))
    
print("Trained model :: ", trained_model)


Running iteration 0
Running iteration 1
Running iteration 2
Running iteration 3
Running iteration 4
Running iteration 5
Running iteration 6
Running iteration 7
Running iteration 8
Running iteration 9
Running iteration 10
Running iteration 11
Running iteration 12
Running iteration 13
Running iteration 14
Running iteration 15
Running iteration 16
Running iteration 17
Running iteration 18
Running iteration 19
Trained model ::  RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
            max_depth=None, max_features='auto', max_leaf_nodes=None,
            min_impurity_decrease=0.0, min_impurity_split=None,
            min_samples_leaf=50, min_samples_split=100,
            min_weight_fraction_leaf=0.0, n_estimators=500, n_jobs=4,
            oob_score=False, random_state=None, verbose=0,
            warm_start=False)


In [15]:

print( "Average Train Accuracy :: ", np.average(train_accuracy))
print("Average Test Accuracy  :: ", np.average(test_accuracy))
error_p =round((100*(1-np.average(test_accuracy))),5)
print("Average Test Error :: {0} %\n".format(error_p))

print("Baseline score divided by the average error :: ",(100*baseline_score)/error_p)

Average Train Accuracy ::  0.9785131736667685
Average Test Accuracy  ::  0.9785741705511379
Average Test Error :: 2.14258 %

Baseline score divided by the average error ::  22.04218814213172


## Analysis



I created my dataset by first looking at the different attributes and trying to determine what I would consider when driving and while crossing the street as a pedestrian myself. I then decided to include several different cases for each set of frames that I was compiling together into one data entry. 

In the dataset generating functions I used a window size of 10 frames. I first checked if the events in lines 34-40 occured at any time in the window and did boolean classification on it. Then in lines 43-46 I checked whether the two attributes in each function were both called at the same time at any point in the window. In lines 48-55 I checked whether the events happened in the last 5 frames. Finally in lines 58-68 I checked all of the attributes in the last frame of the window.

By separating the attributes in this way I was able to build a large set of data that could then be used for testing. Prior to creating the model, I made a baseline prediction for the data. What I did was take index 1:end for the dataset and assign the previous line's crossing value as the predicted value. I then found the mean squared error between the prediction and validation data. I ended up with an error of around 47.5%. This is a very large value.

After tweaking the dataset, I settled on the structure described previously. With this data in hand I split the data into a training set and a test set. These were then run through the Random Forest Classifier with the number of estimators set to 500, the minimum samples required to split was 100, and the minimum samples required at a leaf was 50. I found these values through trial and error. I considered writing an Evolutionary Algorithm or Particle Swarm Optimization Algorithm if I could not get past an 11% error, but by modifying my dataset I got past the bottleneck and with some individual tweaking of the parameters I further decreased the error.

Using a window size of 10, the 20 run average error was 2.17%, which is around 22 times smaller than the baseline prediction of 47.8%. Overall, I found that the dataset structure had the most impact on the results. I was getting anywhere between 40% error and 11% error before I settled on my current structure. With the current structure I was getting between 5% and 2.17% error. Fine tuning the minimum samples needed to be at a leaf node, and the minimum samples needed to split a node helped reduce the error down to its current value of 2.17%. I would like to restate that I believe that the model result is highly dependent on the input data. I look forward to tweaking the model and data further to see if I can improve my result of 2.17% error.