## Conditional Imitation Learning

SDVM will adapt an end to end approach for autonomy. The base algorithm comes from Conditional Imitation Learning by Felipe Codevilla, Matthias Muller, Antonio Lopez, Vladlen Koltun and Alexey Dosovitskiy where the model learns how to drive and the intent of the expert to have a better understanding of the model on the road . To know more about the research you can read on their paper. (https://arxiv.org/abs/1710.02410])


![A controller gets an observation, and command outputs an action which when interacting with environment creates a new observation.](control.jpeg)

A controller gets an observation, and command outputs an action which when interacting with environment creates a new observation.


## Conditional Imitation Learning Network Architecture

![network](net.png)


Data are collected on a simulator or real world including images and measurements

Command refers to one module might specialize in lane following, another in right turns, and a third in left turns. All modules share the perception stream

In [None]:
def conv_block(inputs, filters, kernel_size, strides):
    
    x = Conv2D(filters, (kernel_size, kernel_size), strides = strides, activation='relu')(inputs)
    x = MaxPooling2D(pool_size=(3,3), strides=(2,2))(x)
    x = BatchNormalization()(x)
    x = Dropout(0.4)(x)
     
    return x

def fc_block(inputs, units):
    fc = Dense(units, activation = 'relu')(inputs)
    fc = Dropout(0.2)(fc)
    return fc

''' Branches of network, comes from planner or user defined'''
branches = []

inputs = Input(image_size)

""" Conv 1 """
x = conv_block(inputs, 32, 5, 2)
x = conv_block(x, 32, 3, 1)

""" Conv 2 """
x = conv_block(inputs, 64, 3, 2)
x = conv_block(x, 64, 3, 1)

""" Conv 3 """
x = conv_block(inputs, 128, 3, 2)
x = conv_block(x, 128, 3, 1)

""" Conv 4 """
x = conv_block(inputs, 256, 3, 1)
x = conv_block(x, 256, 3, 1)

""" Reshape """
x = Flatten()(x)

""" fc """
x = fc_block(x, 512)

""" fc """
x = fc_block(x, 512)

"""Process Control"""

""" Speed (measurements) """
speed = input_data[1] # input_speed
speed = fc_block(speed, 128)
speed = fc_block(speed, 128)

""" Joint sensory """
j = concatenate([x, speed])
j = fc_block(j, 512)


""" Final block on dense layer on every branches """
for i in range(len(branch_config)):
    # Use only image input to predict the speed
    if branch_config[i][0] == "Speed":
        branch_output = fc_block(x, 256)
        branch_output = fc_block(branch_output, 256)     
    else:
        branch_output = fc_block(j, 256)
        branch_output = fc_block(branch_output, 256)


    branches.append(Dense((branch_output,len(branch_config[i]))))

### Future changes and improvement
    
    - Using CNN-RNN model to predict speed with only images
    - Advance image processing for inference speed up
    - Appplying ResNet to avoid vanishing gradients
    - RL model to improve performance
    - Motion planner not yet implemented
    