### YOLO inspired model

Note - my forward pass is based on the tiny-yolo model. My input images are split into 13x13 cells, 169 regions in total with each region having one bounding box guess (1x7). I built a simplified cost function for this task, choosing to have one bounding box guess for each cell rather than having the model specialize in different box sizes. I will also be using a constant learning rate which deviates from the standard YOLO model. Further, I will not be fully training the model (using all the data) as it is very computationally expensive. I will simply be showing that the model will work.

In [1]:
import numpy as np
import tensorflow as tf
import pandas as pd
from keras import backend as K
import matplotlib.pyplot as plt
import latex
from sklearn.utils import shuffle

%matplotlib inline
import warnings
warnings.filterwarnings('ignore')

Using TensorFlow backend.


### Loading the training data

In [2]:
theX = np.load("../../data/yolo_cleaned_data/X_416.npy")

In [3]:
# This encoding of y is of shape (13,13,7)
they = np.load("../../data/yolo_cleaned_data/y_416_sm.npy")

### Defining the tensorflow placeholders for the training data

In [13]:
# Placeholder values for input X,y data
def get_placeholders(x_h,x_w,x_c,y_h,y_w,y_c):
    """
    x_h: Height for x input 
    x_w: Width for x input
    x_c: Channels for x input
    y_h: Height for y input
    y_w: Width for y input
    y_c: Channels for y input
    """
    X = tf.placeholder(tf.float32, name="X", shape=(None,x_h,x_w,x_c))
    y = tf.placeholder(tf.float32, name="y", shape=(None,y_h,y_w,y_c))
    return X,y

In [5]:
# Testing if get_placeholders function breaks
tf.reset_default_graph()
with tf.Session() as sess:
    X,y = get_placeholders(416,416,3,13,13,7)
    print("X shape:",X.shape)
    print("y shape:",y.shape)

X shape: (?, 416, 416, 3)
y shape: (?, 13, 13, 7)


### Defining the forward propagation step

In [10]:
# Defining constant layer for 2d convolution, batch norm, activation, and maxpool
def conv_maxpool(the_input,layer,f,s=2,ps=2,ks=3):
    """
    layer: specifies the layer number for naming sections of graph
    the_input: the layer which will be used as input in conv layer
    f (filters): the number of filters to be used for each layer
    s (strides): strides used in max_pool
    ps (pool_size): kernel size for max_pooling
    ks (kernel_size): kernel size for conv2d layer
    Note - padding for these conv2d layers will involve padding
    """
    layer = str(layer)
    Z = tf.layers.conv2d(the_input,filters=f,kernel_size=[ks,ks],strides=(1,1),padding="same",name="Z"+layer,kernel_initializer=tf.contrib.layers.xavier_initializer(seed=0))
    Bn = tf.layers.batch_normalization(Z,name="Bn"+layer)
    A = tf.nn.leaky_relu(Bn,alpha=0.1,name="A"+layer)
    P = tf.layers.max_pooling2d(A,pool_size=[ps,ps],strides=s,padding="valid",name="P"+layer)
    return P

In [11]:
# Same input parameters as conv_maxpool but doesn't involve a max_pool step
def standard_conv(the_input,layer,f,ks=3):
    layer = str(layer)
    Z = tf.layers.conv2d(the_input,filters=f,kernel_size=[ks,ks],strides=(1,1),padding="same",name="Z"+layer,kernel_initializer=tf.contrib.layers.xavier_initializer(seed=0))
    Bn = tf.layers.batch_normalization(Z,name="Bn"+layer)
    A = tf.nn.leaky_relu(Bn,alpha=0.1,name="A"+layer)
    return A

In [12]:
# Building the forward pass based on Tiny-YOLO
# Note - forward pass will use leaky_relu
def forward_pass(X):
    input_layer = tf.reshape(X,[-1,416,416,3]) # Input shape of images
    P1 = conv_maxpool(input_layer,1,16)
    P2 = conv_maxpool(P1,2,32)
    P3 = conv_maxpool(P2,3,64)
    P4 = conv_maxpool(P3,4,128)
    P5 = conv_maxpool(P4,5,256)
    A6 = standard_conv(P5,6,512)
    A7 = standard_conv(A6,7,1024)
    # Final layer - no batch norm, linear activation
    A8 = tf.layers.conv2d(A7,filters=7,kernel_size=[1,1],strides=(1,1),padding="valid",name="A8",activation=None)
    return A8

In [9]:
# Testing if forward_pass breaks
tf.reset_default_graph()
with tf.Session() as sess:
    np.random.seed(1)
    X,y = get_placeholders(416,416,3,13,13,7)
    Z13 = forward_pass(X) # Computation graph
    init = tf.global_variables_initializer()
    sess.run(init)
    aZ = sess.run(Z13,feed_dict={X:np.random.randn(3,416,416,3),y:np.random.randn(3,13,13,7)})
    print("Z shape:", str(aZ.shape))

Z shape: (3, 13, 13, 7)


### Loss Function for YOLO

$$ \lambda_{coord} \sum_{i=0}^{S^{2}} \sum_{j=0}^{B} 1_{ij}^{obj} \bigg[(x_i-\hat{x_{i}})^2 + (y_i - \hat{y_i})^2\bigg]$$
$$ + \lambda_{coord} \sum_{i=0}^{S^{2}} \sum_{j=0}^{B} 1_{ij}^{obj} \bigg[(\sqrt{w_i}-\sqrt{\hat{w_{i}}})^2 + (\sqrt{h_i} - \sqrt{\hat{h_i}})^2\bigg] $$
$$ + \sum_{i=0}^{S^{2}} \sum_{j=0}^{B} 1_{ij}^{obj} (c_i-\hat{c_{i}})^2 $$
$$ + \lambda_{noobj} \sum_{i=0}^{S^{2}} \sum_{j=0}^{B} 1_{ij}^{noobj} (c_i-\hat{c_{i}})^2 $$
$$ \sum_{i=0}^{S^{2}} 1_{i}^{obj} \sum_{c \in classes} (p_i(c)-\hat{p_{i}}(c))^2 $$

Note - the ground truth box in B will be the box that has the highest IoU with the true box

Terms:
- S<sup>2</sup>: the number of cells in an image (15x15)
- B: all bounding boxes per cell (1) 
- 1<sup>obj</sup><sub>ij</sub>: denotes the bounding box predictor in cell (i,j) responsible for prediction
- 1<sup>obj</sup><sub>ij</sub>: denotes if object appears in cell
- C<sub>i</sub>: confidence score for whether there is an object
- lambda<sub>coord</sub>: (5) weight factor that increases loss from bounding box predictions 
- lambda<sub>noobj</sub>: (0.5) weight factor that decreases loss from predictions for boxes that don't contain objects

### Two costs functions
One which applies to multiple bounding box guesses per cell and one which applies to one bounding box guess per cell. I will label which helper functions apply to which.

In [1]:
# NOTE - Only used when there are multiple bounding box guesses
def get_iou(box1, box2):
    """
    box1 - coordinates (x1, y1, x2, y2)
    box2 - coordinates (x1, y1, x2, y2)
    """
    xi1 = tf.maximum(box1["x1"], box2["x1"])
    yi1 = tf.maximum(box1["y1"], box2["y1"])
    xi2 = tf.minimum(box1["x2"], box2["x2"])
    yi2 = tf.minimum(box1["y2"], box2["y2"])
    inter_area = tf.maximum((xi2 - xi1),0) * tf.maximum((yi2 - yi1),0)

    box1_area = (box1["x2"]-box1["x1"]) * (box1["y2"] - box1["y1"])
    box2_area = (box2["x2"]-box2["x1"]) * (box2["y2"] - box2["y1"])
    union_area = box1_area + box2_area - inter_area
    iou = inter_area / union_area
    
    return iou

In [8]:
# Returns the values with a specific mask applied to it
def get_box_values(box,mask):
    """
    Index:
    0: confidence there is an object in cell, 1: mid_x, 2: mid_y, 
    3: width, 4: length, 5: prob_dog, 6: prob_cat
    """
    confidence = tf.boolean_mask(box[:,:,:,0:1],mask)
    mid_x = tf.boolean_mask(box[:,:,:,1:2],mask)
    mid_y = tf.boolean_mask(box[:,:,:,2:3],mask)
    width = tf.boolean_mask(box[:,:,:,3:4],mask)
    length = tf.boolean_mask(box[:,:,:,4:5],mask)
    prob_dog = tf.boolean_mask(box[:,:,:,5:6],mask)
    prob_cat = tf.boolean_mask(box[:,:,:,6:7],mask)
    box = {"co":confidence, "mx":mid_x,"my":mid_y,"w":width,"l":length,"d":prob_dog,"c":prob_cat}
    return box

In [12]:
# NOTE - Only used when there are multiple bounding box guesses
# Returns the (x1,y1),(x2,y2) coordinates for each bounding box as a dict
def get_xy(box_v):
    mid_x = box_v["mx"]
    mid_y = box_v["my"]
    width = box_v["w"]
    length = box_v["l"]
    width = width * 416
    length = length * 416
    
    x1 = mid_x - (1/2*width)
    x2 = mid_x + (1/2*width)
    y1 = mid_y - (1/2*length)
    y2 = mid_y + (1/2*length)
    
    box_xy = {"x1":x1,"x2":x2,"y1":y1,"y2":y2}
    return box_xy

In [13]:
# NOTE - Only used when there are multiple bounding box guesses
# This returns the information associated with the bounding box prediction with the max IoU
# Note this was built for handling two values specifically
def get_max_iou(box1, box2, y, mask):
    """
    b1_v: bounding box values associated with box1
    b2_v: bounding box values associated with box2
    y_v: bounding box values associated with box3
    """
    b1_v = get_box_values(box1,mask)
    b2_v = get_box_values(box2,mask)
    y_v = get_box_values(y,mask)
    # These new coordinates will be used to get the IoU
    b1_xy = get_xy(b1_v)
    b2_xy = get_xy(b2_v)
    y_xy = get_xy(y_v)
    # Getting the Iou for each bounding box prediction
    b1_iou = get_iou(b1_xy, y_xy)
    b2_iou = get_iou(b2_xy, y_xy)
    # Comparing the ious to determine which guess is the ground truth prediction
    def b1(): return b1_v # Lambdas b/c tf.cond requires function calls for true and false
    def b2(): return b2_v
    highest_iou_values = tf.cond(tf.reshape(tf.less(b1_iou,b2_iou),[]), b2, b1) # conditional statement
    return highest_iou_values

In [14]:
# NOTE - Only used when there are multiple bounding box guesses
# This cost function is used for predictions which have two bounding box guesses - still in dev
# Ground truth box is defined as the box with the highest IoU - this box's confidence score will be evaluated
# Note - there is only one object in one cell so no summation over bounding box error for multiple cells
def cost_function_mult(Z,y,coord=5,noobj=0.5):
    """
    Z - shape (None,13,13,14)
    y - shape (None,13,13,7)
    """
    # Masks for guesses that are ground truth or are not
    c_mask_true = y[:,:,:,0:1] > 0
    c_mask_false = y[:,:,:,0:1] < 1
    
    # Returning values of max IoU of the two guesses
    box1 = Z[:,:,:,0:7]
    box2 = Z[:,:,:,7:]
    y_v = get_box_values(y,c_mask_true) # values for y for cell with object
    m_v = get_max_iou(box1,box2,y,c_mask_true) # values for highest IoU guess
    
    b1_f = get_box_values(box1,c_mask_false) # For comparing confidence values when no obj in cell
    b2_f = get_box_values(box2,c_mask_false)
    y_f = get_box_values(y,c_mask_false)
    
    # correspond to individual summations of the cost function:
    part1 = coord * tf.reduce_sum(tf.square(y_v["mx"]-m_v["mx"])+tf.square(y_v["my"]-m_v["my"]))
    part2 = 0 #coord * tf.reduce_sum(tf.square(tf.sqrt(y_v["w"])-tf.sqrt(m_v["w"]))+tf.square(tf.sqrt(y_v["l"])-tf.sqrt(m_v["l"])))
    part3 = tf.reduce_sum(tf.square(y_v["co"]-m_v["co"]))
    part4 = noobj * tf.reduce_sum(tf.square(y_f["co"]-b1_f["co"])+tf.square(y_f["co"]-b2_f["co"]))
    part5 = tf.reduce_sum(tf.add(tf.square(y_v["d"]-m_v["d"]),tf.square(y_v["c"]-m_v["c"])))# if obj in cell, if bounding box is highest IoU, compare class predictions
    total_cost = part1 + part2 + part3 + part4 + part5
    return total_cost

#### Standard cost function for one bounding box guess I am using in my model

In [9]:
# Standard cost function corresponding with a single bounding box prediction
def cost_function(Z,y,coord=5,noobj=0.5):
    """
    Z - shape (None,13,13,14)
    y - shape (None,13,13,7)
    """
    c_mask_true = y[:,:,:,0:1] > 0
    c_mask_false = y[:,:,:,0:1] < 1
    
    y_v = get_box_values(y,c_mask_true)
    m_v = get_box_values(Z,c_mask_true)
    mv_f = get_box_values(Z,c_mask_false)
    y_f = get_box_values(y,c_mask_false)
    
    m_v["w"] = tf.sqrt(tf.abs(m_v["w"]))
    m_v["l"] = tf.sqrt(tf.abs(m_v["l"]))
    y_v["w"] = tf.sqrt(y_v["w"])
    y_v["l"] = tf.sqrt(y_v["l"])
    
    # correspond to individual summations of the cost function:
    part1 = coord * tf.reduce_sum(tf.square(y_v["mx"]-m_v["mx"])+tf.square(y_v["my"]-m_v["my"]))
    part2 = coord * tf.reduce_sum(tf.square(y_v["w"]-m_v["w"])+tf.square(y_v["l"]-m_v["l"]))
    part3 = tf.reduce_sum(tf.square(y_v["co"]-m_v["co"]))
    part4 = noobj * tf.reduce_sum(tf.square(y_f["co"]-mv_f["co"]))
    part5 = tf.reduce_sum(tf.add(tf.square(y_v["d"]-m_v["d"]),tf.square(y_v["c"]-m_v["c"])))# if obj in cell, if bounding box is highest IoU, compare class predictions
    total_cost = part1 + part2 + part3 + part4 + part5
    return total_cost

In [16]:
# comparing to mult cost function w/ one input
ay = np.zeros((1,13,13,7))
ay[0][1][1] = np.array([1,0.5,0.5,0.2,0.2,1,0])
az = np.zeros((1,13,13,14))
az[0][1][1] = np.array([0,0,0,0,0,0,0,0.8,0.25,0.25,0.25,0.25,0.8,0.2])
az[0][0][0] = np.array([0,0,0,0,0,0,0,1,0,0,0,0,0,0])
az[0][0][1] = np.array([0,0,0,0,0,0,0,1,0,0,0,0,0,0])

with tf.Session() as sess:
    y = tf.placeholder(tf.float32,shape=(None,13,13,7))
    Z = tf.placeholder(tf.float32,shape=(None,13,13,14))
    aCost = cost_function_mult(Z,y)
    init = tf.global_variables_initializer()
    sess.run(init)
    tot = sess.run(aCost,feed_dict={Z:az,y:ay})
    print(tot)

1.745


In [17]:
# Two inputs to regular cost function
ay = np.zeros((1,13,13,7))
ay[0][1][1] = np.array([1,0.5,0.5,0.2,0.2,1,0])
az = np.zeros((1,13,13,7))
az[0][1][1] = np.array([0.8,0.25,0.25,0.25,0.25,0.8,0.2])
az[0][0][0] = np.array([1,0,0,0,0,0,0])
az[0][0][1] = np.array([1,0,0,0,0,0,0])

with tf.Session() as sess:
    y = tf.placeholder(tf.float32,shape=(None,13,13,7))
    Z = tf.placeholder(tf.float32,shape=(None,13,13,7))
    aCost = cost_function(Z,y)
    init = tf.global_variables_initializer()
    sess.run(init)
    tot = sess.run(aCost,feed_dict={Z:az,y:ay})
    print(tot)

1.7728641


In [18]:
ay = np.zeros((2,13,13,7))
ay[0][1][1] = np.array([1,0.5,0.5,0.2,0.2,1,0])
ay[1][1][1] = np.array([1,0.5,0.5,0.2,0.2,1,0])
az = np.zeros((2,13,13,7))
az[0][1][1] = np.array([0.8,0.25,0.25,0.25,0.25,0.8,0.2])
az[1][1][1] = np.array([0.8,0.25,0.25,0.25,0.25,0.8,0.2])

with tf.Session() as sess:
    y = tf.placeholder(tf.float32,shape=(None,13,13,7))
    Z = tf.placeholder(tf.float32,shape=(None,13,13,7))
    aCost = cost_function(Z,y)
    init = tf.global_variables_initializer()
    sess.run(init)
    tot = sess.run(aCost,feed_dict={Z:az,y:ay})
    print(tot)

1.5457281


#### Filter boxes that have high IoU with each other, this means that they are probably predictions for the same value

$$ IoU = \frac{B_1 \cap B_2}{B_1 \cup B_2}  $$

In [19]:
def yolo_filter_boxes(box_conf,box_class_conf,boxes,threshold=0.6):
    """
    box_conf: 13x13x1 - score of if there is something in box
    box_class_conf: 13x13x2 - score of whether object is cat or dog
    boxes: 13x13x4 - bounding box coordinates
    return only predictions with reasonable probability:
    scores: (None,13,13,1), classes: (None,13,13,1), boxes: (None,13,13,4)
    """
    # multiply prob of something being in box to prob of what is in box
    box_scores = box_conf * box_class_conf # 13x13x2 updated probs
    # Get max pred value of object in cell 
    box_class = tf.argmax(box_scores,axis=-1) # What object we pred for max pred 13x13x1
    box_class_score = tf.reduce_max(box_scores,axis=-1) # What value (highest predicted score) 13x13x1
    # Filtering to get predicted probs >= threshold
    mask = box_class_score >= threshold
    # Apply the mask to the scores, boxes, and classes to get only high prob predictions
    scores = tf.boolean_mask(box_class_score,mask)
    classes = tf.boolean_mask(box_class,mask)
    boxes = tf.boolean_mask(boxes,mask)

    return scores,classes,boxes

In [20]:
# This takes bounding box input corresponding with high accuracy guesses
# This returns the output for any given image, i.e. the one bounding box
def non_max_suppression(scores,classes,boxes,max_boxes=1,iou_threshold=0.5):
    """
    Scores: (None,) - Probability that a box is correctly classifying image
    Classes: (None,) - the class corresponding with classification
    Boxes: (None,4) - the bounding box for the prediction
    max_boxes: max number of predicted boxes, 1 in this case b/c there is one box per image
    """
    max_box = tf.Variable(max_boxes,dtype="int32")
    K.get_session().run(tf.variables_initializer([max_box])) # Initializing variable for non_max_suppression
    box_indicies = tf.image.non_max_suppression(boxes,scores,max_box,iou_threshold)

    # Getting the predicted box,class,and score based on non-max suppression
    scores = tf.gather(scores,box_indicies)
    classes = tf.gather(classes,box_indicies)
    boxes = tf.gather(boxes,box_indicies)
    
    return scores,classes,boxes

In [21]:
# Testing non_max_suppression to see if it breaks
tf.reset_default_graph()
with tf.Session() as sess:
    scores = tf.random_normal([5,],mean=0.5,seed=1)
    classes = tf.random_normal([5,],mean=0.5,seed=1)
    boxes = tf.random_normal([5,4],mean=0.5,seed=1)
    scores,boxes,classes = non_max_suppression(scores,classes,boxes)
    print("scores = " + str(scores.eval()))
    print("boxes = " + str(boxes.eval()))
    print("classes = " + str(classes.eval()))
    print("scores.shape = " + str(scores.shape))
    print("boxes.shape = " + str(boxes.shape))
    print("classes.shape = " + str(classes.shape))

scores = [1.9845988]
boxes = [-1.9427042]
classes = [[ 0.52130264 -0.663239    1.8338274   0.89602387]]
scores.shape = (?,)
boxes.shape = (?,)
classes.shape = (?, 4)


In [22]:
# Creates shuffled mini batches
def random_mini_batches(X, y, mini_batch_size, seed):
    """
    Creates a list of random minibatches from (X, Y)
    Returns:
    mini_batches -- list of synchronous (mini_batch_X, mini_batch_Y)
    """
    rounds = int(X.shape[0] / mini_batch_size) # Max number of minibatches
    X_shuffle = shuffle(X, random_state=seed)
    y_shuffle = shuffle(y, random_state=seed)
    mini_batches = []
    a = 0 #used to siphon off sections of X
    b = 0 #used to siphon off sections of y
    
    for around in range(rounds):
        x_mini = X_shuffle[a:a+mini_batch_size]
        y_mini = y_shuffle[b:b+mini_batch_size]
        mini_batch = (x_mini,y_mini)
        mini_batches.append(mini_batch)
        a += mini_batch_size
        b += mini_batch_size
    
    return mini_batches

### Creating a subset of my training set 

In [10]:
X_train = shuffle(theX,random_state=0)

In [11]:
y_train = shuffle(they,random_state=0)

In [12]:
# Smaller subset to show model potential
X_train_s = X_train[:200]
y_train_s = y_train[:200]

In [32]:
# Building and training YOLO model
def model(X_train,y_train,lr=0.001,minibatch_size=50,num_epochs=200,print_cost=True):
    tf.reset_default_graph() #resetting graph
    tf.set_random_seed(1)
    seed=0
    costs=[]
    x_h = X_train[0].shape[0]
    x_w = X_train[0].shape[1]
    x_c = X_train[0].shape[2]
    y_h = y_train[0].shape[0]
    y_w = y_train[0].shape[1]
    y_c = y_train[0].shape[2]
    m = X_train.shape[0]
    
    X,y = get_placeholders(x_h,x_w,x_c,y_h,y_w,y_c)
    Z = forward_pass(X)
    cost = cost_function(Z, y)
    optimizer = tf.train.AdamOptimizer(learning_rate=lr).minimize(cost)
    
    init = tf.global_variables_initializer()
    saver = tf.train.Saver()
    with tf.Session() as sess:
        # Loading saved model
        saver = tf.train.import_meta_graph("../../structured_dl_files/models/yolo_model.ckpt.meta")
        saver.restore(sess, "../../structured_dl_files/models/yolo_model.ckpt")
        
        #sess.run(init) # DONT RUN INIT IF LOADING MODEL
        train_writer = tf.summary.FileWriter('../../structured_dl_files/graphs', sess.graph)
        
        for epoch in range(num_epochs):
            minibatch_cost = 0
            seed += 1
            minibatches = random_mini_batches(X_train, y_train, minibatch_size, seed)
            
            for minibatch in minibatches:
                (mini_x,mini_y) = minibatch
                _,temp_cost = sess.run([optimizer,cost], feed_dict={X:mini_x,y:mini_y})
                minibatch_cost += temp_cost
                print(minibatch_cost)
                
            costs.append(cost)
            if print_cost and epoch % 1 == 0:
                print("Cost at epoch {}: {}".format(epoch+1,minibatch_cost))
                
        loc = saver.save(sess, "../../structured_dl_files/models/yolo_model.ckpt")
        aZ = sess.run(Z,feed_dict={X:X_train,y:y_train}) # predictions
        return costs,aZ

In [None]:
# Second round of training
costs,preds = model(X_train_s,y_train_s)

INFO:tensorflow:Restoring parameters from ../../structured_dl_files/models/yolo_model.ckpt
106.44920349121094
216.62578582763672
323.6735382080078
432.7912902832031
Cost at epoch 1: 432.7912902832031
113.49668884277344
216.3068084716797
324.7654266357422
429.34898376464844
Cost at epoch 2: 429.34898376464844
104.82429504394531
207.84427642822266
315.4105682373047
416.12530517578125
Cost at epoch 3: 416.12530517578125
101.20387268066406
199.7838363647461
297.68029022216797
405.85037994384766
Cost at epoch 4: 405.85037994384766
102.01214599609375
199.57809448242188
305.50592041015625
403.7985076904297
Cost at epoch 5: 403.7985076904297
94.0064697265625
202.92708587646484
307.5936279296875
406.3806838989258
Cost at epoch 6: 406.3806838989258
105.31407165527344
200.9748077392578
300.3194351196289
405.46964263916016
Cost at epoch 7: 405.46964263916016
101.09192657470703
203.12165069580078
302.19994354248047
392.60133361816406
Cost at epoch 8: 392.60133361816406
94.88398742675781
186.5220108

2.019789457321167
5.141369342803955
7.2674572467803955
9.649935960769653
Cost at epoch 76: 9.649935960769653
1.8516333103179932
4.78200888633728
6.686229109764099
8.60006308555603
Cost at epoch 77: 8.60006308555603
2.506596803665161
4.425396084785461
6.105759382247925
7.833284974098206
Cost at epoch 78: 7.833284974098206
1.3440451622009277
3.381235361099243
5.18697464466095
7.399457812309265
Cost at epoch 79: 7.399457812309265
1.544036626815796
2.909920811653137
5.322439789772034
6.769153714179993
Cost at epoch 80: 6.769153714179993
2.3017094135284424
3.65920627117157
4.817486882209778
6.087842106819153
Cost at epoch 81: 6.087842106819153
2.0072267055511475
3.1317977905273438
4.5333287715911865
5.651757836341858
Cost at epoch 82: 5.651757836341858
1.8525433540344238
2.8768386840820312
3.989458441734314
5.022396802902222
Cost at epoch 83: 5.022396802902222
0.9377989768981934
2.6602107286453247
3.595832347869873
4.542452096939087
Cost at epoch 84: 4.542452096939087
1.6573599576950073
2.4

0.1663651019334793
0.32262490689754486
0.4942900687456131
0.654864564538002
Cost at epoch 149: 0.654864564538002
0.15620708465576172
0.3349378705024719
0.484584242105484
0.6328218877315521
Cost at epoch 150: 0.6328218877315521
0.16925179958343506
0.3195473849773407
0.46682174503803253
0.6150528937578201
Cost at epoch 151: 0.6150528937578201
0.14815379679203033
0.28602515161037445
0.4168928563594818
0.5553697049617767
Cost at epoch 152: 0.5553697049617767
0.12845489382743835
0.2796456217765808
0.4028455466032028
0.5358579009771347
Cost at epoch 153: 0.5358579009771347
0.12689100205898285
0.2567274868488312
0.37830693274736404
0.524681381881237
Cost at epoch 154: 0.524681381881237
0.13798139989376068
0.27265506982803345
0.4163765013217926
0.5439875721931458
Cost at epoch 155: 0.5439875721931458
0.13940772414207458
0.27276088297367096
0.4159250259399414
0.5344820097088814
Cost at epoch 156: 0.5344820097088814
0.1383335292339325
0.29689835011959076
0.4287816435098648
0.5623279362916946
Cos

In [None]:
# First round of training
costs,preds = model(X_train_s,y_train_s)

443.8983459472656
843.2326965332031
1018.7478790283203
2587.628372192383
Cost at epoch 1: 2587.628372192383
974.9049682617188
1346.6656494140625
1563.125503540039
1856.5045928955078
Cost at epoch 2: 1856.5045928955078
202.01513671875
447.2460632324219
650.0443267822266
844.9364166259766
Cost at epoch 3: 844.9364166259766
163.08554077148438
304.7502746582031
439.9272918701172
593.5941162109375
Cost at epoch 4: 593.5941162109375
139.77182006835938
258.9366912841797
382.5002899169922
511.2835998535156
Cost at epoch 5: 511.2835998535156
125.28668975830078
251.5133285522461
374.7551040649414
496.48729705810547
Cost at epoch 6: 496.48729705810547
129.27267456054688
244.60352325439453
363.78765869140625
486.10472869873047
Cost at epoch 7: 486.10472869873047
112.49811553955078
239.3030776977539
364.4149475097656
476.52024841308594
Cost at epoch 8: 476.52024841308594
116.88157653808594
239.99359130859375
361.3361053466797
477.6672897338867
Cost at epoch 9: 477.6672897338867
121.48662567138672
2

In [8]:
y_1 = they[0]
X_1 = theX[0]
y_1.shape = (1,13,13,7)
X_1.shape = (1,416,416,3)

In [14]:
tf.reset_default_graph()
X,y = get_placeholders(416,416,3,13,13,7)
Z = forward_pass(X)
saver = tf.train.Saver()
with tf.Session() as sess:
    saver = tf.train.import_meta_graph("../../structured_dl_files/models/yolo_model.ckpt.meta")
    saver.restore(sess, "../../structured_dl_files/models/yolo_model.ckpt")
    aZ = sess.run(Z,feed_dict={X:X_1,y:y_1})

INFO:tensorflow:Restoring parameters from ../../structured_dl_files/models/yolo_model.ckpt


In [15]:
aZ.shape

(1, 13, 13, 7)

In [24]:
y_1[0,4,8,:]

array([1.        , 0.0177665 , 0.394     , 0.46700508, 0.392     ,
       0.        , 1.        ])

In [25]:
aZ[0,4,8,:]

array([ 0.1185999 ,  0.6531289 ,  0.32525685,  0.50805163, -0.48630908,
        0.30823478,  0.60653496], dtype=float32)