Move bounding box drawing from the server to the client. #869

jmancewicz · 2016-06-24T21:03:31Z

In preparation for the multi-class object detection display, I have moved the drawing of the bounding boxes from the server to the client. This will allow quicker interactions with the display. Users will be able to toggle displays of bounding boxes by class, adjust the image desaturation if there are color conflicts between the image a class's bounding box color, and change the line width.

So that the lines are drawn with a fixed width regardless of scale, the images are redrawn when the window is resized such that the image display is resized.

The header text "Found 7 box(es) in 3 image(s)" has been changed to drop the "in n images" if the number of images is one and to pluralize "box" as needed.

The line width and desaturation settings are saved per browser.

lukeyeager · 2016-06-28T16:47:54Z

digits/extensions/view/boundingBox/view.py

@@ -11,6 +11,8 @@

 CONFIG_TEMPLATE = "config_template.html"
 HEADER_TEMPLATE = "header_template.html"
+NG_HEADER_TEMPLATE = "ng_header_template.html"
+NG_FOOTER_TEMPLATE = "ng_footer_template.html"


It doesn't look like footer_template.html is used anywhere?

You're right.

lukeyeager · 2016-06-28T16:50:42Z

Very cool. This is working great for me!

lukeyeager · 2016-06-28T16:52:43Z

digits/extensions/view/boundingBox/view.py

@@ -125,10 +129,9 @@ def process_data(
                box = ((output[0], output[1]), (output[2], output[3]))
                bboxes.append(box)
                self.bbox_count += 1
-        digits.utils.image.add_bboxes_to_image(


If we remove it here should we just delete it from utils.image?

I left it there in case there was a use for it in the REST API at some point. I'll just pull it.

Greendogo · 2016-06-28T23:14:47Z

I take this to mean DIGITS currently isn't equipped to do multiple class detection? Any idea on an ETA on that? That kind of puts me and my whole team in a bit of a bind.
*But we're more than willing to help out with testing it out when stuff starts to get pushed!

gheinrich · 2016-06-29T20:22:36Z

digits/extensions/view/interface.py

@@ -49,6 +49,32 @@ def get_header_template(self):
        """
        return None, None

+    def get_ng_templates(self):


do you think we could stick to the existing get_header_template() (possibly we can add a get_footer_template() too)? It feels awkward to name APIs after the Javascript package we are using. If you move to C3 or any other Javascript you don't want to rename the APIs right?

I agree, @gheinrich. I should at least rename it to something library agnostic, like app_begin_html and app_end_html or something to that effect. I didn't insert the app header html into the existing header because the header is wrapped in a div, and the app code needs to have the interface views as child elements for scope purposes.

<div class="row"> <h3>Summary</h3> {{header_html|safe}} </div>

If I were to insert the app code into that header block, the interface view would be isolated from it. Thoughts?

I had added a footer, but it was not used and subsequently removed.

Decided to rename ng_header and ng_footer to app_begin and app_end. It turns out that trying to work these into the header_html would make it too convoluted.

jmancewicz · 2016-07-07T16:26:38Z

@gheinrich could you have another look before I merge?

jmancewicz · 2016-07-20T02:30:05Z

I've made some changes to support multi-class object detection visualization.

With an option popup to adjust display options.

###Todo:

I need to test the Test DB path yet.

Like todo:

I'd like to add mouse over data (class and confidence) for all bboxes under the pointer.
I'd like to add a zoom feature, find the canvas too slow. May try WebGL.
I'd like to add (Mike's suggestion) an alternative view mode, showing confidence as color green to red. I'm not sure how confidence is computed and if it can be normalized.

gheinrich · 2016-07-20T10:17:08Z

Hi Joe, this is fantastic:

I tried Test Db on my validation database (1496 images). The UI becomes unresponsive but it's working:

gheinrich · 2016-07-20T11:13:08Z

digits/extensions/view/boundingBox/config_template.html

@@ -5,18 +5,3 @@
 {% from "helper.html" import mark_errors %}

 <small>Draw a bounding box around a detected object. This expected network output is a nested list of list of box coordinates</small>


if you don't mind fixing this typo: "This expected... " -> "The expected ..." (and maybe "list of list" -> "list of lists" too). Or if you want to completely rephrase it so it does sound English, you're most welcome to do so :-)

I wrote something new. How English it sounds is arguable.

lukeyeager · 2016-08-17T21:37:06Z

The Travis build failed for some unrelated Torch issue. This looks great - thanks Joe!

szm-R · 2016-09-05T07:24:54Z

Hi, How can I use this multi-class inference visualization? I'm using digits 4 with nvidia caffe 0.15.13

lukeyeager · 2016-09-06T17:19:04Z

@szm2015 this pull request simply adds visualization support in DIGITS. It doesn't provide information about how to actually train a working model.

You can try to use the detectnet_network-2classes.prototxt that @gheinrich checked in with NVIDIA/caffe#157 (you seem to be already aware of that pull request), and that should help you get started with a multi-class network architecture. However, getting it to converge on KITTI data is tricky. Hopefully we'll be able to write up some documentation on how to train a model for yourself soon.

szm-R · 2016-09-06T17:32:06Z

@lukeyeager, Thank you, but my question was indeed about visualization not training, I have already trained a 2-class network but have problem visualizing it, It only shows one class and I don't know whether it's because normal visualization doesn't support more than 1 class or it's that I'm not doing the training right at all

lukeyeager · 2016-09-06T17:37:19Z

Oh I see. If your second class isn't showing up, then you simply aren't generating any proposals for that class.

You can monkey around with the thresholds in the clustering layer to try to generate more boxes.

Try changing from

param_str : '1248, 352, 16, 0.6, 3, 0.02, 22'

to

param_str : '1248, 352, 16, 0.2, 2, 0.01, 22'

for example.

Just make some edits to the deploy.prototxt in your job directory and try running inference again. You shouldn't need to restart the DIGITS server.

jmancewicz · 2016-09-06T20:04:33Z

When using the Bounding Boxes, I see this when I have results for two classes

You can use the Raw Data to see the data that describes the Bounding Boxes.

The empty boxes are filtered out before going to the Bounding Box visualization, but are present in the Raw Data. You can see here that bbox-list-class1 (pedestrian) has one box and bbox-list-class0 (car) has two.

szm-R · 2016-09-07T20:36:27Z

@lukeyeager Thank you, I will try this,
A rather irrelevant question!!! How should we cite digits or detectNet if we are using them in our research?

lukeyeager · 2016-09-07T22:11:34Z

@szm2015 thanks for asking!

For DIGITS you could cite either of these papers. I'd suggest using the first since it seems the links for the second paper will probably expire next year.

DIGITS: the Deep learning GPU Training System
- https://sites.google.com/site/automlwsicml15/accepted-papers
- https://drive.google.com/file/d/0BzRGLkqgrI-qRmk2UXMzeFcxSWM/view
Effective Visualizations for Training and Evaluating Deep Models
- http://icmlviz.github.io/papers/
- http://icmlviz.github.io/assets/papers/16.pdf

DetectNet doesn't have a paper, unfortunately. I guess you can link to this pull request? NVIDIA/caffe#144

EDIT

Actually, this blog post is probably a better link for DetectNet:
https://devblogs.nvidia.com/parallelforall/detectnet-deep-neural-network-object-detection-digits/

varunvv · 2016-12-21T15:30:59Z

@lukeyeager I am trying to train a model to detect more than two class of objects.
But as the number of classes increase, the mAP, precison and recall for each model decreases.
Screen shots of training results for 2 object and 3 object detection are given below
Result obtained for 2 class object detection

Result obtained for 3 class object detection

Help me to increase the performance of the models

ShashankVBhat · 2016-12-27T07:15:54Z

I have trained my network for 3 object classes & I am getting good mAP fo all 3 classes, but not getting bounding box for first 2 objects. It draws bounding box only for last class.
If I view raw data it shows bounding boxes for all 3 objects.

I would like to know whether I have to make changes for visualization of bounding boxes for multi classes or DIGITS by default does the changes for visualization of bounding boxes for multi class object detection.

varunvv · 2016-12-29T06:36:28Z

@ShashankVBhat Recheck the last few layer of the network namely "cluster","cluster_gt","score" and "mAP"
There should be one cluster and cluster_gt layer that have n "top" parameters, where n is the number of object classes. In your case "top" parameter occurs 3 times in both "cluster" and "cluster_gt" layer.
"cluster" and cluster_gt layers for a 2 class case is shown below
layer {
type: 'Python'
name: 'cluster'
bottom: 'coverage'
bottom: 'bboxes'
top: 'bbox-list-class0'
top: 'bbox-list-class1'
python_param {
module: 'caffe.layers.detectnet.clustering'
layer: 'ClusterDetections'
param_str : '1600, 1232, 16, 0.6, 3, 0.02, 22, 2'
}
include: { phase: TEST }
}

layer {
type: 'Python'
name: 'cluster_gt'
bottom: 'coverage-label'
bottom: 'bbox-label'
top: 'bbox-list-label-class0'
top: 'bbox-list-label-class1'
python_param {
module: 'caffe.layers.detectnet.clustering'
layer: 'ClusterGroundtruth'
param_str : '1600, 1232, 16, 2'
}
include: { phase: TEST stage: "val" }
}

Now for each object class there should be a "score" layer and "mAP" layer. "score" and "mAP" layer for class0 is given below.

layer {
type: 'Python'
name: 'score-class0'
bottom: 'bbox-list-label-class0'
bottom: 'bbox-list-class0'
top: 'bbox-list-scored-class0'
python_param {
module: 'caffe.layers.detectnet.mean_ap'
layer: 'ScoreDetections'
}
include: { phase: TEST stage: "val" }
}
layer {
type: 'Python'
name: 'mAP-class0'
bottom: 'bbox-list-scored-class0'
top: 'mAP-class0'
top: 'precision-class0'
top: 'recall-class0'
python_param {
module: 'caffe.layers.detectnet.mean_ap'
layer: 'mAP'
param_str : '1600, 1232, 16'
}
include: { phase: TEST stage: "val" }
}

ShashankVBhat · 2016-12-29T12:00:59Z

@varunvv Thank you for the response. I have checked these settings as you mentioned but still I am not able to get bounding box.
@lukeyeager @varunvv @szm2015
Does that depend on the values of gridbox_cvg_threshold, gridbox_rect_thresh, variable eps???
It would be much helpfull if I get correct definition for these variables.

findorion · 2017-01-12T13:06:46Z

@ShashankVBhat Hi, you said that u manage to create a network on 3 classes, I tried to do so with 2 classes with the modifications of https://github.com/NVIDIA/caffe/blob/caffe-0.15/examples/kitti/detectnet_network-2classes.prototxt , but it seems that neither the recall or the precision/maP moves from 0. I kept the value of the tutorial https://github.com/NVIDIA/DIGITS/blob/master/examples/object-detection/README.md for the batch size, learning rate, etc.
Do you have any idea or could you help me through that ?

ShervinAr · 2017-01-14T07:23:43Z

@jmancewicz Hello, is there any chance you could provide the caffe model for multi-class detection of car@pedestrians in DIGITS?
I am using detectnet_network-2classes.prototxt as the network.
many thanks

Move bounding box drawing from the server to the client.

sulth · 2017-09-25T12:54:34Z

@varunvv i got 1 class trained for car as well as for my own data.But iam struggling hard to find a solution with multiclass detection in DIGITS.i tried various ways from Digits but network is only training for car always.Please help me to solve this.

chandanv2 · 2018-02-12T12:21:11Z

@ShashankVBhat Hey, did you train entirely for the 3 classes at once or fine tuned for other classes after training on a single class? Please share the details of training parameters that you've used. It would be of great help!
Thank You.

chandanv2 · 2018-02-12T12:24:29Z

@jmancewicz How are you getting such good results for both the classes (cars and pedestrians)?. Please share the training process details. Did you fine tune over one class or trained the model for both classes at once?
Thanks for info!

eanmikale · 2020-08-13T01:16:56Z

@lukeyeager @senecaur @gheinrich @dusty-nv

Getting the following raw output for multi-class inference. Could someone show me an example of the Python Binding layer. Here is my inference output...

bbox-list-class1 [[0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.]] bbox-list-class0 [[0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.] [0. 0. 0. 0. 0.]]

My Map is zero after 300 epochs.

My detectnet Network is included below...

DetectNet network

Data/Input layers

name: "DetectNet"
layer {
name: "train_data"
type: "Data"
top: "data"
data_param {
backend: LMDB
source: "examples/kitti/kitti_train_images.lmdb"
batch_size: 10
}
include: { phase: TRAIN }
}
layer {
name: "train_label"
type: "Data"
top: "label"
data_param {
backend: LMDB
source: "examples/kitti/kitti_train_labels.lmdb"
batch_size: 10
}
include: { phase: TRAIN }
}
layer {
name: "val_data"
type: "Data"
top: "data"
data_param {
backend: LMDB
source: "examples/kitti/kitti_test_images.lmdb"
batch_size: 6
}
include: { phase: TEST stage: "val" }
}
layer {
name: "val_label"
type: "Data"
top: "label"
data_param {
backend: LMDB
source: "examples/kitti/kitti_test_labels.lmdb"
batch_size: 6
}
include: { phase: TEST stage: "val" }
}
layer {
name: "deploy_data"
type: "Input"
top: "data"
input_param {
shape {
dim: 1
dim: 3
dim: 720
dim: 1248
}
}
include: { phase: TEST not_stage: "val" }
}

Data transformation layers

layer {
name: "train_transform"
type: "DetectNetTransformation"
bottom: "data"
bottom: "label"
top: "transformed_data"
top: "transformed_label"
detectnet_groundtruth_param: {
stride: 16
scale_cvg: 0.4
gridbox_type: GRIDBOX_MIN
coverage_type: RECTANGULAR
min_cvg_len: 20
obj_norm: true
image_size_x: 1248
image_size_y: 720
crop_bboxes: false
object_class: { src: 1 dst: 0} # cars -> 0
object_class: { src: 8 dst: 1} # pedestrians -> 1
}
detectnet_augmentation_param: {
crop_prob: 1
shift_x: 32
shift_y: 32
flip_prob: 0.5
rotation_prob: 0
max_rotate_degree: 5
scale_prob: 0.4
scale_min: 0.8
scale_max: 1.2
hue_rotation_prob: 0.8
hue_rotation: 30
desaturation_prob: 0.8
desaturation_max: 0.8
}
transform_param: {
mean_value: 127
}
include: { phase: TRAIN }
}
layer {
name: "val_transform"
type: "DetectNetTransformation"
bottom: "data"
bottom: "label"
top: "transformed_data"
top: "transformed_label"
detectnet_groundtruth_param: {
stride: 16
scale_cvg: 0.4
gridbox_type: GRIDBOX_MIN
coverage_type: RECTANGULAR
min_cvg_len: 20
obj_norm: true
image_size_x: 1248
image_size_y: 720
crop_bboxes: false
object_class: { src: 1 dst: 0} # cars -> 0
object_class: { src: 8 dst: 1} # pedestrians -> 1
}
transform_param: {
mean_value: 127
}
include: { phase: TEST stage: "val" }
}
layer {
name: "deploy_transform"
type: "Power"
bottom: "data"
top: "transformed_data"
power_param {
shift: -127
}
include: { phase: TEST not_stage: "val" }
}

Label conversion layers

layer {
name: "slice-label"
type: "Slice"
bottom: "transformed_label"
top: "foreground-label"
top: "bbox-label"
top: "size-label"
top: "obj-label"
top: "coverage-label"
slice_param {
slice_dim: 1
slice_point: 1
slice_point: 5
slice_point: 7
slice_point: 8
}
include { phase: TRAIN }
include { phase: TEST stage: "val" }
}
layer {
name: "coverage-block"
type: "Concat"
bottom: "foreground-label"
bottom: "foreground-label"
bottom: "foreground-label"
bottom: "foreground-label"
top: "coverage-block"
concat_param {
concat_dim: 1
}
include { phase: TRAIN }
include { phase: TEST stage: "val" }
}
layer {
name: "size-block"
type: "Concat"
bottom: "size-label"
bottom: "size-label"
top: "size-block"
concat_param {
concat_dim: 1
}
include { phase: TRAIN }
include { phase: TEST stage: "val" }
}
layer {
name: "obj-block"
type: "Concat"
bottom: "obj-label"
bottom: "obj-label"
bottom: "obj-label"
bottom: "obj-label"
top: "obj-block"
concat_param {
concat_dim: 1
}
include { phase: TRAIN }
include { phase: TEST stage: "val" }
}
layer {
name: "bb-label-norm"
type: "Eltwise"
bottom: "bbox-label"
bottom: "size-block"
top: "bbox-label-norm"
eltwise_param {
operation: PROD
}
include { phase: TRAIN }
include { phase: TEST stage: "val" }
}
layer {
name: "bb-obj-norm"
type: "Eltwise"
bottom: "bbox-label-norm"
bottom: "obj-block"
top: "bbox-obj-label-norm"
eltwise_param {
operation: PROD
}
include { phase: TRAIN }
include { phase: TEST stage: "val" }
}

######################################################################

Start of convolutional network

######################################################################

layer {
name: "conv1/7x7_s2"
type: "Convolution"
bottom: "transformed_data"
top: "conv1/7x7_s2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 3
kernel_size: 7
stride: 2
weight_filler {
type: "xavier"
std: 0.1
}
bias_filler {
type: "constant"
value: 0.2
}
}
}

layer {
name: "conv1/relu_7x7"
type: "ReLU"
bottom: "conv1/7x7_s2"
top: "conv1/7x7_s2"
}

layer {
name: "pool1/3x3_s2"
type: "Pooling"
bottom: "conv1/7x7_s2"
top: "pool1/3x3_s2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}

layer {
name: "pool1/norm1"
type: "LRN"
bottom: "pool1/3x3_s2"
top: "pool1/norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}

layer {
name: "conv2/3x3_reduce"
type: "Convolution"
bottom: "pool1/norm1"
top: "conv2/3x3_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.1
}
bias_filler {
type: "constant"
value: 0.2
}
}
}

layer {
name: "conv2/relu_3x3_reduce"
type: "ReLU"
bottom: "conv2/3x3_reduce"
top: "conv2/3x3_reduce"
}

layer {
name: "conv2/3x3"
type: "Convolution"
bottom: "conv2/3x3_reduce"
top: "conv2/3x3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 192
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}

layer {
name: "conv2/relu_3x3"
type: "ReLU"
bottom: "conv2/3x3"
top: "conv2/3x3"
}

layer {
name: "conv2/norm2"
type: "LRN"
bottom: "conv2/3x3"
top: "conv2/norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}

layer {
name: "pool2/3x3_s2"
type: "Pooling"
bottom: "conv2/norm2"
top: "pool2/3x3_s2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}

layer {
name: "inception_3a/1x1"
type: "Convolution"
bottom: "pool2/3x3_s2"
top: "inception_3a/1x1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}

layer {
name: "inception_3a/relu_1x1"
type: "ReLU"
bottom: "inception_3a/1x1"
top: "inception_3a/1x1"
}

layer {
name: "inception_3a/3x3_reduce"
type: "Convolution"
bottom: "pool2/3x3_s2"
top: "inception_3a/3x3_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.09
}
bias_filler {
type: "constant"
value: 0.2
}
}
}

layer {
name: "inception_3a/relu_3x3_reduce"
type: "ReLU"
bottom: "inception_3a/3x3_reduce"
top: "inception_3a/3x3_reduce"
}

layer {
name: "inception_3a/3x3"
type: "Convolution"
bottom: "inception_3a/3x3_reduce"
top: "inception_3a/3x3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}

layer {
name: "inception_3a/relu_3x3"
type: "ReLU"
bottom: "inception_3a/3x3"
top: "inception_3a/3x3"
}

layer {
name: "inception_3a/5x5_reduce"
type: "Convolution"
bottom: "pool2/3x3_s2"
top: "inception_3a/5x5_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 16
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.2
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3a/relu_5x5_reduce"
type: "ReLU"
bottom: "inception_3a/5x5_reduce"
top: "inception_3a/5x5_reduce"
}
layer {
name: "inception_3a/5x5"
type: "Convolution"
bottom: "inception_3a/5x5_reduce"
top: "inception_3a/5x5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3a/relu_5x5"
type: "ReLU"
bottom: "inception_3a/5x5"
top: "inception_3a/5x5"
}

layer {
name: "inception_3a/pool"
type: "Pooling"
bottom: "pool2/3x3_s2"
top: "inception_3a/pool"
pooling_param {
pool: MAX
kernel_size: 3
stride: 1
pad: 1
}
}

layer {
name: "inception_3a/pool_proj"
type: "Convolution"
bottom: "inception_3a/pool"
top: "inception_3a/pool_proj"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.1
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3a/relu_pool_proj"
type: "ReLU"
bottom: "inception_3a/pool_proj"
top: "inception_3a/pool_proj"
}

layer {
name: "inception_3a/output"
type: "Concat"
bottom: "inception_3a/1x1"
bottom: "inception_3a/3x3"
bottom: "inception_3a/5x5"
bottom: "inception_3a/pool_proj"
top: "inception_3a/output"
}

layer {
name: "inception_3b/1x1"
type: "Convolution"
bottom: "inception_3a/output"
top: "inception_3b/1x1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}

layer {
name: "inception_3b/relu_1x1"
type: "ReLU"
bottom: "inception_3b/1x1"
top: "inception_3b/1x1"
}

layer {
name: "inception_3b/3x3_reduce"
type: "Convolution"
bottom: "inception_3a/output"
top: "inception_3b/3x3_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.09
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3b/relu_3x3_reduce"
type: "ReLU"
bottom: "inception_3b/3x3_reduce"
top: "inception_3b/3x3_reduce"
}
layer {
name: "inception_3b/3x3"
type: "Convolution"
bottom: "inception_3b/3x3_reduce"
top: "inception_3b/3x3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 192
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3b/relu_3x3"
type: "ReLU"
bottom: "inception_3b/3x3"
top: "inception_3b/3x3"
}

layer {
name: "inception_3b/5x5_reduce"
type: "Convolution"
bottom: "inception_3a/output"
top: "inception_3b/5x5_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.2
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3b/relu_5x5_reduce"
type: "ReLU"
bottom: "inception_3b/5x5_reduce"
top: "inception_3b/5x5_reduce"
}
layer {
name: "inception_3b/5x5"
type: "Convolution"
bottom: "inception_3b/5x5_reduce"
top: "inception_3b/5x5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3b/relu_5x5"
type: "ReLU"
bottom: "inception_3b/5x5"
top: "inception_3b/5x5"
}

layer {
name: "inception_3b/pool"
type: "Pooling"
bottom: "inception_3a/output"
top: "inception_3b/pool"
pooling_param {
pool: MAX
kernel_size: 3
stride: 1
pad: 1
}
}
layer {
name: "inception_3b/pool_proj"
type: "Convolution"
bottom: "inception_3b/pool"
top: "inception_3b/pool_proj"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.1
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_3b/relu_pool_proj"
type: "ReLU"
bottom: "inception_3b/pool_proj"
top: "inception_3b/pool_proj"
}
layer {
name: "inception_3b/output"
type: "Concat"
bottom: "inception_3b/1x1"
bottom: "inception_3b/3x3"
bottom: "inception_3b/5x5"
bottom: "inception_3b/pool_proj"
top: "inception_3b/output"
}

layer {
name: "pool3/3x3_s2"
type: "Pooling"
bottom: "inception_3b/output"
top: "pool3/3x3_s2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}

layer {
name: "inception_4a/1x1"
type: "Convolution"
bottom: "pool3/3x3_s2"
top: "inception_4a/1x1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 192
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}

layer {
name: "inception_4a/relu_1x1"
type: "ReLU"
bottom: "inception_4a/1x1"
top: "inception_4a/1x1"
}

layer {
name: "inception_4a/3x3_reduce"
type: "Convolution"
bottom: "pool3/3x3_s2"
top: "inception_4a/3x3_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.09
}
bias_filler {
type: "constant"
value: 0.2
}
}
}

layer {
name: "inception_4a/relu_3x3_reduce"
type: "ReLU"
bottom: "inception_4a/3x3_reduce"
top: "inception_4a/3x3_reduce"
}

layer {
name: "inception_4a/3x3"
type: "Convolution"
bottom: "inception_4a/3x3_reduce"
top: "inception_4a/3x3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 208
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}

layer {
name: "inception_4a/relu_3x3"
type: "ReLU"
bottom: "inception_4a/3x3"
top: "inception_4a/3x3"
}

layer {
name: "inception_4a/5x5_reduce"
type: "Convolution"
bottom: "pool3/3x3_s2"
top: "inception_4a/5x5_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 16
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.2
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4a/relu_5x5_reduce"
type: "ReLU"
bottom: "inception_4a/5x5_reduce"
top: "inception_4a/5x5_reduce"
}
layer {
name: "inception_4a/5x5"
type: "Convolution"
bottom: "inception_4a/5x5_reduce"
top: "inception_4a/5x5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 48
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4a/relu_5x5"
type: "ReLU"
bottom: "inception_4a/5x5"
top: "inception_4a/5x5"
}
layer {
name: "inception_4a/pool"
type: "Pooling"
bottom: "pool3/3x3_s2"
top: "inception_4a/pool"
pooling_param {
pool: MAX
kernel_size: 3
stride: 1
pad: 1
}
}
layer {
name: "inception_4a/pool_proj"
type: "Convolution"
bottom: "inception_4a/pool"
top: "inception_4a/pool_proj"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.1
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4a/relu_pool_proj"
type: "ReLU"
bottom: "inception_4a/pool_proj"
top: "inception_4a/pool_proj"
}
layer {
name: "inception_4a/output"
type: "Concat"
bottom: "inception_4a/1x1"
bottom: "inception_4a/3x3"
bottom: "inception_4a/5x5"
bottom: "inception_4a/pool_proj"
top: "inception_4a/output"
}

layer {
name: "inception_4b/1x1"
type: "Convolution"
bottom: "inception_4a/output"
top: "inception_4b/1x1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 160
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}

layer {
name: "inception_4b/relu_1x1"
type: "ReLU"
bottom: "inception_4b/1x1"
top: "inception_4b/1x1"
}
layer {
name: "inception_4b/3x3_reduce"
type: "Convolution"
bottom: "inception_4a/output"
top: "inception_4b/3x3_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 112
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.09
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4b/relu_3x3_reduce"
type: "ReLU"
bottom: "inception_4b/3x3_reduce"
top: "inception_4b/3x3_reduce"
}
layer {
name: "inception_4b/3x3"
type: "Convolution"
bottom: "inception_4b/3x3_reduce"
top: "inception_4b/3x3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 224
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4b/relu_3x3"
type: "ReLU"
bottom: "inception_4b/3x3"
top: "inception_4b/3x3"
}
layer {
name: "inception_4b/5x5_reduce"
type: "Convolution"
bottom: "inception_4a/output"
top: "inception_4b/5x5_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 24
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.2
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4b/relu_5x5_reduce"
type: "ReLU"
bottom: "inception_4b/5x5_reduce"
top: "inception_4b/5x5_reduce"
}
layer {
name: "inception_4b/5x5"
type: "Convolution"
bottom: "inception_4b/5x5_reduce"
top: "inception_4b/5x5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4b/relu_5x5"
type: "ReLU"
bottom: "inception_4b/5x5"
top: "inception_4b/5x5"
}
layer {
name: "inception_4b/pool"
type: "Pooling"
bottom: "inception_4a/output"
top: "inception_4b/pool"
pooling_param {
pool: MAX
kernel_size: 3
stride: 1
pad: 1
}
}
layer {
name: "inception_4b/pool_proj"
type: "Convolution"
bottom: "inception_4b/pool"
top: "inception_4b/pool_proj"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.1
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4b/relu_pool_proj"
type: "ReLU"
bottom: "inception_4b/pool_proj"
top: "inception_4b/pool_proj"
}
layer {
name: "inception_4b/output"
type: "Concat"
bottom: "inception_4b/1x1"
bottom: "inception_4b/3x3"
bottom: "inception_4b/5x5"
bottom: "inception_4b/pool_proj"
top: "inception_4b/output"
}

layer {
name: "inception_4c/1x1"
type: "Convolution"
bottom: "inception_4b/output"
top: "inception_4c/1x1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}

layer {
name: "inception_4c/relu_1x1"
type: "ReLU"
bottom: "inception_4c/1x1"
top: "inception_4c/1x1"
}

layer {
name: "inception_4c/3x3_reduce"
type: "Convolution"
bottom: "inception_4b/output"
top: "inception_4c/3x3_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.09
}
bias_filler {
type: "constant"
value: 0.2
}
}
}

layer {
name: "inception_4c/relu_3x3_reduce"
type: "ReLU"
bottom: "inception_4c/3x3_reduce"
top: "inception_4c/3x3_reduce"
}
layer {
name: "inception_4c/3x3"
type: "Convolution"
bottom: "inception_4c/3x3_reduce"
top: "inception_4c/3x3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4c/relu_3x3"
type: "ReLU"
bottom: "inception_4c/3x3"
top: "inception_4c/3x3"
}
layer {
name: "inception_4c/5x5_reduce"
type: "Convolution"
bottom: "inception_4b/output"
top: "inception_4c/5x5_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 24
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.2
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4c/relu_5x5_reduce"
type: "ReLU"
bottom: "inception_4c/5x5_reduce"
top: "inception_4c/5x5_reduce"
}
layer {
name: "inception_4c/5x5"
type: "Convolution"
bottom: "inception_4c/5x5_reduce"
top: "inception_4c/5x5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4c/relu_5x5"
type: "ReLU"
bottom: "inception_4c/5x5"
top: "inception_4c/5x5"
}
layer {
name: "inception_4c/pool"
type: "Pooling"
bottom: "inception_4b/output"
top: "inception_4c/pool"
pooling_param {
pool: MAX
kernel_size: 3
stride: 1
pad: 1
}
}
layer {
name: "inception_4c/pool_proj"
type: "Convolution"
bottom: "inception_4c/pool"
top: "inception_4c/pool_proj"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.1
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4c/relu_pool_proj"
type: "ReLU"
bottom: "inception_4c/pool_proj"
top: "inception_4c/pool_proj"
}
layer {
name: "inception_4c/output"
type: "Concat"
bottom: "inception_4c/1x1"
bottom: "inception_4c/3x3"
bottom: "inception_4c/5x5"
bottom: "inception_4c/pool_proj"
top: "inception_4c/output"
}

layer {
name: "inception_4d/1x1"
type: "Convolution"
bottom: "inception_4c/output"
top: "inception_4d/1x1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 112
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.1
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4d/relu_1x1"
type: "ReLU"
bottom: "inception_4d/1x1"
top: "inception_4d/1x1"
}
layer {
name: "inception_4d/3x3_reduce"
type: "Convolution"
bottom: "inception_4c/output"
top: "inception_4d/3x3_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 144
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.1
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4d/relu_3x3_reduce"
type: "ReLU"
bottom: "inception_4d/3x3_reduce"
top: "inception_4d/3x3_reduce"
}
layer {
name: "inception_4d/3x3"
type: "Convolution"
bottom: "inception_4d/3x3_reduce"
top: "inception_4d/3x3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 288
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
std: 0.1
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4d/relu_3x3"
type: "ReLU"
bottom: "inception_4d/3x3"
top: "inception_4d/3x3"
}
layer {
name: "inception_4d/5x5_reduce"
type: "Convolution"
bottom: "inception_4c/output"
top: "inception_4d/5x5_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.1
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4d/relu_5x5_reduce"
type: "ReLU"
bottom: "inception_4d/5x5_reduce"
top: "inception_4d/5x5_reduce"
}
layer {
name: "inception_4d/5x5"
type: "Convolution"
bottom: "inception_4d/5x5_reduce"
top: "inception_4d/5x5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.1
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4d/relu_5x5"
type: "ReLU"
bottom: "inception_4d/5x5"
top: "inception_4d/5x5"
}
layer {
name: "inception_4d/pool"
type: "Pooling"
bottom: "inception_4c/output"
top: "inception_4d/pool"
pooling_param {
pool: MAX
kernel_size: 3
stride: 1
pad: 1
}
}
layer {
name: "inception_4d/pool_proj"
type: "Convolution"
bottom: "inception_4d/pool"
top: "inception_4d/pool_proj"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.1
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4d/relu_pool_proj"
type: "ReLU"
bottom: "inception_4d/pool_proj"
top: "inception_4d/pool_proj"
}
layer {
name: "inception_4d/output"
type: "Concat"
bottom: "inception_4d/1x1"
bottom: "inception_4d/3x3"
bottom: "inception_4d/5x5"
bottom: "inception_4d/pool_proj"
top: "inception_4d/output"
}

layer {
name: "inception_4e/1x1"
type: "Convolution"
bottom: "inception_4d/output"
top: "inception_4e/1x1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4e/relu_1x1"
type: "ReLU"
bottom: "inception_4e/1x1"
top: "inception_4e/1x1"
}
layer {
name: "inception_4e/3x3_reduce"
type: "Convolution"
bottom: "inception_4d/output"
top: "inception_4e/3x3_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 160
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.09
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4e/relu_3x3_reduce"
type: "ReLU"
bottom: "inception_4e/3x3_reduce"
top: "inception_4e/3x3_reduce"
}
layer {
name: "inception_4e/3x3"
type: "Convolution"
bottom: "inception_4e/3x3_reduce"
top: "inception_4e/3x3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 320
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4e/relu_3x3"
type: "ReLU"
bottom: "inception_4e/3x3"
top: "inception_4e/3x3"
}
layer {
name: "inception_4e/5x5_reduce"
type: "Convolution"
bottom: "inception_4d/output"
top: "inception_4e/5x5_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.2
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4e/relu_5x5_reduce"
type: "ReLU"
bottom: "inception_4e/5x5_reduce"
top: "inception_4e/5x5_reduce"
}
layer {
name: "inception_4e/5x5"
type: "Convolution"
bottom: "inception_4e/5x5_reduce"
top: "inception_4e/5x5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4e/relu_5x5"
type: "ReLU"
bottom: "inception_4e/5x5"
top: "inception_4e/5x5"
}
layer {
name: "inception_4e/pool"
type: "Pooling"
bottom: "inception_4d/output"
top: "inception_4e/pool"
pooling_param {
pool: MAX
kernel_size: 3
stride: 1
pad: 1
}
}
layer {
name: "inception_4e/pool_proj"
type: "Convolution"
bottom: "inception_4e/pool"
top: "inception_4e/pool_proj"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.1
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_4e/relu_pool_proj"
type: "ReLU"
bottom: "inception_4e/pool_proj"
top: "inception_4e/pool_proj"
}
layer {
name: "inception_4e/output"
type: "Concat"
bottom: "inception_4e/1x1"
bottom: "inception_4e/3x3"
bottom: "inception_4e/5x5"
bottom: "inception_4e/pool_proj"
top: "inception_4e/output"
}

layer {
name: "inception_5a/1x1"
type: "Convolution"
bottom: "inception_4e/output"
top: "inception_5a/1x1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_5a/relu_1x1"
type: "ReLU"
bottom: "inception_5a/1x1"
top: "inception_5a/1x1"
}

layer {
name: "inception_5a/3x3_reduce"
type: "Convolution"
bottom: "inception_4e/output"
top: "inception_5a/3x3_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 160
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.09
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_5a/relu_3x3_reduce"
type: "ReLU"
bottom: "inception_5a/3x3_reduce"
top: "inception_5a/3x3_reduce"
}

layer {
name: "inception_5a/3x3"
type: "Convolution"
bottom: "inception_5a/3x3_reduce"
top: "inception_5a/3x3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 320
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_5a/relu_3x3"
type: "ReLU"
bottom: "inception_5a/3x3"
top: "inception_5a/3x3"
}
layer {
name: "inception_5a/5x5_reduce"
type: "Convolution"
bottom: "inception_4e/output"
top: "inception_5a/5x5_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 32
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.2
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_5a/relu_5x5_reduce"
type: "ReLU"
bottom: "inception_5a/5x5_reduce"
top: "inception_5a/5x5_reduce"
}
layer {
name: "inception_5a/5x5"
type: "Convolution"
bottom: "inception_5a/5x5_reduce"
top: "inception_5a/5x5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_5a/relu_5x5"
type: "ReLU"
bottom: "inception_5a/5x5"
top: "inception_5a/5x5"
}
layer {
name: "inception_5a/pool"
type: "Pooling"
bottom: "inception_4e/output"
top: "inception_5a/pool"
pooling_param {
pool: MAX
kernel_size: 3
stride: 1
pad: 1
}
}
layer {
name: "inception_5a/pool_proj"
type: "Convolution"
bottom: "inception_5a/pool"
top: "inception_5a/pool_proj"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.1
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_5a/relu_pool_proj"
type: "ReLU"
bottom: "inception_5a/pool_proj"
top: "inception_5a/pool_proj"
}
layer {
name: "inception_5a/output"
type: "Concat"
bottom: "inception_5a/1x1"
bottom: "inception_5a/3x3"
bottom: "inception_5a/5x5"
bottom: "inception_5a/pool_proj"
top: "inception_5a/output"
}

layer {
name: "inception_5b/1x1"
type: "Convolution"
bottom: "inception_5a/output"
top: "inception_5b/1x1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.1
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_5b/relu_1x1"
type: "ReLU"
bottom: "inception_5b/1x1"
top: "inception_5b/1x1"
}
layer {
name: "inception_5b/3x3_reduce"
type: "Convolution"
bottom: "inception_5a/output"
top: "inception_5b/3x3_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 192
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.1
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_5b/relu_3x3_reduce"
type: "ReLU"
bottom: "inception_5b/3x3_reduce"
top: "inception_5b/3x3_reduce"
}
layer {
name: "inception_5b/3x3"
type: "Convolution"
bottom: "inception_5b/3x3_reduce"
top: "inception_5b/3x3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
std: 0.1
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_5b/relu_3x3"
type: "ReLU"
bottom: "inception_5b/3x3"
top: "inception_5b/3x3"
}
layer {
name: "inception_5b/5x5_reduce"
type: "Convolution"
bottom: "inception_5a/output"
top: "inception_5b/5x5_reduce"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 48
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.1
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_5b/relu_5x5_reduce"
type: "ReLU"
bottom: "inception_5b/5x5_reduce"
top: "inception_5b/5x5_reduce"
}
layer {
name: "inception_5b/5x5"
type: "Convolution"
bottom: "inception_5b/5x5_reduce"
top: "inception_5b/5x5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.1
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_5b/relu_5x5"
type: "ReLU"
bottom: "inception_5b/5x5"
top: "inception_5b/5x5"
}
layer {
name: "inception_5b/pool"
type: "Pooling"
bottom: "inception_5a/output"
top: "inception_5b/pool"
pooling_param {
pool: MAX
kernel_size: 3
stride: 1
pad: 1
}
}
layer {
name: "inception_5b/pool_proj"
type: "Convolution"
bottom: "inception_5b/pool"
top: "inception_5b/pool_proj"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.1
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
layer {
name: "inception_5b/relu_pool_proj"
type: "ReLU"
bottom: "inception_5b/pool_proj"
top: "inception_5b/pool_proj"
}
layer {
name: "inception_5b/output"
type: "Concat"
bottom: "inception_5b/1x1"
bottom: "inception_5b/3x3"
bottom: "inception_5b/5x5"
bottom: "inception_5b/pool_proj"
top: "inception_5b/output"
}
layer {
name: "pool5/drop_s1"
type: "Dropout"
bottom: "inception_5b/output"
top: "pool5/drop_s1"
dropout_param {
dropout_ratio: 0.4
}
}
layer {
name: "cvg/classifier"
type: "Convolution"
bottom: "pool5/drop_s1"
top: "cvg/classifier"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 2
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.
}
}
}
layer {
name: "coverage/sig"
type: "Sigmoid"
bottom: "cvg/classifier"
top: "coverage"
}
layer {
name: "bbox/regressor"
type: "Convolution"
bottom: "pool5/drop_s1"
top: "bboxes"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 4
kernel_size: 1
weight_filler {
type: "xavier"
std: 0.03
}
bias_filler {
type: "constant"
value: 0.
}
}
}

######################################################################

End of convolutional network

######################################################################

Convert bboxes

layer {
name: "bbox_mask"
type: "Eltwise"
bottom: "bboxes"
bottom: "coverage-block"
top: "bboxes-masked"
eltwise_param {
operation: PROD
}
include { phase: TRAIN }
include { phase: TEST stage: "val" }
}
layer {
name: "bbox-norm"
type: "Eltwise"
bottom: "bboxes-masked"
bottom: "size-block"
top: "bboxes-masked-norm"
eltwise_param {
operation: PROD
}
include { phase: TRAIN }
include { phase: TEST stage: "val" }
}
layer {
name: "bbox-obj-norm"
type: "Eltwise"
bottom: "bboxes-masked-norm"
bottom: "obj-block"
top: "bboxes-obj-masked-norm"
eltwise_param {
operation: PROD
}
include { phase: TRAIN }
include { phase: TEST stage: "val" }
}

Loss layers

layer {
name: "bbox_loss"
type: "L1Loss"
bottom: "bboxes-obj-masked-norm"
bottom: "bbox-obj-label-norm"
top: "loss_bbox"
loss_weight: 2
include { phase: TRAIN }
include { phase: TEST stage: "val" }
}
layer {
name: "coverage_loss"
type: "EuclideanLoss"
bottom: "coverage"
bottom: "coverage-label"
top: "loss_coverage"
include { phase: TRAIN }
include { phase: TEST stage: "val" }
}

Cluster bboxes

layer {
type: 'Python'
name: 'cluster'
bottom: 'coverage'
bottom: 'bboxes'
top: 'bbox-list-class0'
top: 'bbox-list-class1'
python_param {
module: 'caffe.layers.detectnet.clustering'
layer: 'ClusterDetections'
param_str : '1248, 720, 16, 0.6, 3, 0.02, 22, 2'
}
include: { phase: TEST }
}

Calculate mean average precision

layer {
type: 'Python'
name: 'cluster_gt'
bottom: 'coverage-label'
bottom: 'bbox-label'
top: 'bbox-list-label-class0'
top: 'bbox-list-label-class1'
python_param {
module: 'caffe.layers.detectnet.clustering'
layer: 'ClusterGroundtruth'
param_str : '1248, 720, 16, 2'
}
include: { phase: TEST stage: "val" }
}
layer {
type: 'Python'
name: 'score-class0'
bottom: 'bbox-list-label-class0'
bottom: 'bbox-list-class0'
top: 'bbox-list-scored-class0'
python_param {
module: 'caffe.layers.detectnet.mean_ap'
layer: 'ScoreDetections'
}
include: { phase: TEST stage: "val" }
}
layer {
type: 'Python'
name: 'mAP-class0'
bottom: 'bbox-list-scored-class0'
top: 'mAP-class0'
top: 'precision-class0'
top: 'recall-class0'
python_param {
module: 'caffe.layers.detectnet.mean_ap'
layer: 'mAP'
param_str : '1248, 720, 16'
}
include: { phase: TEST stage: "val" }
}

layer {
type: 'Python'
name: 'score-class1'
bottom: 'bbox-list-label-class1'
bottom: 'bbox-list-class1'
top: 'bbox-list-scored-class1'
python_param {
module: 'caffe.layers.detectnet.mean_ap'
layer: 'ScoreDetections'
}
include: { phase: TEST stage: "val" }
}
layer {
type: 'Python'
name: 'mAP-class1'
bottom: 'bbox-list-scored-class1'
top: 'mAP-class1'
top: 'precision-class1'
top: 'recall-class1'
python_param {
module: 'caffe.layers.detectnet.mean_ap'
layer: 'mAP'
param_str : '1248, 720, 16'
}
include: { phase: TEST stage: "val" }
}
layer {
name: "accuracy"
type: "Python"
bottom: "bbox-list-class0" # These blob names may differ in your network
bottom: "bbox-list-class1"
top: "accuracy"
python_param {
module: "digits_python_layers" # File name
layer: "AccuracyLayer" # Class name
param_str: "{"top_k": 1}"
}
include { stage: "val" }
}

Any assistance is greatly appreciated!

lukeyeager reviewed Jun 28, 2016
View reviewed changes

jmancewicz force-pushed the js-bboxes branch from 3dc0938 to 15f282b Compare June 28, 2016 23:14

gheinrich reviewed Jun 29, 2016
View reviewed changes

gheinrich added enhancement UI labels Jul 4, 2016

jmancewicz force-pushed the js-bboxes branch from 8c45f1c to a52b0ff Compare July 20, 2016 02:13

gheinrich reviewed Jul 20, 2016
View reviewed changes

jmancewicz force-pushed the js-bboxes branch from a52b0ff to ac0091d Compare July 26, 2016 22:10

jmancewicz force-pushed the js-bboxes branch 2 times, most recently from 11255df to 9fc2757 Compare August 12, 2016 21:09

lukeyeager added the object-detection label Aug 15, 2016

lukeyeager mentioned this pull request Aug 17, 2016

Multi-class object detection NVIDIA/caffe#157

Merged

jmancewicz force-pushed the js-bboxes branch from 9fc2757 to 5832e31 Compare August 17, 2016 20:11

Draw js bounding boxes for multiclass object detection.

bf562b2

jmancewicz force-pushed the js-bboxes branch from 5832e31 to bf562b2 Compare August 17, 2016 20:36

lukeyeager merged commit fb6d8e3 into NVIDIA:master Aug 17, 2016

gheinrich mentioned this pull request Nov 21, 2016

detectnet_network-2classes visualization problem #1283

Closed

eweill mentioned this pull request Dec 23, 2016

Car-Pedestrian multi class object detection DetectNet mAP and inference #1359

Open

jmancewicz deleted the js-bboxes branch January 4, 2017 19:41

ankghost0912 mentioned this pull request Jan 25, 2017

Multi-class detection using PASCAL VOC dataset and DetectNet #1415

Open

SlipknotTN pushed a commit to cynnyx/DIGITS that referenced this pull request Mar 30, 2017

Merge pull request NVIDIA#869 from jmancewicz/js-bboxes

731c0bc

Move bounding box drawing from the server to the client.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move bounding box drawing from the server to the client. #869

Move bounding box drawing from the server to the client. #869

jmancewicz commented Jun 24, 2016

lukeyeager Jun 28, 2016

jmancewicz Jun 28, 2016

lukeyeager commented Jun 28, 2016

lukeyeager Jun 28, 2016

jmancewicz Jun 28, 2016

Greendogo commented Jun 28, 2016 •

edited

Loading

gheinrich Jun 29, 2016

jmancewicz Jul 5, 2016 •

edited

Loading

jmancewicz Jul 6, 2016

jmancewicz commented Jul 7, 2016

jmancewicz commented Jul 20, 2016 •

edited

Loading

gheinrich commented Jul 20, 2016

gheinrich Jul 20, 2016

jmancewicz Aug 17, 2016

lukeyeager commented Aug 17, 2016

szm-R commented Sep 5, 2016

lukeyeager commented Sep 6, 2016

szm-R commented Sep 6, 2016

lukeyeager commented Sep 6, 2016 •

edited

Loading

jmancewicz commented Sep 6, 2016

szm-R commented Sep 7, 2016

lukeyeager commented Sep 7, 2016 •

edited

Loading

varunvv commented Dec 21, 2016

ShashankVBhat commented Dec 27, 2016

varunvv commented Dec 29, 2016

ShashankVBhat commented Dec 29, 2016

findorion commented Jan 12, 2017

ShervinAr commented Jan 14, 2017

sulth commented Sep 25, 2017

chandanv2 commented Feb 12, 2018

chandanv2 commented Feb 12, 2018

eanmikale commented Aug 13, 2020 •

edited

Loading

		@@ -5,18 +5,3 @@
		{% from "helper.html" import mark_errors %}

		<small>Draw a bounding box around a detected object. This expected network output is a nested list of list of box coordinates</small>

Move bounding box drawing from the server to the client. #869

Move bounding box drawing from the server to the client. #869

Conversation

jmancewicz commented Jun 24, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lukeyeager commented Jun 28, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Greendogo commented Jun 28, 2016 • edited Loading

Choose a reason for hiding this comment

jmancewicz Jul 5, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jmancewicz commented Jul 7, 2016

jmancewicz commented Jul 20, 2016 • edited Loading

Like todo:

gheinrich commented Jul 20, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lukeyeager commented Aug 17, 2016

szm-R commented Sep 5, 2016

lukeyeager commented Sep 6, 2016

szm-R commented Sep 6, 2016

lukeyeager commented Sep 6, 2016 • edited Loading

jmancewicz commented Sep 6, 2016

szm-R commented Sep 7, 2016

lukeyeager commented Sep 7, 2016 • edited Loading

varunvv commented Dec 21, 2016

ShashankVBhat commented Dec 27, 2016

varunvv commented Dec 29, 2016

ShashankVBhat commented Dec 29, 2016

findorion commented Jan 12, 2017

ShervinAr commented Jan 14, 2017

sulth commented Sep 25, 2017

chandanv2 commented Feb 12, 2018

chandanv2 commented Feb 12, 2018

eanmikale commented Aug 13, 2020 • edited Loading

DetectNet network

Data/Input layers

Data transformation layers

Label conversion layers

Start of convolutional network

End of convolutional network

Convert bboxes

Loss layers

Cluster bboxes

Calculate mean average precision

Greendogo commented Jun 28, 2016 •

edited

Loading

jmancewicz Jul 5, 2016 •

edited

Loading

jmancewicz commented Jul 20, 2016 •

edited

Loading

lukeyeager commented Sep 6, 2016 •

edited

Loading

lukeyeager commented Sep 7, 2016 •

edited

Loading

eanmikale commented Aug 13, 2020 •

edited

Loading