Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training for a multi label loss using lmdb #2407

Closed
sukritshankar opened this issue May 3, 2015 · 32 comments
Closed

Training for a multi label loss using lmdb #2407

sukritshankar opened this issue May 3, 2015 · 32 comments

Comments

@sukritshankar
Copy link

I wish to train AlexNet with Cross Entropy Loss, for which every input has multiple label probabilities. Till now, I have been doing this using an HDF5 layer. However, one has to do all sorts of manual pre processing and access to this layer is around twice as slow as that to lmdb.

In view of the above, I wish to know how can one train for a multi label loss using an lmdb layer in caffe. More precisely, what should be done in train.txt so that one can specify multiple labels for a given image ?

@rohrbach
Copy link

rohrbach commented May 4, 2015

Please ask usage questions on caffe-users -- this issues tracker is primarily for Caffe development discussion. Thanks!

@rohrbach rohrbach closed this as completed May 4, 2015
@gavinmh
Copy link

gavinmh commented Sep 14, 2015

I'm sorry to dredge up a closed thread; @sukritshankar , can you share your network definition? In particular, what accuracy layer are you using?

@sukritshankar
Copy link
Author

Hi @gavinmh.
In recent months, we have been able to train Caffe with multi-label loss using LMDB. We formed two different LMDBs, one for data and one for labels, both as 4-D blobs. We did not shuffle the data while doing LMDB creations, since that would make correspondence lose between labels and data. Instead, we shuffled the data ourselves in the folder for avoiding erroneous gradients.

We used such a thing for a slightly different purpose, where we had many softmax layers, and their outputs went into a multi-label Euclidean loss, but in all cases, we used the simple accuracy layer (http://caffe.berkeleyvision.org/doxygen/classcaffe_1_1AccuracyLayer.html).

I am pasting the network definition below; you might ignore the slicing and concatenation parts since we were experimenting for a different purpose. We have tested this prototxt, and it can take in multiple labels correctly for each given image.

name: "AlexNet -- msr_face_LM_0_1_2_3"
layer {
name: "data"
type: "Data"
top: "data"
include {
phase: TRAIN
}
transform_param {
mirror: true
crop_size: 148
mean_file: "/scratchvol/t-susank/caffe/data/basic_tl/mean.binaryproto"
}
data_param {
source: "/scratchvol/t-susank/caffe/examples/basic_tl/train_dataFLM_lmdb"
batch_size: 32
backend: LMDB
}
}
layer {
name: "labels"
type: "Data"
top: "labels"
include {
phase: TRAIN
}
data_param {
source: "/scratchvol/t-susank/caffe/examples/basic_tl/msr_face_LM_0_1_2_3/train_label_lmdb"
batch_size: 32
backend: LMDB
}
}
layer {
name: "data"
type: "Data"
top: "data"
include {
phase: TEST
}
transform_param {
mirror: false
crop_size: 148
mean_file: "/scratchvol/t-susank/caffe/data/basic_tl/mean.binaryproto"
}
data_param {
source: "/scratchvol/t-susank/caffe/examples/basic_tl/test_dataFLM_lmdb"
batch_size: 10
backend: LMDB
}
}
layer {
name: "labels"
type: "Data"
top: "labels”
include {
phase: TEST
}
data_param {
source: "/scratchvol/t-susank/caffe/examples/basic_tl/msr_face_LM_0_1_2_3/test_label_lmdb"
batch_size: 10
backend: LMDB
}
}
layer {
name: "sliceL"
type: "Slice"
bottom: "labels"
top: "labels0"
top: "labels1"
top: "labels2"
top: "labels3"
include {
phase: TEST
}
slice_param {
axis: 1
slice_point: 2
slice_point: 6
slice_point: 11
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "norm2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "pool2"
top: "conv3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6"
top: "fc6"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6"
top: "fc7"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}
layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc8_0"
type: "InnerProduct"
bottom: "fc7"
top: "fc8_0"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "softmax_0"
type: "Softmax"
bottom: "fc8_0"
top: "softmax_0"
}
layer {
name: "fc8_1"
type: "InnerProduct"
bottom: "fc7"
top: "fc8_1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "softmax_1"
type: "Softmax"
bottom: "fc8_1"
top: "softmax_1"
}
layer {
name: "fc8_2"
type: "InnerProduct"
bottom: "fc7"
top: "fc8_2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 5
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "softmax_2"
type: "Softmax"
bottom: "fc8_2"
top: "softmax_2"
}
layer {
name: "fc8_3"
type: "InnerProduct"
bottom: "fc7"
top: "fc8_3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 5
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "softmax_3"
type: "Softmax"
bottom: "fc8_3"
top: "softmax_3"
}
layer {
name: "concat"
type: "Concat"
bottom: "softmax_0"
bottom: "softmax_1"
bottom: "softmax_2"
bottom: "softmax_3"
top: "concat"
concat_param {
axis: 1
}
}
layer {
name: "loss"
type: "EuclideanLoss"
bottom: "concat"
bottom: "labels"
top: "loss"
}
layer {
name: "sliceL"
type: "Slice"
bottom: "labels"
top: "labels0"
top: "labels1"
top: "labels2"
top: "labels3"
include {
phase: TEST
}
slice_param {
axis: 1
slice_point: 2
slice_point: 6
slice_point: 11
}
}
layer {
name: "argmax0"
type: "ArgMax"
bottom: "labels0"
top: "agrmax_0"
include {
phase: TEST
}
}
layer {
name: "accuracy_0"
type: "Accuracy"
bottom: "softmax_0"
bottom: "argmax0"
top: "accuracy"
include {
phase: TEST
}
}
layer {
name: "argmax1"
type: "ArgMax"
bottom: "labels1"
top: "agrmax_1"
include {
phase: TEST
}
}
layer {
name: "accuracy_1"
type: "Accuracy"
bottom: "softmax_1"
bottom: "argmax1"
top: "accuracy"
include {
phase: TEST
}
}
layer {
name: "argmax2"
type: "ArgMax"
bottom: "labels2"
top: "agrmax_2"
include {
phase: TEST
}
}
layer {
name: "accuracy_2"
type: "Accuracy"
bottom: "softmax_2"
bottom: "argmax2"
top: "accuracy"
include {
phase: TEST
}
}
layer {
name: "argmax3"
type: "ArgMax"
bottom: "labels3"
top: "agrmax_3"
include {
phase: TEST
}
}
layer {
name: "accuracy_3"
type: "Accuracy"
bottom: "softmax_3"
bottom: "argmax3"
top: "accuracy"
include {
phase: TEST
}
}

@baoqingping
Copy link

@sukritshankar
It really works well. Thank a lot

@Elpidam
Copy link

Elpidam commented Dec 15, 2015

Hello! How did you make the lmdb files? Did you use any script of the caffe master or did you make your own import just labels into lmdb files?

@ghost
Copy link

ghost commented Dec 16, 2015

Hi Elpidam

You can use script in this page to convert label vectors to lmdb:
https://groups.google.com/forum/#!searchin/caffe-users/multilabel/caffe-users/RuT1TgwiRCo/hoUkZOeEDgAJ

and use caffe-root/tool/convert_imageset in order to convert images to lmdb format.

@jerpint
Copy link

jerpint commented May 26, 2016

Hello, this might be helpful to some, it is code I slightly modified from a google groups forum to create separate training and testing lmdb files using continuous valued labels (in my case a 6D vector)

the Y_train variable should be loaded first as a Mx6 ndarray (of course, the 6 is arbitrary, just change the line in the code at the reshape part) , and the trainingset.txt and testingset.txt should contain the names of the images to be loaded

# google docs code , source : https://groups.google.com/forum/#!topic/caffe-users/-vWuaM3bnro
# convert data to lmdb for test and training
# load Y_train and Y_test first!


import lmdb
import re, fileinput, math

data = 'images_for_caffe/train/trainingset.txt'
lmdb_data_name = 'train_data_lmdb_JP'
lmdb_label_name = 'train_score_lmdb_JP'

path_image = 'images_for_caffe/train/'

Inputs = []
Labels = Y_train.tolist()


for line in fileinput.input(data):
    entries = re.split(' ', line.strip())
    Inputs.append(entries[0])


print('Writing labels for training')

# Size of buffer: 1000 elements to reduce memory consumption
for idx in range(int(math.ceil(len(Labels)/1000.0))):
    in_db_label = lmdb.open(lmdb_label_name, map_size=int(1e12))
    with in_db_label.begin(write=True) as in_txn:
        for label_idx, label_ in enumerate(Labels[(1000*idx):(1000*(idx+1))]):
            im_dat = caffe.io.array_to_datum(np.array(label_).astype(float).reshape(1,1,6))
            in_txn.put('{:0>10d}'.format(1000*idx + label_idx), im_dat.SerializeToString())

            string_ = str(1000*idx+label_idx+1) + ' / ' + str(len(Labels))
            sys.stdout.write("\r%s" % string_)
            sys.stdout.flush()
    in_db_label.close()
print('')

print('Writing image data for training')

for idx in range(int(math.ceil(len(Inputs)/1000.0))):
    in_db_data = lmdb.open(lmdb_data_name, map_size=int(1e12))
    with in_db_data.begin(write=True) as in_txn:
        for in_idx, in_ in enumerate(Inputs[(1000*idx):(1000*(idx+1))]):
            im = caffe.io.load_image(path_image + in_)
            im_dat = caffe.io.array_to_datum(im.astype(float).transpose((2, 0, 1)))
            in_txn.put('{:0>10d}'.format(1000*idx + in_idx), im_dat.SerializeToString())

            string_ = str(1000*idx+in_idx+1) + ' / ' + str(len(Inputs))
            sys.stdout.write("\r%s" % string_)
            sys.stdout.flush()
    in_db_data.close()
print('')


#testing data

data = 'images_for_caffe/test/testingset.txt'
lmdb_data_name = 'test_data_lmdb_JP'
lmdb_label_name = 'test_score_lmdb_JP'

path_image = 'images_for_caffe/test/'

Inputs = []
Labels = Y_test.tolist()


for line in fileinput.input(data):
    entries = re.split(' ', line.strip())
    Inputs.append(entries[0])


print('Writing labels for testing')

# Size of buffer: 1000 elements to reduce memory consumption
for idx in range(int(math.ceil(len(Labels)/1000.0))):
    in_db_label = lmdb.open(lmdb_label_name, map_size=int(1e12))
    with in_db_label.begin(write=True) as in_txn:
        for label_idx, label_ in enumerate(Labels[(1000*idx):(1000*(idx+1))]):
            im_dat = caffe.io.array_to_datum(np.array(label_).astype(float).reshape(1,1,6))
            in_txn.put('{:0>10d}'.format(1000*idx + label_idx), im_dat.SerializeToString())

            string_ = str(1000*idx+label_idx+1) + ' / ' + str(len(Labels))
            sys.stdout.write("\r%s" % string_)
            sys.stdout.flush()
    in_db_label.close()
print('')

print('Writing image data for testing')

for idx in range(int(math.ceil(len(Inputs)/1000.0))):
    in_db_data = lmdb.open(lmdb_data_name, map_size=int(1e12))
    with in_db_data.begin(write=True) as in_txn:
        for in_idx, in_ in enumerate(Inputs[(1000*idx):(1000*(idx+1))]):
            im = caffe.io.load_image(path_image + in_)
            im_dat = caffe.io.array_to_datum(im.astype(float).transpose((2, 0, 1)))
            in_txn.put('{:0>10d}'.format(1000*idx + in_idx), im_dat.SerializeToString())

            string_ = str(1000*idx+in_idx+1) + ' / ' + str(len(Inputs))
            sys.stdout.write("\r%s" % string_)
            sys.stdout.flush()
    in_db_data.close()
print('')
# google docs code , source : https://groups.google.com/forum/#!topic/caffe-users/-vWuaM3bnro
# convert data to lmdb for test and training
# load Y_train and Y_test first!


import lmdb
import re, fileinput, math

data = 'images_for_caffe/train/trainingset.txt'
lmdb_data_name = 'train_data_lmdb_JP'
lmdb_label_name = 'train_score_lmdb_JP'

path_image = 'images_for_caffe/train/'

Inputs = []
Labels = Y_train.tolist()


for line in fileinput.input(data):
    entries = re.split(' ', line.strip())
    Inputs.append(entries[0])


print('Writing labels for training')

# Size of buffer: 1000 elements to reduce memory consumption
for idx in range(int(math.ceil(len(Labels)/1000.0))):
    in_db_label = lmdb.open(lmdb_label_name, map_size=int(1e12))
    with in_db_label.begin(write=True) as in_txn:
        for label_idx, label_ in enumerate(Labels[(1000*idx):(1000*(idx+1))]):
            im_dat = caffe.io.array_to_datum(np.array(label_).astype(float).reshape(1,1,6))
            in_txn.put('{:0>10d}'.format(1000*idx + label_idx), im_dat.SerializeToString())

            string_ = str(1000*idx+label_idx+1) + ' / ' + str(len(Labels))
            sys.stdout.write("\r%s" % string_)
            sys.stdout.flush()
    in_db_label.close()
print('')

print('Writing image data for training')

for idx in range(int(math.ceil(len(Inputs)/1000.0))):
    in_db_data = lmdb.open(lmdb_data_name, map_size=int(1e12))
    with in_db_data.begin(write=True) as in_txn:
        for in_idx, in_ in enumerate(Inputs[(1000*idx):(1000*(idx+1))]):
            im = caffe.io.load_image(path_image + in_)
            im_dat = caffe.io.array_to_datum(im.astype(float).transpose((2, 0, 1)))
            in_txn.put('{:0>10d}'.format(1000*idx + in_idx), im_dat.SerializeToString())

            string_ = str(1000*idx+in_idx+1) + ' / ' + str(len(Inputs))
            sys.stdout.write("\r%s" % string_)
            sys.stdout.flush()
    in_db_data.close()
print('')


#testing data

data = 'images_for_caffe/test/testingset.txt'
lmdb_data_name = 'test_data_lmdb_JP'
lmdb_label_name = 'test_score_lmdb_JP'

path_image = 'images_for_caffe/test/'

Inputs = []
Labels = Y_test.tolist()


for line in fileinput.input(data):
    entries = re.split(' ', line.strip())
    Inputs.append(entries[0])


print('Writing labels for testing')

# Size of buffer: 1000 elements to reduce memory consumption
for idx in range(int(math.ceil(len(Labels)/1000.0))):
    in_db_label = lmdb.open(lmdb_label_name, map_size=int(1e12))
    with in_db_label.begin(write=True) as in_txn:
        for label_idx, label_ in enumerate(Labels[(1000*idx):(1000*(idx+1))]):
            im_dat = caffe.io.array_to_datum(np.array(label_).astype(float).reshape(1,1,6))
            in_txn.put('{:0>10d}'.format(1000*idx + label_idx), im_dat.SerializeToString())

            string_ = str(1000*idx+label_idx+1) + ' / ' + str(len(Labels))
            sys.stdout.write("\r%s" % string_)
            sys.stdout.flush()
    in_db_label.close()
print('')

print('Writing image data for testing')

for idx in range(int(math.ceil(len(Inputs)/1000.0))):
    in_db_data = lmdb.open(lmdb_data_name, map_size=int(1e12))
    with in_db_data.begin(write=True) as in_txn:
        for in_idx, in_ in enumerate(Inputs[(1000*idx):(1000*(idx+1))]):
            im = caffe.io.load_image(path_image + in_)
            im_dat = caffe.io.array_to_datum(im.astype(float).transpose((2, 0, 1)))
            in_txn.put('{:0>10d}'.format(1000*idx + in_idx), im_dat.SerializeToString())

            string_ = str(1000*idx+in_idx+1) + ' / ' + str(len(Inputs))
            sys.stdout.write("\r%s" % string_)
            sys.stdout.flush()
    in_db_data.close()
print('')

@jerpint
Copy link

jerpint commented May 26, 2016

@sukritshankar , could you share your code to write the prototxt you have provided please?

@sukritshankar
Copy link
Author

Sorry for the delayed reply !! I was entangled in some critical personal matters. Please see my recent Github code checkin at https://github.com/sukritshankar/CaffeLMDBCreationMultiLabel . That should solve the issue !!

@jerpint
Copy link

jerpint commented Jun 12, 2016

@sukritshankar thank you very much

@mvasil
Copy link

mvasil commented Aug 15, 2016

@sukritshankar I looked through your code and noticed that you require to "make sure labels are integer values in [0, 255]. I have continuous labels in the range [0, 1], what would I need to modify? Thank you!

@sukritshankar
Copy link
Author

Well, normally you will have continuous labels in [0,1]. Even I had the
same case. So, I just converted them to a [0,255] range by floor(x*255),
where x is the value in [0,1]. This should suffice. The labels will then be
stored in the LMDB as values in [0,255], and then will get rescaled
automatically to [0,1] in the train prototxt as specified in my code. Hope
that helps !!

On Mon, Aug 15, 2016 at 7:22 PM, mvasil notifications@github.com wrote:

@sukritshankar https://github.com/sukritshankar I looked through your
code and noticed that you require to "make sure labels are integer values
in [0, 255]. I have continuous labels in the range [0, 1], what would I
need to modify? Thank you!


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#2407 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AIN9tfzC_QhWgHQK1v3k5pW1PoXQd4SWks5qgK54gaJpZM4EO6ki
.

@jerpint
Copy link

jerpint commented Aug 16, 2016

hey, here is the code I had used to pass from a regular jpg image to separate lmdb files for labels and images. I plan on posting a blogpost soon going through all the steps more in detail at some point in the near future. I was still testing a bunch of stuff in this version of the code, will update with a cleaner version at some point or other. I used preprocessed jpeg images to 227x227 with mirror images as well (also preprocessed), used approx a total of 50 000 images for regression with a 1x6 continuous vector as my label with values between [0,1] , it worked decently well.

import os
# set up Python environment: numpy for numerical routines, and matplotlib for plotting
import lmdb
%matplotlib inline

import numpy as np
import matplotlib.pyplot as plt
# display plots in this notebook


import sys
caffe_root = '../'  # added ../, otherwise this file should be run from {caffe_root}/examples (otherwise change this line)
sys.path.insert(0, caffe_root + 'python')

import caffe
import skimage as sk
# If you get "No module named _caffe", either you have not built pycaffe or you have the wrong path.

def load_Y_labels( path_to_Y_label ):
    Y_label = np.loadtxt(path_to_Y_label, usecols = (1,2,3,4,5,6), unpack=True)
    Y_label = Y_label.transpose()
    Y_label = Y_label.astype('float32')
    Y_label = Y_label.tolist()
    return Y_label

def create_LMDB_files(data,lmdb_data_name,lmdb_label_name,path_image,Inputs,Labels):

    for line in fileinput.input(data):
        entries = re.split(' ', line.strip())
        Inputs.append(entries[0])


    print('Writing labels')


    in_db_label = lmdb.open(lmdb_label_name, map_size=int(1e12))
    max_key_label = in_db_label.stat()["entries"] # # of entries at beg of run
    in_db_label.close()
    # Size of buffer: 1000 elements to reduce memory consumption
    for idx in range(int(math.ceil(len(Labels)/1000.0))):
        in_db_label = lmdb.open(lmdb_label_name, map_size=int(1e12))
        with in_db_label.begin(write=True) as in_txn:
            for label_idx, label_ in enumerate(Labels[(1000*idx):(1000*(idx+1))]):
                im_dat = caffe.io.array_to_datum(np.array(label_).astype(float).reshape(1,1,6))
                in_txn.put('{:0>10d}'.format(1000*idx + label_idx + max_key_label), im_dat.SerializeToString())

                string_ = str(1000*idx+label_idx+1) + ' / ' + str(len(Labels))
                sys.stdout.write("\r%s" % string_)
                sys.stdout.flush()

        max_key_label_close = in_db_label.stat()["entries"]

        in_db_label.close()
    print('')
    print('file size at open :')
    print(max_key_label)
    print('')
    print('')
    print('file size at close :')
    print(max_key_label_close)
    print('')
    print('files appended :')
    print(max_key_label_close - max_key_label)
    print('')
    print('')
    print('')

    print('Writing image data')

    in_db_data = lmdb.open(lmdb_data_name, map_size=int(1e12))
    max_key_data = in_db_data.stat()["entries"]
    in_db_label.close()

    for idx in range(int(math.ceil(len(Inputs)/1000.0))):
        in_db_data = lmdb.open(lmdb_data_name, map_size=int(1e12))
        with in_db_data.begin(write=True) as in_txn:


            for in_idx, in_ in enumerate(Inputs[(1000*idx):(1000*(idx+1))]):
                im = caffe.io.load_image(path_image + in_)
                im_dat = caffe.io.array_to_datum(im.astype(float).transpose((2, 0, 1)))
                in_txn.put('{:0>10d}'.format(1000*idx + in_idx + max_key_data), im_dat.SerializeToString())

                string_ = str(1000*idx+in_idx+1) + ' / ' + str(len(Inputs))
                sys.stdout.write("\r%s" % string_)
                sys.stdout.flush()
        max_key_data_close = in_db_data.stat()["entries"]
        in_db_data.close()
    print('')
    print('')
    print('file size at open :')
    print(max_key_data)
    print('')
    print('file size at close :')
    print(max_key_data_close)
    print('')
    print('files appended :')
    print(max_key_data_close - max_key_data)
    print('')
    print('')
    print('')
    print('')


def create_LMDB_labels_only(data,lmdb_data_name,lmdb_label_name,path_image,Inputs,Labels):

    for line in fileinput.input(data):
        entries = re.split(' ', line.strip())
        Inputs.append(entries[0])


    print('Writing labels')

    # Size of buffer: 1000 elements to reduce memory consumption
    in_db_label = lmdb.open(lmdb_label_name, map_size=int(1e12))
    max_key_label = in_db_label.stat()["entries"]
    in_db_label.close()
    for idx in range(int(math.ceil(len(Labels)/1000.0))):
        in_db_label = lmdb.open(lmdb_label_name, map_size=int(1e12))
        with in_db_label.begin(write=True) as in_txn:
            for label_idx, label_ in enumerate(Labels[(1000*idx):(1000*(idx+1))]):
                im_dat = caffe.io.array_to_datum(np.array(label_).astype(float).reshape(1,1,6))
                in_txn.put('{:0>10d}'.format(1000*idx + label_idx + max_key_label), im_dat.SerializeToString())

                string_ = str(1000*idx+label_idx+1) + ' / ' + str(len(Labels))
                sys.stdout.write("\r%s" % string_)
                sys.stdout.flush()

        max_key_label_close = in_db_label.stat()["entries"]

        in_db_label.close()
    print('')
    print('file size at open :')
    print(max_key_label)
    print('')
    print('')
    print('file size at close :')
    print(max_key_label_close)
    print('')
    print('files appended :')
    print(max_key_label_close - max_key_label)
    print('')
    print('')
    print('')


ref_path = '/home/jerpint/Desktop/caffestuff/'

from pylab import *

#create / append LMDB files


import lmdb
import re, fileinput, math
ref_path = '/home/jerpint/Desktop/caffestuff/'
# textfile = ('trainingset_not_normalized','trainingset_no_zeros','trainingset_no_zeros_not_normalized',
#             'trainingset_no_zeros_not_normalized_penalty','trainingset_no_zeros_penalty','trainingset_only_zeros',
#             'trainingset_penalty')
# define all paths for training

textfile = ('trainingset')


#for ii in range(len(textfile)) :

ii = 1
path_image =  ref_path + 'images_caffe/train_with_mirror/'
    #data = ref_path + 'images_caffe/textfiles/' + textfile[ii] + '.txt' #path_image + 'trainingset.txt'
data = ref_path  + 'images_caffe/train_with_mirror/' + textfile + '.txt'
    #lmdb_label_name = ref_path + 'JP_Kitti/all_data/' +textfile[ii] + '_label'
    #lmdb_data_name = ref_path + 'JP_Kitti/all_data/' + textfile[ii] + '_data'

lmdb_label_name = ref_path + 'JP_Kitti/all_data/' +'train_with_mirror' + '_label'
lmdb_data_name = ref_path + 'JP_Kitti/all_data/' + 'train_with_mirror' + '_data'



Inputs = []


    #textfile_train = ref_path + 'images_caffe/textfiles/' + textfile[ii] + '.txt' #'/media/ubuntu/JERPINTSD/images_caffe/train_large/trainingset.txt'
textfile_train = ref_path  + 'images_caffe/train_with_mirror/' + textfile + '.txt'
Labels =load_Y_labels(textfile_train)

print('TRAINING')
    #print(textfile[ii])
print(textfile)






create_LMDB_files(data,lmdb_data_name,lmdb_label_name,path_image,Inputs,Labels)


# path_image = ref_path + 'images_caffe/test_short2/'
# data = path_image + 'testingset.txt'
# lmdb_label_name = ref_path + 'JP_Kitti/test_label_JP_big'
# lmdb_data_name = ref_path + 'JP_Kitti/test_data_JP_big'



# Inputs = []


# textfile_test =  data #'/media/ubuntu/JERPINTSD/images_caffe/test_short2/testingset.txt'
# Labels =load_Y_labels(textfile_test)



# print('TESTING')
# create_LMDB_files(data,lmdb_data_name,lmdb_label_name,path_image,Inputs,Labels)


the following is an example of what trainingset.txt contained, line by line (col 0 is the image name for the file, I pre processed my images to be 227x227 to use alexnet, col 1-6 the appropriate normalized values I used for regression)

0_0_1.jpg 194.85399156 137.44848517   1.00000000   1.00000000   1.00000000   1.00000000
0_0_1m.jpg  33.14600844 137.44848517   1.00000000   1.00000000   1.00000000   1.00000000
0_0_2.jpg  11.30230862 137.44848517   1.00000000   1.00000000   1.00000000   1.00000000
0_0_2m.jpg 216.69769138 137.44848517   1.00000000   1.00000000   1.00000000   1.00000000

@mvasil
Copy link

mvasil commented Aug 24, 2016

@sukritshankar and @jerpint -- Thank you for the useful LMDB conversion examples.
I have a further question regarding training the network with SigmoidCrossEntropyLossLayer: I have looked through the layer definition, and was wondering if the gradient backprop, as is, works correctly for the case of multiple labels? Did you have to modify the loss function or the gradient computation code in any way in your experiments? In particular, my concern comes from the form of the cross entropy loss function with multiple labels given in equation (63) here.
Thanks again!

@sukritshankar
Copy link
Author

No we did not have to modify the loss function or the gradient computation code !!

On 24 Aug 2016, at 06:19, mvasil notifications@github.com wrote:

@sukritshankar and @jerpint -- Thank you for the useful LMDB conversion examples.
I have a further question regarding training the network with SigmoidCrossEntropyLossLayer: I have looked through the layer definition, and was wondering if the gradient backprop, as is, works correctly for the case of multiple labels? Did you have to modify the loss function or the gradient computation code in any way in your experiments? In particular, my concern comes from the form of the cross entropy loss function with multiple labels given in equation (63) here.
Thanks again!


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

@jerpint
Copy link

jerpint commented Aug 24, 2016

I did not use cross entropy, I used euclidean loss (L2 norm) only. I am currently away from my work computer so I cannot provide my prototxt files just yet :/

@mvasil
Copy link

mvasil commented Aug 24, 2016

Thank you, @jerpint, that would be very helpful when you get a chance!

I started fine-tuning a pre-trained VGG model using the above multi-class training schema, and tried both Sigmoid+Cross Entropy Loss layer, and a Euclidean Loss layer, but results are not good (high value of the loss function and oscillations for even up to 50 epochs of training). I based my conversion of training instances and labels to LMDB on @sukritshankar's linked code, but will give your Python script a go too, to make sure the problem is not how I read the data in.

@jerpint
Copy link

jerpint commented Aug 24, 2016

@mvasil no problem, tag me again in a few days in case I forget. A big challenge was figuring out how to prepare my data (images) to match the preprocessing steps used for Alexnet because I couldn't use their code directly for many practical reasons

@mvasil
Copy link

mvasil commented Sep 3, 2016

@jerpint Are you able to share your solver prototxt and network configuration prototxt files?
I have still not been able to get good results on my problem, seems like the training/test losses are decreasing but very slowly.

@jerpint
Copy link

jerpint commented Sep 15, 2016

hey @mvasil , sorry for the slow reply. Below are both my solver and deploy prototxt files. I had to use fine-tuning using AlexNet to get my regression to work. I also did all the pre processing outside of caffe, for practical reasons related to my application, so all of my images are 227x227. Hope this helps.

deploy.prototxt

name: "JPNet"
layer {
  name: "data"
  type: "Input"
  top: "data"
  input_param { shape: { dim: 1 dim: 3 dim: 227 dim: 227 } }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 96
    kernel_size: 11
    stride: 4
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
}
layer {
  name: "norm1"
  type: "LRN"
  bottom: "conv1"
  top: "norm1"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "norm1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    pad: 2
    kernel_size: 5
    group: 2
  }
}
layer {
  name: "relu2"
  type: "ReLU"
  bottom: "conv2"
  top: "conv2"
}
layer {
  name: "norm2"
  type: "LRN"
  bottom: "conv2"
  top: "norm2"
  lrn_param {
    local_size: 5
    alpha: 0.0001
    beta: 0.75
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "norm2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "conv3"
  type: "Convolution"
  bottom: "pool2"
  top: "conv3"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
  }
}
layer {
  name: "relu3"
  type: "ReLU"
  bottom: "conv3"
  top: "conv3"
}
layer {
  name: "conv4"
  type: "Convolution"
  bottom: "conv3"
  top: "conv4"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 384
    pad: 1
    kernel_size: 3
    group: 2
  }
}
layer {
  name: "relu4"
  type: "ReLU"
  bottom: "conv4"
  top: "conv4"
}
layer {
  name: "conv5"
  type: "Convolution"
  bottom: "conv4"
  top: "conv5"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    pad: 1
    kernel_size: 3
    group: 2
  }
}
layer {
  name: "relu5"
  type: "ReLU"
  bottom: "conv5"
  top: "conv5"
}
layer {
  name: "pool5"
  type: "Pooling"
  bottom: "conv5"
  top: "pool5"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "fc6"
  type: "InnerProduct"
  bottom: "pool5"
  top: "fc6"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 4096
  }
}
layer {
  name: "relu6"
  type: "ReLU"
  bottom: "fc6"
  top: "fc6"
}
layer {
  name: "drop6"
  type: "Dropout"
  bottom: "fc6"
  top: "fc6"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc7"
  type: "InnerProduct"
  bottom: "fc6"
  top: "fc7"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 4096
  }
}
layer {
  name: "relu7"
  type: "ReLU"
  bottom: "fc7"
  top: "fc7"
}
layer {
  name: "drop7"
  type: "Dropout"
  bottom: "fc7"
  top: "fc7"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc8_JP"
  type: "InnerProduct"
  bottom: "fc7"
  top: "fc8_JP"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 6
  }
}

Solver.prototxt :

# The train/test net protocol buffer definition
net: "/home/Desktop/caffestuff/JP_Kitti/all_proto/mirror_shuffle/lenet_auto_trainJP0.prototxt"
#test_net: "/home/Desktop/caffestuff/JP_Kitti/all_proto/mirror_shuffle/lenet_auto_testJP.prototxt"
test_initialization: false
base_lr: 0.01
lr_policy: "step"
gamma: 0.1
stepsize: 100000
#test_iter: 3
#test_interval: 1000
max_iter: 450000
momentum: 0.9
weight_decay: 0.0005

# Display every 100 iterations
display: 100
# The maximum number of iterations

# snapshot intermediate results
snapshot: 500
snapshot_prefix: "/home/Desktop/caffestuff/JP_Kitti/all_proto/mirror_shuffle/snapshot"
snapshot_after_train: true

@yang-fei
Copy link

yang-fei commented Oct 9, 2016

@sukritshankar Hi sukritshankar, you mentioned to have the continuous labels in [0,1], I am not sure about this. I have labels in 3 classes with each of 7 categories. So the label vector should be 3 numbers? So what do you mean continuous labels in [0, 1]


sukritshankar commented on 16 Aug
Well, normally you will have continuous labels in [0,1]. Even I had the
same case. So, I just converted them to a [0,255] range by floor(x*255),
where x is the value in [0,1]. This should suffice. The labels will then be
stored in the LMDB as values in [0,255], and then will get rescaled
automatically to [0,1] in the train prototxt as specified in my code. Hope
that helps !!

On Mon, Aug 15, 2016 at 7:22 PM, mvasil _@_.***> wrote:
@sukritshankar https://github.com/sukritshankar I looked through your
code and noticed that you require to "make sure labels are integer values
in [0, 255]. I have continuous labels in the range [0, 1], what would I
need to modify? Thank you!


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2407 (comment)>, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AIN9tfzC_QhWgHQK1v3k5pW1PoXQd4SWks5qgK54gaJpZM4EO6ki
.

@sukritshankar
Copy link
Author

Perhaps I did not write this clearly. Sorry for that !! By continuous labels in [0,1], I mean that for each given label, we have a ground truth score in [0,1]. For your case, if you have 3 classes with 7 labels each, you then have 21 labels in all, and for a given data instance, your ground truth will be a 21-dim vector, with each value in [0,1].

@yang-fei
Copy link

@sukritshankar oh I see, but the ground truth should be in the 21-dim vector with the number either 0 or 1 which should be mapped into 0 or 255 according to what you said. So what do you mean continuous? I check you example label.mat file, which contains many numbers between 0 and 255.

@sukritshankar
Copy link
Author

[0,1] does not mean only 0 or 1, but any real number from 0 to 1 inclusive
of both. This is necessary for something like sigmoid cross entropy loss
where every label annotation is a probability. Note we do not write {0,1},
but [0,1].

On Mon, Oct 10, 2016 at 3:18 AM, minitfly notifications@github.com wrote:

@sukritshankar https://github.com/sukritshankar oh I see, but the
ground truth should be in the 21-dim vector with the number either 0 or 1
which should be mapped into 0 or 255 according to what you said. So what do
you mean continuous? I check you example label.mat file, which contains
many numbers between 0 and 255.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#2407 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AIN9tRKKeBE-RlUXj2QBTLrqnfJgfSjoks5qyaB3gaJpZM4EO6ki
.

@yang-fei
Copy link

@sukritshankar sorry for that. What I am confused is that, for the ground truth label, which is labeled mannually. But this number is 0 or 1? So the ground truth label is {0, 1}?

@sukritshankar
Copy link
Author

Thats fine !! Yes your groundtruth labels are {0,1}. Think of 0 and 1 as
being part of [0,1], 0 as probability of 0, and 1 as probability of 1. In
such a case, you either have 0 or 255 after conversion. However, the code I
have shared applies to the generic case of continuous values in [0,1] -
Yours is just a special case of this generic case :)

On Mon, Oct 10, 2016 at 3:46 AM, minitfly notifications@github.com wrote:

@sukritshankar https://github.com/sukritshankar sorry for that. What I
am confused is that, for the ground truth label, which is labeled
mannually. But this number is 0 or 1? So the ground truth label is {0, 1}?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#2407 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AIN9tclSQvu5KVWQdW3osSQDfgAfaVOrks5qyacQgaJpZM4EO6ki
.

@shengyudingli
Copy link

Thanks a lot for so many useful discussions! I have a question about this issue, does this code support the variable numbers of image's label? @sukritshankar

@sukritshankar
Copy link
Author

If you data instances have variable number of image labels, make them
consistent by choosing the max length of the variable vector, and putting
in zeros where you have the missing label.

On Tue, Oct 11, 2016 at 12:23 PM, shengyudingli notifications@github.com
wrote:

Thanks a lot for so many useful discussions! I have a question about this
issue, does this code support the variable numbers of image's label?
@sukritshankar https://github.com/sukritshankar


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#2407 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AIN9teWuf1tCI-iplWcaagvj2nzJTuKgks5qy3GcgaJpZM4EO6ki
.

@Shine41
Copy link

Shine41 commented Oct 21, 2016

@sukritshankar Hi, I am trying to apply the multi regression architecture you posted above to my net, but one error come out:
I1020 21:58:32.243350 15732 layer_factory.hpp:77] Creating layer data
I1020 21:58:32.267315 15732 net.cpp:91] Creating Layer data
I1020 21:58:32.267349 15732 net.cpp:399] data -> data
I1020 21:58:32.267393 15732 data_transformer.cpp:25] Loading mean file from: /home/ying/Desktop/My_RenderForCNN/caffe_models/imagenet_mean.binaryproto
I1020 21:58:32.341429 15737 db_lmdb.cpp:35] Opened lmdb /home/ying/Desktop/My_RenderForCNN/data/syn_lmdbs/syn_lmdb_train_image
I1020 21:58:32.590544 15732 data_layer.cpp:41] output data size: 64,3,227,227
I1020 21:58:32.649722 15732 net.cpp:141] Setting up data
I1020 21:58:32.649780 15732 net.cpp:148] Top shape: 64 3 227 227 (9893568)
I1020 21:58:32.649791 15732 net.cpp:156] Memory required for data: 39574272
I1020 21:58:32.649802 15732 layer_factory.hpp:77] Creating layer label
I1020 21:58:32.649905 15732 net.cpp:91] Creating Layer label
I1020 21:58:32.649924 15732 net.cpp:399] label -> label
I1020 21:58:32.652168 15739 db_lmdb.cpp:35] Opened lmdb /home/ying/Desktop/My_RenderForCNN/data/syn_lmdbs/syn_lmdb_train_label
F1020 21:58:32.655133 15738 data_transformer.cpp:63] Check failed: datum_height == data_mean
.height() (227 vs. 256)_

How did you resize your mean_file, namely imagenet_mean.binaryproto to match lmdb size? Thank you in advance!

@Franklin-Yao
Copy link

@jerpint I guess your label shape is wrong.it should be N x M x 1 x 1. Not N x 1 x 1 x M

@jerpint
Copy link

jerpint commented Nov 11, 2016

@xcmax why do you say this? look at the line :

im_dat = caffe.io.array_to_datum(np.array(label_).astype(float).reshape(1,1,6))

this is the line that embeds each label, M of them, for a total of N samples, hence the Nx1x1xM

@Franklin-Yao
Copy link

Franklin-Yao commented Nov 12, 2016

@jerpint I have seen many use N x M x 1 x 1, including @sukritshankar's example https://github.com/sukritshankar/Caffe-LMDBCreation-MultiLabel

print X.shape # Check to see if X is of shape N x M x 1 x 1

with env.begin(write=True) as txn:
# txn is a Transaction object
for i in range(N):
datum = caffe.proto.caffe_pb2.Datum()
datum.channels = X.shape[1]
datum.height = X.shape[2]
datum.width = X.shape[3]
datum.data = X[i].tostring() # or .tobytes() if numpy < 1.9
datum.label = int(y[i])
str_id = '{:08}'.format(i)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests