Skip to content
This repository has been archived by the owner on Feb 7, 2023. It is now read-only.

Creating an image LMDB for Caffe2 training, testing, etc. #1755

Open
MatthewInkawhich opened this issue Jan 16, 2018 · 9 comments
Open

Creating an image LMDB for Caffe2 training, testing, etc. #1755

MatthewInkawhich opened this issue Jan 16, 2018 · 9 comments

Comments

@MatthewInkawhich
Copy link
Contributor

In the original Caffe framework, there was an executable under caffe/build/tools called convert_imageset, which took a directory of JPEG images and a text file with labels for each image, and output an LMDB that could be fed to a Caffe model to train, test, etc.

What is the best way to convert raw JPEG images and labels to an LMDB that Caffe2 can ingest using the AddInput() function from this MNIST tutorial on the Caffe2 website (https://github.com/caffe2/caffe2/blob/master/caffe2/python/tutorials/MNIST.ipynb)?

According to my research, you cannot simply create an LMDB file using this tool and feed a Caffe2 model.

The tutorial script just downloads two LMDBs (mnist-train-nchw-lmdb and mnist-test-nchw-lmdb) and passes them to AddInput(), but gives no insight as to how the LMDBs were created.

@xinqinglxl
Copy link

Hi.
If your training is on the CPU you can follow the guide below
https://github.com/caffe2/caffe2/blob/master/caffe2/python/examples/lmdb_create_example.py

If you want to train the model with multi-gpu you can use the make_image_db command to create the lmdb from image.

@sreelu1995
Copy link

"If you want to train the model with multi-gpu you can use the make_image_db command to create the lmdb from image."
Can you please be more specific about this? How can this command be used to convert .jpg images to LMDB format? Please provide some links to examples or tutorials

@xinqinglxl
Copy link

xinqinglxl commented Jan 19, 2018

Can you please be more specific about this? How can this command be used to convert .jpg images to LMDB format? Please provide some links to examples or tutorials

Hi, @sreelu1995 .
Firstly, you should check whether the make_image_db command is complied,
if you built your caffe2 with opencv the command will be complied and it may in the following directory
caffe2/build/bin/make_image_db

Then prepare your images for train.
eg. dataset/image1.jpg
dataset/image2.jpg
...

Create a text file specifying the categories that the pictures belong to. This text file(eg. train.labels) is formatted like so,

dataset/image1.jpg 0
dataset/image2.jpg 1
...

Excute the make_image_db command then the lmdb will be created.
caffe2/build/bin/make_image_db -color -db lmdb -input_folder ./ -list_file train.labels -num_threads 10 -output_db_name ../lmdb/age2labels_train_nchw_caffe2_lmdb -raw -scale 256 -shuffle

@MatthewInkawhich
Copy link
Contributor Author

MatthewInkawhich commented Jan 19, 2018

@xinqinglxl Thanks for your answer. Just to be clear, does the make_image_db executable create LMDBs that exclusively work as data containers when training on a multi-gpu system? Likewise, is the lmdb_create_example.py script only suitable for cpu usage?

Sorry if I am making you repeat yourself, I am just struggling to understand why the LMDB creation method is system specific.

@xinqinglxl
Copy link

Just to be clear, does the make_image_db executable create LMDBs that exclusively work as data containers when training on a multi-gpu system?

Sorry, but I 've never used the LMDB created by make_image_db on cpu-training,
I think it may works on cpu, you can try it.

I am just struggling to understand why the LMDB creation method is system specific.

I don't think the LMDB creation method is system specific,
the reason why LMDB created by lmdb_create_example.py script can't run on multi-gpus is this python-script-made LMDB is a caffe-type LMDB, which can work on cpu and single-gpu training.
The LMDB created by make_image_db is caffe2-type LMDB, so it can be used on multi-gpus traing.
Unfortunately it seems there is no example to create a LMDB in python, @breadbread1984 provided
a C++ example here

@MatthewInkawhich
Copy link
Contributor Author

MatthewInkawhich commented Jan 22, 2018

I am running into a problem using the LMDB created by make_image_db executable.

For testing purposes, I use the same code to create, train, and test the net found in the MNIST tutorial (https://github.com/caffe2/caffe2/blob/master/caffe2/python/tutorials/MNIST.ipynb), however, instead of downloading the pre-created LMDBs like the tutorial does, I downloaded the MNIST dataset as JPEG files from https://www.kaggle.com/scolianni/mnistasjpg. I am using a sample of this data for my purposes for simplicity (10,000 training | 2,000 validation | 2,000 testing).

Steps:

  1. Extract a sample of JPEGs for training, validation, and testing, and put them in central "pool" directories, while creating label files for each set. These label files are formatted as @xinqinglxl showed in a previous comment.

  2. Create LMDBs for each set using the following commands:
    ./make_image_db -db lmdb -input_folder ~/mnistasjpg/mnist_training/pool/ -list_file ~/mnistasjpg/training_labels_v2 -num_threads 10 -output_db_name ~/mnistasjpg/train_lmdb -raw -scale 28 -shuffle
    ./make_image_db -db lmdb -input_folder ~/mnistasjpg/mnist_validation/pool/ -list_file ~/mnistasjpg/validation_labels_v2 -num_threads 10 -output_db_name ~/mnistasjpg/validation_lmdb -raw -scale 28
    ./make_image_db -db lmdb -input_folder ~/mnistasjpg/mnist_testing/pool/ -list_file ~/mnistasjpg/testing_labels_v2 -num_threads 10 -output_db_name ~/mnistasjpg/test_lmdb -raw -scale 28
    These all ran quickly, with no stdout output as there was with the original Caffe's create_imageset command.

  3. Modify my train.py script that successfully runs training on the downloaded MNIST LMDBs as shown in the tutorial to change the LMDB path variables to the newly created LMDBs.

  4. Run train.py.

Error:
WARNING:root:This caffe2 python run does not have GPU support. Will run in CPU only mode. WARNING:root:Debug message: No module named caffe2_pybind11_state_gpu training data folder:/Users/matthewinkawhich/mnistasjpg workspace root folder:/Users/matthewinkawhich/caffe2_examples/myex Protocol buffers files have been created in your root folder: /Users/matthewinkawhich/caffe2_examples/myex Traceback for operator 4 in network myex_train /usr/local/caffe2/python/helpers/conv.py:149 /usr/local/caffe2/python/helpers/conv.py:196 /usr/local/caffe2/python/brew.py:119 train.py:71 train.py:160 Traceback (most recent call last): File "train.py", line 226, in <module> workspace.RunNet(train_model.net) File "/usr/local/caffe2/python/workspace.py", line 228, in RunNet StringifyNetName(name), num_iter, allow_fail, File "/usr/local/caffe2/python/workspace.py", line 193, in CallWithExceptionIntercept return func(*args, **kwargs) RuntimeError: [enforce fail at conv_op_impl.h:46] C == filter.dim32(1) * group_. Convolution op: input channels does not match: # of input channels 28 is not equal to kernel channels * group:1*1 Error from operator: input: "data" input: "conv1_w" input: "conv1_b" output: "conv1" name: "" type: "Conv" arg { name: "kernel" i: 5 } arg { name: "exhaustive_search" i: 0 } arg { name: "order" s: "NCHW" } engine: "CUDNN"

The important line seems to be:
**Convolution op: input channels does not match: # of input channels 28 is not equal to kernel channels * group:1*1**

Code:

# define model architecture in terms of layers

def AddLeNetModel(model, data):

    # Image size: 28 x 28 -> 24 x 24

    conv1 = brew.conv(model, data, 'conv1', dim_in=1, dim_out=20, kernel=5)

    # Image size: 24 x 24 -> 12 x 12

    pool1 = brew.max_pool(model, conv1, 'pool1', kernel=2, stride=2)

    # Image size: 12 x 12 -> 8 x 8

    conv2 = brew.conv(model, pool1, 'conv2', dim_in=20, dim_out=50, kernel=5)

    # Image size: 8 x 8 -> 4 x 4

    pool2 = brew.max_pool(model, conv2, 'pool2', kernel=2, stride=2)

    # 50 * 4 * 4 stands for dim_out from previous layer multiplied by the image size

    fc3 = brew.fc(model, pool2, 'fc3', dim_in=50 * 4 * 4, dim_out=500)

    fc3 = brew.relu(model, fc3, fc3)

    pred = brew.fc(model, fc3, 'pred', 500, num_classes)

    softmax = brew.softmax(model, pred, 'softmax')

    return softmax

Note that this code is unchanged from the tutorial. Line 71 that the error output was referring to is the line starting with conv1 =.

Fix Attempts:
I thought that maybe the make_image_db was storing the MNIST images with 3 color channels, when indeed they are greyscale images. To try to fix this, I tried the following options:

  1. Changed the line:
    conv1 = brew.conv(model, data, 'conv1', dim_in=1, dim_out=20, kernel=5) to
    conv1 = brew.conv(model, data, 'conv1', dim_in=3, dim_out=20, kernel=5). However, I still get the same error, with one line difference.
    The line:
    Convolution op: input channels does not match: # of input channels 28 is not equal to kernel channels * group:1*1
    is now:
    Convolution op: input channels does not match: # of input channels 28 is not equal to kernel channels * group:3*1

  2. Ran the same commands as shown in Step 2 above but with a -color false flag. However, I am still getting same error.

@xinqinglxl
Copy link

I guess maybe the reason is the dimension incompatibility of the input blob.
The MNIST-LeNet-demo need the input to be NCHW but the input read from lmdb is NHWC.
ps: I'm not sure but you can check it by print the shape of input blob.

Then, you can switch the input blobs to NCHW or maybe you can change the arg_scope to {"order": "NHWC"} .

@sreelu1995
Copy link

@xinqinglxl Thank you so much. But I still have a doubt. When I check the sample command you have given, you have not specified anywhere the path to the image data set. Or are you specifying the path in the train.labels itself?

@xinqinglxl
Copy link

Hi @sreelu1995
caffe2/build/bin/make_image_db -color -db lmdb -input_folder ./ -list_file train.labels -num_threads 10 -output_db_name ../lmdb/age2labels_train_nchw_caffe2_lmdb -raw -scale 256 -shuffle

The image data set is specified by input_folder parameter,
in this example I point it to the current directory (./).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants