Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.rec must use the DPNs's Mxnet im2rec? #16

Open
shipengai opened this issue Oct 25, 2017 · 12 comments
Open

.rec must use the DPNs's Mxnet im2rec? #16

shipengai opened this issue Oct 25, 2017 · 12 comments

Comments

@shipengai
Copy link

I use the new mxnet/tools/im2rec.py to produce .rec file,
when I run the mxnet in your DPNs, error"Segmentation fault(core dumped)" is showed.

@cypw
Copy link
Owner

cypw commented Oct 25, 2017

The provided MXNet in the DPN repo is a pretty old version which may not be compatible with .rec generated by the newer version.

You can either generate the .rec again by using the old MXNet [code], or simply add the customized data augmentations to the lastest MXNet and use the new one.

*Note: I personally recommend you to move to the latest MXNet since it would provide you much faster training and testing speed. But be careful with the official MXNet's default data augmentations, since it uses different strategies and may lead to poorer accuracy.

@shipengai
Copy link
Author

It always show this error .

@cypw
Copy link
Owner

cypw commented Oct 25, 2017

Have you followed my recommendations above?

There are numerous mistakes that lead to Segmentation fault.
Could you provide me with more details?

@shipengai
Copy link
Author

@cypw Thanks your answer.I use the old mxnet to train.

@shipengai
Copy link
Author

shipengai commented Oct 28, 2017

@cypw but I add the torch augmentations to the lastest MXNet,and make .
my operation is :
1/copy file image_aug_torch.cc to new_mxnet/src/io/
2/ rename image_aug_default.cc. as image_aug_default.cc.bk
3/ change the image_augmenter.h as follows
namespace mxnet { namespace io { /*! \return the parameter of default augmenter */ //std::vector<dmlc::ParamFieldInfo> ListDefaultAugParams(); std::vector<dmlc::ParamFieldInfo> ListTorchAugParams(); std::vector<dmlc::ParamFieldInfo> ListDefaultDetAugParams(); } // namespace io } // namespace mxnet #endif // MXNET_IO_IMAGE_AUGMENTER_H_
4/ make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas USE_CUDA=1 USE_CUDA_PATH=/usr/local/cuda USE_CUDNN=1

There are some error as follows:
src/io/image_aug_torch.cc:214:11: error: ‘cv::Mat mxnet::io::TorchImageAugmenter::Process(const cv::Mat&, mxnet::common::RANDOM_ENGINE*)’ marked ‘override’, but does not override cv::Mat Process(const cv::Mat &src, ^ src/io/image_aug_torch.cc: In lambda function: src/io/image_aug_torch.cc:429:36: error: invalid new-expression of abstract class type ‘mxnet::io::TorchImageAugmenter’ return new TorchImageAugmenter(); ^ src/io/image_aug_torch.cc:141:7: note: because the following virtual functions are pure within ‘mxnet::io::TorchImageAugmenter’: class TorchImageAugmenter : public ImageAugmenter { ^ In file included from src/io/image_aug_torch.cc:11:0: src/io/./image_augmenter.h:41:19: note: virtual cv::Mat mxnet::io::ImageAugmenter::Process(const cv::Mat&, std::vector<float>*, mxnet::common::RANDOM_ENGINE*) virtual cv::Mat Process(const cv::Mat &src, std::vector<float> *label, ^ src/io/image_aug_torch.cc: At global scope: src/io/image_aug_torch.cc:430:4: error: no matching function for call to ‘mxnet::io::ImageAugmenterReg::set_body(mxnet::io::<lambda()>)’ }); ^ In file included from /home/shipeng/mxnet/nnvm/include/nnvm/./base.h:13:0, from /home/shipeng/mxnet/nnvm/include/nnvm/op.h:16, from include/mxnet/base.h:33, from src/io/image_aug_torch.cc:6: /home/shipeng/mxnet/dmlc-core/include/dmlc/registry.h:165:21: note: candidate: EntryType& dmlc::FunctionRegEntryBase<EntryType, FunctionType>::set_body(FunctionType) [with EntryType = mxnet::io::ImageAugmenterReg; FunctionType = std::function<mxnet::io::ImageAugmenter*()>] inline EntryType &set_body(FunctionType body) { ^ /home/shipeng/mxnet/dmlc-core/include/dmlc/registry.h:165:21: note: no known conversion for argument 1 from ‘mxnet::io::<lambda()>’ to ‘std::function<mxnet::io::ImageAugmenter*()>’ Makefile:275: recipe for target 'build/src/io/image_aug_torch.o' failed make: *** [build/src/io/image_aug_torch.o] Error 1 make: *** Waiting for unfinished jobs....

@cypw
Copy link
Owner

cypw commented Oct 28, 2017

@shipeng-uestc
If you are using the provided old MXNet @ 92053bd, please do the following tests to debug:

Step 1: Make sure your train.rec and val.rec are correct and your MXNet is good to go.

This can be verified by simply running the testing code on your *.rec files (both train.rec and val.rec).

Step 2: Check if you have used the iterator correctly. Here, I give you an example:

 # data iter
 def get_data_iter(args, kv):
     mean_r = 124
     mean_g = 117
     mean_b = 104
     data_shape = (3, 224, 224)
     train = mx.io.ImageRecordIter(
         data_name           = 'data',
         label_name          = 'softmax_label',
         # ------------------------------------
         path_imgrec         = os.path.join(args.data_dir, "train.rec"),
         aug_seq             = 'aug_torch',
         label_width         = 1,
         data_shape          = data_shape,
         force2color         = True,
         preprocess_threads  = 15,
         verbose             = True,
         num_parts           = 1,
         part_index          = 0,
         shuffle             = True,
         shuffle_chunk_size  = 1024,
         shuffle_chunk_seed  = kv.rank,
         # ------------------------------------
         batch_size          = args.batch_size,
         # ------------------------------------
         rand_mirror         = True,
         mean_r              = mean_r,
         mean_g              = mean_g,
         mean_b              = mean_b,
         scale               = 0.0167,
         seed                = kv.rank,
         # ------------------------------------
         rand_crop           = True,
         min_aspect_ratio    = 0.7500,
         max_aspect_ratio    = 1.3333,
         min_random_area     = 0.08,
         max_random_area     = 1.0,
         random_h            = 20,
         random_s            = 40,
         random_l            = 50,
         fill_value          = (mean_r, mean_g, mean_b),
         inter_method        = 2    # 1-bilinear 2-cubic 9-auto
         )
     val = mx.io.ImageRecordIter(
         data_name           = 'data',
         label_name          = 'softmax_label',
         # ------------------------------------
         path_imgrec         = os.path.join(args.data_dir, "val.rec"),
         aug_seq             = 'aug_torch',
         label_width         = 1,
         data_shape          = data_shape,
         force2color         = True,
         preprocess_threads  = 4,
         verbose             = True,
         num_parts           = kv.num_workers,
         part_index          = kv.rank,
         # ------------------------------------
         batch_size          = args.batch_size,
         # ------------------------------------
         rand_mirror         = False,
         mean_r              = mean_r,
         mean_g              = mean_g,
         mean_b              = mean_b,
         scale               = 0.0167,
         seed                = 0,
         # ------------------------------------
         rand_crop           = False,
         min_random_area     = 0.765625,
         max_random_area     = 0.765625,
         fill_value          = (mean_r, mean_g, mean_b),
         inter_method        = 2    # 1-bilinear 2-cubic 9-auto
         )
     return (train, val)

As for moving to the latest MXNet, I haven't tried it yet. But it seems that the newer MXNet added another input argument (i.e. std::vector<float> *label) in the Process function #L59, so you may also need to add a std::vector<float> *label just like what they did in their code image_aug_default.cc#L196.

@shipengai
Copy link
Author

@cypw thank you very much.But,why min_random_area = 0.765625.if I set it as follows:
rand_crop = False, min_random_area = 1, max_random_area = 1,
is it center crop?

@cypw
Copy link
Owner

cypw commented Oct 28, 2017

I fix it to 0.765625 for evaluating a 224x224 center crop with all input images resized to short length = 256.
Note: min_random_area = max_random_area = 0.765625 = (224^2)/(256^2).

Yes, If you set min_random_area = max_random_area = 1. and rand_crop = False, then you are using a center crop with all input images resized to short length = input length = 224. In other words, the input will be a 224x224 center crop on an 224xN (or Nx224) image.

@shipengai
Copy link
Author

Thank you very much.it help me a lot.
About min_random_area ,I always understood wrong.

@shipengai
Copy link
Author

shipengai commented Oct 28, 2017

I still have a question.Now ,I want use three DPNs-92 to ensemble.I have read some blogs, it says need different training data. Now,I use "unchanged = 1","resize = 395","resize =480"to produce three different train.rec.The input size is 320*320. Do you have a better suggestion?

@cypw
Copy link
Owner

cypw commented Oct 28, 2017

@shipeng-uestc
The last two *.rec files are unnecessary since all resizing can be done inside the data iterator.

Actually, I don't quite understand why do you need to use the training set with different scale to do the multi-model ensembling. Usually, the multi-model ensembling happens after the training phase and only involves validation and testing set.

@shipengai
Copy link
Author

@cypw Thank you.I want to get three differernt DPNs-92 models,then I use three models to predict the same test, in order to get higher accuracy.I read the Resnet paper, when they test, they use 2 resnet152 to get a higher accuracy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants