use 224x224 size to train mobilenet will encounter memory problem #8391

HaoLiuHust · 2017-10-23T08:23:15Z

I was trying to use CIFAR dataset to train a mobilenet, since the origin size is 32x32 so I resize them to 224x224 which is the size of the mobilenet input:

def ExtractImageAndLabel(path,file):
    with open(path+file, 'rb') as f:
        data = cPickle.load(f)
        images = data['data']
        images = np.reshape(images, (10000, 3, 32, 32))
        images_resize = mx.nd.zeros(shape=(10000, 3, 224, 224))
        imagearray = mx.nd.array(images)

        for i, img in enumerate(imagearray):
            img=mx.nd.transpose(img,(2,1,0))
            images_rs = mx.image.imresize(img,224,224)
            images_resize[i]=mx.nd.transpose(images_rs,(2,1,0))

        labels = data['labels']

        labelarray = mx.nd.array(labels)

        return images_resize, labelarray_

but after I resize them, I will encounter this problem:
mxnet_source/dmlc-core/include/dmlc/./logging.h:308: [16:20:13] include/mxnet/././tensor_blob.h:275: Check failed: this->shape_.Size() == shape.Size() (4515840000 vs. 220872704) TBlob.get_with_shape: new and old shape do not match total elements

if I did not resize the image, it will be ok, and I check the meomery use, seems a memory leak happened, my computer has 64GB memory, the program will use them all.

the other parts of my script is:

_import mxnet as mx
from mxnet import nd, autograd, gluon
import numpy as np
import time
import Utils
import logging
import mobilenet

def ReadTrainSets(path):
    train_data = []
    train_label = []

    data_pre = 'data_batch_'
    for i in range(1,6):
        file_name = data_pre+str(i)
        data_batch, label_batch = Utils.ExtractImageAndLabel(path, file_name)
        if not len(train_data):
            train_data = data_batch
            train_label = label_batch

        else:
            train_data = mx.nd.concatenate([train_data,data_batch])
            train_label = mx.nd.concatenate([train_label,label_batch])

    return train_data, train_label

path = './cifar-10-batches-py/'
batch = 128
ctx =(mx.gpu(2),mx.gpu(3))
train_data, train_label=ReadTrainSets(path)

print train_data.shape
print train_label.shape

train_iter = mx.io.NDArrayIter(data=train_data,label=train_label,batch_size=batch,shuffle=True)

net = gluon.model_zoo.vision.get_mobilenet(pretrained=False,ctx=ctx, multiplier = 1,classes=10)
data=mx.sym.Variable('data')
output = net(data)
softmax = mx.symbol.SoftmaxOutput(data=output, name='softmax')
#softmax = mobilenet.get_symbol(10)

logging.basicConfig(level=logging.INFO)

mod = mx.mod.Module(symbol=softmax, context=ctx)

print mod.symbol.get_internals()

mod.bind(data_shapes=train_iter.provide_data, label_shapes=train_iter.provide_label)
mod.init_params(initializer=mx.init.Xavier(magnitude=2.))

mod.fit(train_iter, optimizer_params={'learning_rate':0.1, 'momentum':0.9}, num_epoch=100)
mod.save_checkpoint("mobilenet", epoch=100)_

can any one help?

The text was updated successfully, but these errors were encountered:

chinakook · 2017-10-23T11:29:29Z

use tiny imagenet but not cifar

HaoLiuHust · 2017-10-24T06:29:40Z

I tried to use imagerecorditer, it will be ok, maybe its a bug in NDarray

szha · 2018-01-23T12:26:27Z

@apache/mxnet-committers: This issue has been inactive for the past 90 days. It has no label and needs triage.

For general "how-to" questions, our user forum (and Chinese version) is a good place to get help.

lanking520 · 2018-04-10T18:51:57Z

@sandeep-krishnamurthy please add Python, Bug, NDArray, Memory to this topic

vandanavk · 2018-07-18T23:19:32Z

@HaoLiuHust are you still seeing this issue?

with the latest code, I don't see the issue "Check failed: this->shape_.Size() == shape.Size() (4515840000 vs. 220872704) TBlob.get_with_shape: new and old shape do not match total elements".

Could you share the system configuration on which you see this issue?

HaoLiuHust · 2018-07-19T03:34:54Z

@vandanavk It is a long time..., the program has lost, maybe it is ok now..

sandeep-krishnamurthy · 2018-07-19T04:12:43Z

Closing the issue as it is not reproducible. Please reopen in case issue still persists.

szha added the needs triage label Jan 23, 2018

nswamy added Bug Python NDArray Memory and removed needs triage labels Apr 10, 2018

sandeep-krishnamurthy closed this as completed Jul 19, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use 224x224 size to train mobilenet will encounter memory problem #8391

use 224x224 size to train mobilenet will encounter memory problem #8391

HaoLiuHust commented Oct 23, 2017 •

edited

chinakook commented Oct 23, 2017

HaoLiuHust commented Oct 24, 2017

szha commented Jan 23, 2018

lanking520 commented Apr 10, 2018

vandanavk commented Jul 18, 2018

HaoLiuHust commented Jul 19, 2018

sandeep-krishnamurthy commented Jul 19, 2018

use 224x224 size to train mobilenet will encounter memory problem #8391

use 224x224 size to train mobilenet will encounter memory problem #8391

Comments

HaoLiuHust commented Oct 23, 2017 • edited

chinakook commented Oct 23, 2017

HaoLiuHust commented Oct 24, 2017

szha commented Jan 23, 2018

lanking520 commented Apr 10, 2018

vandanavk commented Jul 18, 2018

HaoLiuHust commented Jul 19, 2018

sandeep-krishnamurthy commented Jul 19, 2018

HaoLiuHust commented Oct 23, 2017 •

edited