-
Notifications
You must be signed in to change notification settings - Fork 6.8k
DeepLearning on Imagenet with mxnet issues translating .lst to .rec files #9766
Comments
Can you please also provide the script that you ran to reproduce the issue ? I see that you are using older version of MXNet : 0.11.0 . Have you tried 1.0.0 ? |
I am traveling for the next couple of days. I will post the script when I
return. I have not tried vs 1.0.
Dave
…On Feb 13, 2018 1:26 PM, "Anirudh Subramanian" ***@***.***> wrote:
Can you please also provide the script that you ran to reproduce the issue
? I see that you are using older version of MXNet : 0.11.0 . Have you tried
1.0.0 ?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#9766 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ATUaTQhLKUSfiY-62qLGUz1PJuxstQGcks5tUdOrgaJpZM4SBeRA>
.
|
Here is the script used to train alexnet USAGEpython train_alexnet.py --checkpoints checkpoints --prefix alexnetpython train_alexnet.py --checkpoints checkpoints --prefix alexnet --start-epoch 25import the necessary packagesfrom config import imagenet_alexnet_config as config construct the argument parse and parse the argumentsap = argparse.ArgumentParser() set the logging level and output filelogging.basicConfig(level=logging.DEBUG, load the RGB means for the training set, then determine the batchsizemeans = json.loads(open(config.DATASET_MEAN).read()) construct the training image iteratortrainIter = mx.io.ImageRecordIter( construct the validation image iteratorvalIter = mx.io.ImageRecordIter( initialize the optimizeropt = mx.optimizer.SGD(learning_rate=1e-2, momentum=0.9, wd=0.0005, construct the checkpoints path, initialize the model argument andauxiliary parameterscheckpointsPath = os.path.sep.join([args["checkpoints"], if there is no specific model starting epoch supplied, theninitialize the networkif args["start_epoch"] <= 0: otherwise, a specific checkpoint was suppliedelse:
compile the modelmodel = mx.mod.Module( initialize the callbacks and evaluation metricsbatchEndCBs = [mx.callback.Speedometer(batchSize, 500)] train the networkprint("[INFO] training network...") |
Here is the complete error trace Stack trace returned 6 entries: terminate called after throwing an instance of 'dmlc::Error' Stack trace returned 6 entries: Aborted (core dumped) |
I have oosted the script.
…On Feb 13, 2018 7:48 PM, "David Stone" ***@***.***> wrote:
I am traveling for the next couple of days. I will post the script when I
return. I have not tried vs 1.0.
Dave
On Feb 13, 2018 1:26 PM, "Anirudh Subramanian" ***@***.***>
wrote:
> Can you please also provide the script that you ran to reproduce the
> issue ? I see that you are using older version of MXNet : 0.11.0 . Have you
> tried 1.0.0 ?
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#9766 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/ATUaTQhLKUSfiY-62qLGUz1PJuxstQGcks5tUdOrgaJpZM4SBeRA>
> .
>
|
I have resolved the issue. I had used resize=256 while my training script was using 3x227x227 image. You can close the issue. |
Note: Providing complete information in the most concise form is the best way to get help. This issue template serves as the checklist for essential information to most of the technical issues and bug reports. For non-technical issues and feature requests, feel free to present the information in what you believe is the best form.
For Q & A and discussion, please start a discussion thread at https://discuss.mxnet.io
Description
I ran the following command and got double the expected output
$ ~/mxnet/bin/im2rec imagenet/lists/train.lst "" \imagenet/rec/train.rec \ resize=256 encoding=’.jpg’
\quality=100
The output I got from running the command
$ ls -l imagenet/rec/
total 217313924
-rw-rw-r-- 1 stonedl3 stonedl3 8310150340 Feb 4 01:27 test.rec
-rw-rw-r-- 1 stonedl3 stonedl3 205306062916 Feb 4 00:46 train.rec
-rw-rw-r-- 1 stonedl3 stonedl3 8913201356 Feb 4 01:10 val.rec
I then tried to train alexnet and got an error
[10:57:18] /home/stonedl3/mxnet/dmlc-core/include/dmlc/./logging.h:308: [10:57:18] src/io/image_aug_default.cc:300: Check failed: static_cast<index_t>(res.rows) >= param_.data_shape[1] && static_cast<index_t>(res.cols) >= param_.data_shape[2] input image size smaller than input shape
Environment info (Required)
I am using an HP omen desktop with the following specs:
Graphics Cards two NVIDIA GTX 1080 ti GPUs
Memory 31.9 GB
The text was updated successfully, but these errors were encountered: