-
Notifications
You must be signed in to change notification settings - Fork 810
Error:neon --gpu nervanagpu examples/convnet/i1k-alexnet-fp32.yaml #26
Comments
Do you have the imagenet data files (the tar files containing the images)? Alex On Saturday, May 16, 2015, Andy Yuan notifications@github.com wrote:
|
yes, I have it and change the path of -fp32.yaml . |
can you confirm that the following files are in $repo_path/I1K (where ILSVRC2012_img_train.tar from the error it seems like the batch_writer is not finding the train tar On Sat, May 16, 2015 at 6:35 PM, Andy Yuan notifications@github.com wrote:
|
cool. it works! |
a little suggestion: maybe we should provide meanful debug/error message. ;) |
ubgpu@ubgpu:~/github/neon/neon$ neon --gpu nervanagpu examples/convnet/i1k-alexnet-fp32.yaml
WARNING:neon.util.persist:deserializing object from: examples/convnet/i1k-alexnet-fp32.yaml
WARNING:neon.datasets.imageset:Imageset initialized with dtype <type 'numpy.float32'>
2015-05-15 22:00:54,319 WARNING:neon - setting log level to: 20
2015-05-15 22:00:54,447 INFO:gpu - Initialized NervanaGPU with stochastic_round=None
2015-05-15 22:00:54,447 INFO:gpu - Seeding random number generator with: None
2015-05-15 22:00:54,448 INFO:init - NervanaGPU backend, RNG seed: None, numerr: None
2015-05-15 22:00:54,449 INFO:mlp - Layers:
ImageDataLayer d0: 3 x (224 x 224) nodes
ConvLayer conv1: 3 x (224 x 224) inputs, 64 x (55 x 55) nodes, RectLin act_fn
PoolingLayer pool1: 64 x (55 x 55) inputs, 64 x (27 x 27) nodes, Linear act_fn
ConvLayer conv2: 64 x (27 x 27) inputs, 192 x (27 x 27) nodes, RectLin act_fn
PoolingLayer pool2: 192 x (27 x 27) inputs, 192 x (13 x 13) nodes, Linear act_fn
ConvLayer conv3: 192 x (13 x 13) inputs, 384 x (13 x 13) nodes, RectLin act_fn
ConvLayer conv4: 384 x (13 x 13) inputs, 256 x (13 x 13) nodes, RectLin act_fn
ConvLayer conv5: 256 x (13 x 13) inputs, 256 x (13 x 13) nodes, RectLin act_fn
PoolingLayer pool3: 256 x (13 x 13) inputs, 256 x (6 x 6) nodes, Linear act_fn
FCLayer fc4096a: 9216 inputs, 4096 nodes, RectLin act_fn
DropOutLayer dropout1: 4096 inputs, 4096 nodes, Linear act_fn
FCLayer fc4096b: 4096 inputs, 4096 nodes, RectLin act_fn
DropOutLayer dropout2: 4096 inputs, 4096 nodes, Linear act_fn
FCLayer fc1000: 4096 inputs, 1000 nodes, Softmax act_fn
CostLayer cost: 1000 nodes, CrossEntropy cost_fn
2015-05-15 22:00:54,449 INFO:batch_norm - BatchNormalization set to train mode
2015-05-15 22:00:54,450 INFO:val_init - Generating AutoUniformValGen values of shape (363, 64)
2015-05-15 22:00:54,452 INFO:batch_norm - BatchNormalization set to train mode
2015-05-15 22:00:54,453 INFO:val_init - Generating AutoUniformValGen values of shape (1600, 192)
2015-05-15 22:00:54,458 INFO:batch_norm - BatchNormalization set to train mode
2015-05-15 22:00:54,459 INFO:val_init - Generating AutoUniformValGen values of shape (1728, 384)
2015-05-15 22:00:54,469 INFO:batch_norm - BatchNormalization set to train mode
2015-05-15 22:00:54,470 INFO:val_init - Generating AutoUniformValGen values of shape (3456, 256)
2015-05-15 22:00:54,483 INFO:batch_norm - BatchNormalization set to train mode
2015-05-15 22:00:54,484 INFO:val_init - Generating AutoUniformValGen values of shape (2304, 256)
2015-05-15 22:00:54,492 INFO:batch_norm - BatchNormalization set to train mode
2015-05-15 22:00:54,493 INFO:val_init - Generating AutoUniformValGen values of shape (4096, 9216)
2015-05-15 22:00:54,964 INFO:batch_norm - BatchNormalization set to train mode
2015-05-15 22:00:54,965 INFO:val_init - Generating AutoUniformValGen values of shape (4096, 4096)
2015-05-15 22:00:55,175 INFO:val_init - Generating AutoUniformValGen values of shape (1000, 4096)
2015-05-15 22:00:55,229 WARNING:imageset - Batch dir cache not found in /home/ubgpu/data/I1K/imageset_batches/dataset_cache.pkl:
Press Y to create, otherwise exit: Y
/usr/local/lib/python2.7/dist-packages/neon/util/batch_writer.py:137: RuntimeWarning: divide by zero encountered in log10
self.val_start = 10 ** int(np.log10(self.ntrain * 10))
Traceback (most recent call last):
File "/usr/local/bin/neon", line 199, in
experiment, result, status = main()
File "/usr/local/bin/neon", line 168, in main
result = experiment.run()
File "/usr/local/lib/python2.7/dist-packages/neon/experiments/fit_predict_err.py", line 97, in run
super(FitPredictErrorExperiment, self).run()
File "/usr/local/lib/python2.7/dist-packages/neon/experiments/fit.py", line 70, in run
self.dataset.load()
File "/usr/local/lib/python2.7/dist-packages/neon/datasets/imageset.py", line 176, in load
self.bw.run()
File "/usr/local/lib/python2.7/dist-packages/neon/util/batch_writer.py", line 215, in run
self.write_csv_files()
File "/usr/local/lib/python2.7/dist-packages/neon/util/batch_writer.py", line 137, in write_csv_files
self.val_start = 10 ** int(np.log10(self.ntrain * 10))
OverflowError: cannot convert float infinity to integer
ubgpu@ubgpu:~/github/neon/neon$
The text was updated successfully, but these errors were encountered: