Resource exhausted: OOM when allocating tensor with shape[2304,384] Traceback (most recent call last):

Please go to Stack Overflow for help and support:

I tried to run models/tutorials/image/cifar10/train.py 
I let it run  about a day on my pc :
(windows10 , tensorflow-gpu 1.2 ,) after 
`2017-07-20 13:58:20.441224: step 941580, loss = 0.14 (3076.2 examples/sec; 0.042 sec/batch)`

`I got this error : 
```
2017-07-20 13:58:20.791379: W c:\tf_jenkins\home\workspace\release-win\m\windows-gpu\py\35\tensorflow\core\framework\op_kernel.cc:1158] Resource exhausted: OOM when allocating tensor with shape[2304,384]
Traceback (most recent call last):
  File "D:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1139, in _do_call
    return fn(*args)
  File "D:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1121, in _run_fn
    status, run_metadata)
  File "D:\Anaconda3\lib\contextlib.py", line 66, in __exit__
    next(self.gen)
  File "D:\Anaconda3\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 466, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[2304,384]
	 [[Node: ExponentialMovingAverage/AssignMovingAvg_4/sub_1 = Sub[T=DT_FLOAT, _class=["loc:@local3/weights"], _device="/job:localhost/replica:0/task:0/cpu:0"](local3/weights/ExponentialMovingAverage/read, local3/weights/read)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:/Users/Hoda/Documents/GitHub/models/tutorials/image/cifar10/cifar10_train.py", line 127, in <module>
    tf.app.run()
  File "D:\Anaconda3\lib\site-packages\tensorflow\python\platform\app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "C:/Users/Hoda/Documents/GitHub/models/tutorials/image/cifar10/cifar10_train.py", line 123, in main
    train()
  File "C:/Users/Hoda/Documents/GitHub/models/tutorials/image/cifar10/cifar10_train.py", line 115, in train
    mon_sess.run(train_op)
  File "D:\Anaconda3\lib\site-packages\tensorflow\python\training\monitored_session.py", line 505, in run
    run_metadata=run_metadata)
  File "D:\Anaconda3\lib\site-packages\tensorflow\python\training\monitored_session.py", line 842, in run
    run_metadata=run_metadata)
  File "D:\Anaconda3\lib\site-packages\tensorflow\python\training\monitored_session.py", line 798, in run
    return self._sess.run(*args, **kwargs)
  File "D:\Anaconda3\lib\site-packages\tensorflow\python\training\monitored_session.py", line 952, in run
    run_metadata=run_metadata)
  File "D:\Anaconda3\lib\site-packages\tensorflow\python\training\monitored_session.py", line 798, in run
    return self._sess.run(*args, **kwargs)
  File "D:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 789, in run
    run_metadata_ptr)
  File "D:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 997, in _run
    feed_dict_string, options, run_metadata)
  File "D:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1132, in _do_run
    target_list, options, run_metadata)
  File "D:\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1152, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[2304,384]
	 [[Node: ExponentialMovingAverage/AssignMovingAvg_4/sub_1 = Sub[T=DT_FLOAT, _class=["loc:@local3/weights"], _device="/job:localhost/replica:0/task:0/cpu:0"](local3/weights/ExponentialMovingAverage/read, local3/weights/read)]]

Caused by op 'ExponentialMovingAverage/AssignMovingAvg_4/sub_1', defined at:
  File "C:/Users/Hoda/Documents/GitHub/models/tutorials/image/cifar10/cifar10_train.py", line 127, in <module>
    tf.app.run()
  File "D:\Anaconda3\lib\site-packages\tensorflow\python\platform\app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "C:/Users/Hoda/Documents/GitHub/models/tutorials/image/cifar10/cifar10_train.py", line 123, in main
    train()
  File "C:/Users/Hoda/Documents/GitHub/models/tutorials/image/cifar10/cifar10_train.py", line 79, in train
    train_op = cifar10.train(loss, global_step)
  File "C:\Users\Hoda\Documents\GitHub\models\tutorials\image\cifar10\cifar10.py", line 373, in train
    variables_averages_op = variable_averages.apply(tf.trainable_variables())
  File "D:\Anaconda3\lib\site-packages\tensorflow\python\training\moving_averages.py", line 392, in apply
    self._averages[var], var, decay, zero_debias=zero_debias))
  File "D:\Anaconda3\lib\site-packages\tensorflow\python\training\moving_averages.py", line 72, in assign_moving_average
    update_delta = (variable - value) * decay
  File "D:\Anaconda3\lib\site-packages\tensorflow\python\ops\variables.py", line 694, in _run_op
    return getattr(ops.Tensor, operator)(a._AsTensor(), *args)
  File "D:\Anaconda3\lib\site-packages\tensorflow\python\ops\math_ops.py", line 838, in binary_op_wrapper
    return func(x, y, name=name)
  File "D:\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 2501, in _sub
    result = _op_def_lib.apply_op("Sub", x=x, y=y, name=name)
  File "D:\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 767, in apply_op
    op_def=op_def)
  File "D:\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 2510, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "D:\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1273, in __init__
    self._traceback = _extract_stack()

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[2304,384]
	 [[Node: ExponentialMovingAverage/AssignMovingAvg_4/sub_1 = Sub[T=DT_FLOAT, _class=["loc:@local3/weights"], _device="/job:localhost/replica:0/task:0/cpu:0"](local3/weights/ExponentialMovingAverage/read, local3/weights/read)]]
```

`how can I fix it? and do I have to run it again from or the previous result is saved? 
ibe the problem clearly here. Be sure to convey here why it's a bug in TensorFlow or a feature request.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Resource exhausted: OOM when allocating tensor with shape[2304,384] Traceback (most recent call last): #1993

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Resource exhausted: OOM when allocating tensor with shape[2304,384] Traceback (most recent call last): #1993

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions