Skip to content

INFO:tensorflow:Error reported to Coordinator: Nan in summary histogram for: image_pooling/BatchNorm/moving_variance_1 #3716

@GeorgeBohw

Description

@GeorgeBohw

Please go to Stack Overflow for help and support:

http://stackoverflow.com/questions/tagged/tensorflow

Also, please understand that many of the models included in this repository are experimental and research-style code. If you open a GitHub issue, here is our policy:

  1. It must be a bug, a feature request, or a significant problem with documentation (for small docs fixes please send a PR instead).
  2. The form below must be filled out.

Here's why we have that policy: TensorFlow developers respond to issues. We want to focus on work that benefits the whole community, e.g., fixing bugs and adding features. Support only helps individuals. GitHub also notifies thousands of people when issues are filed. We want them to see you communicating an interesting problem, rather than being redirected to Stack Overflow.


System information

  • What is the top-level directory of the model you are using:
  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow):
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
  • TensorFlow installed from (source or binary):binary
  • TensorFlow version (use command below):1.6-gpu
  • Bazel version (if compiling from source):
  • CUDA/cuDNN version:9.0/7.0
  • GPU model and memory:11G
  • Exact command to reproduce:

You can collect some of this information using our environment capture script:

https://github.com/tensorflow/tensorflow/tree/master/tools/tf_env_collect.sh

You can obtain the TensorFlow version with

python -c "import tensorflow as tf; print(tf.GIT_VERSION, tf.VERSION)"

Describe the problem

Describe the problem clearly here. Be sure to convey here why it's a bug in TensorFlow or a feature request.

When I run local_test.sh,i only modify --***_crop_size to 1000,then the error comes out:

## **_INFO:tensorflow:Error reported to Coordinator: Nan in summary histogram for: image_pooling/BatchNorm/moving_variance_1

 [[Node: image_pooling/BatchNorm/moving_variance_1 = HistogramSummary[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](image_pooling/BatchNorm/moving_variance_1/tag, image_pooling/BatchNorm/moving_variance/read)]]
 [[Node: xception_65/middle_flow/block1/unit_13/xception_module/separable_conv3_pointwise/weights/read/_617 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_2728_...ights/read", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

Caused by op u'image_pooling/BatchNorm/moving_variance_1', defined at:
File "/home/george/project/deeplabv3/models-master/research/deeplab/train.py", line 347, in
tf.app.run()
File "/home/george/anaconda2/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 126, in run
_sys.exit(main(argv))
File "/home/george/project/deeplabv3/models-master/research/deeplab/train.py", line 268, in main
summaries.add(tf.summary.histogram(model_var.op.name, model_var))
File "/home/george/anaconda2/lib/python2.7/site-packages/tensorflow/python/summary/summary.py", line 193, in histogram
tag=tag, values=values, name=scope)
File "/home/george/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/gen_logging_ops.py", line 189, in _histogram_summary
"HistogramSummary", tag=tag, values=values, name=name)
File "/home/george/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/george/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3271, in create_op
op_def=op_def)
File "/home/george/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1650, in init
self._traceback = self._graph.extract_stack() # pylint: disable=protected-access**

What is the reason? 1000 is too large?If I want to use the model to test 1920*1080 size image,how can I do?
I am looking forward to your response,thank you!

Source code / logs

Include any logs or source code that would be helpful to diagnose the problem. If including tracebacks, please include the full traceback. Large logs and files should be attached. Try to provide a reproducible test case that is the bare minimum necessary to generate the problem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions