Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ppc64le: //tensorflow/contrib/tensor_forest:scatter_add_ndim_op_test #21833

Closed
wdirons opened this issue Aug 23, 2018 · 3 comments
Closed

ppc64le: //tensorflow/contrib/tensor_forest:scatter_add_ndim_op_test #21833

wdirons opened this issue Aug 23, 2018 · 3 comments
Assignees

Comments

@wdirons
Copy link
Contributor

wdirons commented Aug 23, 2018

Please assign this issue to me

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): PPC64LE Linux Ubuntu 16.04
  • Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: N/A
  • TensorFlow installed from (source or binary): source
  • TensorFlow version (use command below): master branch git clone from 8/23/18 (Last commit 9289302)
  • Python version: 2.7
  • Bazel version (if compiling from source): 0.15.0
  • GCC/Compiler version (if compiling from source): 5.4.0
  • CUDA/cuDNN version: 9.0, 7
  • GPU model and memory: 4 V100 GPUs with 16 GB of memory each
  • Exact command to reproduce:
    azel test --config=cuda --test_tag_filters=-no_oss,-oss_serial,-no_gpu,-benchmark-test --test_timeout 300,450,1200,3600 --local_test_jobs=4 --test_output=errors --build_tests_ //tensorflow/tools/ci_build/gpu_build:parallel_gpu_execute //tensorflow/contrib/tensor_forest:scatter_add_ndim_op_test

Describe the problem

5 of 6 testcases fail with error similar to:

======================================================================
FAIL: testBadInput (__main__.ScatterAddNdimTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorf                              on/kernel_tests/scatter_add_ndim_op_test.py", line 75, in testBadInput
    self.assertAllEqual(init_val, input_data.eval())
  File "/opt/anaconda2/lib/python2.7/contextlib.py", line 35, in __exit__
    self.gen.throw(type, value, traceback)
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorf                              .py", line 1615, in assertRaisesWithPredicateMatch
    str(e)))
AssertionError: Exception of type <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>: AttrValue must not have reference type value of float_ref
         for attr 'tensor_type'
        ; NodeDef: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_dev                              ="edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas); Op<name=_Recv; signature= -> tensor:tensor_type; attr=t                              ame:string; attr=send_device:string; attr=send_device_incarnation:int; attr=recv_device:string; attr=client_terminated:bool,default=false; is_stateful=true>
         [[Node: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_devic                              edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas)]]

----------------------------------------------------------------------

Source code / logs

[root@690470e3d41a workspace]# bazel test --config=cuda --test_tag_filters=-no_oss,-oss_serial,-no_gpu,-benchmark-test --test_timeout 300,450,1200,3600 --local_test_jobs=4 --test_output=errors --build_tests_only --config=opt --run_under=//tensorflow/tools/ci_build/gpu_build:parallel_gpu_execute //tensorflow/contrib/tensor_forest:scatter_add_ndim_op_test
WARNING: The following configs were expanded more than once: [cuda]. For repeatable flags, repeats are counted twice and may lead to unexpected behavior.
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
WARNING: /root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/external/grpc/BUILD:1992:1: in srcs attribute of cc_library rule @grpc//:grpc_nanopb: please do not import '@grpc//third_party/nanopb:pb_common.c' directly. You should either move the file to this package or depend on an appropriate rule there. Since this rule was created by the macro 'grpc_generate_one_off_targets', the error might have been caused by the macro implementation in /root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/external/grpc/bazel/grpc_build_system.bzl:172:12
WARNING: /root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/external/grpc/BUILD:1992:1: in srcs attribute of cc_library rule @grpc//:grpc_nanopb: please do not import '@grpc//third_party/nanopb:pb_decode.c' directly. You should either move the file to this package or depend on an appropriate rule there. Since this rule was created by the macro 'grpc_generate_one_off_targets', the error might have been caused by the macro implementation in /root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/external/grpc/bazel/grpc_build_system.bzl:172:12
WARNING: /root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/external/grpc/BUILD:1992:1: in srcs attribute of cc_library rule @grpc//:grpc_nanopb: please do not import '@grpc//third_party/nanopb:pb_encode.c' directly. You should either move the file to this package or depend on an appropriate rule there. Since this rule was created by the macro 'grpc_generate_one_off_targets', the error might have been caused by the macro implementation in /root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/external/grpc/bazel/grpc_build_system.bzl:172:12
INFO: Analysed target //tensorflow/contrib/tensor_forest:scatter_add_ndim_op_test (129 packages loaded).
INFO: Found 1 test target...
FAIL: //tensorflow/contrib/tensor_forest:scatter_add_ndim_op_test (see /root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/testlogs/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test/test.log)
INFO: From Testing //tensorflow/contrib/tensor_forest:scatter_add_ndim_op_test:
==================== Test output for //tensorflow/contrib/tensor_forest:scatter_add_ndim_op_test:
Running test /root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test  on GPU 0
2018-08-21 21:51:07.422276: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1405] Found device 0 with properties:
name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.53
pciBusID: 0004:04:00.0
totalMemory: 15.75GiB freeMemory: 15.32GiB
2018-08-21 21:51:07.422327: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0
2018-08-21 21:51:07.669332: I tensorflow/core/common_runtime/gpu/gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-08-21 21:51:07.669384: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971]      0
2018-08-21 21:51:07.669393: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 0:   N
2018-08-21 21:51:07.669859: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4838 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0004:04:00.0, compute capability: 7.0)
2018-08-21 21:51:07.725500: E tensorflow/core/framework/op_segment.cc:53] Create kernel failed: Invalid argument: AttrValue must not have reference type value of float_ref
         for attr 'tensor_type'
        ; NodeDef: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas); Op<name=_Recv; signature= -> tensor:tensor_type; attr=tensor_type:type; attr=tensor_name:string; attr=send_device:string; attr=send_device_incarnation:int; attr=recv_device:string; attr=client_terminated:bool,default=false; is_stateful=true>
2018-08-21 21:51:07.725565: E tensorflow/core/common_runtime/executor.cc:697] Executor failed to create kernel. Invalid argument: AttrValue must not have reference type value of float_ref
         for attr 'tensor_type'
        ; NodeDef: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas); Op<name=_Recv; signature= -> tensor:tensor_type; attr=tensor_type:type; attr=tensor_name:string; attr=send_device:string; attr=send_device_incarnation:int; attr=recv_device:string; attr=client_terminated:bool,default=false; is_stateful=true>
         [[Node: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas)]]
E2018-08-21 21:51:07.730994: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0
2018-08-21 21:51:07.731026: I tensorflow/core/common_runtime/gpu/gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-08-21 21:51:07.731034: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971]      0
2018-08-21 21:51:07.731041: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 0:   N
2018-08-21 21:51:07.731432: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4838 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0004:04:00.0, compute capability: 7.0)
2018-08-21 21:51:07.741437: E tensorflow/core/framework/op_segment.cc:53] Create kernel failed: Invalid argument: AttrValue must not have reference type value of float_ref
         for attr 'tensor_type'
        ; NodeDef: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas); Op<name=_Recv; signature= -> tensor:tensor_type; attr=tensor_type:type; attr=tensor_name:string; attr=send_device:string; attr=send_device_incarnation:int; attr=recv_device:string; attr=client_terminated:bool,default=false; is_stateful=true>
2018-08-21 21:51:07.741477: E tensorflow/core/common_runtime/executor.cc:697] Executor failed to create kernel. Invalid argument: AttrValue must not have reference type value of float_ref
         for attr 'tensor_type'
        ; NodeDef: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas); Op<name=_Recv; signature= -> tensor:tensor_type; attr=tensor_type:type; attr=tensor_name:string; attr=send_device:string; attr=send_device_incarnation:int; attr=recv_device:string; attr=client_terminated:bool,default=false; is_stateful=true>
         [[Node: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas)]]
E2018-08-21 21:51:07.745582: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0
2018-08-21 21:51:07.745601: I tensorflow/core/common_runtime/gpu/gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-08-21 21:51:07.745608: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971]      0
2018-08-21 21:51:07.745615: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 0:   N
2018-08-21 21:51:07.745978: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4838 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0004:04:00.0, compute capability: 7.0)
2018-08-21 21:51:07.755530: E tensorflow/core/framework/op_segment.cc:53] Create kernel failed: Invalid argument: AttrValue must not have reference type value of float_ref
         for attr 'tensor_type'
        ; NodeDef: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas); Op<name=_Recv; signature= -> tensor:tensor_type; attr=tensor_type:type; attr=tensor_name:string; attr=send_device:string; attr=send_device_incarnation:int; attr=recv_device:string; attr=client_terminated:bool,default=false; is_stateful=true>
2018-08-21 21:51:07.755565: E tensorflow/core/common_runtime/executor.cc:697] Executor failed to create kernel. Invalid argument: AttrValue must not have reference type value of float_ref
         for attr 'tensor_type'
        ; NodeDef: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas); Op<name=_Recv; signature= -> tensor:tensor_type; attr=tensor_type:type; attr=tensor_name:string; attr=send_device:string; attr=send_device_incarnation:int; attr=recv_device:string; attr=client_terminated:bool,default=false; is_stateful=true>
         [[Node: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas)]]
F2018-08-21 21:51:07.760553: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0
2018-08-21 21:51:07.760575: I tensorflow/core/common_runtime/gpu/gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-08-21 21:51:07.760583: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971]      0
2018-08-21 21:51:07.760590: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 0:   N
2018-08-21 21:51:07.760945: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4838 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0004:04:00.0, compute capability: 7.0)
2018-08-21 21:51:07.770746: E tensorflow/core/framework/op_segment.cc:53] Create kernel failed: Invalid argument: AttrValue must not have reference type value of float_ref
         for attr 'tensor_type'
        ; NodeDef: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas); Op<name=_Recv; signature= -> tensor:tensor_type; attr=tensor_type:type; attr=tensor_name:string; attr=send_device:string; attr=send_device_incarnation:int; attr=recv_device:string; attr=client_terminated:bool,default=false; is_stateful=true>
2018-08-21 21:51:07.770780: E tensorflow/core/common_runtime/executor.cc:697] Executor failed to create kernel. Invalid argument: AttrValue must not have reference type value of float_ref
         for attr 'tensor_type'
        ; NodeDef: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas); Op<name=_Recv; signature= -> tensor:tensor_type; attr=tensor_type:type; attr=tensor_name:string; attr=send_device:string; attr=send_device_incarnation:int; attr=recv_device:string; attr=client_terminated:bool,default=false; is_stateful=true>
         [[Node: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas)]]
E2018-08-21 21:51:07.774856: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1484] Adding visible gpu devices: 0
2018-08-21 21:51:07.774874: I tensorflow/core/common_runtime/gpu/gpu_device.cc:965] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-08-21 21:51:07.774882: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971]      0
2018-08-21 21:51:07.774889: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] 0:   N
2018-08-21 21:51:07.775259: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1097] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4838 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0004:04:00.0, compute capability: 7.0)
2018-08-21 21:51:07.787517: E tensorflow/core/framework/op_segment.cc:53] Create kernel failed: Invalid argument: AttrValue must not have reference type value of float_ref
         for attr 'tensor_type'
        ; NodeDef: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas); Op<name=_Recv; signature= -> tensor:tensor_type; attr=tensor_type:type; attr=tensor_name:string; attr=send_device:string; attr=send_device_incarnation:int; attr=recv_device:string; attr=client_terminated:bool,default=false; is_stateful=true>
2018-08-21 21:51:07.787565: E tensorflow/core/common_runtime/executor.cc:697] Executor failed to create kernel. Invalid argument: AttrValue must not have reference type value of float_ref
         for attr 'tensor_type'
        ; NodeDef: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas); Op<name=_Recv; signature= -> tensor:tensor_type; attr=tensor_type:type; attr=tensor_name:string; attr=send_device:string; attr=send_device_incarnation:int; attr=recv_device:string; attr=client_terminated:bool,default=false; is_stateful=true>
         [[Node: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas)]]
E.
======================================================================
ERROR: test1dim (__main__.ScatterAddNdimTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorflow/contrib/tensor_forest/python/kernel_tests/scatter_add_ndim_op_test.py", line 37, in test1dim
    tensor_forest_ops.scatter_add_ndim(input_data, indices, updates).run()
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorflow/python/framework/ops.py", line 2241, in run
    _run_using_default_session(self, feed_dict, self.graph, session)
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorflow/python/framework/ops.py", line 4986, in _run_using_default_session
    session.run(operation, feed_dict)
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 877, in run
    run_metadata_ptr)
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 1100, in _run
    feed_dict_tensor, options, run_metadata)
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 1272, in _do_run
    run_metadata)
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 1291, in _do_call
    raise type(e)(node_def, op, message)
InvalidArgumentError: AttrValue must not have reference type value of float_ref
         for attr 'tensor_type'
        ; NodeDef: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas); Op<name=_Recv; signature= -> tensor:tensor_type; attr=tensor_type:type; attr=tensor_name:string; attr=send_device:string; attr=send_device_incarnation:int; attr=recv_device:string; attr=client_terminated:bool,default=false; is_stateful=true>
         [[Node: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas)]]

======================================================================
ERROR: test3dim (__main__.ScatterAddNdimTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorflow/contrib/tensor_forest/python/kernel_tests/scatter_add_ndim_op_test.py", line 50, in test3dim
    tensor_forest_ops.scatter_add_ndim(input_data, indices, updates).run()
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorflow/python/framework/ops.py", line 2241, in run
    _run_using_default_session(self, feed_dict, self.graph, session)
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorflow/python/framework/ops.py", line 4986, in _run_using_default_session
    session.run(operation, feed_dict)
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 877, in run
    run_metadata_ptr)
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 1100, in _run
    feed_dict_tensor, options, run_metadata)
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 1272, in _do_run
    run_metadata)
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 1291, in _do_call
    raise type(e)(node_def, op, message)
InvalidArgumentError: AttrValue must not have reference type value of float_ref
         for attr 'tensor_type'
        ; NodeDef: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas); Op<name=_Recv; signature= -> tensor:tensor_type; attr=tensor_type:type; attr=tensor_name:string; attr=send_device:string; attr=send_device_incarnation:int; attr=recv_device:string; attr=client_terminated:bool,default=false; is_stateful=true>
         [[Node: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas)]]

======================================================================
ERROR: testIncompleteIndices (__main__.ScatterAddNdimTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorflow/contrib/tensor_forest/python/kernel_tests/scatter_add_ndim_op_test.py", line 85, in testIncompleteIndices
    tensor_forest_ops.scatter_add_ndim(input_data, indices, updates).run()
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorflow/python/framework/ops.py", line 2241, in run
    _run_using_default_session(self, feed_dict, self.graph, session)
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorflow/python/framework/ops.py", line 4986, in _run_using_default_session
    session.run(operation, feed_dict)
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 877, in run
    run_metadata_ptr)
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 1100, in _run
    feed_dict_tensor, options, run_metadata)
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 1272, in _do_run
    run_metadata)
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorflow/python/client/session.py", line 1291, in _do_call
    raise type(e)(node_def, op, message)
InvalidArgumentError: AttrValue must not have reference type value of float_ref
         for attr 'tensor_type'
        ; NodeDef: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas); Op<name=_Recv; signature= -> tensor:tensor_type; attr=tensor_type:type; attr=tensor_name:string; attr=send_device:string; attr=send_device_incarnation:int; attr=recv_device:string; attr=client_terminated:bool,default=false; is_stateful=true>
         [[Node: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas)]]

======================================================================
ERROR: testNoUpdates (__main__.ScatterAddNdimTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorflow/contrib/tensor_forest/python/kernel_tests/scatter_add_ndim_op_test.py", line 62, in testNoUpdates
    tensor_forest_ops.scatter_add_ndim(input_data, indices, updates).run()
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorf                              line 2241, in run
    _run_using_default_session(self, feed_dict, self.graph, session)
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorf                              line 4986, in _run_using_default_session
    session.run(operation, feed_dict)
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorf                               line 877, in run
    run_metadata_ptr)
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorf                               line 1100, in _run
    feed_dict_tensor, options, run_metadata)
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorf                               line 1272, in _do_run
    run_metadata)
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorf                               line 1291, in _do_call
    raise type(e)(node_def, op, message)
InvalidArgumentError: AttrValue must not have reference type value of float_ref
         for attr 'tensor_type'
        ; NodeDef: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_dev                              ="edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas); Op<name=_Recv; signature= -> tensor:tensor_type; attr=t                              ame:string; attr=send_device:string; attr=send_device_incarnation:int; attr=recv_device:string; attr=client_terminated:bool,default=false; is_stateful=true>
         [[Node: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_devic                              edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas)]]

======================================================================
FAIL: testBadInput (__main__.ScatterAddNdimTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorf                              on/kernel_tests/scatter_add_ndim_op_test.py", line 75, in testBadInput
    self.assertAllEqual(init_val, input_data.eval())
  File "/opt/anaconda2/lib/python2.7/contextlib.py", line 35, in __exit__
    self.gen.throw(type, value, traceback)
  File "/root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test.runfiles/org_tensorflow/tensorf                              .py", line 1615, in assertRaisesWithPredicateMatch
    str(e)))
AssertionError: Exception of type <class 'tensorflow.python.framework.errors_impl.InvalidArgumentError'>: AttrValue must not have reference type value of float_ref
         for attr 'tensor_type'
        ; NodeDef: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_dev                              ="edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas); Op<name=_Recv; signature= -> tensor:tensor_type; attr=t                              ame:string; attr=send_device:string; attr=send_device_incarnation:int; attr=recv_device:string; attr=client_terminated:bool,default=false; is_stateful=true>
         [[Node: Variable/_5 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_devic                              edge_2_Variable", tensor_type=DT_FLOAT_REF, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^ScatterAddNdim/indices, ^ScatterAddNdim/deltas)]]

----------------------------------------------------------------------
Ran 6 tests in 1.097s

FAILED (failures=1, errors=4)
================================================================================
Target //tensorflow/contrib/tensor_forest:scatter_add_ndim_op_test up-to-date:
  bazel-bin/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test
INFO: Elapsed time: 64.835s, Critical Path: 45.36s
INFO: 1 process: 1 local.
INFO: Build completed, 1 test FAILED, 5 total actions
//tensorflow/contrib/tensor_forest:scatter_add_ndim_op_test              FAILED in 2.8s
  /root/.cache/bazel/_bazel_root/eab0d61a99b6696edb3d2aff87b585e8/execroot/org_tensorflow/bazel-out/ppc-opt/testlogs/tensorflow/contrib/tensor_forest/scatter_add_ndim_op_test/test.log
@wdirons
Copy link
Contributor Author

wdirons commented Aug 23, 2018

I had read online that ScatterAddNdim had no GPU support, I looked at https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/tensor_forest/BUILD#L460-L465, and see the testcase has a tag 'no-pip-gpu'. So I believe this test shouldn't run in a GPU unit test, but the unit test only exclude the test matching the tag no-gpu.

@gunan - Can I get your opinion on how to fix it. I see three ways:

  1. Add no-gpu tag to the testcase in tensor_forest/BUILD
  2. Add -no-pip-gpu to the test_tag_filters parameter in bazel for the GPU unit test.
  3. I noticed linux/gpu doesn't have a test defined for contrib like linux/cpu , should gpu test not test contrib?

CC @jayfurmanek, He would prefer to test as much as possible in the GPU test.

@jayfurmanek
Copy link
Contributor

I think 3. there is the most interesting one for me. We do ship most of contrib (with limited support) in PowerAI so maximum test coverage is ideal.
Historically, what was the idea behind that no-pip-gpu flag?

@gunan
Copy link
Contributor

gunan commented Aug 24, 2018

Sorry for the late reply.
Ideally, linux GPU should test contrib, too but we are already dedicating thousands of GPUs to tensorflow CI. We test contrib on GPU every night to get the most with less machines.
So I would go with (1).
The idea behind no_pip_gpu tag is at one point, we mass-tagged the tests to get all builds green and filed bugs about those.

wdirons added a commit to wdirons/tensorflow that referenced this issue Aug 27, 2018
As scatter_add_ndim doesn't have implementation for GPU, the
test needs to be excluded from GPU test to prevent it from failing.
Currently fails on both x86_64 and ppc64le.
Fixes tensorflow#21833
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants