Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

contrib.data.Dataset - doc issue with Dataset.map / tf.py_func in 1.3.0rc0 #11786

Closed
cheind opened this issue Jul 26, 2017 · 8 comments
Closed
Assignees
Labels
type:docs-bug Document issues

Comments

@cheind
Copy link

cheind commented Jul 26, 2017

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): no
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10
  • TensorFlow installed from (source or binary): 1.3.0rc0
  • Python version: 3.6
  • CUDA/cuDNN version: 8/6
  • GPU model and memory: GTX 1080
  • Exact command to reproduce:

The following sample is taken from here and works in TF 1.2.1

import tensorflow as tf
import numpy as np

def _read_py_function(filename, label):
  return np.zeros((100,100,1)), label

def _resize_function(image_decoded, label):
  image_decoded.set_shape([None, None, None])
  image_resized = tf.image.resize_images(image_decoded, [28, 28])
  return image_resized, label

filenames = np.array(["/var/data/image1.jpg", "/var/data/image2.jpg"])
labels = np.array([0, 37])

dataset = tf.contrib.data.Dataset.from_tensor_slices((filenames, labels))
dataset = dataset.map(
    lambda filename, label: tf.py_func(
        _read_py_function, [filename, label], [tf.uint8, label.dtype]))
dataset = dataset.map(_resize_function)

In 1.3.0rc0 the following error is produced

Cannot convert a list containing a tensor of dtype <dtype: 'int32'> to <dtype: 'uint8'> (Tensor is: <tf.Tensor 'PyFunc:1' shape=<unknown> dtype=int32>)

This is due to the breaking change mentioned in release notes. To fix, one now has to introduce an explicit tuple() like so

dataset = dataset.map(
    lambda filename, label: tuple(tf.py_func(
        _read_py_function, [filename, label], [tf.uint8, label.dtype])))

This should at least be mentioned in the API docs / programmer guide.

@aselle
Copy link
Contributor

aselle commented Jul 26, 2017

@mrry, could you update the documentation to reflect this change?

@aselle aselle added the type:docs-bug Document issues label Jul 26, 2017
@aselle
Copy link
Contributor

aselle commented Jul 26, 2017

Thanks @cheind for reporting this.

@cheind
Copy link
Author

cheind commented Jul 27, 2017

@mrry You're welcome. While this is certainly a doc issue at this point, I want to raise the concern that assigning tuple and list totally different semantics in Python is very uncommon (even in Tensorflow) and could lead to many suprising moments on user side.

vrv pushed a commit to vrv/tensorflow that referenced this issue Jul 28, 2017
vrv pushed a commit to vrv/tensorflow that referenced this issue Jul 28, 2017
END_PUBLIC

I dropped the following commit because it doesn't compile.
I will follow up with Andrew to fix it or revert it.
Commit 003deb8 authored by osdamv<osdamv@gmail.com>
Committed by Vijay Vasudevan<vrv@google.com>:
Refactor and implementation of the camera API 1, it fixes tensorflow#8736 (tensorflow#10771)

List of commits in this CL:
---
Commit 4464503 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Use identity of param variable in cudnn_rnn.RNNParamsSaveable instead of parameter
variable directly. The RNNParamsSaveable is usually used in a graph which also
has a saver for the cudnn param variable itself, if the same op is used for
both, fails with a two savers for same op error.

PiperOrigin-RevId: 163431826

---
Commit d629a83 authored by RJ Ryan<rjryan@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Increase bound on tf.contrib.signal.inverse_stft gradient error to avoid flakiness on macOS.

PiperOrigin-RevId: 163426631

---
Commit 253bcbb authored by Kay Zhu<kayzhu@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[XLA] Use HloEvaluator for convolution in reference_util.

Also Speed up HloEvaluator's HandleConvolution in non-opt build, by moving calls
to HloInstruction::shape() out of the inner loop.

PiperOrigin-RevId: 163416183

---
Commit 569a00e authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Update API to traffic in unique_ptrs rather than owning raw pointers

PiperOrigin-RevId: 163414320

---
Commit 31a77bc authored by Asim Shankar<ashankar@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Java: Update release to 1.3.0-rc1

PiperOrigin-RevId: 163413736

---
Commit 1ebbf43 authored by Jonathan Hseu<vomjom@vomjom.net>
Committed by GitHub<noreply@github.com>:
Add missing grpc dependency (tensorflow#11828)

---
Commit 905abb1 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Test asserts should have `expected` first.

PiperOrigin-RevId: 163409348

---
Commit d5cc143 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Increase timeout to deflake the test.

PiperOrigin-RevId: 163407824

---
Commit ce1c7f0 authored by Eli Bendersky<eliben@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Properly include logging header in xla_internal_test_main

PiperOrigin-RevId: 163405986

---
Commit 22241cd authored by joetoth<joetoth@gmail.com>
Committed by Vijay Vasudevan<vrv@google.com>:
External leveldb link changed (tensorflow#11833)

table_format.txt was renamed to table_format.md
---
Commit 6b7314d authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Consolidating the code to fill the partition's function library
into one place. Previously, Partition() and MasterSession::RegisterPartition()
both fills in the partitioned graph's function library.

PiperOrigin-RevId: 163400992

---
Commit 28373cf authored by Frank Chen<frankchn@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Adds preliminary support for Cloud TPUs with Cluster Resolvers. This aims to allow users to have a better experienec when specifying one or multiple Cloud TPUs for their training jobs by allowing users to use names rather than IP addresses.

PiperOrigin-RevId: 163393443

---
Commit e5353c9 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Don't prune nodes that have reference inputs.

PiperOrigin-RevId: 163390862

---
Commit 2265108 authored by Asim Shankar<ashankar@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
C API: Groundwork for experimenting with TF_Tensor in device memory.

TF_Tensor objects are always backed by host memory. This commit lays
the groundwork for allowing TF_Tensor objects to refer to tensor data
on device (e.g., GPU) memory.

PiperOrigin-RevId: 163388079

---
Commit 613bf1c authored by Yuefeng Zhou<yuefengz@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
fix asan test failure in SingleMachineTest::ReleaseMemoryAfterDestruction.

PiperOrigin-RevId: 163386941

---
Commit 4653d37 authored by Eli Bendersky<eliben@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[XLA] Change type to appease GPU builds.

PiperOrigin-RevId: 163384927

---
Commit 9f131bd authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Internal change

PiperOrigin-RevId: 163378484

---
Commit 8bc0236 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
PiperOrigin-RevId: 163366493

---
Commit 3b97f1f authored by Yangzihao Wang<yangzihao@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Change to only run one round of matmul benchmark.

PiperOrigin-RevId: 163364341

---
Commit a4a3a33 authored by Yun Peng<pcloudy@google.com>
Committed by Vijay Vasudevan<vrv@google.com>:
Fix ./configure on Windows (tensorflow#11775)

* Fix ./configure on Windows

* Disable bitwise_ops_test on Windows

---
Commit ae3119d authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Small changes to op framework.

PiperOrigin-RevId: 163361071

---
Commit f40189d authored by qjivy<ji.qiu@spreadtrum.com>
Committed by Vijay Vasudevan<vrv@google.com>:
PR again: Enable building label_image with jpeg/gif/png decoder for Android.  (tensorflow#11475)

* Enable building label_image with jpeg/gif/png decoder for Android.
Add dependency "android_tesnorflow_image_op" to label_image, which
is not overlapped with android_tensorflow_kernels.

* Running buildifier to reformat the BUILD files for
sanity check.

---
Commit 5991658 authored by KB Sriram<kbsriram@gmail.com>
Committed by Vijay Vasudevan<vrv@google.com>:
Add the Constant operator class (tensorflow#11559)

Create a custom operator class to create constants in the Graph,
and introduce the Operator marker annotation to identify
operator classes.

Please see tensorflow#7149 for the master tracking issue.
---
Commit 86ca350 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Further BUILD cleanup

PiperOrigin-RevId: 163360750

---
Commit 376bb06 authored by Pete Warden<petewarden@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Look inside functions to see which node types are used.

PiperOrigin-RevId: 163360375

---
Commit 2139e7d authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[tf.contrib.data] map expects a nested structure.

Fixes tensorflow#11786

PiperOrigin-RevId: 163359134

---
Commit d09304f authored by Jonathan Hseu<vomjom@vomjom.net>
Committed by Vijay Vasudevan<vrv@google.com>:
Upgrade gRPC (tensorflow#11768)

* BUILD rule modifications

* More build fixes

* Code changes

* More code fixes

* Working tests

* CMake build

* Fix pprof

* Fix header includes

* CMake fix test

* Bazel clean

* Fix verbs

* More verbs fixes

* bazel clean for XLA

* Windows build fix test

* Add openssl/rand.h

* New cmake build command

* --config Release

---
Commit 3cd8284 authored by David Norman<DavidNorman@users.noreply.github.com>
Committed by Vijay Vasudevan<vrv@google.com>:
Fix error with default python path selection (tensorflow#11814)

* Fix error with default python path selection

* Move setting of environment var outside if / else

---
Commit ddd8e21 authored by Eli Bendersky<eliben@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[XLA] Consolidate all similar main()s in tests into a single target.

PiperOrigin-RevId: 163354724

---
Commit a36bca2 authored by Tayo Oguntebi<tayo@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Remove ShapeWithoutPadding() utility function, as it is no longer needed.

PiperOrigin-RevId: 163353430

---
Commit b26f9cd authored by David Norman<DavidNorman@users.noreply.github.com>
Committed by Vijay Vasudevan<vrv@google.com>:
Ensure that the multi-instruction fuse can take shared inputs (tensorflow#11748)

* Ensure that the multi-instruction fuse can take shared inputs

Note that the fuse action only works when the shared input / constant
appears after all of its consumers in the list of instructions.

* Add a comment describing the test

---
Commit 34cbf16 authored by Jiri Simsa<jsimsa@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Update Dataset API documentation.

PiperOrigin-RevId: 163349457

---
Commit 2381ce5 authored by Abdullah Alrasheed<a.rasheed@tc-sa.com>
Committed by Vijay Vasudevan<vrv@google.com>:
DOC: Fix typo. (tensorflow#11813)

you could could be I/O bottlenecked.
TO:
you could be I/O bottlenecked.
---
Commit e4a5c53 authored by Toby Boyd<tobyboyd@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
["Variable", "VariableV2", "VarHandleOp"] is the default for ps_ops=None

PiperOrigin-RevId: 163344629

---
Commit 722f6f3 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Fix TensorForest's saveable object names so loading a savedmodel works.

PiperOrigin-RevId: 163332598

---
Commit cda80a7 authored by Eric Liu<ioeric@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[tpu profiler] Dump HLO graphs in profile responses to the log directory.

PiperOrigin-RevId: 163318992

---
Commit cea9ef6 authored by horance<horance-liu@users.noreply.github.com>
Committed by Vijay Vasudevan<vrv@google.com>:
Refactoring device name utils (tensorflow#11797)

* remove duplicated code for full_name and legacy_name for DeviceNameUtils

* replace tabs

* Real->Device

---
Commit 1f7c0f9 authored by Kongsea<kongsea@gmail.com>
Committed by Vijay Vasudevan<vrv@google.com>:
Refine docstrings (tensorflow#11800)

---
Commit dd1f0cd authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Supports lookup devices by fullname either in the canonical form or the
legacy form. This makes DeviceSet behaves the same as DeviceMgr's
FindDevice method.

PiperOrigin-RevId: 163300346

---
Commit 631a364 authored by Kay Zhu<kayzhu@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[XLA] Add Reduce, DynamicSlice and DynamicSliceUpdate to HloEvaluator.

- Reduce is disabled explicitly for constant folding, as not all types of
embedded computation can be currently supported by the evaluator.

- Added support to evaluate HloModule to HloEvaluator.

- Minor signature change to Evaluate().

PiperOrigin-RevId: 163299238

---
Commit a524701 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Sets the incarnation number even when the attribute is set.

PiperOrigin-RevId: 163299121

---
Commit a49fe03 authored by Suharsh Sivakumar<suharshs@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Remove platform bridge for grpc_response_reader.

PiperOrigin-RevId: 163295986

---
Commit 4404aa7 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[XLA] Add TODO comment explaining why the IsScalar check exists.

PiperOrigin-RevId: 163292777

---
Commit 43036ac authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Remove unnecessary break statements.

PiperOrigin-RevId: 163291947

---
Commit fd5de46 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[XLA] Add regression test for a corner case using Reduce that currently fails with the GPU backend.

PiperOrigin-RevId: 163287986

---
Commit 32e198f authored by Chris Leary<leary@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[TF:XLA] Add tf.cross support.

See tensorflow#11788

PiperOrigin-RevId: 163287731

---
Commit 88abddb authored by Alan Yee<alyee@ucsd.edu>
Committed by Vijay Vasudevan<vrv@google.com>:
Update README.md (tensorflow#11793)

Remove bad practices of sudo pip and install use safer pip install commands
---
Commit 9b30dc3 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Remove final mentions of `get_shape` in docstring.

PiperOrigin-RevId: 163282839

---
Commit 423c1ee authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
BREAKING CHANGE: Fix semantic error in how maybe_batch* handles sparse tensors.

PiperOrigin-RevId: 163276613

---
Commit 6028c07 authored by Justin Lebar<jlebar@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Highlight incoming/outgoing edges on hover in HLO graphviz dumps, and other improvements.

Other improvements:

 - Don't show tooltips for nodes and clusters.  Previously we'd show a
   tooltip containing a pointer value expressed as decimal.  Not so
   useful.

 - Show tooltips on edges with the to/from node names.

 - Fix bug wherein if we had

   - a node at the "edge" of the graph (so its operands aren't included
     unless they're referenced by another node),
   - with all of its operands included in the graph save one or more
     constants, and
   - those constants weren't referenced by any nodes not at the edge of
     the graph,

   we would incorrectly draw the node as "grayed out", indicating that
   one of its operands (namely, its constant operand) wasn't present in
   the graph.

   This is wrong because constants are inlined into their users, so they
   should always count as "displayed" for the purposes of determining
   whether a node is grayed out.

PiperOrigin-RevId: 163276108

---
Commit ce7a355 authored by Joshua V. Dillon<jvdillon@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Update contrib/distributions/estimator_test build dependency.

PiperOrigin-RevId: 163272464

---
Commit 1b8458a authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Shorten docstring line.

PiperOrigin-RevId: 163269709

---
Commit 69e323c authored by Asim Shankar<ashankar@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Fix comment ypo

PiperOrigin-RevId: 163266376

---
Commit 08790e7 authored by Chris Leary<leary@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[XLA] Fix a bug in cloning outfeeds, carried the wrong shape.

PiperOrigin-RevId: 163265592

---
Commit 1bad826 authored by Yangzihao Wang<yangzihao@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Rollback of GPU kernel implementation of transpose for tensors with one small dimension.
END_PUBLIC

BEGIN_PUBLIC
BEGIN_PUBLIC
Automated g4 rollback of changelist 162525519

PiperOrigin-RevId: 163490703
@GPhilo
Copy link

GPhilo commented Aug 21, 2017

I have a similar error message, but I'm not sure if I got the proposed solution correctly.
The code raising the error:

dataset = tf.contrib.data.TFRecordDataset(shard_files)
dataset = dataset.map(partial(decoder.decode, items=['label', 'image']))

where decoder is a tf.contrib.slim.tfexample_decoder.TFExampleDecoder()

So, if I got the issue right, since decoder.decode() returns a list [label, decoded_image_data] this is implicitly casted to a tensor (and thus the cast fails because label and image_data have different types).
However, if I write a lambda that wraps tuple() around the result of the call to decoder.decode(), this should fix the problem:

dataset = dataset.map( lambda s : tuple(decoder.decode(s, items=['label', 'image'])))

I still get the error, however:

TypeError: Cannot convert a list containing a tensor of dtype <dtype: 'float32'> to <dtype: 'int64'> (Tensor is: <tf.Tensor 'distort_image/Mul:0' shape=(224, 224, 3) dtype=float32>)

Am I getting the solution wrong? Is this really the intended way to use the API? Even when many other TF APIs return lists instead of tuples?

EDIT: I had map() in several places and, of course, the new error was coming from the line below the one I fixed.

@mrry
Copy link
Contributor

mrry commented Aug 21, 2017

@GPhilo We've fixed this issue in the internal branch, and it should appear at HEAD soon. (Follow #12396 to see when the commit lands.) After the fix, the behavior of Dataset.map() will be the same if a function returns a tuple or a list containing the same elements, and the tuple(...) workaround will no longer be necessary.

@cheind
Copy link
Author

cheind commented Aug 22, 2017

@mrry sounds great!

@tamizharasank
Copy link

TypeError: Cannot convert a list containing a tensor of dtype <dtype: 'int32'> to <dtype: 'float32'> (Tensor is: <tf.Tensor 'Preprocessor/stack_1:0' shape=(1, 3) dtype=int32>)

@mrry
Copy link
Contributor

mrry commented Jul 20, 2018

@tamizharasank This issue has been closed. If you think you've found a bug, please open a new issue with enough details about your program to reproduce the problem. Otherwise, Stack Overflow may be a more useful venue for your question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:docs-bug Document issues
Projects
None yet
Development

No branches or pull requests

6 participants