Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Equality operator not overloaded #9359

Closed
carlthome opened this issue Apr 21, 2017 · 26 comments
Closed

Equality operator not overloaded #9359

carlthome opened this issue Apr 21, 2017 · 26 comments

Comments

@carlthome
Copy link
Contributor

@carlthome carlthome commented Apr 21, 2017

It's unexpected and a gotcha that some comparison operators (>, <, <=, =>) are overloaded but not all (==, !=). Is there a particular reason for this? I think either all standard operators should be overloaded, or none at all.

In [1]: import tensorflow as tf

In [2]: tensor = tf.zeros((3,5))

In [3]: tensor > 0
Out[3]: <tf.Tensor 'Greater:0' shape=(3, 5) dtype=bool>

In [4]: tensor == 0
Out[4]: False

This is on TensorFlow 1.1.0rc2.

@drpngx
Copy link
Contributor

@drpngx drpngx commented Apr 22, 2017

I think the reason is that we thought it might be confusing to have statements like if foo, which check for None.

@aselle may have some comment.

@carlthome
Copy link
Contributor Author

@carlthome carlthome commented Apr 23, 2017

Surely for those cases it would be better with the less ambiguous if foo is None:

@drpngx
Copy link
Contributor

@drpngx drpngx commented Apr 23, 2017

The google style guide mandates implicit comparisons, but not for integers. A lot of code has been written that way.

Whatever the case, reading the description again, it looks to me that case 4 doesn't make sense in the current state no matter what.

@shoyer
Copy link
Member

@shoyer shoyer commented Apr 23, 2017

This may be a complication of fact that tensors can be used as keys in dictionaries, which I believe use == to find the matching object with the same hash:
https://docs.python.org/2/reference/datamodel.html#object.__hash__

if foo actually calls bool(foo) -> foo.__bool__(), so that's something different.

@gunan
Copy link
Contributor

@gunan gunan commented Jun 16, 2017

Automatically closing due to lack of recent activity. Please update the issue when new information becomes available, and we will reopen the issue. Thanks!

@gunan gunan closed this Jun 16, 2017
@twangnh
Copy link

@twangnh twangnh commented Feb 20, 2018

but how would you suggest to write the code if we want == operator?

@carlthome
Copy link
Contributor Author

@carlthome carlthome commented Feb 20, 2018

tf.equal I guess? I wish this was reopened though.

@gunan
Copy link
Contributor

@gunan gunan commented Feb 21, 2018

@alextp @josh11b @drpngx to check and see if we would like to reopen this. But it has been almost one year with no activity, so it seems to be very low priority.

@carlthome
Copy link
Contributor Author

@carlthome carlthome commented Feb 21, 2018

I've seen several bugs caused by this inconsistency. I even managed to be tripped up by this myself just last week, and I would not be surprised if these bugs are very common across people's experiments, where the Python expression always evaluates to False and an implicit tf.convert_to_tensor on the expression just becomes a runtime constant.

@alextp
Copy link
Contributor

@alextp alextp commented Feb 21, 2018

We cannot override == and != because that would make tensors no longer work as dictionary keys, which would break a lot of internal and external code :-/

@shoyer
Copy link
Member

@shoyer shoyer commented Feb 21, 2018

Agreed with @alextp, this is unfortunately a non-starter.

The most obvious use of tensors as dictionary keys is placeholders in feed_dict.

@karandwivedi42
Copy link

@karandwivedi42 karandwivedi42 commented Apr 4, 2018

I just spent more than an hour debugging this. :/

Strongly agree that the inconsistency is misleading (I saw some code with a <= b and modified it with a==b and always got 0).

@carlthome
Copy link
Contributor Author

@carlthome carlthome commented Apr 5, 2018

This is an extremely tricky wart.

For TensorFlow 2.0 maybe tensors could stop being hashable and instead users would have to explicitly call e.g. tf.constant(9).hashable()? That would free up == and != for the expected use as overloaded operators for graph construction.

That's unfortunately the best I could come up with.

x = tf.placeholder(...)
feed_dict = {x.hashable(): np.array(...)}

@jeromerg
Copy link

@jeromerg jeromerg commented Nov 7, 2018

Would it be possible to at least produce a warning?

@alextp
Copy link
Contributor

@alextp alextp commented Nov 7, 2018

If we add a warning it'll trigger all the time for innocuous code since TF internally uses hash tables keyed by tensors in many places :-/

@TomoshibiAkira
Copy link

@TomoshibiAkira TomoshibiAkira commented Feb 26, 2019

Stumbled upon this issue when I found out tensor_x == 0 returns a False...
Well, here goes my time :(
Can't we at least warn people about the operator's inconsistent behavior? Maybe somewhere in the doc, suggest them to use tf functions instead.

@carlthome
Copy link
Contributor Author

@carlthome carlthome commented Feb 26, 2019

It's String comparisons in Java all over again. 😟

@josh11b
Copy link
Contributor

@josh11b josh11b commented Feb 26, 2019

We have recently been looking at this and been trying things to work around the hashable issue, but we haven't found anything that doesn't break a lot of code. In addition to the hashing issue, I've also seen problems with code like:

tensorflow/python/ops/embedding_ops.py, line 117, in _embedding_lookup_and_transform
if params is None or params in ((), []):

which is relying on Tensor.eq being the same as "is" when params is a Tensor.

@ericrannaud
Copy link

@ericrannaud ericrannaud commented Jun 5, 2019

This behavior could benefit from documentation.

No mention could be found in the following pages, where, presumably, a user may expect to discover it:

https://www.tensorflow.org/versions/r1.14/api_docs/python/tf/math/equal
https://www.tensorflow.org/versions/r1.14/api_docs/python/tf/math/not_equal
https://www.tensorflow.org/guide/eager
https://www.tensorflow.org/guide/autograph
https://www.tensorflow.org/alpha/guide/eager
https://www.tensorflow.org/alpha/guide/autograph
https://www.tensorflow.org/alpha/tutorials/eager/basics
https://www.tensorflow.org/alpha/tutorials/eager/tf_function
https://developers.google.com/machine-learning/crash-course/first-steps-with-tensorflow/programming-exercises
https://colab.research.google.com/notebooks/mlcc/first_steps_with_tensor_flow.ipynb

The palm being awarded to:
https://www.tensorflow.org/alpha/tutorials/eager/basics
which contains the sentence "Operator overloading is also supported." with no qualification whatsoever. It also contains a section titled "NumPy Compatibility" which omits to mention this tiny little wrinkle (and yes, I understand this section is technically about converting numpy arrays. but readers infer things from what could be seen as subtext here: "it's basically the same").

A lot of the examples casually use tf.equal(), ==, >= in the same breadth, without even hinting to the subtelty underlying their respective use.

The only tangeantial reference I could find is with respect to autograph.experimental.Feature.EQUALITY_OPERATORS mentioned in one of the links above, although the need for this is not explained in the least. https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/autograph/experimental/Feature states that EQUALITY_OPERATORS is going away as support will be directly in the tf.Tensor class, contradicting @josh11b 's comment above.

If I may, I would advise bright red flashing letters in a giant pulsating warning box.

P.S. @carlthome At least in Java, it's not like a >= b works and a == b doesn't...

@carlthome
Copy link
Contributor Author

@carlthome carlthome commented Jun 5, 2019

P.S. @carlthome At least in Java, it's not like a >= b works and a == b doesn't...

True, this one is worse. 😢

@ericrannaud
Copy link

@ericrannaud ericrannaud commented Jun 5, 2019

Frankly, you have a great opportunity to fix this historical wart with TensorFlow 2. Make Tensor.__hash__() throw an error, and require the use of Tensor.hashable() or Tensor.handle(). Debug builds can instrument __contains__ on standard containers to catch the use of in with Tensors items. You will break code, but in a (mostly) safe manner. And will save a lot of trouble down the road. And document (!) the change in TensorFlow 2 release notes.

If you really must, only enable the new sane definition of __eq__ and __ne__ in eager execution mode and within autograph transformations where the use of Tensor as an abstract handle is a lot less likely (and sensible) than in the traditional manual graph building mode.

Internal Google code can be updated using static analysis. Internal pedestrian constraints shouldn't drive the design of a tool aiming for universality. Presumably, if I understand TensorFlow's growth plans correctly, a lot more TensorFlow code is going to be written in the future than was written to this day.

@jaingaurav
Copy link
Contributor

@jaingaurav jaingaurav commented Aug 12, 2019

Thank you everyone for requesting this feature. Due to feedback from the community we have built this feature behind the tf.enable_tensor_equality() toggle. This will be enabled by default in 2.0.

Please provide any feedback on this feature with the upcoming nightly builds.

@carlthome
Copy link
Contributor Author

@carlthome carlthome commented Aug 13, 2019

Fantastic! Looking forward to trying it out @jaingaurav. But so how was the dict key hash usage rectified? ⁉️

TensorFlow-Docs-Copybara pushed a commit to tensorflow/docs that referenced this issue Aug 13, 2019
tensorflow/tensorflow#9359

Also:
  guide/data needs nightly
  imbalance_data was trying to import the wrong package.
PiperOrigin-RevId: 263189464
jaingaurav added a commit to jaingaurav/tensorflow that referenced this issue Aug 19, 2019
Fixes tensorflow#9359

PiperOrigin-RevId: 262948811
(cherry picked from commit 4089730)
@jaingaurav
Copy link
Contributor

@jaingaurav jaingaurav commented Aug 29, 2019

@carlthome: We solved dictionary hashing in a couple of different ways:

  1. With short-lived/locally scoped dictionaries, using id() as the key is fine
  2. Exposed a wrapper object that can be used as a key with a reference to the Tensor: https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/Tensor#experimental_ref
  3. Changed certain dictionaries to be a map of id(key) to a tuple of (key, value)
  4. Refactored code to avoid unnecessary use of a dictionary.

All of these options are reasonable for user code to use as well and we have been able to successfully apply the different approaches to various projects. As a result of enabling tensor equality we uncovered a number of bugs in code as well as tests where code wasn't really testing what it thought it was testing.

@ericrannaud
Copy link

@ericrannaud ericrannaud commented Aug 29, 2019

@carlthome
Copy link
Contributor Author

@carlthome carlthome commented Aug 30, 2019

You would go from having a community of users to having a community of
developers. As it is, it's impossible to contribute more than small bug
fixes to a project that doesn't open up its technical discussions and offer
detailed, day-to-day insights into its work and, most importantly, its
thinking.

AFAIK @ewilderj is doing a large push for opening up the Google-centric work process to a wider open source community. You might be interested in joining one of the SIGs for example: https://www.tensorflow.org/community/forums#special_interest_groups

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet