Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Functions, not sessions in 2.0 #20

Merged
merged 15 commits into from Nov 16, 2018

Conversation

@asimshankar
Copy link
Contributor

asimshankar commented Sep 20, 2018

Review period closes 2018-10-04

Functions, not sessions in 2.0

Status Proposed
Author(s) ashankar@google.com, joshl@google.com
Sponsor apassos@google.com
Updated 2018-09-18

Proposal to make TensorFlow be more "Pythonic" in 2.0. In four bullet points:

  • Encourage the encapsulation of graph computation as Python functions
    (where the graph is executed when the function is invoked, instead of via Session)
  • Align "state" in the TensorFlow runtime (e.g., resource tensors like those that back tf.Variable objects) with state in the Python program (e.g., Python objects corresponding to the runtime state with lifetimes attached to each other).
  • Make it easy to export these encapsulations to a GraphDef+Checkpoint and/or SavedModel.
  • Enable eager execution by default.

asimshankar added some commits Sep 20, 2018

@ewilderj ewilderj added this to Needs attention in RFC management via automation Sep 20, 2018

@ewilderj ewilderj moved this from Needs attention to Open reviews in RFC management Sep 20, 2018

@ngc92

This comment has been minimized.

Copy link

ngc92 commented Sep 20, 2018

For python 3 users, it would be nice if instead of specifying expected types and shapes as an argument to defun, pythons own type annotations would be parsed. So instead of

@tf.defun(input_signature=((tf.float32, [None, 5]))
def f(x):
  pass

we could write

@tf.defun
def f(x: tf.TensorSpec[tf.float32, (None, 5)):
  pass

While this could not be used in tensorflow itself (unless there was a build step that when targeting py2 would convert the latter into the former), but would give projects not concerned with backward compatibility a nicer syntax, and give linters the ability to detect some API misuses before the program is run.

asimshankar added some commits Sep 21, 2018

@carlthome

This comment has been minimized.

Copy link

carlthome commented Sep 21, 2018

Definitely prefer tf.function over tf.defun! Why would we de-fun TensorFlow programs? 🙃

@omoindrot

This comment has been minimized.

Copy link

omoindrot commented Sep 21, 2018

It also makes sense to have @tf.function if we have @tf.method (and not @tf.defmethod).

@seanpmorgan

This comment has been minimized.

Copy link
Member

seanpmorgan commented Sep 21, 2018

Strongly agree with @tf.function or @tf.func. My first look at tf.contrib.defun was that it signaled some kind of defunct function. Since the decorator will be so commonly used, it's worth making sure the name is intuitive at first glance and doesn't need to be looked up.

Looking at common python decorators, I find the most clear decorators use either:

  • an understood noun (@classmethod, @propery, @app_endpoint())
  • written out action that wraps it (@autograph.convert(), @login_required, @atexit.register )
@seanpmorgan

This comment has been minimized.

Copy link
Member

seanpmorgan commented Sep 21, 2018

Separately, will we need to handle tf.defun wrapping a function which uses tf.py_func? Seems like if nothing else it may need to throw an error since it wont be serialized in the graph.

Related to tensorflow/tensorflow#10282

@ngc92

This comment has been minimized.

Copy link

ngc92 commented Sep 21, 2018

How does this interact with device selection? I.e.

@tf.defun
def f(...):
    ...

with tf.device("/gpu:0"):
  f()
with tf.device("/gpu:1"):
  f()

the different f's have the same signature here, but to execute them on different devices we need to different graphs.

@samjabrahams

This comment has been minimized.

Copy link

samjabrahams commented Sep 22, 2018

At a high level, I really like the idea to push towards a more functional/composable API. For me, having to figure out a clean way to reuse components of a graph while passing in different inputs (e.g. dataset inputs during training, then export a graph with placeholders for inference) has always been an annoyance. I will likely have more comments (currently traveling and have limited time to type things up), but I had a couple of concerns I want to bring up:

  • Currently, the Session API takes a few runs to "warm up". That is, the first couple of runs will run slower. I suppose this is more of a question about Eager in general, but are there risks of things always running slowly by not using a session? Or will there actually be less overhead due to not needing to traverse a graph to find what operations to run?
  • I'm concerned about this change being a half-measure. I haven't done user research myself, but I think that the difficulty of new user on-boarding is more than just the fact that the graph is run lazily. It's also that the API is massive and thus prone to make new user's eyes glaze over. Not only that, but there is limited guidance on what idomatic TensorFlow code should use. This leads to confusion and a lot of people who have to learn the same lessons over and over. I believe that one of the largest opportunities of 2.0 is the ability to remove cruft and imagine what TensorFloe could be if it was built from the ground up. I understand that there are needs in terms of enabling backwards compatibility to prevent alienating the userbase, but right now I think this risks adding bloat in the form of "there's yet another way to run a graph!".
    • I'm not saying that this API is exactly what we want; I need more time to digest it. What I would say is that either the scope should be much larger, or it should be reduced so that it doesn't require as many awkward compromises/workarounds in order to kind-of work with existing semantics.
  • I think there should be more weight given to/conversation about diverging the Python vs non-Python APIs. I know that most of the people using C++/Go APIs may be considered "power users", and thus better equipped to deal with additional things to learn, but one of the strengths of TensorFlow is that the semantics between your Python/training code are similar to those in deployment. It's hard to argue against "Python code should be Pythonic", but I wonder if there is a way to keep the APIs unified at the same time.

@asimshankar asimshankar referenced this pull request Sep 25, 2018

Merged

RFC: Sunset tf.contrib #18

@asimshankar

This comment has been minimized.

Copy link
Contributor Author

asimshankar commented Sep 26, 2018

@samjabrahams , thanks for the comments. Some responses:

  • "warm up": There should be no risk of running slowly by not using a session. Any warmup (e.g., creating the kernels etc.) may happen on the first time the function is executed, but there should be no overhead beyond that. As you guessed, there may even be less overhead (the ~10us or so sometimes spent in looking up the exact graph to execute based on feeds and fetches).

  • This change is just one of many towards 2.0. You'll see talk about reducing the API surface in other RFCs like #15 #16 #18, and others coming. Is there something specific to this proposal that you think can be improved?

  • Point taken about the other language APIs. I'll follow up on that.

@asimshankar

This comment has been minimized.

Copy link
Contributor Author

asimshankar commented Sep 26, 2018

@ngc92 , re #20 (comment) - yes, different graphs will be created for those. This happens with tf.contrib.eager.defun today, as the context of execution is implicitly a part of the input signature. I'll clarify that in the document too.

asimshankar added some commits Oct 9, 2018

@ewilderj ewilderj moved this from Open reviews to Awaiting Committee Notes in RFC management Oct 11, 2018

tensorflow-copybara pushed a commit to tensorflow/tensorflow that referenced this pull request Oct 18, 2018

TF 2.0: tf.Session -> tf.compat.v1.Session
(In the spirit of tensorflow/community#20)

PiperOrigin-RevId: 217627136

@goldiegadde goldiegadde self-assigned this Oct 30, 2018

@goldiegadde

This comment has been minimized.

Copy link
Contributor

goldiegadde commented Oct 30, 2018

asimshankar could you please post the notes from the design review meeting into this thread ?

also, are there any updates to this document?

once both those are resolved we can proceed to merge this.

@asimshankar

This comment has been minimized.

Copy link
Contributor Author

asimshankar commented Nov 16, 2018

@goldiegadde : I updated the document in place based on the meeting notes.

@goldiegadde

This comment has been minimized.

Copy link
Contributor

goldiegadde commented Nov 16, 2018

@goldiegadde : I updated the document in place based on the meeting notes.

Thanks @asimshankar . Can you push a version of the doc with Status as "Accepted". We can then proceed to merge it.

@asimshankar

This comment has been minimized.

Copy link
Contributor Author

asimshankar commented Nov 16, 2018

@goldiegadde : Done. Thanks.

@goldiegadde

This comment has been minimized.

Copy link
Contributor

goldiegadde commented Nov 16, 2018

Thanks everyone for reviewing the RFC. I am going to merge this request now.

@ewilderj ewilderj reopened this Nov 16, 2018

RFC management automation moved this from Awaiting Committee Notes to Needs attention Nov 16, 2018

@ewilderj ewilderj merged commit feda983 into tensorflow:master Nov 16, 2018

1 check passed

cla/google All necessary CLAs are signed

RFC management automation moved this from Needs attention to Accepted RFCs Nov 16, 2018

@facaiy facaiy referenced this pull request Mar 9, 2019

Open

Migrate dense image warp #53

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.