Recursively copying elements from one graph to another #557

srjoglekar246 · 2015-12-19T10:09:35Z

Allows for easy portability of elements (Variables and Ops) from one Tensorflow Graph to another. If called on a top-level root in a dataflow graph, automatically copies all required instances.

Provides an API to retrieve the copied elements in the other graph, using a namespace.

Reference: https://codesachin.wordpress.com/2015/11/20/recursively-copying-elements-from-one-graph-to-another-in-tensorflow/

srjoglekar246 · 2015-12-19T10:10:39Z

Just added the code, haven't changed any imports yet. Want to get general feedback before refining the code.

vrv · 2015-12-19T17:34:22Z

Hi @srjoglekar246, this looks pretty cool, and thanks for going through the work to put this together!

One of the ideas we have been working on is a notion of "Functions" in the GraphDef, which would allow for re-usable components. See here for the proto definition.

It's not ready yet and still needs some work (we may not do this as a proto in the GraphDef) but the underlying idea would be something that would obviate the need for copying elements around with unique names: you could define a subgraph as a function with inputs and outputs, and then re-use them. So instead of having multiple versions of the graph stamped out at once, each with unique names, you could have one named function that could be called, ported across graphs, etc.

Do you feel that such a feature would accomplish your higher level goals?

srjoglekar246 · 2015-12-19T17:43:55Z

Aah yes. The whole idea was to enable reusability of dataflow-structures across Graph instances. If I am not wrong, your solution would need some careful handling of how sessions 'run' these portable functions. But if implemented right, it could save a lot of memory and accomplish what my code intends.

vrv · 2015-12-19T17:55:05Z

Yeah, there's some more plumbing in the internal execution of graphs that specially handles functions.

For now, do you mind if we keep this pull request open but on the backburner, at least until we figure out whether functions may be an easier to use abstraction?

(If you really want this checked in somewhere and know lots of others are using this, we've been intending to create a 'contrib' directory or repo where these types of utilities / functions could be placed).

srjoglekar246 · 2015-12-19T17:59:14Z

Yeah no problem!
On the other hand, a contrib directory for such scripts would be nice - especially for code that might be useful to a good audience as utilities (instead of being inside the main framework).

tensorflow-jenkins · 2015-12-20T08:43:01Z

Can one of the admins verify this patch?

srjoglekar246 · 2016-01-09T04:29:50Z

Ping @vrv

vrv · 2016-01-09T04:36:37Z

We've considered adding a contrib directory but ownership and bug reports would be hard to manage -- probably needs to be a separate repo. Adding @martinwicke since I think he's in the process of figuring this out.

There's been more progress over the past few weeks on functions. I think it's close -- take a look at an example: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/framework/function_test.py#L279 and let us know if that would sort of accomplish what you want

bhack · 2016-01-09T14:44:45Z

@vrv With the experience of opecv-contrib it is hard to maintain a quality level in a contributed repository and grant a decent review time of that PRs. One of the best community scaling effort is the Debian developer/maintainer process. We can implement a very light version of this here. Github lowered the entrance cost of new developers trough the fork-PR process but also let proliferation of sparse or short time frame contributions. We can find a way to reward middle and long term valid contributors or contributors groups to maintain some contrib modules in Tensorflow and to review and accept related PR and issue on this target modules. I don't know what will be the best way to to this with the actual Github management features for handling process of module Orphaning, contributor/group MIA, new module proposal acceptance and module removal. I think that Wikifing community rules a little bit, using labels, repositories, and submodules the process could be managed in some way.

srjoglekar246 · 2016-01-09T15:20:37Z

@vrv A separate repo would be nice, would even let users build a proper codebase for different algorithms implemented in tensorflow. Something like what sklearn is for scipy.
As far as the code goes, it pretty much does what I wanted to achieve, with a better interface for reusable functions across graphs. Is it coming out in the next version? The function class would also enable running of different algorithms within same environment - a nice bonus.

vrv · 2016-01-09T18:09:54Z

@srjoglekar246, @bhack: we'll try to find something that works for contributions.

As for functions: I'm not entirely sure what the state of it is, but I wanted to solicit early feedback from you since it seems like what you originally were trying to do with this PR. It's being actively worked on so I'm hoping it will be ready "soon".

srjoglekar246 · 2016-01-09T18:18:13Z

I guess once we can define reusable functions, we could do away with initialising ops as "tf.add", instead going for the method used here: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/framework/function_test.py#L99 . My only suggestion would be to make the system smarter with respect to argument types (like adding float types to int types should automatically return a float, assuming shapes are compatible). But I guess it won't be too easy, especially in C++.

cesarsalgado · 2016-01-09T22:08:15Z

I'm afraid that creating a separate repo would lose the focus of the community as a whole.

Edit: For example, I would like all implementations of new papers to have high quality documentation. I'm afraid that the the contrib repo would have a poor doc like some caffe PR has. Maybe if tensorflow has semi-offical implms of some paper this will disincentivize the main repo to make an official and better documented implem earlier than it would otherwise.

martinwicke · 2016-04-07T19:27:16Z

Sorry for the long silence -- if you're still interested, I'd like to merge this into contrib. Can you move the file to tensorflow/contrib/copy_graph/python/util/copy.py?

Also, can you add a license header and the python3 from future imports, and can you add a test for this?

tensorflow-jenkins · 2016-04-07T19:27:17Z

Can one of the admins verify this patch?

srjoglekar246 · 2016-04-07T19:31:28Z

@martinwicke Will get it done by tomorrow. Any particular format/guideline for the unit test to be added?

martinwicke · 2016-04-07T19:36:16Z

Just make sure that it tests the functionality you claim your functions provide.

martinwicke · 2016-04-07T19:37:08Z

And can you modify the docstrings to match the tensorflow style guide (look at the "writing documentation" howto)?

Thanks!

…o copyops

srjoglekar246 · 2016-04-09T11:35:38Z

@martinwicke Could you take a look at the code (especially the BUILD and test files) and tell me if I am on the right track?

martinwicke · 2016-04-12T02:05:35Z

tensorflow/contrib/copy_graph/python/util/copy.py

+from tensorflow.python.ops.variables import Variable
+from tensorflow.python.client.session import Session
+from tensorflow.python.framework import ops
+from copy import deepcopy


Can you import this before tensorflow

martinwicke · 2016-04-12T02:06:07Z

tensorflow/contrib/copy_graph/__init__.py

+
+@@copy_op_to_graph
+@@copy_variable_to_graph
+@@get_copied_op


Can you add this module to gen_docs_combined.py? See the other contrib modules in there.

Sure. Can you let me know if/how to run the tensorflow tests on my machine?

Follow the instructions to build from source. If you can do that, you should be able to do
bazel test tensorflow/... to run the tests. You can also give explicit test targets to re-run only some tests.

…o copyops

srjoglekar246 · 2016-04-12T12:20:33Z

@martinwicke Made the changes. Let me know if there's anything more to be modified.

srjoglekar246 · 2016-04-15T05:35:05Z

Ping @martinwicke

gunan · 2016-04-15T05:35:06Z

Can one of the admins verify this patch?

martinwicke · 2016-04-15T16:54:06Z

Jenkins, test this please.

martinwicke · 2016-04-15T16:54:55Z

tensorflow/contrib/copy_graph/__init__.py

+from __future__ import division
+from __future__ import print_function
+
+import sys


This import seems redundant here.

martinwicke · 2016-04-15T16:58:14Z

I've had some minor comments about python module things, but otherwise looks good.

…o copyops

srjoglekar246 · 2016-04-16T09:47:56Z

@martinwicke I made the changes, and fixed the bazel test errors. They all pass now. Have a look.

martinwicke · 2016-04-17T07:36:49Z

Thanks!

Jenkins, test this please.

srjoglekar246 · 2016-04-17T15:17:01Z

@martinwicke Seems to work. Okay to be pushed in?

martinwicke · 2016-04-19T23:29:14Z

Thanks!

thjashin · 2016-11-08T13:06:35Z

Hi @vrv ,

I'm currently writing some high-level library based on tensorflow. I'm relying a lot on copying existing ops to achieve re-usability of data flow structures. So I'm very interested in the "Functions" idea you mentioned here. I'm wondering how this is going on.
Because I recently met some problems due to not copying op._control_flow_context, which makes gradients through tf.cond fail in copied subgraph. This problem also exists in this contribution.

vrv · 2016-11-08T16:43:07Z

There's been some work on functions but it's still kind of primitive and I don't know how well it composes with control flow.

We have https://github.com/tensorflow/tensorflow/blob/5a566a7701381a5cf7f70fce397759483764e482/tensorflow/python/framework/function.py which isn't yet public (but until we seal the public interface it's still available to play around with), and it isn't getting that much love / attention unfortunately. But if it proves useful, let us know and maybe we can at least make it public at some point.

thjashin · 2016-11-09T11:52:24Z

@vrv Thanks for the link. I had a look at functions and unfortunately that's not what I actually want. I can describe my high-level goals here. It's something like theano.clone() (related issues: #5479 #1070), which, in my view, can be seen as operation-level reuse rather than subgraph-level, which enables one to replace inputs of any operations in the graph. I guess this is also what the author of tf.contrib.graph_editor tries to achieve.

First push, just added code

14cc89c

googlebot added the cla: yes label Dec 19, 2015

vrv assigned martinwicke Jan 9, 2016

srjoglekar246 added 7 commits April 8, 2016 15:26

Merge branch 'master' of https://github.com/tensorflow/tensorflow int…

1d03228

…o copyops

WIP: Organizing copy_graph code in contrib

3eac496

WIP: Removed hidden files

37e8b99

WIP: Modifying docs as per tf guidelines

0c288bf

WIP: Tests and BUILD file remain

39b1055

Adding tests

93a6758

Merge branch 'master' of https://github.com/tensorflow/tensorflow int…

bf544bd

…o copyops

martinwicke reviewed Apr 12, 2016
View reviewed changes

srjoglekar246 added 2 commits April 12, 2016 17:47

Finished tests and added to gen docs

06b2f3b

Merge branch 'master' of https://github.com/tensorflow/tensorflow int…

fe6134c

…o copyops

martinwicke reviewed Apr 15, 2016
View reviewed changes

srjoglekar246 added 4 commits April 15, 2016 23:31

Merge branch 'master' of https://github.com/tensorflow/tensorflow int…

c919ae7

…o copyops

Added main call to test file

3f3b9ef

Fixed test errors

6f8be72

Merge branch 'master' of https://github.com/tensorflow/tensorflow int…

0c10c23

…o copyops

martinwicke merged commit 0cb193e into tensorflow:master Apr 19, 2016

aselle mentioned this pull request Nov 11, 2016

theano.clone feature #5479

Closed

thjashin mentioned this pull request Nov 23, 2016

A theano.clone() equivalent for Tensorflow #5802

Closed

girving mentioned this pull request Jun 16, 2017

Request: Equivalent of theano.clone() to dynamically replace subgraph #1070

Closed

Recursively copying elements from one graph to another #557

Recursively copying elements from one graph to another #557

Conversation

srjoglekar246 commented Dec 19, 2015

srjoglekar246 commented Dec 19, 2015

vrv commented Dec 19, 2015

srjoglekar246 commented Dec 19, 2015

vrv commented Dec 19, 2015

srjoglekar246 commented Dec 19, 2015

tensorflow-jenkins commented Dec 20, 2015

srjoglekar246 commented Jan 9, 2016

vrv commented Jan 9, 2016

bhack commented Jan 9, 2016

srjoglekar246 commented Jan 9, 2016

vrv commented Jan 9, 2016

srjoglekar246 commented Jan 9, 2016

cesarsalgado commented Jan 9, 2016

martinwicke commented Apr 7, 2016

tensorflow-jenkins commented Apr 7, 2016

srjoglekar246 commented Apr 7, 2016

martinwicke commented Apr 7, 2016

martinwicke commented Apr 7, 2016

srjoglekar246 commented Apr 9, 2016

martinwicke Apr 12, 2016

Choose a reason for hiding this comment

martinwicke Apr 12, 2016

Choose a reason for hiding this comment

srjoglekar246 Apr 12, 2016

Choose a reason for hiding this comment

martinwicke Apr 15, 2016

Choose a reason for hiding this comment

srjoglekar246 commented Apr 12, 2016

srjoglekar246 commented Apr 15, 2016

gunan commented Apr 15, 2016

martinwicke commented Apr 15, 2016

martinwicke Apr 15, 2016

Choose a reason for hiding this comment

martinwicke commented Apr 15, 2016

srjoglekar246 commented Apr 16, 2016

martinwicke commented Apr 17, 2016

srjoglekar246 commented Apr 17, 2016

martinwicke commented Apr 19, 2016

thjashin commented Nov 8, 2016 • edited

vrv commented Nov 8, 2016

thjashin commented Nov 9, 2016 • edited

thjashin commented Nov 8, 2016 •

edited

thjashin commented Nov 9, 2016 •

edited